Reinforcement Learning Course

Google’s new AI training method helps small models tackle complex reasoning

Google's SRL framework provides a step-by-step "curriculum" that makes LLMs more reliable for complex reasoning tasks.

Weibo's new open source AI model VibeThinker-1.5B outperforms DeepSeek-R1 on $7,800 post-training budget

Chinese social networking company Weibo's AI division recently released its open source VibeThinker-1.5B —a 1.5 billion ...

Meta’s SPICE framework pushes AI toward self-learning without human supervision

The new reinforcement learning system lets large language models challenge and improve themselves using real-world data ...

Deep Learning with Yacine on MSN

DeepSeek R1 Explained: GRPO, Reinforcement Learning & SFT

Dive into DeepSeek R1 and explore GRPO, reinforcement learning, and supervised fine-tuning (SFT) in an easy-to-understand way ...

13d

The post-training revolution: How reinforcement learning is upending the AI infra stack

TechCrunch was proud to host Scale Venture Partners at Disrupt 2025 in San Francisco. Here’s an overview of their AI Stage session. The reinforcement learning market has exploded, with enterprises ...

The Robot Report

AgiBot deploys its Real-World Reinforcement Learning system

AgiBot said its Real-World Reinforcement Learning system lets robots learn new skills in minutes on a pilot production line.

IEEE

Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications

Abstract: Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. Most of the existing ...

english.newsnationtv

From Algorithms to Intelligence: How AI Is Reshaping Quantitative Finance Education

from QuantInsti make these skills more accessible. The Rise of AI in Financial Markets Financial markets produce massive amounts of data every second: prices, order books, news, social media sentiment ...

NBC 10 Philadelphia

Tiger Woods attends Learning Lab ceremony at Cobbs Creek Golf Course in Philly

Golf legend Tiger Woods attended a ribbon cutting ceremony for a facility in Philadelphia’s Cobbs Creek neighborhood that provides free educational programming for children and teenagers. The ceremony ...

GitHub

The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning

This repo is forked from verl. We build our code on the dapo recipe. Before training, you need to ensure that the AIME, AIME25 and AMC datasets are with "data_source" of "aime", "aime25" and "amc" ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results