Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 10 days ago • 85
FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios Paper • 2604.07413 • Published 16 days ago • 94
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published 16 days ago • 114
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Paper • 2604.11784 • Published 11 days ago • 141
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published 21 days ago • 367
CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning Paper • 2512.02551 • Published Dec 2, 2025 • 13
CRINN: Contrastive Reinforcement Learning for Approximate Nearest Neighbor Search Paper • 2508.02091 • Published Aug 4, 2025 • 13
CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning Paper • 2507.14111 • Published Jul 18, 2025 • 25