-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2508.20722
-
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 80 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 116 -
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
Paper • 2508.19828 • Published • 7
-
FastVLM: Efficient Vision Encoding for Vision Language Models
Paper • 2412.13303 • Published • 72 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 116 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 53 -
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
Paper • 2509.12201 • Published • 104
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 626 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 301 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 315 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 211
-
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Paper • 2406.04151 • Published • 24 -
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
Paper • 2510.16872 • Published • 104 -
Scaling Generalist Data-Analytic Agents
Paper • 2509.25084 • Published • 18 -
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 117
-
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
Paper • 2504.08600 • Published • 32 -
Think-on-Graph 3.0: Efficient and Adaptive LLM Reasoning on Heterogeneous Graphs via Multi-Agent Dual-Evolving Context Retrieval
Paper • 2509.21710 • Published • 18 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 121
-
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge
Paper • 2506.21506 • Published • 51 -
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Paper • 2505.17612 • Published • 81 -
Efficient Agent Training for Computer Use
Paper • 2505.13909 • Published • 44 -
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 117
-
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 225 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 53 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 116
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Paper • 2406.04151 • Published • 24 -
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
Paper • 2510.16872 • Published • 104 -
Scaling Generalist Data-Analytic Agents
Paper • 2509.25084 • Published • 18 -
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 117
-
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 80 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 116 -
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
Paper • 2508.19828 • Published • 7
-
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
Paper • 2504.08600 • Published • 32 -
Think-on-Graph 3.0: Efficient and Adaptive LLM Reasoning on Heterogeneous Graphs via Multi-Agent Dual-Evolving Context Retrieval
Paper • 2509.21710 • Published • 18 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 121
-
FastVLM: Efficient Vision Encoding for Vision Language Models
Paper • 2412.13303 • Published • 72 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 116 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 53 -
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
Paper • 2509.12201 • Published • 104
-
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge
Paper • 2506.21506 • Published • 51 -
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Paper • 2505.17612 • Published • 81 -
Efficient Agent Training for Computer Use
Paper • 2505.13909 • Published • 44 -
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 117
-
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 225 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 53 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 116
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 626 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 301 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 315 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 211