-
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 194 -
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation
Paper • 2601.09688 • Published • 126 -
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization
Paper • 2512.24615 • Published • 119 -
Recursive Language Models
Paper • 2512.24601 • Published • 84
Collections
Discover the best community collections!
Collections including paper arxiv:2601.06002
-
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models
Paper • 2512.24618 • Published • 149 -
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
Paper • 2512.24873 • Published • 104 -
AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents
Paper • 2512.23343 • Published • 29 -
Figure It Out: Improving the Frontier of Reasoning with Active Visual Thinking
Paper • 2512.24297 • Published • 6
-
Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models
Paper • 2503.09567 • Published • 1 -
AutoPR: Let's Automate Your Academic Promotion!
Paper • 2510.09558 • Published • 53 -
The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning
Paper • 2601.06002 • Published • 52
-
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Paper • 2508.07629 • Published • 43 -
Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
Paper • 2508.07101 • Published • 14 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 8 -
Train Long, Think Short: Curriculum Learning for Efficient Reasoning
Paper • 2508.08940 • Published • 27
-
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Paper • 2501.18585 • Published • 61 -
RWKV-7 "Goose" with Expressive Dynamic State Evolution
Paper • 2503.14456 • Published • 153 -
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning
Paper • 2503.15265 • Published • 46 -
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning
Paper • 2503.15558 • Published • 50
-
GARDO: Reinforcing Diffusion Models without Reward Hacking
Paper • 2512.24138 • Published • 29 -
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models
Paper • 2512.24165 • Published • 51 -
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space
Paper • 2512.24617 • Published • 64 -
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Paper • 2512.23447 • Published • 98
-
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 223 -
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models
Paper • 2511.23319 • Published • 24 -
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information
Paper • 2511.22176 • Published • 5 -
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning
Paper • 2511.22265 • Published • 2
-
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Paper • 2504.13837 • Published • 139 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
Paper • 2503.24235 • Published • 54 -
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 188
-
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models
Paper • 2402.10986 • Published • 81 -
Aria Everyday Activities Dataset
Paper • 2402.13349 • Published • 31 -
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Paper • 2403.04132 • Published • 40 -
SaulLM-7B: A pioneering Large Language Model for Law
Paper • 2403.03883 • Published • 89
-
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 194 -
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation
Paper • 2601.09688 • Published • 126 -
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization
Paper • 2512.24615 • Published • 119 -
Recursive Language Models
Paper • 2512.24601 • Published • 84
-
GARDO: Reinforcing Diffusion Models without Reward Hacking
Paper • 2512.24138 • Published • 29 -
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models
Paper • 2512.24165 • Published • 51 -
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space
Paper • 2512.24617 • Published • 64 -
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Paper • 2512.23447 • Published • 98
-
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models
Paper • 2512.24618 • Published • 149 -
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
Paper • 2512.24873 • Published • 104 -
AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents
Paper • 2512.23343 • Published • 29 -
Figure It Out: Improving the Frontier of Reasoning with Active Visual Thinking
Paper • 2512.24297 • Published • 6
-
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 223 -
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models
Paper • 2511.23319 • Published • 24 -
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information
Paper • 2511.22176 • Published • 5 -
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning
Paper • 2511.22265 • Published • 2
-
Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models
Paper • 2503.09567 • Published • 1 -
AutoPR: Let's Automate Your Academic Promotion!
Paper • 2510.09558 • Published • 53 -
The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning
Paper • 2601.06002 • Published • 52
-
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Paper • 2508.07629 • Published • 43 -
Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
Paper • 2508.07101 • Published • 14 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 8 -
Train Long, Think Short: Curriculum Learning for Efficient Reasoning
Paper • 2508.08940 • Published • 27
-
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Paper • 2504.13837 • Published • 139 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
Paper • 2503.24235 • Published • 54 -
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 188
-
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Paper • 2501.18585 • Published • 61 -
RWKV-7 "Goose" with Expressive Dynamic State Evolution
Paper • 2503.14456 • Published • 153 -
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning
Paper • 2503.15265 • Published • 46 -
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning
Paper • 2503.15558 • Published • 50
-
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models
Paper • 2402.10986 • Published • 81 -
Aria Everyday Activities Dataset
Paper • 2402.13349 • Published • 31 -
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Paper • 2403.04132 • Published • 40 -
SaulLM-7B: A pioneering Large Language Model for Law
Paper • 2403.03883 • Published • 89