A collection of reasoning datasets used for post training base models using RL
Sujay Shahare
sujayshahare
AI & ML interests
LLM post training, RL
Organizations
None yet
Reasoning LLMs
Collection of research papers I find interesting on reasoning models
-
Learning to Reason without External Rewards
Paper • 2505.19590 • Published • 29 -
Scalable Best-of-N Selection for Large Language Models via Self-Certainty
Paper • 2502.18581 • Published -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 90 -
Fractured Chain-of-Thought Reasoning
Paper • 2505.12992 • Published • 23
Reasoning Datasets
A collection of reasoning datasets used for post training base models using RL
Reasoning LLMs
Collection of research papers I find interesting on reasoning models
-
Learning to Reason without External Rewards
Paper • 2505.19590 • Published • 29 -
Scalable Best-of-N Selection for Large Language Models via Self-Certainty
Paper • 2502.18581 • Published -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 90 -
Fractured Chain-of-Thought Reasoning
Paper • 2505.12992 • Published • 23
models
0
None public yet
datasets
0
None public yet