Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Sujay Shahare's picture

6 8

Sujay Shahare

sujayshahare

Gargaz's profile picture

Mi6paulino's profile picture

·

sujay_shahare
SujayShahare
sujay-shahare

AI & ML interests

LLM post training, RL

Organizations

None yet

Collections 3

Reasoning Datasets

A collection of reasoning datasets used for post training base models using RL

allenai/tulu-3-sft-mixture

Viewer • Updated Dec 2, 2024 • 939k • 11.8k • 196
a-m-team/AM-DeepSeek-R1-Distilled-1.4M

Preview • Updated Mar 30 • 1.74k • 170
PrimeIntellect/SYNTHETIC-1

Viewer • Updated Feb 21 • 1.99M • 850 • 60
openai/gsm8k

Viewer • Updated Jan 4, 2024 • 17.6k • 434k • 1.01k

Collection of research papers I find interesting on reasoning models

Learning to Reason without External Rewards

Paper • 2505.19590 • Published May 26 • 29
Scalable Best-of-N Selection for Large Language Models via Self-Certainty

Paper • 2502.18581 • Published Feb 25
Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 90
Fractured Chain-of-Thought Reasoning

Paper • 2505.12992 • Published May 19 • 23

Reasoning Datasets

A collection of reasoning datasets used for post training base models using RL

allenai/tulu-3-sft-mixture

Viewer • Updated Dec 2, 2024 • 939k • 11.8k • 196
a-m-team/AM-DeepSeek-R1-Distilled-1.4M

Preview • Updated Mar 30 • 1.74k • 170
PrimeIntellect/SYNTHETIC-1

Viewer • Updated Feb 21 • 1.99M • 850 • 60
openai/gsm8k

Viewer • Updated Jan 4, 2024 • 17.6k • 434k • 1.01k

Collection of research papers I find interesting on reasoning models

Learning to Reason without External Rewards

Paper • 2505.19590 • Published May 26 • 29
Scalable Best-of-N Selection for Large Language Models via Self-Certainty

Paper • 2502.18581 • Published Feb 25
Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 90
Fractured Chain-of-Thought Reasoning

Paper • 2505.12992 • Published May 19 • 23

View 3 collections

models 0

None public yet

datasets 0

None public yet

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs