tian

Xiaotiank

AI & ML interests

None yet

Recent Activity

updated a model about 2 months ago

TsinghuaC3I/Qwen2.5-7B-VL-ReAd-R

upvoted a paper 3 months ago

FlowRL: Matching Reward Distributions for LLM Reasoning

upvoted a paper 3 months ago

MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook

View all activity

Organizations

updated a model about 2 months ago

TsinghuaC3I/Qwen2.5-7B-VL-ReAd-R

8B • Updated Oct 21 • 121 • 2

upvoted 2 papers 3 months ago

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18 • 114

MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook

Paper • 2509.14142 • Published Sep 17 • 10

updated a dataset 3 months ago

TsinghuaC3I/AdsQA

Viewer • Updated Sep 16 • 800 • 64 • 3

published a dataset 3 months ago

TsinghuaC3I/AdsQA

Viewer • Updated Sep 16 • 800 • 64 • 3

upvoted 3 papers 3 months ago

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Paper • 2509.09674 • Published Sep 11 • 80

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10 • 189

HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?

Paper • 2509.07894 • Published Sep 9 • 31

published a model 3 months ago

TsinghuaC3I/Qwen2.5-7B-VL-ReAd-R

8B • Updated Oct 21 • 121 • 2

upvoted a paper 3 months ago

Towards a Unified View of Large Language Model Post-Training

Paper • 2509.04419 • Published Sep 4 • 75

upvoted a paper 4 months ago

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14 • 97

liked 2 Spaces 6 months ago

LLM训练终极指南 | The Ultra-Scale Playbook

🔥

240

了解LLM训练的方方面面

FineWeb：大规模提炼网页以获取优质文本数据

🍷

upvoted a paper 8 months ago

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 120

tian

AI & ML interests

Recent Activity

Organizations

Xiaotiank's activity

LLM训练终极指南 | The Ultra-Scale Playbook

FineWeb：大规模提炼网页以获取优质文本数据