Zhanfeng Mo's picture

3 9

Zhanfeng Mo

mzf666

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 8 days ago

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

upvoted a paper 16 days ago

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

upvoted a paper 21 days ago

P1: Mastering Physics Olympiads with Reinforcement Learning

View all activity

Organizations

upvoted a paper 8 days ago

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published 15 days ago • 150

upvoted a paper 16 days ago

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published 20 days ago • 91

upvoted 2 papers 21 days ago

P1: Mastering Physics Olympiads with Reinforcement Learning

Paper • 2511.13612 • Published 23 days ago • 132

MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

Paper • 2511.11793 • Published 26 days ago • 159

authored 5 papers about 2 months ago

Panda LLM: Training Data and Evaluation for Open-Sourced Chinese Instruction-Following Large Language Models

Paper • 2305.03025 • Published May 4, 2023

100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models

Paper • 2505.00551 • Published May 1 • 36

MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 134

Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6 • 30

First Try Matters: Revisiting the Role of Reflection in Reasoning Models

Paper • 2510.08308 • Published Oct 9 • 24

upvoted 3 papers 2 months ago

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

Paper • 2510.03222 • Published Oct 3 • 75

First Try Matters: Revisiting the Role of Reflection in Reasoning Models

Paper • 2510.08308 • Published Oct 9 • 24

Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6 • 30

upvoted a paper 5 months ago

MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 134

upvoted a paper 7 months ago

100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models

Paper • 2505.00551 • Published May 1 • 36