1 20 6

Bingxiang He

hbx

https://hbx-hbx.github.io/

AI & ML interests

NLP

Recent Activity

upvoted a paper 23 days ago

P1: Mastering Physics Olympiads with Reinforcement Learning

updated a model 28 days ago

hbx/JustRL-DeepSeek-1.5B

liked a model 28 days ago

hbx/JustRL-DeepSeek-1.5B

View all activity

Organizations

upvoted a paper 23 days ago

P1: Mastering Physics Olympiads with Reinforcement Learning

Paper • 2511.13612 • Published 24 days ago • 132

updated a model 28 days ago

hbx/JustRL-DeepSeek-1.5B

Text Generation • 2B • Updated 28 days ago • 351 • 5

liked 2 models 28 days ago

hbx/JustRL-DeepSeek-1.5B

Text Generation • 2B • Updated 28 days ago • 351 • 5

hbx/JustRL-Nemotron-1.5B

Text Generation • 2B • Updated Nov 4 • 104 • 1

upvoted a paper about 1 month ago

CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents

Paper • 2511.02734 • Published Nov 4 • 20

updated a model about 1 month ago

hbx/JustRL-Nemotron-1.5B

Text Generation • 2B • Updated Nov 4 • 104 • 1

updated a collection about 1 month ago

JustRL

Collection

2 items • Updated Nov 1

published a model about 1 month ago

hbx/JustRL-Nemotron-1.5B

Text Generation • 2B • Updated Nov 4 • 104 • 1

updated a collection about 1 month ago

JustRL

Collection

2 items • Updated Nov 1

published a model about 1 month ago

hbx/JustRL-DeepSeek-1.5B

Text Generation • 2B • Updated 28 days ago • 351 • 5

authored 6 papers 2 months ago

Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents

Paper • 2402.09205 • Published Feb 14, 2024

AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset

Paper • 2504.03612 • Published Apr 4 • 2

The Right Time Matters: Data Arrangement Affects Zero-Shot Generalization in Instruction Tuning

Paper • 2406.11721 • Published Jun 17, 2024

upvoted 2 papers 3 months ago

UserRL: Training Interactive User-Centric Agent via Reinforcement Learning

Paper • 2509.19736 • Published Sep 24 • 12

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16 • 51

authored a paper 3 months ago

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16 • 51

liked a model 3 months ago

openbmb/VoxCPM-0.5B

Text-to-Speech • Updated Sep 19 • 2.43k • 773

Bingxiang He

AI & ML interests

Recent Activity

Organizations

hbx's activity