4 10 9

LIU Shih-yang

sliuau

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

upvoted a paper 4 days ago

Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models

liked a model 5 days ago

mistralai/Ministral-3-3B-Reasoning-2512

View all activity

Organizations

upvoted 2 papers 4 days ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published 11 days ago • 95

Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models

Paper • 2511.18890 • Published 13 days ago • 29

liked a model 5 days ago

mistralai/Ministral-3-3B-Reasoning-2512

4B • Updated 3 days ago • 3.39k • 54

New activity in allenai/Olmo-3-7B-Think 11 days ago

Endless reasoning loop when serving the model with vLLM

#2 opened 12 days ago by

sliuau

liked a model 12 days ago

allenai/Olmo-3-7B-Think

Text Generation • 528k • Updated 13 days ago • 15.1k • • 65

published a dataset about 1 month ago

sliuau/DeepScaleR-Preview-Dataset-verl-format

Viewer • Updated Nov 3 • 40.8k • 43

updated a dataset about 1 month ago

sliuau/DeepScaleR-Preview-Dataset-verl-format

Viewer • Updated Nov 3 • 40.8k • 43

upvoted a paper about 1 month ago

DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning

Paper • 2510.15110 • Published Oct 16 • 15

updated 3 models about 1 month ago

liked 3 models about 1 month ago

nvidia/DLER-Llama-Nemotron-8B-Merge-Research

8B • Updated Oct 25 • 91 • 13

nvidia/DLER-R1-1.5B-Research

2B • Updated Oct 25 • 449 • 15

nvidia/DLER-R1-7B-Research

8B • Updated Oct 25 • 234 • 14

commented a paper about 1 month ago

DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning

Paper • 2510.15110 • Published Oct 16 • 15 •

published a model about 1 month ago

nvidia/DLER-R1-1.5B-Research

2B • Updated Oct 25 • 449 • 15

authored 2 papers about 1 month ago

CMOSE: Comprehensive Multi-Modality Online Student Engagement Dataset with High-Quality Labels

Paper • 2312.09066 • Published Dec 14, 2023

DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning

Paper • 2510.15110 • Published Oct 16 • 15

published 2 models about 2 months ago

nvidia/DLER-R1-7B-Research

8B • Updated Oct 25 • 234 • 14

nvidia/DLER-Llama-Nemotron-8B-Merge-Research

8B • Updated Oct 25 • 91 • 13

LIU Shih-yang

AI & ML interests

Recent Activity

Organizations

sliuau's activity

Endless reasoning loop when serving the model with vLLM