5 58 1

Dawei Li

wjldw

https://david-li0406.github.io/

AI & ML interests

LLM, NLP, Data Mining

Recent Activity

upvoted a paper about 2 months ago

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

upvoted a paper 3 months ago

RubricBench: Aligning Model-Generated Rubrics with Human Standards

upvoted a paper 4 months ago

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

View all activity

Organizations

upvoted a paper about 2 months ago

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published Apr 11 • 82

upvoted a paper 3 months ago

RubricBench: Aligning Model-Generated Rubrics with Human Standards

Paper • 2603.01562 • Published Mar 2 • 63

upvoted a paper 4 months ago

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 355

upvoted a paper 5 months ago

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

Paper • 2601.14004 • Published Jan 20 • 49

authored a paper 5 months ago

ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents

Paper • 2601.12294 • Published Jan 18 • 19

upvoted a paper 5 months ago

ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents

Paper • 2601.12294 • Published Jan 18 • 19

submitted a paper to Daily Papers 5 months ago

ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents

Paper • 2601.12294 • Published Jan 18 • 19

upvoted 4 papers 5 months ago

RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation

Paper • 2601.08430 • Published Jan 13 • 62

MMFormalizer: Multimodal Autoformalization in the Wild

Paper • 2601.03017 • Published Jan 6 • 106

EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis

Paper • 2601.05808 • Published Jan 9 • 37

Agent-as-a-Judge

Paper • 2601.05111 • Published Jan 8 • 20

updated 2 models 5 months ago

wjldw/ToolPRM-GRPO-synthesis

4B • Updated Jan 4 • 1

wjldw/ToolPRM-GRPO-v4

4B • Updated Jan 3 • 4

published a model 5 months ago

wjldw/ToolPRM-GRPO-v4

4B • Updated Jan 3 • 4

updated a model 5 months ago

wjldw/ToolPRM-Base-v4

Text Generation • 196k • Updated Jan 3 • 5

published a model 5 months ago

wjldw/ToolPRM-Base-v4

Text Generation • 196k • Updated Jan 3 • 5

updated a model 5 months ago

wjldw/ToolPRM-CoT-v4

Text Generation • 196k • Updated Jan 3 • 2

published a model 5 months ago

wjldw/ToolPRM-CoT-v4

Text Generation • 196k • Updated Jan 3 • 2

updated a model 5 months ago

wjldw/ToolPRM-Base-synthesis

Text Generation • 196k • Updated Jan 3 • 4

published a model 5 months ago

wjldw/ToolPRM-Base-synthesis

Text Generation • 196k • Updated Jan 3 • 4

Dawei Li

AI & ML interests

Recent Activity

Organizations

wjldw's activity