Geyang
geyang627
AI & ML interests
None yet
Recent Activity
upvoted a paper 14 days ago
Safe and Scalable Web Agent Learning via Recreated Websites upvoted an article 28 days ago
Deriving the PPO Loss from First Principles upvoted an article 28 days ago
A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond