KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration
Paper • 2605.14278 • Published • 37
None defined yet.
Post-Trained MoE Can Skip Half Experts via Self-Distillation
KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration