Imagination Helps Visual Reasoning, But Not Yet in Latent Space Paper • 2602.22766 • Published 22 days ago • 42
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper • 2602.03560 • Published Feb 3 • 47
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published Feb 13 • 57
Next Embedding Prediction Makes World Models Stronger Paper • 2603.02765 • Published 17 days ago • 20
view article Article NEO-unify: Building Native Multimodal Unified Models End to End 15 days ago • 102
NLE: Non-autoregressive LLM-based ASR by Transcript Editing Paper • 2603.08397 • Published 11 days ago • 21
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 Text Generation • 124B • Updated about 22 hours ago • 69.6k • 273
Weak-SIGReg: Covariance Regularization for Stable Deep Learning Paper • 2603.05924 • Published 14 days ago • 1