Collections
Discover the best community collections!
Collections including paper arxiv:2601.00393
-
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 547 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 306 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 131 -
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 149
-
PersonaLive! Expressive Portrait Image Animation for Live Streaming
Paper • 2512.11253 • Published • 36 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 131 -
Agent READMEs: An Empirical Study of Context Files for Agentic Coding
Paper • 2511.12884 • Published • 23 -
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion
Paper • 2503.11576 • Published • 141
-
MagicWorld: Interactive Geometry-driven Video World Exploration
Paper • 2511.18886 • Published • 19 -
EvoVLA: Self-Evolving Vision-Language-Action Model
Paper • 2511.16166 • Published • 6 -
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots
Paper • 2511.17889 • Published • 5 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 131
-
NitroGen: An Open Foundation Model for Generalist Gaming Agents
Paper • 2601.02427 • Published • 45 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 306 -
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models
Paper • 2512.24165 • Published • 51 -
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting
Paper • 2601.02151 • Published • 109
-
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations
Paper • 2512.21004 • Published • 13 -
Spatia: Video Generation with Updatable Spatial Memory
Paper • 2512.15716 • Published • 33 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 131
-
Captain Safari: A World Engine
Paper • 2511.22815 • Published • 11 -
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Paper • 2512.08478 • Published • 77 -
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling
Paper • 2512.14614 • Published • 71 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 131
-
Seedance 1.0: Exploring the Boundaries of Video Generation Models
Paper • 2506.09113 • Published • 105 -
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion
Paper • 2506.08009 • Published • 30 -
Seeing Voices: Generating A-Roll Video from Audio with Mirage
Paper • 2506.08279 • Published • 27 -
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement
Paper • 2506.07848 • Published • 4
-
NitroGen: An Open Foundation Model for Generalist Gaming Agents
Paper • 2601.02427 • Published • 45 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 306 -
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models
Paper • 2512.24165 • Published • 51 -
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting
Paper • 2601.02151 • Published • 109
-
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 547 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 306 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 131 -
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 149
-
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations
Paper • 2512.21004 • Published • 13 -
Spatia: Video Generation with Updatable Spatial Memory
Paper • 2512.15716 • Published • 33 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 131
-
PersonaLive! Expressive Portrait Image Animation for Live Streaming
Paper • 2512.11253 • Published • 36 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 131 -
Agent READMEs: An Empirical Study of Context Files for Agentic Coding
Paper • 2511.12884 • Published • 23 -
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion
Paper • 2503.11576 • Published • 141
-
Captain Safari: A World Engine
Paper • 2511.22815 • Published • 11 -
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Paper • 2512.08478 • Published • 77 -
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling
Paper • 2512.14614 • Published • 71 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 131
-
MagicWorld: Interactive Geometry-driven Video World Exploration
Paper • 2511.18886 • Published • 19 -
EvoVLA: Self-Evolving Vision-Language-Action Model
Paper • 2511.16166 • Published • 6 -
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots
Paper • 2511.17889 • Published • 5 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 131
-
Seedance 1.0: Exploring the Boundaries of Video Generation Models
Paper • 2506.09113 • Published • 105 -
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion
Paper • 2506.08009 • Published • 30 -
Seeing Voices: Generating A-Roll Video from Audio with Mirage
Paper • 2506.08279 • Published • 27 -
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement
Paper • 2506.07848 • Published • 4