Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2601.00393

Self-Improving World Modelling with Latent Actions

Paper • 2602.06130 • Published 9 days ago • 29
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30, 2025 • 547
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 306
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131
LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 149

Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

Paper • 2512.17532 • Published Dec 19, 2025 • 67
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131

PersonaLive! Expressive Portrait Image Animation for Live Streaming

Paper • 2512.11253 • Published Dec 12, 2025 • 36
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131
Agent READMEs: An Empirical Study of Context Files for Agentic Coding

Paper • 2511.12884 • Published Nov 17, 2025 • 23
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14, 2025 • 141

about 1 month ago

MagicWorld: Interactive Geometry-driven Video World Exploration

Paper • 2511.18886 • Published Nov 24, 2025 • 19
EvoVLA: Self-Evolving Vision-Language-Action Model

Paper • 2511.16166 • Published Nov 20, 2025 • 6
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots

Paper • 2511.17889 • Published Nov 22, 2025 • 5
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131

NitroGen: An Open Foundation Model for Generalist Gaming Agents

Paper • 2601.02427 • Published Jan 4 • 45
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 306
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

Paper • 2512.24165 • Published Dec 30, 2025 • 51
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

Paper • 2601.02151 • Published Jan 5 • 109

NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131

Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations

Paper • 2512.21004 • Published Dec 24, 2025 • 13
Spatia: Video Generation with Updatable Spatial Memory

Paper • 2512.15716 • Published Dec 17, 2025 • 33
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131

Captain Safari: A World Engine

Paper • 2511.22815 • Published Nov 28, 2025 • 11
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

Paper • 2512.08478 • Published Dec 9, 2025 • 77
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling

Paper • 2512.14614 • Published Dec 16, 2025 • 71
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131

Video Generation

Seedance 1.0: Exploring the Boundaries of Video Generation Models

Paper • 2506.09113 • Published Jun 10, 2025 • 105
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion

Paper • 2506.08009 • Published Jun 9, 2025 • 30
Seeing Voices: Generating A-Roll Video from Audio with Mirage

Paper • 2506.08279 • Published Jun 9, 2025 • 27
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement

Paper • 2506.07848 • Published Jun 9, 2025 • 4

Self-Improving World Modelling with Latent Actions

Paper • 2602.06130 • Published 9 days ago • 29
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131

NitroGen: An Open Foundation Model for Generalist Gaming Agents

Paper • 2601.02427 • Published Jan 4 • 45
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 306
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

Paper • 2512.24165 • Published Dec 30, 2025 • 51
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

Paper • 2601.02151 • Published Jan 5 • 109

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30, 2025 • 547
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 306
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131
LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 149

NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131

Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

Paper • 2512.17532 • Published Dec 19, 2025 • 67
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131

Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations

Paper • 2512.21004 • Published Dec 24, 2025 • 13
Spatia: Video Generation with Updatable Spatial Memory

Paper • 2512.15716 • Published Dec 17, 2025 • 33
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131

PersonaLive! Expressive Portrait Image Animation for Live Streaming

Paper • 2512.11253 • Published Dec 12, 2025 • 36
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131
Agent READMEs: An Empirical Study of Context Files for Agentic Coding

Paper • 2511.12884 • Published Nov 17, 2025 • 23
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14, 2025 • 141

Captain Safari: A World Engine

Paper • 2511.22815 • Published Nov 28, 2025 • 11
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

Paper • 2512.08478 • Published Dec 9, 2025 • 77
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling

Paper • 2512.14614 • Published Dec 16, 2025 • 71
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131

about 1 month ago

MagicWorld: Interactive Geometry-driven Video World Exploration

Paper • 2511.18886 • Published Nov 24, 2025 • 19
EvoVLA: Self-Evolving Vision-Language-Action Model

Paper • 2511.16166 • Published Nov 20, 2025 • 6
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots

Paper • 2511.17889 • Published Nov 22, 2025 • 5
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 131

Video Generation

Seedance 1.0: Exploring the Boundaries of Video Generation Models

Paper • 2506.09113 • Published Jun 10, 2025 • 105
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion

Paper • 2506.08009 • Published Jun 9, 2025 • 30
Seeing Voices: Generating A-Roll Video from Audio with Mirage

Paper • 2506.08279 • Published Jun 9, 2025 • 27
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement

Paper • 2506.07848 • Published Jun 9, 2025 • 4

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs