-
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
Paper • 2507.13344 • Published • 57 -
π^3: Scalable Permutation-Equivariant Visual Geometry Learning
Paper • 2507.13347 • Published • 64 -
MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second
Paper • 2507.10065 • Published • 24 -
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering
Paper • 2507.08776 • Published • 54
Collections
Discover the best community collections!
Collections including paper arxiv:2507.02813
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 523 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Controllable Text Generation for Large Language Models: A Survey
Paper • 2408.12599 • Published • 65 -
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Paper • 2408.12590 • Published • 36 -
Real-Time Video Generation with Pyramid Attention Broadcast
Paper • 2408.12588 • Published • 17 -
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Paper • 2408.11039 • Published • 63
-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 11 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 14 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 22 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 9
-
Diffusion Classifiers Understand Compositionality, but Conditions Apply
Paper • 2505.17955 • Published • 22 -
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Paper • 2506.14429 • Published • 44 -
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
Paper • 2507.02813 • Published • 60
-
Flux Fill Outpainting
👈4Extend images using AI to change size and alignment
-
FlexPainter: Flexible and Multi-View Consistent Texture Generation
Paper • 2506.02620 • Published • 14 -
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
Paper • 2507.02813 • Published • 60 -
UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding
Paper • 2506.23219 • Published • 7
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 7 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 23 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 14 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69
-
FlashWorld: High-quality 3D Scene Generation within Seconds
Paper • 2510.13678 • Published • 71 -
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks
Paper • 2510.15019 • Published • 63 -
GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction
Paper • 2509.18090 • Published • 4 -
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation
Paper • 2509.19296 • Published • 23
-
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
Paper • 2507.13344 • Published • 57 -
π^3: Scalable Permutation-Equivariant Visual Geometry Learning
Paper • 2507.13347 • Published • 64 -
MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second
Paper • 2507.10065 • Published • 24 -
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering
Paper • 2507.08776 • Published • 54
-
Diffusion Classifiers Understand Compositionality, but Conditions Apply
Paper • 2505.17955 • Published • 22 -
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Paper • 2506.14429 • Published • 44 -
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
Paper • 2507.02813 • Published • 60
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 523 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Flux Fill Outpainting
👈4Extend images using AI to change size and alignment
-
FlexPainter: Flexible and Multi-View Consistent Texture Generation
Paper • 2506.02620 • Published • 14 -
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
Paper • 2507.02813 • Published • 60 -
UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding
Paper • 2506.23219 • Published • 7
-
Controllable Text Generation for Large Language Models: A Survey
Paper • 2408.12599 • Published • 65 -
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Paper • 2408.12590 • Published • 36 -
Real-Time Video Generation with Pyramid Attention Broadcast
Paper • 2408.12588 • Published • 17 -
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Paper • 2408.11039 • Published • 63
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 7 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 23 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 14 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69
-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 11 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 14 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 22 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 9
-
FlashWorld: High-quality 3D Scene Generation within Seconds
Paper • 2510.13678 • Published • 71 -
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks
Paper • 2510.15019 • Published • 63 -
GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction
Paper • 2509.18090 • Published • 4 -
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation
Paper • 2509.19296 • Published • 23