Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2507.02813

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Paper • 2507.13344 • Published Jul 17 • 57
π^3: Scalable Permutation-Equivariant Visual Geometry Learning

Paper • 2507.13347 • Published Jul 17 • 64
MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second

Paper • 2507.10065 • Published Jul 14 • 24
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering

Paper • 2507.08776 • Published Jul 11 • 54

about 4 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 523 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

AI Math: Diffusion

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 65
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

Paper • 2408.12590 • Published Aug 22, 2024 • 36
Real-Time Video Generation with Pyramid Attention Broadcast

Paper • 2408.12588 • Published Aug 22, 2024 • 17
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 63

TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion

Paper • 2401.09416 • Published Jan 17, 2024 • 11
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild

Paper • 2401.10171 • Published Jan 18, 2024 • 14
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

Paper • 2311.09217 • Published Nov 15, 2023 • 22
GALA: Generating Animatable Layered Assets from a Single Scan

Paper • 2401.12979 • Published Jan 23, 2024 • 9

Diffusion Classifiers Understand Compositionality, but Conditions Apply

Paper • 2505.17955 • Published May 23 • 22
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

Paper • 2506.14429 • Published Jun 17 • 44
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion

Paper • 2507.02813 • Published Jul 3 • 60

Fill outpaint image extender

Runtime error

4

Flux Fill Outpainting

👈

4

Extend images using AI to change size and alignment
FlexPainter: Flexible and Multi-View Consistent Texture Generation

Paper • 2506.02620 • Published Jun 3 • 14
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion

Paper • 2507.02813 • Published Jul 3 • 60
UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding

Paper • 2506.23219 • Published Jun 29 • 7

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 7
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 23
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 14
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

FlashWorld: High-quality 3D Scene Generation within Seconds

Paper • 2510.13678 • Published Oct 15 • 71
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks

Paper • 2510.15019 • Published Oct 16 • 63
GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction

Paper • 2509.18090 • Published Sep 22 • 4
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation

Paper • 2509.19296 • Published Sep 23 • 23

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Paper • 2507.13344 • Published Jul 17 • 57
π^3: Scalable Permutation-Equivariant Visual Geometry Learning

Paper • 2507.13347 • Published Jul 17 • 64
MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second

Paper • 2507.10065 • Published Jul 14 • 24
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering

Paper • 2507.08776 • Published Jul 11 • 54

Diffusion Classifiers Understand Compositionality, but Conditions Apply

Paper • 2505.17955 • Published May 23 • 22
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

Paper • 2506.14429 • Published Jun 17 • 44
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion

Paper • 2507.02813 • Published Jul 3 • 60

about 4 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 523 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

Fill outpaint image extender

Runtime error

4

Flux Fill Outpainting

👈

4

Extend images using AI to change size and alignment
FlexPainter: Flexible and Multi-View Consistent Texture Generation

Paper • 2506.02620 • Published Jun 3 • 14
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion

Paper • 2507.02813 • Published Jul 3 • 60
UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding

Paper • 2506.23219 • Published Jun 29 • 7

AI Math: Diffusion

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 65
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

Paper • 2408.12590 • Published Aug 22, 2024 • 36
Real-Time Video Generation with Pyramid Attention Broadcast

Paper • 2408.12588 • Published Aug 22, 2024 • 17
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 63

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 7
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 23
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 14
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion

Paper • 2401.09416 • Published Jan 17, 2024 • 11
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild

Paper • 2401.10171 • Published Jan 18, 2024 • 14
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

Paper • 2311.09217 • Published Nov 15, 2023 • 22
GALA: Generating Animatable Layered Assets from a Single Scan

Paper • 2401.12979 • Published Jan 23, 2024 • 9

FlashWorld: High-quality 3D Scene Generation within Seconds

Paper • 2510.13678 • Published Oct 15 • 71
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks

Paper • 2510.15019 • Published Oct 16 • 63
GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction

Paper • 2509.18090 • Published Sep 22 • 4
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation

Paper • 2509.19296 • Published Sep 23 • 23

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs