Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.12747

FlashVSR: Towards Real-Time Diffusion-Based Streaming Video Super-Resolution

Paper • 2510.12747 • Published Oct 14 • 37
DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-Resolution

Paper • 2505.16239 • Published May 22

Read Later Stack

Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published Oct 13 • 31
Self-Improving LLM Agents at Test-Time

Paper • 2510.07841 • Published Oct 9 • 9
Making Mathematical Reasoning Adaptive

Paper • 2510.04617 • Published Oct 6 • 22
DocReward: A Document Reward Model for Structuring and Stylizing

Paper • 2510.11391 • Published Oct 13 • 27

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Towards Real-Time Diffusion-Based Streaming Video Super-Resolution

JunhaoZhuang/FlashVSR

Video-to-Video • Updated Nov 5 • 158
FlashVSR: Towards Real-Time Diffusion-Based Streaming Video Super-Resolution

Paper • 2510.12747 • Published Oct 14 • 37
JunhaoZhuang/FlashVSR-v1.1

Video-to-Video • Updated Nov 5 • 72

Image-Video General Tasks

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Paper • 2501.04001 • Published Jan 7 • 47
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published Jan 7 • 52
An Empirical Study of Autoregressive Pre-training from Videos

Paper • 2501.05453 • Published Jan 9 • 41
MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training

Paper • 2501.07556 • Published Jan 13 • 7

FlashVSR: Towards Real-Time Diffusion-Based Streaming Video Super-Resolution

Paper • 2510.12747 • Published Oct 14 • 37
DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-Resolution

Paper • 2505.16239 • Published May 22

Towards Real-Time Diffusion-Based Streaming Video Super-Resolution

JunhaoZhuang/FlashVSR

Video-to-Video • Updated Nov 5 • 158
FlashVSR: Towards Real-Time Diffusion-Based Streaming Video Super-Resolution

Paper • 2510.12747 • Published Oct 14 • 37
JunhaoZhuang/FlashVSR-v1.1

Video-to-Video • Updated Nov 5 • 72

Read Later Stack

Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published Oct 13 • 31
Self-Improving LLM Agents at Test-Time

Paper • 2510.07841 • Published Oct 9 • 9
Making Mathematical Reasoning Adaptive

Paper • 2510.04617 • Published Oct 6 • 22
DocReward: A Document Reward Model for Structuring and Stylizing

Paper • 2510.11391 • Published Oct 13 • 27

Image-Video General Tasks

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Paper • 2501.04001 • Published Jan 7 • 47
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published Jan 7 • 52
An Empirical Study of Autoregressive Pre-training from Videos

Paper • 2501.05453 • Published Jan 9 • 41
MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training

Paper • 2501.07556 • Published Jan 13 • 7

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs