Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2512.20578

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models.

Diffusion Language Models Know the Answer Before Decoding

Paper • 2508.19982 • Published Aug 27, 2025 • 27
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Paper • 2512.13586 • Published Dec 15, 2025 • 93
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following

Paper • 2601.06431 • Published Jan 10 • 12
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published Jan 14 • 63

[papers] Gameplay Optimization

Research papers that may contribute to a broader approach to teaching machines how to play complex strategy games beyond just Chess.

OptiMind: Teaching LLMs to Think Like Optimization Experts

Paper • 2509.22979 • Published Sep 26, 2025 • 4
LFM2 Technical Report

Paper • 2511.23404 • Published Nov 28, 2025 • 53
Zero-Overhead Introspection for Adaptive Test-Time Compute

Paper • 2512.01457 • Published Dec 1, 2025 • 2
Confidence Estimation for LLMs in Multi-turn Interactions

Paper • 2601.02179 • Published Jan 5 • 17

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation

Paper • 2512.24271 • Published Dec 30, 2025 • 63
Nested Learning: The Illusion of Deep Learning Architectures

Paper • 2512.24695 • Published Dec 31, 2025 • 44
Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits

Paper • 2512.20578 • Published Dec 23, 2025 • 86
Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling

Paper • 2512.23959 • Published Dec 30, 2025 • 112

Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits

Paper • 2512.20578 • Published Dec 23, 2025 • 86
The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models

Paper • 2601.03425 • Published Jan 6 • 16

ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning

Paper • 2512.02835 • Published Dec 2, 2025 • 10
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image

Paper • 2512.05044 • Published Dec 4, 2025 • 17
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning

Paper • 2512.05591 • Published Dec 5, 2025 • 17
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling

Paper • 2512.05343 • Published Dec 5, 2025 • 25

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 16 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 152
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

facebook/locate-3d

Updated Apr 17, 2025 • 204 • 11
facebook/locate-3d-plus

Updated Apr 17, 2025 • 37 • 8
facebook/3d-jepa

Updated Apr 17, 2025 • 34 • 6
Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

Paper • 2512.22238 • Published Dec 23, 2025 • 29

Evals and hallucinations

Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits

Paper • 2512.20578 • Published Dec 23, 2025 • 86

Read But Not Implemented

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Paper • 2512.16093 • Published Dec 18, 2025 • 95
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 238
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Paper • 2512.16676 • Published Dec 18, 2025 • 219
Sharp Monocular View Synthesis in Less Than a Second

Paper • 2512.10685 • Published Dec 11, 2025 • 28

Small Language Models are the Future of Agentic AI

Paper • 2506.02153 • Published Jun 2, 2025 • 24
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published Dec 2, 2025 • 256
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

Paper • 2511.22570 • Published Nov 27, 2025 • 90
DeepSeek-OCR: Contexts Optical Compression

Paper • 2510.18234 • Published Oct 21, 2025 • 93

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models.

Diffusion Language Models Know the Answer Before Decoding

Paper • 2508.19982 • Published Aug 27, 2025 • 27
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Paper • 2512.13586 • Published Dec 15, 2025 • 93
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following

Paper • 2601.06431 • Published Jan 10 • 12
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published Jan 14 • 63

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 16 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 152
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

[papers] Gameplay Optimization

Research papers that may contribute to a broader approach to teaching machines how to play complex strategy games beyond just Chess.

OptiMind: Teaching LLMs to Think Like Optimization Experts

Paper • 2509.22979 • Published Sep 26, 2025 • 4
LFM2 Technical Report

Paper • 2511.23404 • Published Nov 28, 2025 • 53
Zero-Overhead Introspection for Adaptive Test-Time Compute

Paper • 2512.01457 • Published Dec 1, 2025 • 2
Confidence Estimation for LLMs in Multi-turn Interactions

Paper • 2601.02179 • Published Jan 5 • 17

facebook/locate-3d

Updated Apr 17, 2025 • 204 • 11
facebook/locate-3d-plus

Updated Apr 17, 2025 • 37 • 8
facebook/3d-jepa

Updated Apr 17, 2025 • 34 • 6
Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

Paper • 2512.22238 • Published Dec 23, 2025 • 29

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation

Paper • 2512.24271 • Published Dec 30, 2025 • 63
Nested Learning: The Illusion of Deep Learning Architectures

Paper • 2512.24695 • Published Dec 31, 2025 • 44
Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits

Paper • 2512.20578 • Published Dec 23, 2025 • 86
Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling

Paper • 2512.23959 • Published Dec 30, 2025 • 112

Evals and hallucinations

Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits

Paper • 2512.20578 • Published Dec 23, 2025 • 86

Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits

Paper • 2512.20578 • Published Dec 23, 2025 • 86
The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models

Paper • 2601.03425 • Published Jan 6 • 16

Read But Not Implemented

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Paper • 2512.16093 • Published Dec 18, 2025 • 95
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 238
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Paper • 2512.16676 • Published Dec 18, 2025 • 219
Sharp Monocular View Synthesis in Less Than a Second

Paper • 2512.10685 • Published Dec 11, 2025 • 28

ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning

Paper • 2512.02835 • Published Dec 2, 2025 • 10
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image

Paper • 2512.05044 • Published Dec 4, 2025 • 17
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning

Paper • 2512.05591 • Published Dec 5, 2025 • 17
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling

Paper • 2512.05343 • Published Dec 5, 2025 • 25

Small Language Models are the Future of Agentic AI

Paper • 2506.02153 • Published Jun 2, 2025 • 24
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published Dec 2, 2025 • 256
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

Paper • 2511.22570 • Published Nov 27, 2025 • 90
DeepSeek-OCR: Contexts Optical Compression

Paper • 2510.18234 • Published Oct 21, 2025 • 93

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs