-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 58 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 52 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 45 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 63
Collections
Discover the best community collections!
Collections including paper arxiv:2602.12099
-
openai/gpt-oss-120b
Text Generation • 120B • Updated • 3.48M • • 4.51k -
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning
Paper • 2512.20605 • Published • 62 -
Nested Browser-Use Learning for Agentic Information Seeking
Paper • 2512.23647 • Published • 19 -
TimeBill: Time-Budgeted Inference for Large Language Models
Paper • 2512.21859 • Published • 25
-
ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems
Paper • 2503.20756 • Published • 7 -
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset
Paper • 2505.09568 • Published • 99 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 214 -
Qwen3-Omni Technical Report
Paper • 2509.17765 • Published • 149
-
Foundation Models in Robotics: Applications, Challenges, and the Future
Paper • 2312.07843 • Published • 16 -
Neural Fields in Robotics: A Survey
Paper • 2410.20220 • Published • 5 -
Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Dataset
Paper • 2410.22325 • Published • 10 -
Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning
Paper • 2410.21845 • Published • 16
-
Beyond Imitation: Reinforcement Learning for Active Latent Planning
Paper • 2601.21598 • Published • 9 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 40 -
Self-Hinting Language Models Enhance Reinforcement Learning
Paper • 2602.03143 • Published • 29 -
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning
Paper • 2602.12099 • Published • 55
-
Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning
Paper • 2506.06205 • Published • 30 -
BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation
Paper • 2506.07530 • Published • 20 -
Ark: An Open-source Python-based Framework for Robot Learning
Paper • 2506.21628 • Published • 16 -
RoboBrain 2.0 Technical Report
Paper • 2507.02029 • Published • 35
-
OpenVLA: An Open-Source Vision-Language-Action Model
Paper • 2406.09246 • Published • 42 -
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Paper • 2411.19650 • Published -
Octo: An Open-Source Generalist Robot Policy
Paper • 2405.12213 • Published • 29 -
Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression
Paper • 2412.03293 • Published
-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 17 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 27 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 9 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 20
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 58 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 52 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 45 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 63
-
Beyond Imitation: Reinforcement Learning for Active Latent Planning
Paper • 2601.21598 • Published • 9 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 40 -
Self-Hinting Language Models Enhance Reinforcement Learning
Paper • 2602.03143 • Published • 29 -
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning
Paper • 2602.12099 • Published • 55
-
openai/gpt-oss-120b
Text Generation • 120B • Updated • 3.48M • • 4.51k -
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning
Paper • 2512.20605 • Published • 62 -
Nested Browser-Use Learning for Agentic Information Seeking
Paper • 2512.23647 • Published • 19 -
TimeBill: Time-Budgeted Inference for Large Language Models
Paper • 2512.21859 • Published • 25
-
Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning
Paper • 2506.06205 • Published • 30 -
BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation
Paper • 2506.07530 • Published • 20 -
Ark: An Open-source Python-based Framework for Robot Learning
Paper • 2506.21628 • Published • 16 -
RoboBrain 2.0 Technical Report
Paper • 2507.02029 • Published • 35
-
ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems
Paper • 2503.20756 • Published • 7 -
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset
Paper • 2505.09568 • Published • 99 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 214 -
Qwen3-Omni Technical Report
Paper • 2509.17765 • Published • 149
-
OpenVLA: An Open-Source Vision-Language-Action Model
Paper • 2406.09246 • Published • 42 -
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Paper • 2411.19650 • Published -
Octo: An Open-Source Generalist Robot Policy
Paper • 2405.12213 • Published • 29 -
Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression
Paper • 2412.03293 • Published
-
Foundation Models in Robotics: Applications, Challenges, and the Future
Paper • 2312.07843 • Published • 16 -
Neural Fields in Robotics: A Survey
Paper • 2410.20220 • Published • 5 -
Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Dataset
Paper • 2410.22325 • Published • 10 -
Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning
Paper • 2410.21845 • Published • 16
-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 17 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 27 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 9 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 20