VST Collection A comprehensive framework designed to cultivate VLMs with human-like visuospatial abilities. ⢠5 items ⢠Updated Nov 12, 2025 ⢠6
MolmoAct Data Mixture Collection All datasets for the MolmoAct (Multimodal Open Language Model for Action) release. ⢠4 items ⢠Updated 26 days ago ⢠18
MolmoAct Collection All models for the MolmoAct (Multimodal Open Language Model for Action) release. ⢠10 items ⢠Updated 26 days ago ⢠33
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12, 2025 ⢠482
Cohere Labs Aya Vision Collection Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. ⢠5 items ⢠Updated Jul 31, 2025 ⢠70
Cosmos Collection ā ļø This collection is archived. š https://huggingface.co/collections/nvidia/nvidia-cosmos-2 ⢠31 items ⢠Updated 2 days ago ⢠299
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 ⢠15 items ⢠Updated Dec 6, 2024 ⢠649
Molmo Collection Artifacts for open multimodal language models. ⢠5 items ⢠Updated 26 days ago ⢠309
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions Paper ⢠2309.10150 ⢠Published Sep 18, 2023 ⢠25
Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping Paper ⢠2309.07970 ⢠Published Sep 14, 2023 ⢠8
Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control Paper ⢠2307.00117 ⢠Published Jun 30, 2023 ⢠6