VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 8 items • Updated 7 days ago • 160
view article Article Asynchronous Robot Inference: Decoupling Action Prediction and Execution +5 Jul 10 • 45
view article Article Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints +2 May 1, 2024 • 80
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9 • 723
💬Urdu ASR Models Collection Collection of fine-tuned Urdu speech recognition models. • 9 items • Updated Jul 14 • 2
V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13 • 173
view article Article LeMaterial: an open source initiative to accelerate materials discovery and research +8 Dec 10, 2024 • 54
D-FINE Collection State-of-the-art real-time object detection model with Apache 2.0 licence • 15 items • Updated May 5 • 56
Llama 4 Collection Meta's new Llama 4 multimodal models, Scout & Maverick. Includes Dynamic GGUFs, 16-bit & Dynamic 4-bit uploads. Run & fine-tune them with Unsloth! • 15 items • Updated 9 days ago • 53
MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 8 items • Updated Mar 21 • 22
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 7 items • Updated Jul 21 • 160