InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14, 2025 • 306
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community +1 Apr 15, 2024 • 191
Bytes Are All You Need: Transformers Operating Directly On File Bytes Paper • 2306.00238 • Published May 31, 2023 • 6