Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published Feb 11 • 193
📋 Twinkle Eval Logs Collection Benchmark log generated with Twinkle Eval, recording the model's outputs for each prompt, see more in https://github.com/ai-twinkle/Eval • 22 items • Updated 3 days ago • 1
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 Text Generation • 124B • Updated 4 days ago • 163k • 308
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 15 items • Updated 4 days ago • 245
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 19 days ago • 79
📋 Twinkle Eval Logs Collection Benchmark log generated with Twinkle Eval, recording the model's outputs for each prompt, see more in https://github.com/ai-twinkle/Eval • 22 items • Updated 3 days ago • 1
🤏 Smol-Data Collection Tried and tested mixes for strong pretraining. Inspired by https://huggingface.co/blog/codelion/optimal-dataset-mixing • 14 items • Updated 26 days ago • 12
Running on CPU Upgrade 208 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 208 Explore synthetic data experiments as an interactive bookshelf
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 Feb 20 • 490
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark Paper • 2409.02813 • Published Sep 4, 2024 • 33
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI Paper • 2404.16006 • Published Apr 24, 2024 • 2