sakuma's picture

54 430

sakuma

sakumaXIII

·

AI & ML interests

None yet

Recent Activity

liked a model about 7 hours ago

mistralai/Devstral-Small-2-24B-Instruct-2512

liked a model about 7 hours ago

mistralai/Devstral-2-123B-Instruct-2512

liked a model about 7 hours ago

unsloth/Devstral-2-123B-Instruct-2512-GGUF

View all activity

Organizations

None yet

upvoted an article 2 days ago

Article

We Got Claude to Fine-Tune an Open Source LLM

7 days ago

•

433

upvoted 4 collections 8 days ago

Ministral 3 - Additional Checkpoints

Different formats and Quantized versions of our Ministral 3 family; 14B/8B/3B Instruct/Reasoning GGUF, 3B Instruct ONNX and 14B/8B/3B Instruct BF16. • 13 items • Updated 8 days ago • 13

Ministral 3

A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated 8 days ago • 120

Mistral Large 3

A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated 8 days ago • 73

Trinity

Collection of Arcee AI models in the Trinity family • 6 items • Updated 9 days ago • 16

upvoted an article 22 days ago

Article

Easily Build and Share ROCm Kernels with Hugging Face

+2

24 days ago

•

35

upvoted an article 25 days ago

Article

We’re open-sourcing our text-to-image model and the process behind it

29 days ago

•

74

upvoted a paper 27 days ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9 • 129

upvoted a collection 28 days ago

RLVE

Models for "RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments" - https://arxiv.org/abs/2511.07317 • 3 items • Updated 29 days ago • 5

upvoted a paper 29 days ago

Gemini Embedding: Generalizable Embeddings from Gemini

Paper • 2503.07891 • Published Mar 10 • 45

upvoted a collection 29 days ago

C2S-Scale-Gemma-Models

C2S-Scale Gemma models trained using the Cell2Sentence framework, described in the C2S-Scale paper. • 2 items • Updated Oct 13 • 12

upvoted 4 articles about 1 month ago

Article

Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face

Feb 11

•

89

Article

Exploring Synthetic Data Generation with DataDreamer

Jan 21

•

9

Article

Synthetic dataset generation techniques: Self-Instruct

May 15, 2024

•

21

Article

Synthetic data: save money, time and carbon with open source

Feb 16, 2024

•

83

upvoted 5 collections about 1 month ago

Dream-Coder 7B

https://hkunlp.github.io/blog/2025/dream-coder • 2 items • Updated Jul 15 • 6

Dream 7B

https://hkunlp.github.io/blog/2025/dream/ • 2 items • Updated Jul 16 • 5

Qwen3-Next

4 items • Updated Sep 22 • 163

LLaDA 2.0

4 items • Updated about 12 hours ago • 21

Granite 4.0 Language Models

13 items • Updated 23 days ago • 195