casinca commited on
Commit
6a676cd
·
verified ·
1 Parent(s): b93f873

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -4,7 +4,7 @@ library_name: transformers
4
 
5
  # tiny-mimo-v2-flash
6
 
7
- A ~1.85B-parameter tiny random-weight checkpoint of [XiaomiMiMo/MiMo-V2-Flash](https://huggingface.co/XiaomiMiMo/MiMo-V2-Flash), used for internal testing in Hugging Face `transformers` for the native HF implementation.
8
 
9
  ## Configuration
10
 
@@ -21,5 +21,5 @@ A ~1.85B-parameter tiny random-weight checkpoint of [XiaomiMiMo/MiMo-V2-Flash](h
21
  | `num_attention_heads` / `num_key_value_heads` | 16 / 1 | 64 / 4 (**ratio 4.0**) |
22
  | `head_dim` / `v_head_dim` | 192 / 128 | 192 / 128 |
23
  | `n_routed_experts` / `num_experts_per_tok` | 64 / 2 | 256 / 8 (**ratio 4.0**) |
24
- | parameters | 1.85B | 300B |
25
 
 
4
 
5
  # tiny-mimo-v2-flash
6
 
7
+ A ~2.34B-parameter tiny random-weight checkpoint of [XiaomiMiMo/MiMo-V2-Flash](https://huggingface.co/XiaomiMiMo/MiMo-V2-Flash), used for internal testing in Hugging Face `transformers` for the native HF implementation.
8
 
9
  ## Configuration
10
 
 
21
  | `num_attention_heads` / `num_key_value_heads` | 16 / 1 | 64 / 4 (**ratio 4.0**) |
22
  | `head_dim` / `v_head_dim` | 192 / 128 | 192 / 128 |
23
  | `n_routed_experts` / `num_experts_per_tok` | 64 / 2 | 256 / 8 (**ratio 4.0**) |
24
+ | parameters | 2.34B | 300B |
25