AzalKhan commited on
Commit
a611ce3
·
verified ·
1 Parent(s): c666385

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +25 -0
README.md ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Qwen/Qwen2.5-1.5B-Instruct
4
+ datasets:
5
+ - open-r1/DAPO-Math-17k-Processed
6
+ library_name: transformers
7
+ tags:
8
+ - grpo
9
+ - reinforcement-learning
10
+ - reasoning
11
+ - qwen
12
+ ---
13
+
14
+ # Qwen2.5-1.5B-Instruct_BF16_open-r1-DAPO-Math-17k-Processed_1176_FlashRL_G4-L2048_new
15
+
16
+ This repository contains a checkpoint trained with GRPO on `open-r1/DAPO-Math-17k-Processed` starting from `Qwen/Qwen2.5-1.5B-Instruct`.\
17
+ This snapshot corresponds to training step `1176`.
18
+
19
+ Contents include:
20
+ - Model weights (`.safetensors`)
21
+ - Config files (`config.json`, `generation_config.json`)
22
+ - Tokenizer files (`tokenizer.json`, `tokenizer_config.json`, `vocab.json`, `merges.txt`, `special_tokens_map.json`, `added_tokens.json`)
23
+ - Optional chat template (`chat_template.jinja`)
24
+
25
+ Training artifacts (optimizer/scheduler states and RNG) have been intentionally excluded.