Update README.md
Browse files
README.md
CHANGED
|
@@ -51,7 +51,7 @@ For details on training and evaluation, read [our paper](https://arxiv.org/abs/2
|
|
| 51 |
|
| 52 |
|
| 53 |
| Model | Size | Alignment | GSM8k 8-shot CoT Acc. | AlpacaEval 2 Winrate (LC) |
|
| 54 |
-
|
| 55 |
| **Tulu V2.5 PPO Llama 3 8B (this model)** | 8B | PPO with 8B RM | 61.5 | 22.7 |
|
| 56 |
| **Tulu V2.5 PPO 13B** | 13B | PPO with 70B RM | 58.0 | **26.7** |
|
| 57 |
| **Tulu V2 DPO 13B** | 13B | DPO | 50.5 | 16.0 |
|
|
|
|
| 51 |
|
| 52 |
|
| 53 |
| Model | Size | Alignment | GSM8k 8-shot CoT Acc. | AlpacaEval 2 Winrate (LC) |
|
| 54 |
+
|-|-|-|-|-|
|
| 55 |
| **Tulu V2.5 PPO Llama 3 8B (this model)** | 8B | PPO with 8B RM | 61.5 | 22.7 |
|
| 56 |
| **Tulu V2.5 PPO 13B** | 13B | PPO with 70B RM | 58.0 | **26.7** |
|
| 57 |
| **Tulu V2 DPO 13B** | 13B | DPO | 50.5 | 16.0 |
|