Update README.md
Browse files
README.md
CHANGED
|
@@ -206,19 +206,6 @@ curl http://localhost:8000/v1/chat/completions \
|
|
| 206 |
|
| 207 |
---
|
| 208 |
|
| 209 |
-
## Benchmarks
|
| 210 |
-
|
| 211 |
-
This README currently reflects results for:
|
| 212 |
-
|
| 213 |
-
* **MMLU** (+ 8 lm-eval tasks, 62.05% avg)
|
| 214 |
-
* **MBPP** & **MBPP+** (EvalPlus)
|
| 215 |
-
* **HumanEval** & **HumanEval+** (EvalPlus)
|
| 216 |
-
* **LiveCodeBench**
|
| 217 |
-
* **GSM8K**
|
| 218 |
-
* **MATH-500**
|
| 219 |
-
|
| 220 |
-
Evaluation status: **75% complete (6/8 benchmarks)** — **WildBench** and **SWE-Bench** will be added here once finalized.
|
| 221 |
-
|
| 222 |
---
|
| 223 |
|
| 224 |
## License
|
|
|
|
| 206 |
|
| 207 |
---
|
| 208 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 209 |
---
|
| 210 |
|
| 211 |
## License
|