Update README.md
Browse files
README.md
CHANGED
|
@@ -10,7 +10,7 @@ pipeline_tag: visual-question-answering
|
|
| 10 |
|
| 11 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 12 |
|
| 13 |
-
|
| 14 |
|
| 15 |
## Model Details
|
| 16 |
|
|
@@ -19,6 +19,12 @@ This modelcard aims to be a base template for new models. It has been generated
|
|
| 19 |
- **License:** apache-2.0
|
| 20 |
- **Finetuned from model:** meta-llama/Llama-3.2-11B-Vision-Instruct
|
| 21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
## Reproduction
|
| 23 |
|
| 24 |
<!-- This section describes the evaluation protocols and provides the results. -->
|
|
|
|
| 10 |
|
| 11 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 12 |
|
| 13 |
+
Llama-3.2V-11B-cot is an early version of [LLaVA-o1](https://github.com/PKU-YuanGroup/LLaVA-o1), which is a visual language model capable of spontaneous, systematic reasoning.
|
| 14 |
|
| 15 |
## Model Details
|
| 16 |
|
|
|
|
| 19 |
- **License:** apache-2.0
|
| 20 |
- **Finetuned from model:** meta-llama/Llama-3.2-11B-Vision-Instruct
|
| 21 |
|
| 22 |
+
## Benchmark Results
|
| 23 |
+
|
| 24 |
+
| MMStar | MMBench | MMVet | MathVista | AI2D | Hallusion | Average |
|
| 25 |
+
|--------|---------|-------|-----------|------|-----------|---------|
|
| 26 |
+
| 57.6 | 75.0 | 60.3 | 54.8 | 85.7 | 47.8 | 63.5 |
|
| 27 |
+
|
| 28 |
## Reproduction
|
| 29 |
|
| 30 |
<!-- This section describes the evaluation protocols and provides the results. -->
|