Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -1,38 +1,164 @@
|
|
| 1 |
---
|
| 2 |
-
|
| 3 |
-
- en
|
| 4 |
-
datasets:
|
| 5 |
-
- liuhaotian/LLaVA-Instruct-150K
|
| 6 |
-
pipeline_tag: image-text-to-text
|
| 7 |
-
arxiv: 2304.08485
|
| 8 |
license: llama2
|
| 9 |
tags:
|
| 10 |
- vision
|
| 11 |
- image-text-to-text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
---
|
| 13 |
-
# LLaVA Model Card
|
| 14 |
|
| 15 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
|
| 17 |
-
|
| 18 |
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
-
|
| 22 |
|
|
|
|
| 23 |
|
| 24 |
-
## Model details
|
| 25 |
|
| 26 |
-
|
| 27 |
-
LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data.
|
| 28 |
-
It is an auto-regressive language model, based on the transformer architecture.
|
| 29 |
|
| 30 |
-
|
| 31 |
-
|
|
|
|
| 32 |
|
| 33 |
-
|
| 34 |
-
https://llava-vl.github.io/
|
| 35 |
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
|
|
|
| 1 |
---
|
| 2 |
+
library_name: llima
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
license: llama2
|
| 4 |
tags:
|
| 5 |
- vision
|
| 6 |
- image-text-to-text
|
| 7 |
+
- generative_ai
|
| 8 |
+
- embedded
|
| 9 |
+
- sima
|
| 10 |
+
pipeline_tag: image-text-to-text
|
| 11 |
+
base_model: llava-hf/llava-1.5-7b-hf
|
| 12 |
---
|
|
|
|
| 13 |
|
| 14 |
+
# LLaVA-1.5-7b-hf: Optimized for SiMa.ai Modalix
|
| 15 |
+
|
| 16 |
+
## Overview
|
| 17 |
+
|
| 18 |
+
This repository contains the **llava-1.5-7b-hf** model, optimized and compiled for the **SiMa.ai Modalix** platform.
|
| 19 |
+
|
| 20 |
+
- **Model Architecture:** LLaVA 1.5 (7B parameters)
|
| 21 |
+
- **Quantization:** Hybrid
|
| 22 |
+
- **Prompt Processing:** A16W8 (16-bit activations, 8-bit weights)
|
| 23 |
+
- **Token Generation:** A16W4 (16-bit activations, 4-bit weights)
|
| 24 |
+
- **Maximum context length:** 2048
|
| 25 |
+
- **Input Resolution:** 336x336
|
| 26 |
+
- **Source Model:** [llava-hf/llava-1.5-7b-hf](https://huggingface.co/llava-hf/llava-1.5-7b-hf)
|
| 27 |
+
|
| 28 |
+
## Performance
|
| 29 |
+
|
| 30 |
+
The following performance metrics were measured with an image and a text prompt of 50 tokens.
|
| 31 |
+
|
| 32 |
+
| Model | Precision | Device | Response Rate (tokens/sec) | Time To First Token (sec) |
|
| 33 |
+
|---|---|---|---|---|
|
| 34 |
+
| llava-1.5-7b-hf | A16W8/A16W4 | Modalix | 10.2 tokens/sec | 1.43 sec |
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
## Prerequisites
|
| 38 |
+
|
| 39 |
+
To run this model, you need:
|
| 40 |
+
|
| 41 |
+
1. **SiMa.ai Modalix Device**
|
| 42 |
+
2. **SiMa.ai CLI**: [Installed](https://docs.sima.ai/pages/sima_cli/main.html#installation) on your Modalix device.
|
| 43 |
+
3. **Hugging Face CLI**: For downloading the model.
|
| 44 |
+
|
| 45 |
+
## Installation & Deployment
|
| 46 |
+
|
| 47 |
+
Follow these steps to deploy the model to your Modalix device.
|
| 48 |
+
|
| 49 |
+
### 1. Install LLiMa Demo Application
|
| 50 |
+
> **Note:** This is a **one-time setup**. If you have already installed the LLiMa demo application (e.g. for another model), you can skip this step and continue with model download.
|
| 51 |
+
|
| 52 |
+
On your Modalix device, install the LLiMa demo application using the `sima-cli`:
|
| 53 |
+
|
| 54 |
+
```bash
|
| 55 |
+
# Create a directory for LLiMa
|
| 56 |
+
cd /media/nvme
|
| 57 |
+
mkdir llima
|
| 58 |
+
cd llima
|
| 59 |
+
|
| 60 |
+
# Install the LLiMa runtime code
|
| 61 |
+
sima-cli install -v 2.0.0 samples/llima -t select
|
| 62 |
+
```
|
| 63 |
+
> **Note:** To only download the LLiMa runtime code, select **🚫 Skip** when prompted.
|
| 64 |
+
|
| 65 |
+
### 2. Download the Model
|
| 66 |
+
|
| 67 |
+
Download the compiled model assets from this repository directly to your device.
|
| 68 |
+
|
| 69 |
+
```bash
|
| 70 |
+
# Download the model to a local directory
|
| 71 |
+
cd /media/nvme/llima
|
| 72 |
+
hf download simaai/llava-1.5-7b-hf-a16w4 --local-dir llava-1.5-7b-hf-a16w4
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
Alternatively, you can download the compiled model to a Host and copy it to the Modalix device:
|
| 76 |
+
|
| 77 |
+
```bash
|
| 78 |
+
hf download simaai/llava-1.5-7b-hf-a16w4 --local-dir llava-1.5-7b-hf-a16w4
|
| 79 |
+
scp -r llava-1.5-7b-hf-a16w4 sima@<modalix-ip>:/media/nvme/llima/
|
| 80 |
+
```
|
| 81 |
+
*Replace \<modalix-ip\> with the IP address of your Modalix device.*
|
| 82 |
+
|
| 83 |
+
**Expected Directory Structure:**
|
| 84 |
+
|
| 85 |
+
```text
|
| 86 |
+
/media/nvme/llima/
|
| 87 |
+
├── simaai-genai-demo/ # The demo app
|
| 88 |
+
└── llava-1.5-7b-hf-a16w4/ # Your downloaded model
|
| 89 |
+
```
|
| 90 |
+
|
| 91 |
+
## Usage
|
| 92 |
+
|
| 93 |
+
### Run the Application
|
| 94 |
+
|
| 95 |
+
Navigate to the demo directory and start the application:
|
| 96 |
+
|
| 97 |
+
```bash
|
| 98 |
+
cd /media/nvme/llima/simaai-genai-demo
|
| 99 |
+
./run.sh
|
| 100 |
+
```
|
| 101 |
+
|
| 102 |
+
The script will detect the installed model(s) and prompt you to select one.
|
| 103 |
+
|
| 104 |
+
Once the application is running, open a browser and navigate to:
|
| 105 |
+
```text
|
| 106 |
+
https://<modalix-ip>:5000/
|
| 107 |
+
```
|
| 108 |
+
*Replace \<modalix-ip\> with the IP address of your Modalix device.*
|
| 109 |
+
|
| 110 |
+
### API Usage
|
| 111 |
+
|
| 112 |
+
To use OpenAI-compatible API, run the model in API mode:
|
| 113 |
+
```bash
|
| 114 |
+
cd /media/nvme/llima/simaai-genai-demo
|
| 115 |
+
./run.sh --httponly --api-only
|
| 116 |
+
```
|
| 117 |
+
|
| 118 |
+
You can interact with it using `curl` or Python.
|
| 119 |
|
| 120 |
+
**Example: Chat Completion**
|
| 121 |
|
| 122 |
+
```bash
|
| 123 |
+
# Note: You need to replace <YOUR_BASE64_STRING_HERE> with an actual base64 encoded image string.
|
| 124 |
+
curl -N -k -X POST "https://<modalix-ip>:5000/v1/chat/completions" \
|
| 125 |
+
-H "Content-Type: application/json" \
|
| 126 |
+
-d '{
|
| 127 |
+
"messages": [
|
| 128 |
+
{
|
| 129 |
+
"role": "user",
|
| 130 |
+
"content": [
|
| 131 |
+
{
|
| 132 |
+
"type": "image_url",
|
| 133 |
+
"image_url": {
|
| 134 |
+
"url": "data:image/jpeg;base64,<YOUR_BASE64_STRING_HERE>"
|
| 135 |
+
}
|
| 136 |
+
},
|
| 137 |
+
{
|
| 138 |
+
"type": "text",
|
| 139 |
+
"text": "Describe this image"
|
| 140 |
+
}
|
| 141 |
+
]
|
| 142 |
+
}
|
| 143 |
+
],
|
| 144 |
+
"stream": true
|
| 145 |
+
}'
|
| 146 |
+
```
|
| 147 |
+
*Replace \<modalix-ip\> with the IP address of your Modalix device.*
|
| 148 |
|
| 149 |
+
## Limitations
|
| 150 |
|
| 151 |
+
- **Quantization**: This model is quantized (A16W4/A16W8) for optimal performance on embedded devices. While this maintains high accuracy, minor deviations from the full-precision model may occur.
|
| 152 |
|
|
|
|
| 153 |
|
| 154 |
+
## Troubleshooting
|
|
|
|
|
|
|
| 155 |
|
| 156 |
+
- **`sima-cli` not found**: Ensure that sima-cli is installed on your Modalix device.
|
| 157 |
+
- **Model can't be run**: Verify the model directory is exactly inside `/media/nvme/llima/` and not nested (e.g., `/media/nvme/llima/llava-1.5-7b-hf-a16w4/llava-1.5-7b-hf-a16w4`).
|
| 158 |
+
- **Permission Denied**: Ensure you have read/write permissions for the `/media/nvme` directory.
|
| 159 |
|
| 160 |
+
## Resources
|
|
|
|
| 161 |
|
| 162 |
+
- [SiMa.ai Documentation](https://docs.sima.ai)
|
| 163 |
+
- [SiMa.ai Hugging Face Organization](https://huggingface.co/simaai)
|
| 164 |
+
- [LLaVA Website](https://llava-vl.github.io/)
|