florianvoss commited on
Commit
fb566c2
·
verified ·
1 Parent(s): 59e3de8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +148 -22
README.md CHANGED
@@ -1,38 +1,164 @@
1
  ---
2
- language:
3
- - en
4
- datasets:
5
- - liuhaotian/LLaVA-Instruct-150K
6
- pipeline_tag: image-text-to-text
7
- arxiv: 2304.08485
8
  license: llama2
9
  tags:
10
  - vision
11
  - image-text-to-text
 
 
 
 
 
12
  ---
13
- # LLaVA Model Card
14
 
15
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62441d1d9fdefb55a0b7d12c/FPshq08TKYD0e-qwPLDVO.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
- Below is the model card of Llava model 7b, which is copied from the original Llava model card that you can find [here](https://huggingface.co/liuhaotian/llava-v1.5-13b).
18
 
19
- Check out also the Google Colab demo to run Llava on a free-tier Google Colab instance: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1qsl6cd2c8gGtEW1xV5io7S8NHh-Cp1TV?usp=sharing)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- Or check out our Spaces demo! [![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-md-dark.svg)](https://huggingface.co/spaces/llava-hf/llava-4bit)
22
 
 
23
 
24
- ## Model details
25
 
26
- **Model type:**
27
- LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data.
28
- It is an auto-regressive language model, based on the transformer architecture.
29
 
30
- **Model date:**
31
- LLaVA-v1.5-7B was trained in September 2023.
 
32
 
33
- **Paper or resources for more information:**
34
- https://llava-vl.github.io/
35
 
36
- ## License
37
- Llama 2 is licensed under the LLAMA 2 Community License,
38
- Copyright (c) Meta Platforms, Inc. All Rights Reserved.
 
1
  ---
2
+ library_name: llima
 
 
 
 
 
3
  license: llama2
4
  tags:
5
  - vision
6
  - image-text-to-text
7
+ - generative_ai
8
+ - embedded
9
+ - sima
10
+ pipeline_tag: image-text-to-text
11
+ base_model: llava-hf/llava-1.5-7b-hf
12
  ---
 
13
 
14
+ # LLaVA-1.5-7b-hf: Optimized for SiMa.ai Modalix
15
+
16
+ ## Overview
17
+
18
+ This repository contains the **llava-1.5-7b-hf** model, optimized and compiled for the **SiMa.ai Modalix** platform.
19
+
20
+ - **Model Architecture:** LLaVA 1.5 (7B parameters)
21
+ - **Quantization:** Hybrid
22
+ - **Prompt Processing:** A16W8 (16-bit activations, 8-bit weights)
23
+ - **Token Generation:** A16W4 (16-bit activations, 4-bit weights)
24
+ - **Maximum context length:** 2048
25
+ - **Input Resolution:** 336x336
26
+ - **Source Model:** [llava-hf/llava-1.5-7b-hf](https://huggingface.co/llava-hf/llava-1.5-7b-hf)
27
+
28
+ ## Performance
29
+
30
+ The following performance metrics were measured with an image and a text prompt of 50 tokens.
31
+
32
+ | Model | Precision | Device | Response Rate (tokens/sec) | Time To First Token (sec) |
33
+ |---|---|---|---|---|
34
+ | llava-1.5-7b-hf | A16W8/A16W4 | Modalix | 10.2 tokens/sec | 1.43 sec |
35
+
36
+
37
+ ## Prerequisites
38
+
39
+ To run this model, you need:
40
+
41
+ 1. **SiMa.ai Modalix Device**
42
+ 2. **SiMa.ai CLI**: [Installed](https://docs.sima.ai/pages/sima_cli/main.html#installation) on your Modalix device.
43
+ 3. **Hugging Face CLI**: For downloading the model.
44
+
45
+ ## Installation & Deployment
46
+
47
+ Follow these steps to deploy the model to your Modalix device.
48
+
49
+ ### 1. Install LLiMa Demo Application
50
+ > **Note:** This is a **one-time setup**. If you have already installed the LLiMa demo application (e.g. for another model), you can skip this step and continue with model download.
51
+
52
+ On your Modalix device, install the LLiMa demo application using the `sima-cli`:
53
+
54
+ ```bash
55
+ # Create a directory for LLiMa
56
+ cd /media/nvme
57
+ mkdir llima
58
+ cd llima
59
+
60
+ # Install the LLiMa runtime code
61
+ sima-cli install -v 2.0.0 samples/llima -t select
62
+ ```
63
+ > **Note:** To only download the LLiMa runtime code, select **🚫 Skip** when prompted.
64
+
65
+ ### 2. Download the Model
66
+
67
+ Download the compiled model assets from this repository directly to your device.
68
+
69
+ ```bash
70
+ # Download the model to a local directory
71
+ cd /media/nvme/llima
72
+ hf download simaai/llava-1.5-7b-hf-a16w4 --local-dir llava-1.5-7b-hf-a16w4
73
+ ```
74
+
75
+ Alternatively, you can download the compiled model to a Host and copy it to the Modalix device:
76
+
77
+ ```bash
78
+ hf download simaai/llava-1.5-7b-hf-a16w4 --local-dir llava-1.5-7b-hf-a16w4
79
+ scp -r llava-1.5-7b-hf-a16w4 sima@<modalix-ip>:/media/nvme/llima/
80
+ ```
81
+ *Replace \<modalix-ip\> with the IP address of your Modalix device.*
82
+
83
+ **Expected Directory Structure:**
84
+
85
+ ```text
86
+ /media/nvme/llima/
87
+ ├── simaai-genai-demo/ # The demo app
88
+ └── llava-1.5-7b-hf-a16w4/ # Your downloaded model
89
+ ```
90
+
91
+ ## Usage
92
+
93
+ ### Run the Application
94
+
95
+ Navigate to the demo directory and start the application:
96
+
97
+ ```bash
98
+ cd /media/nvme/llima/simaai-genai-demo
99
+ ./run.sh
100
+ ```
101
+
102
+ The script will detect the installed model(s) and prompt you to select one.
103
+
104
+ Once the application is running, open a browser and navigate to:
105
+ ```text
106
+ https://<modalix-ip>:5000/
107
+ ```
108
+ *Replace \<modalix-ip\> with the IP address of your Modalix device.*
109
+
110
+ ### API Usage
111
+
112
+ To use OpenAI-compatible API, run the model in API mode:
113
+ ```bash
114
+ cd /media/nvme/llima/simaai-genai-demo
115
+ ./run.sh --httponly --api-only
116
+ ```
117
+
118
+ You can interact with it using `curl` or Python.
119
 
120
+ **Example: Chat Completion**
121
 
122
+ ```bash
123
+ # Note: You need to replace <YOUR_BASE64_STRING_HERE> with an actual base64 encoded image string.
124
+ curl -N -k -X POST "https://<modalix-ip>:5000/v1/chat/completions" \
125
+ -H "Content-Type: application/json" \
126
+ -d '{
127
+ "messages": [
128
+ {
129
+ "role": "user",
130
+ "content": [
131
+ {
132
+ "type": "image_url",
133
+ "image_url": {
134
+ "url": "data:image/jpeg;base64,<YOUR_BASE64_STRING_HERE>"
135
+ }
136
+ },
137
+ {
138
+ "type": "text",
139
+ "text": "Describe this image"
140
+ }
141
+ ]
142
+ }
143
+ ],
144
+ "stream": true
145
+ }'
146
+ ```
147
+ *Replace \<modalix-ip\> with the IP address of your Modalix device.*
148
 
149
+ ## Limitations
150
 
151
+ - **Quantization**: This model is quantized (A16W4/A16W8) for optimal performance on embedded devices. While this maintains high accuracy, minor deviations from the full-precision model may occur.
152
 
 
153
 
154
+ ## Troubleshooting
 
 
155
 
156
+ - **`sima-cli` not found**: Ensure that sima-cli is installed on your Modalix device.
157
+ - **Model can't be run**: Verify the model directory is exactly inside `/media/nvme/llima/` and not nested (e.g., `/media/nvme/llima/llava-1.5-7b-hf-a16w4/llava-1.5-7b-hf-a16w4`).
158
+ - **Permission Denied**: Ensure you have read/write permissions for the `/media/nvme` directory.
159
 
160
+ ## Resources
 
161
 
162
+ - [SiMa.ai Documentation](https://docs.sima.ai)
163
+ - [SiMa.ai Hugging Face Organization](https://huggingface.co/simaai)
164
+ - [LLaVA Website](https://llava-vl.github.io/)