simaai
/

llava-1.5-7b-hf-a16w4

@@ -1,38 +1,164 @@
 ---
-language:
-- en
-datasets:
-- liuhaotian/LLaVA-Instruct-150K
-pipeline_tag: image-text-to-text
-arxiv: 2304.08485
 license: llama2
 tags:
 - vision
 - image-text-to-text
 ---
-# LLaVA Model Card
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/62441d1d9fdefb55a0b7d12c/FPshq08TKYD0e-qwPLDVO.png)
-Below is the model card of Llava model 7b, which is copied from the original Llava model card that you can find [here](https://huggingface.co/liuhaotian/llava-v1.5-13b).
-Check out also the Google Colab demo to run Llava on a free-tier Google Colab instance: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1qsl6cd2c8gGtEW1xV5io7S8NHh-Cp1TV?usp=sharing)
-Or check out our Spaces demo! [![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-md-dark.svg)](https://huggingface.co/spaces/llava-hf/llava-4bit)
-## Model details
-**Model type:**
-LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data.
-It is an auto-regressive language model, based on the transformer architecture.
-**Model date:**
-LLaVA-v1.5-7B was trained in September 2023.
-**Paper or resources for more information:**
-https://llava-vl.github.io/
-## License
-Llama 2 is licensed under the LLAMA 2 Community License,
-Copyright (c) Meta Platforms, Inc. All Rights Reserved.

 ---
+library_name: llima
 license: llama2
 tags:
 - vision
 - image-text-to-text
+- generative_ai
+- embedded
+- sima
+pipeline_tag: image-text-to-text
+base_model: llava-hf/llava-1.5-7b-hf
 ---
+# LLaVA-1.5-7b-hf: Optimized for SiMa.ai Modalix
+## Overview
+This repository contains the **llava-1.5-7b-hf** model, optimized and compiled for the **SiMa.ai Modalix** platform.
+- **Model Architecture:** LLaVA 1.5 (7B parameters)
+- **Quantization:** Hybrid
+  - **Prompt Processing:** A16W8 (16-bit activations, 8-bit weights)
+  - **Token Generation:** A16W4 (16-bit activations, 4-bit weights)
+- **Maximum context length:** 2048
+- **Input Resolution:** 336x336
+- **Source Model:** [llava-hf/llava-1.5-7b-hf](https://huggingface.co/llava-hf/llava-1.5-7b-hf)
+## Performance
+The following performance metrics were measured with an image and a text prompt of 50 tokens.
+| Model | Precision | Device | Response Rate (tokens/sec) | Time To First Token (sec) |
+|---|---|---|---|---|
+| llava-1.5-7b-hf | A16W8/A16W4 | Modalix | 10.2 tokens/sec | 1.43 sec |
+## Prerequisites
+To run this model, you need:
+1.  **SiMa.ai Modalix Device**
+2.  **SiMa.ai CLI**: [Installed](https://docs.sima.ai/pages/sima_cli/main.html#installation) on your Modalix device.
+3.  **Hugging Face CLI**: For downloading the model.
+## Installation & Deployment
+Follow these steps to deploy the model to your Modalix device.
+### 1. Install LLiMa Demo Application
+> **Note:** This is a **one-time setup**. If you have already installed the LLiMa demo application (e.g. for another model), you can skip this step and continue with model download.
+On your Modalix device, install the LLiMa demo application using the `sima-cli`:
+```bash
+# Create a directory for LLiMa
+cd /media/nvme
+mkdir llima
+cd llima
+# Install the LLiMa runtime code
+sima-cli install -v 2.0.0 samples/llima -t select
+```
+> **Note:** To only download the LLiMa runtime code, select **🚫 Skip** when prompted.
+### 2. Download the Model
+Download the compiled model assets from this repository directly to your device.
+```bash
+# Download the model to a local directory
+cd /media/nvme/llima
+hf download simaai/llava-1.5-7b-hf-a16w4 --local-dir llava-1.5-7b-hf-a16w4
+```
+Alternatively, you can download the compiled model to a Host and copy it to the Modalix device:
+```bash
+hf download simaai/llava-1.5-7b-hf-a16w4 --local-dir llava-1.5-7b-hf-a16w4
+scp -r llava-1.5-7b-hf-a16w4 sima@<modalix-ip>:/media/nvme/llima/
+```
+*Replace \<modalix-ip\> with the IP address of your Modalix device.*
+**Expected Directory Structure:**
+```text
+/media/nvme/llima/
+├── simaai-genai-demo/   # The demo app
+└── llava-1.5-7b-hf-a16w4/        # Your downloaded model
+```
+## Usage
+### Run the Application
+Navigate to the demo directory and start the application:
+```bash
+cd /media/nvme/llima/simaai-genai-demo
+./run.sh
+```
+The script will detect the installed model(s) and prompt you to select one.
+Once the application is running, open a browser and navigate to:
+```text
+https://<modalix-ip>:5000/
+```
+*Replace \<modalix-ip\> with the IP address of your Modalix device.*
+### API Usage
+To use OpenAI-compatible API, run the model in API mode:
+```bash
+cd /media/nvme/llima/simaai-genai-demo
+./run.sh --httponly --api-only
+```
+You can interact with it using `curl` or Python.
+**Example: Chat Completion**
+```bash
+# Note: You need to replace <YOUR_BASE64_STRING_HERE> with an actual base64 encoded image string.
+curl -N -k -X POST "https://<modalix-ip>:5000/v1/chat/completions" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "messages": [
+      {
+        "role": "user",
+        "content": [
+          {
+            "type": "image_url",
+            "image_url": {
+              "url": "data:image/jpeg;base64,<YOUR_BASE64_STRING_HERE>"
+            }
+          },
+          {
+            "type": "text",
+            "text": "Describe this image"
+          }
+        ]
+      }
+    ],
+    "stream": true
+  }'
+```
+*Replace \<modalix-ip\> with the IP address of your Modalix device.*
+## Limitations
+- **Quantization**: This model is quantized (A16W4/A16W8) for optimal performance on embedded devices. While this maintains high accuracy, minor deviations from the full-precision model may occur.
+## Troubleshooting
+- **`sima-cli` not found**: Ensure that sima-cli is installed on your Modalix device.
+- **Model can't be run**: Verify the model directory is exactly inside `/media/nvme/llima/` and not nested (e.g., `/media/nvme/llima/llava-1.5-7b-hf-a16w4/llava-1.5-7b-hf-a16w4`).
+- **Permission Denied**: Ensure you have read/write permissions for the `/media/nvme` directory.
+## Resources
+- [SiMa.ai Documentation](https://docs.sima.ai)
+- [SiMa.ai Hugging Face Organization](https://huggingface.co/simaai)
+- [LLaVA Website](https://llava-vl.github.io/)