Mungert commited on
Commit
d670d26
·
verified ·
0 Parent(s):

Super-squash history to reclaim storage

Browse files
.gitattributes ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ google_gemma-3-4b-it-bf16.gguf filter=lfs diff=lfs merge=lfs -text
37
+ google_gemma-3-4b-it-q4_k_l.gguf filter=lfs diff=lfs merge=lfs -text
38
+ google_gemma-3-4b-it-q4_k_s.gguf filter=lfs diff=lfs merge=lfs -text
39
+ google_gemma-3-4b-it-q8.gguf filter=lfs diff=lfs merge=lfs -text
40
+ google_gemma-3-4b-it-q5_k_m.gguf filter=lfs diff=lfs merge=lfs -text
41
+ google_gemma-3-4b-it-q3_k_s.gguf filter=lfs diff=lfs merge=lfs -text
42
+ google_gemma-3-4b-it-q6_k_m.gguf filter=lfs diff=lfs merge=lfs -text
43
+ google_gemma-3-4b-it-q3_k_l.gguf filter=lfs diff=lfs merge=lfs -text
44
+ google_gemma-3-4b-it-bf16-q8.gguf filter=lfs diff=lfs merge=lfs -text
45
+ google_gemma-3-4b-it-f16-q8.gguf filter=lfs diff=lfs merge=lfs -text
46
+ google_gemma-3-4b-it-q5_k_s.gguf filter=lfs diff=lfs merge=lfs -text
47
+ google_gemma-3-4b-it-mmproj-f16.gguf filter=lfs diff=lfs merge=lfs -text
48
+ google_gemma-3-4b-it-mmproj-q8.gguf filter=lfs diff=lfs merge=lfs -text
49
+ google_gemma-3-4b-it-iq3_xs.gguf filter=lfs diff=lfs merge=lfs -text
50
+ google_gemma-3-4b-it-mmproj-bf16.gguf filter=lfs diff=lfs merge=lfs -text
51
+ google_gemma-3-4b-it-q4_0.gguf filter=lfs diff=lfs merge=lfs -text
52
+ google_gemma-3-4b-it-q4_1.gguf filter=lfs diff=lfs merge=lfs -text
53
+ google_gemma-3-4b-it-iq4_xs.gguf filter=lfs diff=lfs merge=lfs -text
54
+ google_gemma-3-4b-it-iq4_nl.gguf filter=lfs diff=lfs merge=lfs -text
55
+ mmproj.gguf filter=lfs diff=lfs merge=lfs -text
56
+ google_gemma-3-4b-it-q5_k_l.gguf filter=lfs diff=lfs merge=lfs -text
57
+ google_gemma-3-4b-it-q3_k_m.gguf filter=lfs diff=lfs merge=lfs -text
58
+ google_gemma-3-4b-it-q2_k_l.gguf filter=lfs diff=lfs merge=lfs -text
59
+ google_gemma-3-4b-it-mmproj-f32.gguf filter=lfs diff=lfs merge=lfs -text
60
+ google_gemma-3-4b-it-q6_k_l.gguf filter=lfs diff=lfs merge=lfs -text
61
+ google_gemma-3-4b-it-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
62
+ car-1.jpg filter=lfs diff=lfs merge=lfs -text
63
+ llama-gemma3-cli filter=lfs diff=lfs merge=lfs -text
64
+ google_gemma-3-4b-it-f16.gguf filter=lfs diff=lfs merge=lfs -text
65
+ gemma-3-4b-it-f16.gguf filter=lfs diff=lfs merge=lfs -text
66
+ gemma-3-4b-it-f16-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
67
+ gemma-3-4b-it-bf16-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
68
+ gemma-3-4b-it-f16-q6_k.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,268 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: gemma
3
+ pipeline_tag: image-text-to-text
4
+ tags:
5
+ - vision
6
+ - gemma
7
+ - llama.cpp
8
+ ---
9
+
10
+ # <span style="color: #7FFF7F;">Gemma-3 4B Instruct GGUF Models</span>
11
+
12
+
13
+ ## How to Use Gemma 3 Vision with llama.cpp
14
+
15
+ To utilize the experimental support for Gemma 3 Vision in `llama.cpp`, follow these steps:
16
+
17
+ 1. **Clone the lastest llama.cpp Repository**:
18
+ ```bash
19
+ git clone https://github.com/ggml-org/llama.cpp.git
20
+ cd llama.cpp
21
+ ```
22
+
23
+
24
+ 2. **Build the Llama.cpp**:
25
+
26
+ Build llama.cpp as usual : https://github.com/ggml-org/llama.cpp#building-the-project
27
+
28
+ Once llama.cpp is built Copy the ./llama.cpp/build/bin/llama-gemma3-cli to a chosen folder.
29
+
30
+ 3. **Download the Gemma 3 gguf file**:
31
+
32
+ https://huggingface.co/Mungert/gemma-3-4b-it-gguf/tree/main
33
+
34
+ Choose a gguf file without the mmproj in the name
35
+
36
+ Example gguf file : https://huggingface.co/Mungert/gemma-3-4b-it-gguf/resolve/main/google_gemma-3-4b-it-q4_k_l.gguf
37
+
38
+ Copy this file to your chosen folder.
39
+
40
+ 4. **Download the Gemma 3 mmproj file**
41
+
42
+ https://huggingface.co/Mungert/gemma-3-4b-it-gguf/tree/main
43
+
44
+ Choose a file with mmproj in the name
45
+
46
+ Example mmproj file : https://huggingface.co/Mungert/gemma-3-4b-it-gguf/resolve/main/google_gemma-3-4b-it-mmproj-bf16.gguf
47
+
48
+ Copy this file to your chosen folder.
49
+
50
+ 5. Copy images to the same folder as the gguf files or alter paths appropriately.
51
+
52
+ In the example below the gguf files, images and llama-gemma-cli are in the same folder.
53
+
54
+ Example image: image https://huggingface.co/Mungert/gemma-3-4b-it-gguf/resolve/main/car-1.jpg
55
+
56
+ Copy this file to your chosen folder.
57
+
58
+ 6. **Run the CLI Tool**:
59
+
60
+ From your chosen folder :
61
+
62
+ ```bash
63
+ llama-gemma3-cli -m google_gemma-3-4b-it-q4_k_l.gguf --mmproj google_gemma-3-4b-it-mmproj-bf16.gguf
64
+ ```
65
+
66
+ ```
67
+ Running in chat mode, available commands:
68
+ /image <path> load an image
69
+ /clear clear the chat history
70
+ /quit or /exit exit the program
71
+
72
+ > /image car-1.jpg
73
+ Encoding image car-1.jpg
74
+ Image encoded in 46305 ms
75
+ Image decoded in 19302 ms
76
+
77
+ > what is the image of
78
+ Here's a breakdown of what's in the image:
79
+
80
+ **Subject:** The primary subject is a black Porsche Panamera Turbo driving on a highway.
81
+
82
+ **Details:**
83
+
84
+ * **Car:** It's a sleek, modern Porsche Panamera Turbo, identifiable by its distinctive rear design, the "PORSCHE" lettering, and the "Panamera Turbo" badge. The license plate reads "CVC-911".
85
+ * **Setting:** The car is on a multi-lane highway, with a blurred background of trees, a distant building, and a cloudy sky. The lighting suggests it's either dusk or dawn.
86
+ * **Motion:** The image captures the car in motion, with a slight motion blur to convey speed.
87
+
88
+ **Overall Impression:** The image conveys a sense of speed, luxury, and power. It's a well-composed shot that highlights the car's design and performance.
89
+
90
+ Do you want me to describe any specific aspect of the image in more detail, or perhaps analyze its composition?
91
+ ```
92
+
93
+ # <span id="testllm" style="color: #7F7FFF;">🚀 If you find these models useful</span>
94
+
95
+ Please click like ❤️ . Also I’d really appreciate it if you could test my Network Monitor Assistant at 👉 [Network Monitor Assitant](https://readyforquantum.com).
96
+ 💬 Click the **chat icon** (bottom right of the main and dashboard pages) . Choose a LLM; toggle between the LLM Types TurboLLM -> FreeLLM -> TestLLM.
97
+
98
+ ### What I'm Testing
99
+ I'm experimenting with **function calling** against my network monitoring service. Using small open source models. I am into the question "How small can it go and still function".
100
+ 🟡 **TestLLM** – Runs **Phi-4-mini-instruct** using phi-4-mini-q4_0.gguf , llama.cpp on 6 threads of a Cpu VM (Should take about 15s to load. Inference speed is quite slow and it only processes one user prompt at a time—still working on scaling!). If you're curious, I'd be happy to share how it works! .
101
+
102
+ ### The other Available AI Assistants
103
+ 🟢 **TurboLLM** – Uses **gpt-4o-mini** Fast! . Note: tokens are limited since OpenAI models are pricey, but you can [Login](https://readyforquantum.com) or [Download](https://readyforquantum.com/download/?utm_source=huggingface&utm_medium=referral&utm_campaign=huggingface_repo_readme) the Quantum Network Monitor agent to get more tokens, Alternatively use the TestLLM .
104
+ 🔵 **HugLLM** – Runs **open-source Hugging Face models** Fast, Runs small models (≈8B) hence lower quality, Get 2x more tokens (subject to Hugging Face API availability)
105
+
106
+
107
+ ## **Choosing the Right Model Format**
108
+
109
+ Selecting the correct model format depends on your **hardware capabilities** and **memory constraints**.
110
+
111
+ ### **BF16 (Brain Float 16) – Use if BF16 acceleration is available**
112
+ - A 16-bit floating-point format designed for **faster computation** while retaining good precision.
113
+ - Provides **similar dynamic range** as FP32 but with **lower memory usage**.
114
+ - Recommended if your hardware supports **BF16 acceleration** (check your device’s specs).
115
+ - Ideal for **high-performance inference** with **reduced memory footprint** compared to FP32.
116
+
117
+ 📌 **Use BF16 if:**
118
+ ✔ Your hardware has native **BF16 support** (e.g., newer GPUs, TPUs).
119
+ ✔ You want **higher precision** while saving memory.
120
+ ✔ You plan to **requantize** the model into another format.
121
+
122
+ 📌 **Avoid BF16 if:**
123
+ ❌ Your hardware does **not** support BF16 (it may fall back to FP32 and run slower).
124
+ ❌ You need compatibility with older devices that lack BF16 optimization.
125
+
126
+ ---
127
+
128
+ ### **F16 (Float 16) – More widely supported than BF16**
129
+ - A 16-bit floating-point **high precision** but with less of range of values than BF16.
130
+ - Works on most devices with **FP16 acceleration support** (including many GPUs and some CPUs).
131
+ - Slightly lower numerical precision than BF16 but generally sufficient for inference.
132
+
133
+ 📌 **Use F16 if:**
134
+ ✔ Your hardware supports **FP16** but **not BF16**.
135
+ ✔ You need a **balance between speed, memory usage, and accuracy**.
136
+ ✔ You are running on a **GPU** or another device optimized for FP16 computations.
137
+
138
+ 📌 **Avoid F16 if:**
139
+ ❌ Your device lacks **native FP16 support** (it may run slower than expected).
140
+ ❌ You have memory limtations.
141
+
142
+ ---
143
+
144
+ ### **Quantized Models (Q4_K, Q6_K, Q8, etc.) – For CPU & Low-VRAM Inference**
145
+ Quantization reduces model size and memory usage while maintaining as much accuracy as possible.
146
+ - **Lower-bit models (Q4_K)** → **Best for minimal memory usage**, may have lower precision.
147
+ - **Higher-bit models (Q6_K, Q8_0)** → **Better accuracy**, requires more memory.
148
+
149
+ 📌 **Use Quantized Models if:**
150
+ ✔ You are running inference on a **CPU** and need an optimized model.
151
+ ✔ Your device has **low VRAM** and cannot load full-precision models.
152
+ ✔ You want to reduce **memory footprint** while keeping reasonable accuracy.
153
+
154
+ 📌 **Avoid Quantized Models if:**
155
+ ❌ You need **maximum accuracy** (full-precision models are better for this).
156
+ ❌ Your hardware has enough VRAM for higher-precision formats (BF16/F16).
157
+
158
+ ---
159
+
160
+ ### **Summary Table: Model Format Selection**
161
+
162
+ | Model Format | Precision | Memory Usage | Device Requirements | Best Use Case |
163
+ |--------------|------------|---------------|----------------------|---------------|
164
+ | **BF16** | Highest | High | BF16-supported GPU/CPUs | High-speed inference with reduced memory |
165
+ | **F16** | High | High | FP16-supported devices | GPU inference when BF16 isn’t available |
166
+ | **Q4_K** | Low | Very Low | CPU or Low-VRAM devices | Best for memory-constrained environments |
167
+ | **Q6_K** | Medium Low | Low | CPU with more memory | Better accuracy while still being quantized |
168
+ | **Q8** | Medium | Moderate | CPU or GPU with enough VRAM | Best accuracy among quantized models |
169
+
170
+
171
+ ## **Included Files & Details**
172
+
173
+ ### `google_gemma-3-4b-it-bf16.gguf`
174
+ - Model weights preserved in **BF16**.
175
+ - Use this if you want to **requantize** the model into a different format.
176
+ - Best if your device supports **BF16 acceleration**.
177
+
178
+ ### `google_gemma-3-4b-it-f16.gguf`
179
+ - Model weights stored in **F16**.
180
+ - Use if your device supports **FP16**, especially if BF16 is not available.
181
+
182
+ ### `google_gemma-3-4b-it-bf16-q8.gguf`
183
+ - **Output & embeddings** remain in **BF16**.
184
+ - All other layers quantized to **Q8_0**.
185
+ - Use if your device supports **BF16** and you want a quantized version.
186
+
187
+ ### `google_gemma-3-4b-it-f16-q8.gguf`
188
+ - **Output & embeddings** remain in **F16**.
189
+ - All other layers quantized to **Q8_0**.
190
+
191
+ ### `google_gemma-3-4b-it-q4_k_l.gguf`
192
+ - **Output & embeddings** quantized to **Q8_0**.
193
+ - All other layers quantized to **Q4_K**.
194
+ - Good for **CPU inference** with limited memory.
195
+
196
+ ### `google_gemma-3-4b-it-q4_k_m.gguf`
197
+ - Similar to Q4_K.
198
+ - Another option for **low-memory CPU inference**.
199
+
200
+ ### `google_gemma-3-4b-it-q4_k_s.gguf`
201
+ - Smallest **Q4_K** variant, using less memory at the cost of accuracy.
202
+ - Best for **very low-memory setups**.
203
+
204
+ ### `google_gemma-3-4b-it-q6_k_l.gguf`
205
+ - **Output & embeddings** quantized to **Q8_0**.
206
+ - All other layers quantized to **Q6_K** .
207
+
208
+ ### `google_gemma-3-4b-it-q6_k_m.gguf`
209
+ - A mid-range **Q6_K** quantized model for balanced performance .
210
+ - Suitable for **CPU-based inference** with **moderate memory**.
211
+
212
+ ### `google_gemma-3-4b-it-q8.gguf`
213
+ - Fully **Q8** quantized model for better accuracy.
214
+ - Requires **more memory** but offers higher precision.
215
+
216
+
217
+ # Gemma 3 model card
218
+
219
+ **Model Page**: [Gemma](https://ai.google.dev/gemma/docs/core)
220
+
221
+ **Resources and Technical Documentation**:
222
+
223
+ * [Gemma 3 Technical Report][g3-tech-report]
224
+ * [Responsible Generative AI Toolkit][rai-toolkit]
225
+ * [Gemma on Kaggle][kaggle-gemma]
226
+ * [Gemma on Vertex Model Garden][vertex-mg-gemma3]
227
+
228
+ **Terms of Use**: [Terms][terms]
229
+
230
+ **Authors**: Google DeepMind
231
+
232
+ ## Model Information
233
+
234
+ Summary description and brief definition of inputs and outputs.
235
+
236
+ ### Description
237
+
238
+ Gemma is a family of lightweight, state-of-the-art open models from Google,
239
+ built from the same research and technology used to create the Gemini models.
240
+ Gemma 3 models are multimodal, handling text and image input and generating text
241
+ output, with open weights for both pre-trained variants and instruction-tuned
242
+ variants. Gemma 3 has a large, 128K context window, multilingual support in over
243
+ 140 languages, and is available in more sizes than previous versions. Gemma 3
244
+ models are well-suited for a variety of text generation and image understanding
245
+ tasks, including question answering, summarization, and reasoning. Their
246
+ relatively small size makes it possible to deploy them in environments with
247
+ limited resources such as laptops, desktops or your own cloud infrastructure,
248
+ democratizing access to state of the art AI models and helping foster innovation
249
+ for everyone.
250
+
251
+ ### Inputs and outputs
252
+
253
+ - **Input:**
254
+ - Text string, such as a question, a prompt, or a document to be summarized
255
+ - Images, normalized to 896 x 896 resolution and encoded to 256 tokens
256
+ each
257
+ - Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and
258
+ 32K tokens for the 1B size
259
+
260
+ - **Output:**
261
+ - Generated text in response to the input, such as an answer to a
262
+ question, analysis of image content, or a summary of a document
263
+ - Total output context of 8192 tokens
264
+
265
+
266
+ ## Credits
267
+
268
+ Thanks [Bartowski](https://huggingface.co/bartowski) for imartix upload. And your guidance on quantization that has enabled me to produce these gguf file.
car-1.jpg ADDED

Git LFS Details

  • SHA256: 43588f762fb740cbde1d33ab04d3ef539ff4ce88254b6d28daddd5a1792c2cd9
  • Pointer size: 131 Bytes
  • Size of remote file: 280 kB
gemma-3-4b-it-bf16-q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e6a32ecc44453a6538d81b0cb4115ca58e0863a003ee88ce1ee6ac049098f29d
3
+ size 4759372096
gemma-3-4b-it-f16-q6_k.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9dbfaa2e93a3aac10d6ba1db9efa9cf0e46efc02ebe5575ec35063ccfa3387e3
3
+ size 3982279264
gemma-3-4b-it-f16-q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0835b91284b341dbf1250e0828780319dd788b9c6822ad06bd8ba7e9acb1b049
3
+ size 4759372096
google_gemma-3-4b-it-bf16-q8.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:43c9324ba0174824e80ac9348284c060344561206d7200bda85bdb0df308b53d
3
+ size 4759372096
google_gemma-3-4b-it-bf16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:80f21bfd9fb13c6e67c77f445dd55c9a03fdc912def4894bf741f1511ba1342b
3
+ size 7767474272
google_gemma-3-4b-it-f16-q8.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d37b7be6c52c077ab88d942bed8e8c971a939891cf1ca14eadb137d22aa74cc
3
+ size 4759372096
google_gemma-3-4b-it-iq3_xs.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:86dde70bd1739395a684b016ef96fc6156210f451de659a75713e2b9430b4822
3
+ size 1863254336
google_gemma-3-4b-it-iq4_nl.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:73c52c7078b58fcd15cd0d481b42a48ad3cd74e7cf8034352bb3101c95ad58b5
3
+ size 2363375936
google_gemma-3-4b-it-iq4_xs.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3c44379412daf2e010150b90e6373c0d98a147dd634eb72fced2a6b092125a2c
3
+ size 2263105856
google_gemma-3-4b-it-mmproj-bf16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dfd94f00498e29cf2bf391ebe1a1a7ed6d151e75d05063f61ebbb2aaf8859cb5
3
+ size 851251104
google_gemma-3-4b-it-mmproj-f16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8c0fb064b019a6972856aaae2c7e4792858af3ca4561be2dbf649123ba6c40cb
3
+ size 851251104
google_gemma-3-4b-it-mmproj-f32.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:804f41f3860612815bdd915ae382cd1966ec94b801dd051ecb4a96018cd97acf
3
+ size 1679290272
google_gemma-3-4b-it-mmproj-q8.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:33b32907e4e7ef4ab8d29da6444f3033b27a242a772db7136fdc3e4d9195edc3
3
+ size 588612384
google_gemma-3-4b-it-q3_k_m.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0ca6b1f6a78b27734cbac7e7c3d5cde3f7420d56478fc9186c82fe5c7553725f
3
+ size 2098323776
google_gemma-3-4b-it-q3_k_s.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6a13953c679d88db95ce8c5050da64aedd83281dae90e301990e34d3d1f2c733
3
+ size 1937228096
google_gemma-3-4b-it-q4_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2f2105104de860cdbd6042347d67e1f0b132ea64d0b53e5b6619706e4d6f6193
3
+ size 2525905216
google_gemma-3-4b-it-q4_1.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cce8329e50eb457e869e2733ef7b0323cd85a9ba7e28186fe226e77e03d14fe0
3
+ size 2726445376
google_gemma-3-4b-it-q4_k_m.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2067cd277a277b16f2cd465eb10765d6d768682c7896e78269ed4e509b041c59
3
+ size 2489758016
google_gemma-3-4b-it-q4_k_s.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5e86d5c003c337e486b4faa8608ac94b56b8bcfa1d23ac0519b31a33ab519138
3
+ size 2377793856
google_gemma-3-4b-it-q5_k_m.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8d2180aa3eeb65dfae0bb777ecf956320a47d1ce5575711bbdebc37c87082d82
3
+ size 2829562176
google_gemma-3-4b-it-q5_k_s.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2e4ece964fe19d232ed52b2b8fc74863e221f4230f7e3065f7c549352970fe4c
3
+ size 2764456256
google_gemma-3-4b-it-q6_k_m.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4dda4283f62cbc5cbf16e11cf55c48490a9a57ad5d0125f136943e228b52ea8c
3
+ size 3190604096
google_gemma-3-4b-it-q8.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ec257cb2b77f753e0ec77091c3046e5d6d9fe4ad622e65f7b894de1b16f369b2
3
+ size 4130226272
llama-gemma3-cli ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b91510639e5a0eb0528d5d5c868d3f0e5392376bd0502f12676699c828174ac2
3
+ size 2186584
mmproj.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8c0fb064b019a6972856aaae2c7e4792858af3ca4561be2dbf649123ba6c40cb
3
+ size 851251104