|
|
--- |
|
|
base_model: nintwentydo/Razorback-12B-v0.2 |
|
|
base_model_relation: quantized |
|
|
library_name: transformers |
|
|
tags: |
|
|
- mergekit |
|
|
- merge |
|
|
- multimodal |
|
|
- mistral |
|
|
- pixtral |
|
|
language: |
|
|
- en |
|
|
- fr |
|
|
- de |
|
|
- es |
|
|
- it |
|
|
- pt |
|
|
- ru |
|
|
- zh |
|
|
- ja |
|
|
license: other |
|
|
pipeline_tag: image-text-to-text |
|
|
--- |
|
|
|
|
|
# Razorback 12B v0.2 ExLlamaV2 8.0bpw Quant |
|
|
#### UnslopNemo with Vision! |
|
|
|
|
|
<img src="https://huggingface.co/nintwentydo/Razorback-12B-v0.1/resolve/main/razorback.jpg" style="width: 100%; max-width:700px"></img> |
|
|
|
|
|
A more robust attempt at merging TheDrummer's UnslopNemo v3 into Pixtral 12B. |
|
|
|
|
|
Has been really stable in my testing so far. Needs more testing to see what samplers it does/doesn't like. |
|
|
|
|
|
Seems to be the best of both worlds - less sloppy, more engaging content and decent intelligence / visual understanding. |
|
|
|
|
|
|
|
|
## Merging Approach |
|
|
First, I loaded up Pixtral 12B Base and Mistral Nemo Base to compare their parameter differences. |
|
|
Looking at the L2 norm / relative difference values I was able to isolate which parts of Pixtral 12B are a significant deviation from Mistral Nemo. |
|
|
Because while the language model architecture is the same between the two, a lot of vision understanding has been trained into Pixtral's language model and can break very easily. |
|
|
|
|
|
Then I calculated merging weights for each parameter using an exponential falloff. The smaller the difference, the higher the weight. |
|
|
|
|
|
Applied this recipe to Pixtral Instruct (Pixtral-12B-2409) and TheDrummer's UnslopNemo-12B-v3. The goal is to infuse as much Drummer goodness without breaking vision input. And it looks like it's worked! |
|
|
|
|
|
|
|
|
## Usage |
|
|
Needs more testing to identify best sampling params, but so far just using ~0.7 temp + 0.03 min p has been rock solid. |
|
|
|
|
|
Use the included chat template (Mistral). No chatml support yet. |
|
|
|
|
|
## Credits |
|
|
- Mistral for [mistralai/Pixtral-12B-2409](https://huggingface.co/mistralai/Pixtral-12B-2409) |
|
|
- Unsloth for [unsloth/Pixtral-12B-2409](https://huggingface.co/unsloth/Pixtral-12B-2409) transformers conversion |
|
|
- TheDrummer for [TheDrummer/UnslopNemo-12B-v3](https://huggingface.co/TheDrummer/UnslopNemo-12B-v3) |
|
|
|
|
|
|
|
|
## Available Sizes |
|
|
| Repo | Bits | Head Bits | Size | |
|
|
| ----------- | ------ | ------ | ------ | |
|
|
| [nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw](https://huggingface.co/nintwentydo/Razorback-12B-v0.2-exl2-4.0bpw) | 4.0 | 6.0 | 8.19 GB | |
|
|
| [nintwentydo/Razorback-12B-v0.2-exl2-5.0bpw](https://huggingface.co/nintwentydo/Razorback-12B-v0.2-exl2-5.0bpw) | 5.0 | 6.0 | 9.54 GB | |
|
|
| [nintwentydo/Razorback-12B-v0.2-exl2-6.0bpw](https://huggingface.co/nintwentydo/Razorback-12B-v0.2-exl2-6.0bpw) | 6.0 | 8.0 | 11.1 GB | |
|
|
| [nintwentydo/Razorback-12B-v0.2-exl2-8.0bpw](https://huggingface.co/nintwentydo/Razorback-12B-v0.2-exl2-8.0bpw) | 8.0 | 8.0 | 13.7 GB | |
|
|
|