In a Training Loop 🔄

David Belton PRO

DavidAU

AI & ML interests

Application(s) of single/multiple LLMs in specialized use cases & automation tasks. LLM, Prompt , System Role and Parameter engineering VIA chat / API. 500+ LLMs graded.

Recent Activity

updated a model about 8 hours ago

DavidAU/Qwen3-24B-A4B-Freedom-Thinking-Abliterated-Heretic-NEO-Imatrix-GGUF

updated a collection about 12 hours ago

Heretic - Abliterated, Uncensored, Unrestricted POWER.

updated a collection about 12 hours ago

Heretic - Abliterated, Uncensored, Unrestricted POWER.

View all activity

Organizations

None yet

replied to their post 29 days ago

The 80Bs will soon be on the docket.

Issue with ablits -; may be some losses in brain dept so to speak.

Ablits are much better than they used to be, but when it comes to tuning - can be a bit of a nightmare.

replied to their post 29 days ago

@salsaman

Received your message, you can contact me via discord:
David_AU [note underscore]

Or open/setup a model on your Hugging face and I will contact you via community tab there if you prefer.

posted an update about 1 month ago

Post

6413

*** Happy Halloween - Embrace the Horror ! ***

Unsloth fine tunes using in house horror dataset.

Gemma 3 - 1B, 4B, two 12Bs and 27B (uploaded yesterday)
Qwen 3 - 1.7B [two] - new today... and , 4B, 6B, 42B ...

And 32 MORE horror models:

https://huggingface.co/DavidAU/models?search=horror

Collection:

https://huggingface.co/collections/DavidAU/grand-horror-165b-horror-and-fiction-generation

Enjoy ;

6 replies

replied to hoteo's post about 1 month ago

Hey;

I am DavidAU
[ https://huggingface.co/DavidAU ]

Also in the fine Perth, WA area ; please take a look at my repo and see if there is something we have in common. (?)

Your desc of "Algorithm in progress" is not too clear.
I build models including quants, merges and fine tunes.

You can contact me on DISCORD too:
David_AU

Cheers.
David

replied to mlabonne's post 5 months ago

@pmshaikh

Try:
https://huggingface.co/google/medgemma-27b-it
and/or
gemma3 27B model - strong in math.

Link to quants on these repo page.

Also; Qwen's repo -> MATH model(s).

You can reach me via the "Community tabs" at any of the model repos here:
https://huggingface.co/DavidAU

replied to bartowski's post over 1 year ago

Hear you there. In my cases, the issue is source -> Gguf.
Which source and outfile config = best quality gguf.

For creative uses cases the cumulative errors do add up, and add up to different - but nuanced results.

For some of my experiments these "rounding errors" are the target to improve output.

That being said, it can also lead to improves (or not) in logic/problem solving too. This is not completely understood, but an observation from testing.

reacted to singhsidhukuldeep's post with 🚀 over 1 year ago

Post

1742

✨ Feeling thankful...

🇮🇳 15th August, 2024; on India's 78th Independence Day

🎉 Crossed 100 followers on Hugging Face

🏆 Got LinkedIn Top Voice

🤖 AI has never been more exciting and I am here for it

👀 @clem Can I be a Hugging Face fellow now?

replied to bartowski's post over 1 year ago

I do not expect meaningful differences between FP16, BF16, and FP32 or the models derived from them and so far I have not seen any evidence to the contrary either.

There is a difference when running test prompts (ie Q4KM), at temp 0 for all three, depending on:

1 - Org source "fp"
2 - Outfile settings - fp16,fp32, or bf16.

Although it is minor in PPL differences, it does show when using a test prompt.
There are word changes, sentence changes and the like.
On longer gen, conclusions change as well.

It is not a big contrast, but it does show when testing this way.

reacted to mlabonne's post with ❤️ over 1 year ago

Post

9412

⚡ AutoQuant

AutoQuant is the evolution of my previous AutoGGUF notebook (https://colab.research.google.com/drive/1P646NEg33BZy4BfLDNpTz0V0lwIU3CHu). It allows you to quantize your models in five different formats:

- GGUF: perfect for inference on CPUs (and LM Studio)
- GPTQ/EXL2: fast inference on GPUs
- AWQ: super fast inference on GPUs with vLLM (https://github.com/vllm-project/vllm)
- HQQ: extreme quantization with decent 2-bit and 3-bit models

Once the model is converted, it automatically uploads it on the Hugging Face Hub. To quantize a 7B model, GGUF only needs a T4 GPU, while the other methods require an A100 GPU.

Here's an example of a model I quantized using HQQ and AutoQuant: mlabonne/AlphaMonarch-7B-2bit-HQQ

I hope you'll enjoy it and quantize lots of models! :)

💻 AutoQuant: https://colab.research.google.com/drive/1b6nqC7UZVt8bx4MksX7s656GXPM-eWw4

19 replies

replied to mlabonne's post over 1 year ago

Fantastic - thanks so much for sharing. Only a couple 1000 models I want to quant! Using GGUF -my-repo at the moment (a space) :

https://huggingface.co/spaces/ggml-org/gguf-my-repo

Have you or do you know of any ways to use the same COLAB type method (or space or other) to make GGUFs with Imatrix ?

David Belton PRO

AI & ML interests

Recent Activity

Organizations

DavidAU's activity