Nice work !

#1
by edwarddddr - opened

I have not tested the full-precision capacities of the model, but the Q5_K_M.gguf is so far pretty good. I am still tweaking performance on my old gtx1050ti, so far I've managed to hit 18tps.
So far it's really cool and usable, it handles tool calling and it's pretty smart.
Thank you for your work !

Awesome to hear! In case you want to have even better quality with the same amount of GPU memory usage and performance you could use our i1-Q5_K_M quants from https://huggingface.co/mradermacher/AgentCPM-Explore-i1-GGUF but even static Q5_K_M quants are so near the original model that no normal user will ever be able to tell a difference unless in some very specific niche use cases.

Sign up or log in to comment