pswitala/pllum-8B-instruct-Q5_k_m-gguf
This model was converted to GGUF format from CYFRAGOVPL/Llama-PLLuM-8B-instruct using llama.cpp
Refer to the original model card for more details on the model.
Really fast at rtx4080
This model run smoothly on RTX4080 with 70 tokens / sec
- Downloads last month
- 12
Hardware compatibility
Log In to add your hardware
5-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for pswitala/pllum-8b-instruct-q5_k_m_gguf
Base model
CYFRAGOVPL/Llama-PLLuM-8B-instruct