Testing this pruning version is not very convenient.
#1
by
chfm
- opened
Testing has shown that this pruning version is not as effective as the original version for Q3 quantization.
By the way, Mr. PHR00t has shared fast versions, including NSFW , Could you please perform GGUF quantization for these?
https://huggingface.co/Phr00t/Qwen-Rapid-AIO