Q3_K_M (112 GB) is bigger than Q3_K_XL (104 GB)?
#8
by
rtzurtz
- opened
as per title
Yes that;s correct. K_XL is usually smaller
But what are the implications?
Cos I have strix halo 128 GB, so can run Q3_K_XL at 100 GB or Q3_K_M at 115 GB, but what's the difference in perplexity or benchmarks?
Couldn't we have a UD K_XL which sits between them?
There's 20 GB+ of 'free real estate' on the device.