mmproj precision

#1
by BigWhoop - opened

Hi, I saw that the Qwen3-VL GGUFs often come with a F32 mmproj. Is there any benefit in using the F32 over F16?

Unsloth AI org

Not really - F16 / BF16 is enough - F32 might work as well, but tbh BF16 is enough

The vision tensors are almost always in BF16 so going to F32 doesn't add anything. BF16 -> F16 is a lossy conversion but it's debatable whether you'd notice the difference.

If your hardware doesn't support BF16 however then F32 may be used if you want to use a lossless mmproj.

Sign up or log in to comment