r/LocalLLaMA · · 1 min read

Anyone seen benchmarks comparing Gemma 4 4-bit QAT vs. 8-bit standard quants?

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

I'm trying to find out if anyone has done any benchmarking comparing the Gemma 4 4-bit QAT models (via Unsloth) against standard 8-bit non-QAT quants.

I know QAT is supposed to retain a ton of accuracy compared to the baseline BF16, but I'm curious how a 4-bit QAT model actually fares against a traditional 8-bit PTQ. I've read some mixed feedback across different threads, but I haven't been able to find hard numbers or a direct head to head comparison between the two.

Has anyone run any evaluations on this yet?

submitted by /u/Character_Split4906
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA