r/LocalLLaMA · · 1 min read

gemma4 QATs vs higher-bit regular quantizations?

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

I have enough RAM+VRAM to use gemma4 26b a4b up to q6_k quantizations w/ decent performance. Does anyone have any comparisons of the Q4_0 QATs (at 4-bits/wt) vs non-QATs at >4 bits/wt? (ex: q6_K)?

KLD vs the originals wouldn't be appropriate IIUC.

submitted by /u/Fun_Tangerine_1086
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA