r/LocalLLaMA · · 1 min read

Let’s talk quants of Gemma and Qwen - 16 vs Q8 vs Q4 - any experiences?

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Some people say they’d never go under Q8, and others say they find Q3 acceptable! What’s your take?

submitted by /u/Borkato
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA