r/LocalLLaMA · · 1 min read

Is there any case of a less quantised smaller model outperforming a more quantised larger model?

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

As per the title

Such as Gemma 4 31B Q4 K S vs Gemma 4 26B A4B Q8
Or
Qwen 3.6 27B Q4 K M vs Qwen 3.6 35B A3B Q6 K

Etc

At what point is it worth switching?

My use case is mostly creative writing.

submitted by /u/opoot_
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA