Is there any case of a less quantised smaller model outperforming a more quantised larger model?
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
As per the title
Such as Gemma 4 31B Q4 K S vs Gemma 4 26B A4B Q8
Or
Qwen 3.6 27B Q4 K M vs Qwen 3.6 35B A3B Q6 K
Etc
At what point is it worth switching?
My use case is mostly creative writing.
[link] [comments]
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.