Whats the best Qwen 27B Q8 quant?
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
everyone is talking about q 4 q 5 and q 6, but. i got some coding that i feel like lower quants kept getting wrong. I can run q 8 from unsloth but feels a bit slow even with MTP ON, should I just resort to q8 35 b a3b at this point?
[link] [comments]
More from r/LocalLLaMA
-
CUDA: add fast walsh-hadamard transform by am17an · Pull Request #23615 · ggml-org/llama.cpp
May 25
-
Is there any case of a less quantised smaller model outperforming a more quantised larger model?
May 25
-
Llama.cpp : Split Mode Tensor Fix Incoming?
May 25
-
Is Qwen3.6 current king for local agentic use?
May 25
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.