Folks running qwen 3.6 27b for agentic work. Do you dare to use q4_k_m?
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
I dont have good experience running q4_k_m, the difference to q6 is "a few errors an hour" to " a few errors every couple of days".
Edit: How it fails? Just like user DifficultDog8435 and FullstackSensei explained in the comments. They worded it better than me.
[link] [comments]
More from r/LocalLLaMA
-
Stop traumatizing AI into loops and turn hallucinations into an honest "I don't know!" by being NICE to them (Proof of Concept, Research, I don't want to sell anything)
May 27
-
Cactus Hybrid Router: Gemma4-2B can match Gemini-3.1-Flash-Lite by routing 15-55% of tasks to Gemini And Running The Rest Locally.
May 26
-
Small comparison on full compute performance (Anima) of 5090 (600,475 and 400W) vs 6000 PRO MaxQ (325W), and 6000 PRO WS/SE (600W).
May 26
-
$400 Qwen 3.6-27B Setup - Dual RTX 3060 - 30-50 t/s
May 26
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.