High VRAM local coding model — still Qwen 3.6 27B?
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
I’ve been using Qwen 3.6 27B and it’s amazing. Not exactly your Opus replacement, but great for small tasks and checking work. But if you had 224GB of VRAM, would it still be your choice? Or is there something you consider better in the 100+B range (GPT-OSS, Deepseek, etc) that’s just not talked about as much because fewer people can run it? I care more about intelligence than t/s.
[link] [comments]
More from r/LocalLLaMA
-
24+ tok/s from ~30B MoE models on an old GTX 1080 (8 GB VRAM, 128k context)
May 13
-
Web-Search is coming to a screeching performance halt as Google shuts down their free search index, and traffic defenders like Cloudflare challenge AI at every gateway. What are our options?
May 13
-
Side Projects.
May 13
-
MI50s Qwen 3.6 27B @52.8 tps TG @1569 tps PP (no MTP, no Quant)
May 13
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.