r/LocalLLaMA · May 30, 2026 · 1 min read

125 tok/s for Qwen3.6 q4xl on 2x 4060ti is insane perf/dollar

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Under $1000 for 32gb vram from 2023, and ~300 watts draw... and this thing is outperforming the latest pick-your-vendor $5k mini pcs from 2026.

So.. next question is can I make it squeeze 150 t/s with the same q4xl on cuda 13.3 this weekend. Anyone try it yet?

Discussion (0)

No comments yet. Sign in and be the first to say something.