r/LocalLLaMA · · 1 min read

running Qwen 3.6 35b A3B on 2x 5060TI

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

i ran Qwen 3.6 35b A3B two 5060TI 16gb ( 32 gb vram also i have 32gb dram but i don't like offloading ) i used Q4 on LM Studio to get full context and i get 90t/s any tricks to optimze this more to upgrade to Q6 or Q8 ?
thanks !

another thing if you recommend somthing for cooling because i am using 2 stacked gpus with 0 gap ( ihave and mATX motherboard ) now the second gpu it not that hot but hotter then the bottom one

submitted by /u/chocofoxy
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA