running Qwen 3.6 35b A3B on 2x 5060TI
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
i ran Qwen 3.6 35b A3B two 5060TI 16gb ( 32 gb vram also i have 32gb dram but i don't like offloading ) i used Q4 on LM Studio to get full context and i get 90t/s any tricks to optimze this more to upgrade to Q6 or Q8 ?
thanks !
another thing if you recommend somthing for cooling because i am using 2 stacked gpus with 0 gap ( ihave and mATX motherboard ) now the second gpu it not that hot but hotter then the bottom one
[link] [comments]
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.