r/LocalLLaMA · · 1 min read

Looking for Suggestions — Single 5090 & 64gb DDR5

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Hi Reddit,

I am planning on running Qwen 3.6 27b NVFP4 via vLLM on my 5090 but was wondering if something like 35b a3b at Q8 on Llama would produce better results for agentic coding and utilize the system memory. My research says no but if that’s the case what would yall do to utilize the system memory?

submitted by /u/icedgz
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA