r/LocalLLaMA · · 1 min read

Qwen 3.6-27B on vLLM with dual RTX 3090s: looking for launch parameters

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Hi everyone. Please share your working launch commands for running Qwen 3.6-27B via vLLM on dual RTX 3090s (both running in PCIe 4.0 x8). I'm interested in setups both with and without an NVLink bridge.

I'm familiar with the club-3090 repo, but their ready-to-use vLLM recipes are focused on 4-bit models. With 48GB of total VRAM, I'd rather not compress it that much—I want to use bigger quant to retain maximum generation quality.

Questions for anyone running this model on similar hardware:

  1. Which specific quantization of Qwen 3.6-27B are you using?
  2. What exact commands/parameters are you using to launch vLLM?

I'd appreciate any configs or launch advice you can share.

submitted by /u/xspider2000
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA