r/LocalLLaMA · · 1 min read

1000 tps generation on Qwen3.6 27B with V100s

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

1000 tps generation on Qwen3.6 27B with V100s

I wanted to see what the absolute best case scenario for generation on this setup was and was not disappointed. 128 concurrent requests is so far removed from what I need but it’s funny to see big number. For single user (batch 1 not 128) the generation is around 80t/s with 3000 t/s processing,no mtp!!

submitted by /u/Simple_Library_2700
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA