r/LocalLLaMA · · 1 min read

vLLM has a new streaming parser for Qwen3+ available in nightly

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

vLLM has a new streaming parser for Qwen3+ available in nightly

The new parser reportedly fixes the issues many were seeing with Qwen3.6-27b stopping mid turn, as well as failing streaming tool calls due to chunk boundaries.

The mid turn stopping is especially annoying when trying to use the model for agentic workflows. I've not seen it happen anymore in the limited testing I've done this evening, fingers crossed that is gone for good!

submitted by /u/rmhubbert
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA