vLLM releases

27 articles archived · Visit source ↗ · RSS

Sign in to subscribe

vLLM releases dev-tools 3d ago

v0.24.0

[CI] Raise gsm8k startup timeout for MoE Refactor Qwen3 NVFP4 configs…

23
vLLM releases dev-tools 4d ago

v0.24.0rc2: Fix P/D with DP Supervisor (#46628)

Signed-off-by: Robert Shaw [email protected] (cherry picked from commit c5e3c40 )

7
vLLM releases dev-tools 5d ago

v0.24.0rc1: [CI/Build] Fix topk histogram build on SM75 (#46550)

Signed-off-by: Mohammad Miadh Angkad [email protected] (cherry picked from commit 191826e )

11
vLLM releases dev-tools 15d ago

v0.23.1rc0: [Bugfix][CI] Update Dockerfile dependency graph PNG (#45602)

Signed-off-by: sfeng33 [email protected]

37
vLLM releases dev-tools 18d ago

v0.23.0rc2: [Docker] Fix CUTLASS DSL cu13 install order in Dockerfile (#45204)

Signed-off-by: Mohammad Miadh Angkad [email protected] (cherry picked from commit 40e065e )

36
vLLM releases dev-tools 18d ago

v0.23.0: [Docker] Fix CUTLASS DSL cu13 install order in Dockerfile (#45204)

Signed-off-by: Mohammad Miadh Angkad [email protected] (cherry picked from commit 40e065e )

23
vLLM releases dev-tools 26d ago

v0.22.1rc2: fix: resolve CUTLASS fmin compatibility for DeepSeek-V4 init

Signed-off-by: khluu [email protected]

9
vLLM releases dev-tools 26d ago

v0.22.1: fix: resolve CUTLASS fmin compatibility for DeepSeek-V4 init

Signed-off-by: khluu [email protected]

28
vLLM releases dev-tools 27d ago

v0.22.1rc1: [docker] Stop using extra-index-url for flashinfer-jit-cache (#44366)

Signed-off-by: Kevin H. Luu [email protected]

34
vLLM releases dev-tools 1mo ago

v0.22.1rc0: [CI] Make Model Executor test hangs fail fast with a traceback (#43971)

Signed-off-by: khluu [email protected] Co-authored-by: Claude [email protected]

10
vLLM releases dev-tools 1mo ago

v0.22.0rc3: [BugFix] Fix hard-coded timeout for multi-API-server startup (#43768)

Signed-off-by: Vadim Gimpelson [email protected] Co-authored-by: Nick Hill [email protected]

20
vLLM releases dev-tools 1mo ago

v0.22.0: [BugFix] Fix hard-coded timeout for multi-API-server startup (#43768)

Signed-off-by: Vadim Gimpelson [email protected] Co-authored-by: Nick Hill [email protected]

29
vLLM releases dev-tools 1mo ago

v0.22.0rc2: Fix early CUDA init (#43791)

Signed-off-by: Harry Mellor [email protected] (cherry picked from commit 41688e2 )

11
vLLM releases dev-tools 1mo ago

v0.22.0rc1: [MRV2][BugFix] Fix KV connector handling in spec decode case (#43719)

Signed-off-by: Nick Hill [email protected] Co-authored-by: Wentao Ye [email protected] (cherry picked from commit 8c94938 )

18
vLLM releases dev-tools 1mo ago

v0.21.1rc0: [ROCm][CI] Stage B gating (#42025)

Signed-off-by: Andreas Karatzas [email protected]

17
vLLM releases dev-tools 1mo ago

v0.21.0

Highlights This release features 367 commits from 202 contributors (49 new)! Transformers v4 deprecated : This release formally deprecates transformers v4 support ( #40389 ). Users should migrate to transformers v5. C++20 build requirement : vLLM now requires a C++20-compatible…

23
vLLM releases dev-tools 1mo ago

v0.21.0rc3

[MLA Attention Backend] Add TOKENSPEED_MLA backend for DSR1/Kimi K25 …

28
vLLM releases dev-tools 1mo ago

v0.21.0rc2

[Bugfix] Install nvidia-cutlass-dsl[cu13] extra on CUDA 13 platforms …

16
vLLM releases dev-tools 1mo ago

v0.21.0rc1

[Build] Build bundled DeepGEMM _C per-Python so the wheel imports o…

7
vLLM releases dev-tools 1mo ago

v0.20.2

vLLM v0.20.2 Highlights This release features 6 commits from 6 contributors (0 new)! This is a small patch release with bug fixes for DeepSeek V4, gpt-oss, and Qwen3-VL Bug Fixes DeepSeek V4 sparse attention : Re-enable the persistent topk path on Hopper and ensure the memset…

11
vLLM releases dev-tools 1mo ago

v0.20.1

vLLM v0.20.1 This is a patch release on top of v0.20.0 primarily focused on DeepSeek V4 stabilization and performance improvements , along with several important bug fixes. DeepSeek V4 Base model support ( #41006 ). Multi-stream pre-attention GEMM ( #41061 ), configurable…

37
vLLM releases dev-tools 1mo ago

v0.20.2rc0: [MRV2] Add shutdown() method (#41297)

Signed-off-by: Woosuk Kwon [email protected]

35
vLLM releases dev-tools 2mo ago

v0.20.0

vLLM v0.20.0 Highlights This release features 752 commits from 320 contributors (123 new)! DeepSeek V4 : Initial DeepSeek V4 support landed ( #40860 ), with DSML token-leakage fix in DSV4/3.2 ( #40806 ), DSA + MTP IMA fix ( #40772 ), and a silu clamp limit on the shared expert (…

33
vLLM releases dev-tools 2mo ago

v0.20.1rc0: Add system_fingerprint field to OpenAI-compatible API responses (#40537)

Co-authored-by: Claude [email protected]

6
vLLM releases dev-tools 2mo ago

v0.20.0rc1

Revert "[Misc] Move pyav and soundfile to common requirements" (#…

25
vLLM releases dev-tools 2mo ago

v0.19.1

This is a patch release on top of v0.19.0 with Transformers v5.5.3 upgrade and bug fixes for Gemma4: Update to transformers v5 ( #30566 ) [Bugfix] Fix invalid JSON in Gemma 4 streaming tool calls by stripping partial delimiters ( #38992 ) [Bugfix][Frontend] Fix Gemma4 streaming…

17
vLLM releases dev-tools 2mo ago

v0.19.2rc0: [Bugfix] Fix k_proj's bias for GLM-ASR (#40160)

Signed-off-by: Rishapveer Singh [email protected]

4