vLLM PR adding native HIP W4A16 kernel was merged
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| The performance increase introduced by the PR is awesome. Makes my ROCm rig a lot more useful. Numbers from the PR:
EDIT: The numbers are for Qwen3.6-27B-GPTQ-W4A16-G32. See more here: PR link [link] [comments] |
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.