llama.cpp releases · May 21, 2026 · 1 min read

b9255

Mirrored from llama.cpp releases for archival readability. Support the source by reading on the original site.

Like Read original ↗

hexagon: HMX quantized matmul rework (#23368)

hmx-mm: update debug logging in hmx-mm
hmx-mm: update dequant logic to use HVX_vector_x2/4
hmx-mm: remove non-pipelined version of the quantize matmul

It seems that we don't reall need non-pipelined version

hmx-mm: use activation depth mode and update naming

Co-authored-by: Kim-Chyan Gan [email protected]

hex-mm: minor hmx matmul naming updates
hmx-mm: remove unused vars
snapdragon: scripts bump default ubatch-size to 1K
hexagon: combine HMX and power and clock settings into a single set_power call
hmx-mm: remove leftover of the scale repl helper
hexagon: fix editconf error

Co-authored-by: Kim-Chyan Gan [email protected]

macOS/iOS:

Linux:

Android:

Android arm64 (CPU)

Windows:

openEuler:

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

No comments yet. Sign in and be the first to say something.

More from llama.cpp releases