llama.cpp releases · June 1, 2026 · 1 min read

b9464

Mirrored from llama.cpp releases for archival readability. Support the source by reading on the original site.

speculative : fix n_outputs_max and remove draft-simple auto-enable (#23988)

Extract the speculative max-draft-size logic from server_n_outputs_max
into a reusable common_speculative_n_max() function in common/speculative.

Assisted-by: llama.cpp:local pi

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

UI:

Discussion (0)

No comments yet. Sign in and be the first to say something.