Home Status News MCP Pricing Sign in

Home Status News MCP Pricing Sign in

News / llama.cpp releases

llama.cpp releases

454 articles archived · Visit source ↗ · RSS

Sign in to subscribe

llama.cpp releases dev-tools 11d ago

b9701

mtmd: refactor preprocessor, add mtmd_image_preproc_out ( #24736 ) add mtmd_image_preproc_out add dev docs remove unused clip API rm unused clip_image_f32_batch::grid change preprocess() call signature macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI…

15
llama.cpp releases dev-tools 11d ago

b9700

[SYCL] rename GGML_SYCL_SUPPORT_LEVEL_ZERO ( #24719 ) rename GGML_SYCL_SUPPORT_LEVEL_ZERO to GGML_SYCL_SUPPORT_LEVEL_ZERO_API, and GGML_SYCL_ENABLE_LEVEL_ZERO to GGML_SYCL_USE_LEVEL_ZERO_API fix code format fix error when rebase macOS/iOS: macOS Apple Silicon (arm64) macOS Apple…

31
llama.cpp releases dev-tools 11d ago

b9699

sycl : support MUL_MAT and OUT_PROD with Q1_0 ( #24721 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…

31
llama.cpp releases dev-tools 11d ago

b9698

app : enable self-update only when built with llama-install.sh ( #24754 ) Signed-off-by: Adrien Gallouët [email protected] macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU)…

34
llama.cpp releases dev-tools 11d ago

b9697

ci : fix check-release message parsing ( #24751 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…

25
llama.cpp releases dev-tools 12d ago

b9694

ci : fix Windows x64 (OpenVINO) release link ( #24731 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…

28
llama.cpp releases dev-tools 12d ago

b9693

metal : check for BF16 support in concat kernel ( #24747 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu…

16
llama.cpp releases dev-tools 12d ago

b9692

mtmd: llava_uhd should no longer use batch dim ( #24732 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu…

32
llama.cpp releases dev-tools 12d ago

b9691

ggml-cpu: Conditionally enable power11 backend based on compiler support ( #24687 ) ggml: Conditionally enable power11 backend based on compiler support Guard POWER11 backend creation behind a compiler flag check for -mcpu=power11. This avoids build failures on current GCC/Clang…

14
llama.cpp releases dev-tools 12d ago

b9690

metal : implement rope_back operator ( #24725 ) Reuse existing rope kernels with a function constant to toggle forward/backward rotation, avoiding duplicate kernel code. Assisted-by: pi:llama.cpp/Qwen3.6-27B macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64,…

27
llama.cpp releases dev-tools 12d ago

b9689

metal : add f16 and bf16 support for concat operator ( #24724 ) metal : add f16 and bf16 support for concat operator Extend the Metal backend concat operator to support f16 and bf16 tensor types in addition to the existing f32 and i32 support. Template kernel_concat on type T…

34
llama.cpp releases dev-tools 12d ago

b9688

server: (router) add model management API ( #23976 ) wip server: (router) add SSE realtime updates API nits wip add download API add download api update docs add delete endpoint fix std::terminate fix crash fix 2 add tests nits macOS/iOS: macOS Apple Silicon (arm64) macOS Apple…

17
llama.cpp releases dev-tools 12d ago

b9687

llama : skip main_gpu validation when no devices are available ( #23405 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64…

11
llama.cpp releases dev-tools 12d ago

b9686

spec: fix segfault error on long prompts for eagle3 ( #24707 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu…

17
llama.cpp releases dev-tools 12d ago

b9685

[SYCL] add dev2dev memcpy by SYCL API ( #24476 ) add dev2dev memcpy by SYCL API mv GGML_SYCL_DEV2DEV_MEMCPY to runntime table update the detect method for p2p comm fix the erro created during fix confilct Co-authored-by: Neo Zhang macOS/iOS: macOS Apple Silicon (arm64) macOS…

33
llama.cpp releases dev-tools 12d ago

b9684

[SYCL] Add conv_3d ( #24691 ) add conv_3d optimize update ops.md restore test script rm unused code rm copyright notes macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu…

15
llama.cpp releases dev-tools 12d ago

b9682

vulkan: record actual memory properties during buffer creation ( #24326 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64…

9
llama.cpp releases dev-tools 12d ago

b9678

opencl: optimize mul_mat_f16_f32_l4 for decode ( #24504 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu…

4
llama.cpp releases dev-tools 12d ago

b9677

common: update logging to enforce max_capacity and optimize queue resizing ( #24490 ) common: update logging to enforce max_capacity and optimize queue resizing logic common/log: remove queue expansion logic macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64,…

35
llama.cpp releases dev-tools 12d ago

b9675

sycl : Enable to support fp16 by OPs: SQR, SQRT, LOG, SIN, COS, CLAMP ( #24692 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu…

33
llama.cpp releases dev-tools 12d ago

b9674

SYCL: fix use-after-free bug with async memcpy in MoE prefill ( #24676 ) SYCL: fix a bug with async memcpy make mmid_row_mapping_host persistent comment on stream->wait Apply suggestion from @sanmai Apply suggestion from @sanmai Apply suggestion from @sanmai macOS/iOS: macOS…

34
llama.cpp releases dev-tools 13d ago

b9680: ci: fix vulkan docker images (#24595)

Update vulkan-shaders-gen.cpp Update vulkan-shaders-gen.cpp add comment describing code change intention Update vulkan-shaders-gen.cpp fix potential UB

24
llama.cpp releases dev-tools 13d ago

b9673

sycl: Add optional USM system allocations ( #22526 ) This introduces an optional feature to allocate large GPU buffers (≥ 1GB) using USM system allocations if supported by the device. It allows using buffers from the system allocator then letting the system manage memory…

18
llama.cpp releases dev-tools 13d ago

b9672

vendor : update BoringSSL to 0.20260616.0 ( #24693 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…

31
llama.cpp releases dev-tools 13d ago

b9670

Fix and restrict NVFP4 edge-cases in llama-graph ( #24331 ) Move post-GEMM MUL required for dequant b4 lora and bias add see #23484 : For lora, I would presume we want fully dequantized values before doing the residuals, but this depends on how the LORAs were generated.…

26
llama.cpp releases dev-tools 13d ago

b9669

spec: add backend sampling support for eagle3 ( #24655 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…

27
llama.cpp releases dev-tools 13d ago

b9668

vulkan: prefer host-visible memory buffers on UMA devices ( #22930 ) implement UMA host-visible memory update based on 0cc4m's suggestion macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu…

37
llama.cpp releases dev-tools 13d ago

b9667

vulkan: Support gated_delta_net with S_v=16 ( #24581 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…

38
llama.cpp releases dev-tools 13d ago

b9665

bench : add --offline ( #24511 ) bench : add --offline Signed-off-by: Adrien Gallouët [email protected] Add default Signed-off-by: Adrien Gallouët [email protected] Signed-off-by: Adrien Gallouët [email protected] macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon…

29
llama.cpp releases dev-tools 14d ago

b9663

[SYCL] Support OP EXPM1, support all UT cases of FLOOR, TRUNC, ROUND ( #24363 ) support OP EXPM1, support all UT cases of FLOOR, TRUNC, ROUND fix conflict rebase, support new UT case of repeat, concat macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI…

17
llama.cpp releases dev-tools 14d ago

b9664: sycl: support reordered Q4_K/Q5_K/Q6_K MoE MUL_MAT_ID (#24452)

sycl: support reordered Q4_K and Q5_K MoE MUL_MAT_ID Extend reordered-weight handling to fused MoE MUL_MAT_ID for Q4_K and Q5_K expert tensors and add Q5_K reordered DMMV coverage. Unsupported 3D reorder cases now fall back instead of aborting. sycl: extend MoE reorder to Q6_K…

21
llama.cpp releases dev-tools 14d ago

b9661

vulkan: add col2im_1d op ( #24425 ) vulkan: add GGML_OP_COL2IM_1D, follow-up to the CPU op vulkan: col2im_1d bounded gather loop instead of full-K scan with modulo vulkan: col2im_1d address review from @jeffbolznv vulkan: col2im_1d return nullptr for unsupported types, address…

20
llama.cpp releases dev-tools 14d ago

b9660

chat : fix LFM2 tool-call parsing double-escaping ( #24667 ) Add escape test cases chat : fix LFM2 tool-call parsing double-escaping macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64…

20
llama.cpp releases dev-tools 14d ago

b9659

mtmd: fix miscounting n_tokens ( #24656 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan)…

20
llama.cpp releases dev-tools 14d ago

b9658

chat: include full unparsed prompt in debug ( #24650 ) message on parse error macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64…

5
llama.cpp releases dev-tools 14d ago

b9656

chat: harden peg-native tool call parsing ( #24329 ) chat: harden peg-native tool call parsing accept an optional leading type: function field in build_json_tools_flat_keys so openai style tool calls parse on templates whose serialization opens on the name field. return a clean…

27
llama.cpp releases dev-tools 14d ago

b9655

chat: fix an "oldie but goodie" grammar generator bug that surfaced during last changes ( #24653 ) chat: fix an "oldie but goodie" grammar generator bug that surfaced during last changes update erroneous case in PEG parser test macOS/iOS: macOS Apple Silicon (arm64) macOS Apple…

38
llama.cpp releases dev-tools 14d ago

b9654

mtmd : add post-decode callback ( #24645 ) Assisted-by: pi:llama.cpp/Qwen3.6-27B macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu…

14
llama.cpp releases dev-tools 14d ago

b9653

vulkan: support more CONCAT types ( #24579 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan)…

35
llama.cpp releases dev-tools 14d ago

b9652

wasm : fix fallback symbol collision ( #24639 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan)…

4
llama.cpp releases dev-tools 14d ago

b9651

SYCL: use native subgroup size for K-quant DMMV ( #21700 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu…

15
llama.cpp releases dev-tools 14d ago

b9650

sycl: fix soft_max_f32 max reduction ( #24451 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan)…

18
llama.cpp releases dev-tools 14d ago

b9649

sycl : fix reorder function; add fp32/fp16 in build script ( #24578 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan)…

31
llama.cpp releases dev-tools 14d ago

b9647

[SYCL] add to support pool_1d, move pool_1d/2d code to pool.cpp/hpp ( #24584 ) add to support pool_1d, move pool_1d/2d code to pool.cpp/hpp update ops.md macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS…

25
llama.cpp releases dev-tools 14d ago

b9646

[SYCL]: Remove per-allocation Level Zero runtime checks ( #23399 ) [SYCL] Centralize Level Zero detection in ggml_sycl_init use the same wording get back the warning [SYCL] Remove per-allocation getenv() for GGML_SYCL_ENABLE_LEVEL_ZERO bring back the comment move it up to make…

10
llama.cpp releases dev-tools 14d ago

b9645

metal : add repeat bf16 ( #24638 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan) Ubuntu x64…

36
llama.cpp releases dev-tools 14d ago

b9644

chat: fix whitespace problems once and for all ( #24624 ) chat: fix whitespace problems once and for all Purge trailing spaces from grammar generation Revert "Purge trailing spaces from grammar generation" This reverts commit b0827ec . macOS/iOS: macOS Apple Silicon (arm64)…

30
llama.cpp releases dev-tools 14d ago

b9642

CUDA: only support F32/F16 for GGML_OP_REPEAT ( #24533 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…

33
llama.cpp releases dev-tools 15d ago

b9641

ggml-webgpu: improve i-quants mul_mat performance and speed up prefil…

23
llama.cpp releases dev-tools 15d ago

b9637

chat: add dedicated Cohere2MoE (North Code) parser ( #24615 ) chat: add dedicated Cohere2MoE (North Code) parser Some renames to make @CISC happy :> macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework…

25

Page 3 of 10 · 454 articles ← Newer Older →

Product

Pricing
Roadmap
Changelog
Incidents

Resources

News RSS
MCP RSS
MCP releases RSS
Incidents RSS
Changelog RSS

Project

About
API
Contact

Legal

Privacy
Terms
Security

Prismix · © 2026 · AI Hub

All product names and logos are trademarks of their respective owners.

Send feedback

Name (optional)

Email *

Message *