llama.cpp releases · · 1 min read

b9319

Mirrored from llama.cpp releases for archival readability. Support the source by reading on the original site.

ggml: gguf_init_from_callback and gguf_init_from_buffer (#22341)

  • ggml: implement gguf_init_from_buffer

  • test: gguf_init_from_buffer

  • fix: memory breakdown for a model loaded with no_alloc from a file is consistent with being loaded from a buffer

  • fix: use GGML_UNUSED

Co-authored-by: Copilot [email protected]

  • fix: remove total_size from gguf_reader

  • fix: file offset calculation, rename offset to data_offset

Co-authored-by: Copilot [email protected]

  • refactor: extract model loader bug fixes to another PR

  • feat: add gguf_init_from_callback

  • fix: always require a max expected size

  • fix: change gguf_reader_callback_t's output type to void *, change max_expected_size and offsets to uint64_t

  • fix: harden against offset overflow in buffer read

  • fix: remove seek behavior from the callback

  • feat: max_chunk_read == 0 means SIZE_MAX

  • fix: seeking in a gguf file with no tensors


Co-authored-by: Copilot [email protected]

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

UI:

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from llama.cpp releases