llama.cpp releases · · 1 min read

b9310

Mirrored from llama.cpp releases for archival readability. Support the source by reading on the original site.

server: fix checkpoints creation (#22929)

  • common : add common_chat_split_by_role

  • cont : fix spans to reach end of message

  • server: fix checkpoints creation

  • extract message_spans from chat templates
  • find the prompt token position before the latest user message
  • split prompt batching at that position
  • create a context checkpoint before the latest user input
  • avoid periodic mid-prompt checkpoints when that position is known
  • handle multimodal prompts when mapping text/template positions to server prompt tokens
  • add --checkpoint-min-step to control minimum spacing between checkpoints
  • cont : clean-up

  • Support autoparser detection for message barriers

  • server: fix message span delimiter and update docs


Co-authored-by: Alde Rojas [email protected]
Co-authored-by: Georgi Gerganov [email protected]
Co-authored-by: Piotr Wilkin [email protected]

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

UI:

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from llama.cpp releases