llama.cpp releases · · 1 min read

b9481

Mirrored from llama.cpp releases for archival readability. Support the source by reading on the original site.

model : support granite multilingual embeddings R2 (ibm-granite/granite-embedding-{97,311}m-multilingual-r2) (#22716)

  • Add support for the ibm-granite/granite-embedding-{97m,311m}-multilingual-r2 embedding models:

  • Added a version of the gpt4o tokenizer that has a fixed regex (better handling of marks), and different token merging setting for the 97m model

  • Reused gemma4 tokenizer for the 311m model

  • granite-embedding-*-multilingual-r2 : add support SwiGLU FFN for Granite Embedding Multilingual R2

  • added new GGUF key .hidden_activation (LLM_KV_HIDDEN_ACT) + writer

  • added a forward declaration of llm_ffn_op_type to llama-hparams.h

  • added llm_ffn_op in hparams

  • added LLM_FFN_NONE = 0 sentinel to llm_ffn_op_type (value-initialization), modern-bert: explicitly assigns LLM_FFN_GEGLU before reading GGUF (unchanged).

  • centralized hidden_act mapping in llama-model.cpp, added llm_ffn_op_type_from_string() helper, mirroring rope_scaling_type/llama_rope_scaling_type_from_string()

  • modern-bert reads the GGUF key (when present) and uses the resulting op in its FFN graph

  • Added granite-embedding-{97m,311m}-multilingual-r2 to the converter code

  • Added the hashes for the granite embedding multilingual R2 models

  • Set the hidden_activation in the GGUF if the field is present in config.json (such as for the granite embedding models)

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from llama.cpp releases