r/LocalLLaMA · · 1 min read

LFM2.5-Embedding-350M & LFM2.5-ColBERT-350M

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

LFM2.5-Embedding-350M & LFM2.5-ColBERT-350M

LFM2.5-Embedding-350M is a dense bi-encoder for fast multilingual retrieval. It produces a single vector per document — the smallest, fastest index — for reliable cross-lingual search across 11 languages.

  • Best-in-class multilingual accuracy for a dense embedder of its size.
  • Inference speed is on par with much smaller models, thanks to the efficient LFM2 backbone.
  • You can use it as a drop-in replacement in your current RAG pipelines.

https://huggingface.co/LiquidAI/LFM2.5-Embedding-350M-GGUF

LFM2.5-ColBERT-350M is a late interaction retriever with best-in-class multilingual performance. It stores one vector per token and matches queries to documents with MaxSim, so you can store documents in one language (for example, a product description in English) and retrieve them in many languages with high accuracy.

  • LFM2.5-ColBERT-350M offers best-in-class accuracy across 11 languages.
  • Inference speed is on par with much smaller models, thanks to the efficient LFM2 backbone.
  • You can use it as a drop-in replacement in your current RAG pipelines to improve performance.

https://huggingface.co/LiquidAI/LFM2.5-ColBERT-350M-GGUF

submitted by /u/pmttyji
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA