r/LocalLLaMA · · 2 min read

Cognitor: open-source semantic search engine. Automatically chunks, embeds and indexes the content of a target folder, making it searchable semantically.

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

https://github.com/tanaos/cognitor

Cognitor is an open-source semantic search engine and vector database which automatically chunks, embeds and indexes the entire content of a target folder (and its subfolders), making it easily searchable by both AI agents and humans. Processing happens 100% locally by default, via sentence-transformers.

It provides a simple REST API to query the indexed data via natural language, and can be used as a standalone semantic search engine, a vector database, or as a backend for your applications.

How does it work?

Cognitor consists of two main components:

  • Search engine: a vector database which stores document embeddings, full text and metadata, and provides a simple REST API to query the indexed information.
  • Worker: a background process that monitors a specified folder for changes, automatically chunks and embeds the content of the files, and updates the vector database accordingly.

How to use?

1. Clone the repo

git clone https://github.com/tanaos/cognitor.git cd cognitor 

2. Start search engine + worker

Configure the following environment variables in your .env file (at the root of the project):

# Absolute path on your host machine to ingest DOCS_FOLDER=/path/to/your/docs # Name of the collection in which the worker will store the indexed documents COGNITOR_COLLECTION_NAME=cognitor-worker-documents 

Start both the search engine and the worker with

docker compose --profile worker up -d 

3. Integrate with your applications

We provide SDKs for:

Alternatively, you can use any HTTP client to interact with the REST API exposed on http://localhost:7530 or the Swagger UI at http://localhost:7530/docs.

Sample Python integration

Install the SDK:

pip install cognitor 

Use it in your code:

from cognitor import Cognitor with Cognitor("http://localhost:7530") as client: # Check if the search engine is ready to accept requests print(client.health_ready()) # "ready" or "loading" # Search by text query response = client.search("my-collection", query_text="Hello", top_k=10) print(response) 

See the Python SDK page for more examples and documentation.

submitted by /u/Ok_Hold_5385
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA