RAGless: Q-Q retrieval with score aggregation for closed-domain FAQ [P]
Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.
What it does
RAGless is a semantic retrieval system based on Question-to-Question matching. At ingestion, an LLM generates multiple question variants per answer (3–5) and each variant gets its own embedding. At query time, the user question is embedded, Top-K nearest question variants are retrieved, and scores are aggregated by answer_id — the answer with the highest aggregated score wins.
Threshold logic uses two gates: minimum aggregated score (default 0.70) plus a fallback on the best single-hit score (0.82), to avoid false negatives when only one variant makes it into Top-K. Embeddings use asymmetric task types (RETRIEVAL_DOCUMENT at ingestion, RETRIEVAL_QUERY at runtime).
Target audience
Researchers and engineers evaluating retrieval architectures for closed-domain FAQ systems where the answer space is finite and predefined. Production-ready for that scope. Not intended for open-ended generative Q&A.
Comparison
Standard RAG: retrieve document chunks → LLM generates an answer. RAGless: retrieve pre-generated question variants → return the pre-written answer. The generation step is eliminated entirely. Compared to dense passage retrieval (DPR) and similar approaches, RAGless operates at the question level rather than the passage level, which improves precision for FAQ-style retrieval at the cost of flexibility.
GitHub: github.com/EmilResearch/RAGless
Open to feedback — happy to answer questions.
If you find it useful, a ⭐ on GitHub is appreciated.
[link] [comments]
More from r/MachineLearning
-
Loss functions in Instance Representation Learning [R]
Jun 29
-
Price elasticity model [R]
Jun 29
-
Rejected MICCAI paper: workshop -> journal/conference or directly journal/conference [R]
Jun 29
-
I built a demo agricultural planning system with an AI advisor for small-scale farmers in Nicaragua using NASA data [p]
Jun 29
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.