Tag

Rag

500 articles archived under #rag · RSS

arXiv — Machine Learning research 1h ago

Geometric Measurements of the Axiom of Choice in Neural Proof Embeddings

arXiv:2606.28572v1 Announce Type: new Abstract: The axiom of choice has divided the foundations of mathematics for over a century, but the distinction between classical and constructive proofs has remained a philosophical and methodological one. We use Lean 4's kernel-level…

8
arXiv — Machine Learning research 1h ago

How Token Influence Decays with Distance: A Green-Function View of Trained Language Models

arXiv:2606.29139v1 Announce Type: new Abstract: We study how the next-token prediction of an autoregressive Transformer language model changes under small perturbations of earlier input token embeddings. Motivated by operator learning and iterative solvers for differential…

27
arXiv — Machine Learning research 1h ago

Deterministic Decisions for High-Stakes AI. A Zero-Egress Pipeline with the Deployability of RAG and the Accuracy of Machine Learning

arXiv:2606.29280v1 Announce Type: new Abstract: We identify intervention bias as a previously unquantified failure mode of zero-shot large-language-model (LLM) educational advisory agents: without task-specific training, they recommend action when a hindsight-optimal oracle…

31
arXiv — Machine Learning research 1h ago

SP-CACW: Convergence-Aware Client Weighting for Selfish Personalized Learning

arXiv:2606.29322v1 Announce Type: new Abstract: Collaborative learning is sustainable only when it benefits each participant. Standard federated learning optimizes a global average objective, which can under perform for clients whose data distributions differ substantially from…

35
arXiv — Machine Learning research 1h ago

Deciphering Region-Level Signatures from Latency Measurements in LEO Satellite Internet

arXiv:2606.29324v1 Announce Type: new Abstract: Low-Earth orbit (LEO) satellite Internet has become an indispensable infrastructure that provide growing coverage for global users. Despite extensive measurement efforts, the principles underlying region-level performance…

32
arXiv — Machine Learning research 1h ago

The Mirage of Optimizing Training Policies: Monotonic Inference Policies as the Real Objective for LLM Reinforcement Learning

arXiv:2606.29526v1 Announce Type: new Abstract: Reinforcement learning (RL) has gained growing attention in large language model (LLM) post-training, yet RL training remains fragile and can suffer from instability or collapse. One vital cause is training-inference mismatch: LLM…

17
arXiv — NLP / Computation & Language research 1h ago

Turn-Averaged SAEs for Feature Discovery and Long-Context Attribution

arXiv:2606.28548v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) have become a useful tool for extracting interpretable features in language models. However, standard SAE architectures operate on individual token activations, meaning that the number of active features…

25
arXiv — NLP / Computation & Language research 1h ago

AnTenA: Actionable and Explainable Tensor Analysis System with Large Language Models

arXiv:2606.28708v1 Announce Type: new Abstract: Accurately explaining hidden patterns in multi-aspect data has typically been done by leveraging labels and/or accompanying auxiliary metadata. However, labels and auxiliary data may be inaccurate (e.g. nonstandard, inconsistent),…

21
arXiv — NLP / Computation & Language research 1h ago

5ting at SemEval-2026 Task 8: Strong End-to-End Multi-Turn RAG via LLM-Based Reranking and Faithfulness Control

arXiv:2606.28737v1 Announce Type: new Abstract: We introduce 5ting, our system for the SemEval2026 Task 8 (MTRAGEval), which evaluates multi-turn Retrieval Augmented Generation (RAG) systems. Multi turn RAG involves context drift, under specification, and hallucination risk. Our…

5
arXiv — NLP / Computation & Language research 1h ago

BERTomelo: Your Portuguese Encoder Best Friend

arXiv:2606.28999v1 Announce Type: new Abstract: Encoders have become the state of the art for multiple NLP tasks, especially those requiring deep contextual understanding. While multilingual models offer broad coverage, dedicated monolingual encoders are essential for capturing…

16
arXiv — NLP / Computation & Language research 1h ago

How to Leverage Synthetic Speech for LLM-Based ASR Systems?

arXiv:2606.29031v1 Announce Type: new Abstract: In regulated domains such as banking and healthcare, where privacy constraints make real speech costly to collect and retain, synthetic speech from modern text-to-speech (TTS) is an appealing alternative for training automatic…

15
arXiv — NLP / Computation & Language research 1h ago

A Comparative Study on Affective Cues in Text Embeddings Across Psychological Emotion Theories

arXiv:2606.29068v1 Announce Type: new Abstract: Text encoders are known for their utility in natural language processing, as they are able to efficiently compress inputs into dense vectors while preserving semantics. These models have been applied to affective computing, in…

19
arXiv — NLP / Computation & Language research 1h ago

AB-RAG: Adaptive Budgeted Retrieval-Augmented Generation for Reliable Question Answering

arXiv:2606.29090v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) has become the standard way to ground large language models in external knowledge, yet most systems retrieve a fixed number of passages for every question regardless of its difficulty. This…

11
arXiv — NLP / Computation & Language research 1h ago

MIThinker: A Plug-and-Play Policy-Optimized Thinker For Motivational Interviewing Counseling

arXiv:2606.29265v1 Announce Type: new Abstract: Reasoning large language models (LLMs) have recently made much progress in complex problem-solving, leveraging internal reasoning (or thought) to guide their solution generation. However, existing LLM-based counseling agents,…

17
arXiv — NLP / Computation & Language research 1h ago

TriageRA-CCF: Source-Side Clinical Confidence and Coverage Signals for Adaptive Rank Budgeting in Medical LLMs

arXiv:2606.29375v1 Announce Type: new Abstract: Medical large language models are commonly adapted with a fixed low-rank budget, even though medical questions differ substantially in confidence, clinical coverage, and cross-domain difficulty. We study adaptive rank budgeting for…

15
arXiv — NLP / Computation & Language research 1h ago

mamabench and mamaretrieval: Benchmarks for Evaluating Medical Retrieval-Augmented Generation in Maternal, Neonatal, and Reproductive Health

arXiv:2606.29467v1 Announce Type: new Abstract: Medical question-answering benchmarks rarely cover the maternal, neonatal, child, and reproductive-health questions a nurse-midwife asks, and, to our knowledge, no public chunk-level relevance benchmark exists for maternal-health…

25
arXiv — NLP / Computation & Language research 1h ago

Coverage-Driven KV Cache Eviction for Efficient and Improved Inference of LLM

arXiv:2606.29563v1 Announce Type: new Abstract: Large language models (LLMs) excel at complex tasks like question answering and summarization, thanks to their ability to handle long-context inputs. However, deploying LLMs is costly, not only due to the high computational demands…

7
arXiv — NLP / Computation & Language research 1h ago

Anisotropy Decides Cosine vs. Rank Metrics for Text Embeddings

arXiv:2606.29571v1 Announce Type: new Abstract: The standard way to compare two text embeddings is cosine similarity. Scattered studies report that a different metric does better, but never pin down the geometric condition that decides when, or why. We settle both with a…

10
arXiv — NLP / Computation & Language research 1h ago

MAM-AI: An On-Device Medical Retrieval-Augmented Generation System for Nurses and Midwives in Zanzibar

arXiv:2606.29580v1 Announce Type: new Abstract: Maternal and newborn mortality remain among the highest in sub-Saharan Africa, where midwifery care is often delivered by nurses who lack midwifery training to international standards, and consulting authoritative guidance at the…

7
arXiv — NLP / Computation & Language research 1h ago

Managing Map Cardinality in Automatic Disease Classification Mapping: Balancing Precision, Recall and Coverage

arXiv:2606.29750v1 Announce Type: new Abstract: Automatic mapping between disease classification systems, such as the International Classification of Diseases (ICD), is a challenging yet essential task for integrating health data and conducting longitudinal data analysis.…

32
arXiv — NLP / Computation & Language research 1h ago

MemDelta: Controlled Baselines and Hidden Confounds in Agent Memory Evaluation

arXiv:2606.29914v1 Announce Type: new Abstract: Agent memory systems are increasingly evaluated against RAG and full-context baselines, but reported gains often mix changes in the memory method with changes in the language model, embedding model, or retrieval pipeline, making it…

4
arXiv — NLP / Computation & Language research 1h ago

Parametric Skills

arXiv:2606.30015v1 Announce Type: new Abstract: Since intelligence fundamentally relies on efficient skill acquisition (Chollet, 2019), the ability to leverage skills is critical. For LLMs, skills, manually authored or extracted from task trajectories, are textual recipes…

16
arXiv — NLP / Computation & Language research 1h ago

Efficient Retrieval-Augmented Generation via Token Co-occurrence Graphs

arXiv:2606.30093v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) mitigates hallucinations in Large Language Models (LLMs) by grounding the generation process on external knowledge. However, standard RAG approaches struggle with multi-hop reasoning. While…

10
arXiv — NLP / Computation & Language research 1h ago

Estimating Grammatical Gender Directions in Contextual Embeddings under Controlled and Natural Contexts

arXiv:2606.30152v1 Announce Type: new Abstract: Contextual language models conflate grammatical gender and social semantic bias in gendered languages such as Spanish. Existing gender debiasing approaches only operate on static word embeddings leaving contextual representations…

26
arXiv — NLP / Computation & Language research 1h ago

Forewarned is Forearmed: When Non-Sequential Embedding Turns Into an Anomaly Detector

arXiv:2606.30196v1 Announce Type: new Abstract: This paper offers an in-depth analysis of non-sequential multimodal sentence-level embeddings, with a particular focus on the SONAR model. We demonstrate that certain embedding dimensions are sensitive to perturbations and can…

25
r/MachineLearning community 12h ago

I built a demo agricultural planning system with an AI advisor for small-scale farmers in Nicaragua using NASA data [p]

(this was deleted before but i dont know if it was the filters of reddit or the moderators, if is the moderators i will not post it again after you delete it sorry.) (The name will probably change soon because I didn't realize "AgroVision" is already a registered trademark lol.)…

15
Hacker News — AI on Front Page community 19h ago

Pollen (CEO Negus-Fancey, CTO Wright) tried to remove article, and Google helped

Article URL: https://blog.pragmaticengineer.com/pollen-tried-to-remove-my-article-about-callum-negus-fancey-and-google-is-assisting-to-it/ Comments URL: https://news.ycombinator.com/item?id=48716902 Points: 264 # Comments: 32

15
r/MachineLearning community 21h ago

RAGless: Q-Q retrieval with score aggregation for closed-domain FAQ [P]

What it does RAGless is a semantic retrieval system based on Question-to-Question matching. At ingestion, an LLM generates multiple question variants per answer (3–5) and each variant gets its own embedding. At query time, the user question is embedded, Top-K nearest question…

23
arXiv — Machine Learning research 1d ago

Boundary condition fidelity for bottom-hole pressure and CO2 plume prediction in geological carbon storage

arXiv:2606.27515v1 Announce Type: new Abstract: Accurate prediction of bottom-hole pressure (BHP) and CO2 plume migration is essential for safe geological carbon storage, yet practical simulations often rely on truncated domains where artificial boundaries distort pressure…

32
arXiv — Machine Learning research 1d ago

TeRoR: Decoupled Temporal Rotation with Relational Circular Region for Temporal Knowledge Graph Embedding

arXiv:2606.27651v1 Announce Type: new Abstract: In recent years, with the emergence of Temporal Knowledge Graphs (TKGs), research on learning entity and relation representations in TKGs has attracted increasing attention, giving rise to a large number of TKG embedding methods.…

35
arXiv — Machine Learning research 1d ago

Are Time-Series Foundation Models Ready for E-Nose Data? An Empirical Assessment of Their Embeddings

arXiv:2606.27672v1 Announce Type: new Abstract: Inspired by advances in natural language processing and computer vision, "time-series foundation models" (TSFMs) have recently been introduced with the promise of strong generalization across diverse time-series tasks, including…

5
arXiv — Machine Learning research 1d ago

Aurora: A Leverage-Aware Spectral Optimizer

arXiv:2606.27715v1 Announce Type: new Abstract: We show that for tall matrix parameters, like projection matrices in the MLP layers, the Muon update can have row norms that are arbitrarily non-uniform. This can lead to a self-reinforcing feedback loop whereby neurons receive…

13
arXiv — Machine Learning research 1d ago

Dangerous Liaisons of Convex Learning and Non-Affine Aggregation

arXiv:2606.28123v1 Announce Type: new Abstract: Last-iterate convergence and generalization guarantees in first-order convex learning hinge on the monotonicity of the update operator. While linear averaging preserves the monotonicity of gradient updates, this property is often…

17
arXiv — Machine Learning research 1d ago

On the Inseparability of Instructions and Data in Shared-Embedding Sequence Models

arXiv:2606.27567v1 Announce Type: cross Abstract: Prompt injection is the top security risk for LLM-integrated applications, yet every defense proposed so far has been broken. We prove this is not a coincidence: in shared-embedding architectures that lack enforced control-data…

20
arXiv — NLP / Computation & Language research 1d ago

Causal Connections: Leveraging Multilingual Fine-Tuning for Financial QA@FinCausal 2026

arXiv:2606.27446v1 Announce Type: new Abstract: This paper describes team HSA_CORAL's submission to the FinCausal 2026 shared task on extracting cause-effect relations from financial narratives via extractive question answering in English and Spanish. We compare three modeling…

4
arXiv — NLP / Computation & Language research 1d ago

Mitigating Position Bias in Transformers via Layer-Specific Positional Embedding Scaling

arXiv:2606.27705v1 Announce Type: new Abstract: Large Language Models (LLMs) still struggle with the ``lost-in-the-middle'' problem, where critical information located in the middle of long-context inputs is often underrepresented or lost. While existing methods attempt to…

4
arXiv — NLP / Computation & Language research 1d ago

SHIFT: Gate-Modulated Activation Steering for Knowledge Conflict Mitigation in Retrieval-Augmented Generation

arXiv:2606.27786v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) enhances LLMs by incorporating external knowledge to support response generation. However, conflicts between retrieved context and parametric knowledge have emerged as a critical challenge in…

16
arXiv — NLP / Computation & Language research 1d ago

The Signal-Coverage Matrix: Stratifying Type and Semantic Errors in Statement Autoformalization

arXiv:2606.28013v1 Announce Type: new Abstract: Headline type-correctness (TC\%) of LLM autoformalization has climbed from $\sim$53\% to $\sim$76\% in two years, yet this scalar conceals which errors each method resolves. We propose a signal-coverage matrix that crosses the Lean…

23
arXiv — NLP / Computation & Language research 1d ago

MultiHashFormer: Hash-based Generative Language Models

arXiv:2606.28057v1 Announce Type: new Abstract: Language models (LMs) represent tokens using embedding matrices that scale linearly with the vocabulary size. To constrain the parameter footprint, prior work proposes hashing many tokens into a single vector within encoder-only…

15
arXiv — NLP / Computation & Language research 1d ago

HPRO: Hierarchical Progressive Reward Optimization via Preference Extraction for Emotional Text-to-Speech

arXiv:2606.28249v1 Announce Type: cross Abstract: Recently, Large Language Model (LLM)-based Text-to-Speech (TTS) models have achieved remarkable naturalness. However, the standard Supervised Fine-Tuning paradigm often converges to statistically averaged prosody, limiting…

20
r/LocalLLaMA community 1d ago

A lot of good M5 Max options available at Apple Refurbished

Just a heads-up. After Apple's price hike announcement, they added a bunch of top-of-the-line 14" M5 Pro/Max options to their refurbished website. If you got discouraged by the price hike, check out their refurbished store.   submitted by   /u/Hanthunius [link]  …

13
r/MachineLearning community 1d ago

I shrank a transformer until every number fitted on the screen and made the weights editable [R]

I've been teaching myself how LLMs actually work, not at the API level, but down to the matrix multiplications. To force myself to really understand the forward pass, I first built a complete transformer by hand in a spreadsheet from embeddings through to the loss. Then I turned…

31
r/MachineLearning community 2d ago

Benchmarking Self-Hosted Gemma 2 9B vs. Frontier APIs: The FP8 Quantization Prefill Tax and VRAM Realities on an NVIDIA L4 [P]

When evaluating migrating production LLM workloads off commercial cloud APIs, the conversation usually gets oversimplified into a trade-off between quality and infrastructure cost. To look past clean, isolated averages, I built a repeatable evaluation matrix using a real-world…

29
Hugging Face Daily Papers research 2d ago

Fast LeWorldModel

Abstract Fast-LeWM accelerates visual planning by replacing autoregressive rollout with parallel action-prefix prediction, reducing computational costs and latency accumulation during long-horizon predictions. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Joint-Embedding…

20
TechCrunch — AI news-outlet 2d ago

Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on

New models are launching in Asia that promise Mythos-like capabilities without fear of an export ban. U.S. AI labs may never recover this enormous market.

29
r/LocalLLaMA community 3d ago

What's one local AI workflow you wish you'd discovered sooner?

There are a lot of posts about the models and benchmarks, but I am more interested in the workflows that people use. What is one workflow that really saved you time or made your local LLM more useful? It could be anything—RAG, MCP, coding agents, organizing prompt, document…

23
Hugging Face Daily Papers research 4d ago

Hallucination in World Models is Predictable and Preventable

Abstract World models exhibit hallucinations in low-data regions of state-action space, which can be detected and mitigated using data-centric signals and coverage-aware sampling techniques. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Modern generative world models render…

25
arXiv — Machine Learning research 4d ago

Fast LeWorldModel

arXiv:2606.26217v1 Announce Type: new Abstract: Joint-Embedding Predictive Architectures (JEPAs), including recent LeWorldModel (LeWM), have become a promising foundation for reconstruction-free visual world models. For visual planning, however, LeWM evaluates candidate action…

32
arXiv — Machine Learning research 4d ago

Embedding Foundation Model Predictions in Discrete-Choice Models with Structural Guarantees

arXiv:2606.26432v1 Announce Type: new Abstract: Tabular foundation models achieve strong accuracy on choice prediction tasks, but their predictions often violate the economic logic those tasks require: raising a price can increase predicted demand, implied willingness-to-pay…

36
arXiv — Machine Learning research 4d ago

PersistentKV: Page-Aware Decode Scheduling for Long-Context LLM Serving on Commodity GPUs

arXiv:2606.26666v1 Announce Type: new Abstract: Autoregressive large language model (LLM) serving is increasingly limited by key-value (KV) cache movement rather than dense matrix multiplication. Modern paged-attention systems reduce KV-cache fragmentation and mature kernels…

20

Geometric Measurements of the Axiom of Choice in Neural Proof Embeddings

How Token Influence Decays with Distance: A Green-Function View of Trained Language Models

Deterministic Decisions for High-Stakes AI. A Zero-Egress Pipeline with the Deployability of RAG and the Accuracy of Machine Learning

SP-CACW: Convergence-Aware Client Weighting for Selfish Personalized Learning

Deciphering Region-Level Signatures from Latency Measurements in LEO Satellite Internet

The Mirage of Optimizing Training Policies: Monotonic Inference Policies as the Real Objective for LLM Reinforcement Learning

Turn-Averaged SAEs for Feature Discovery and Long-Context Attribution

AnTenA: Actionable and Explainable Tensor Analysis System with Large Language Models

5ting at SemEval-2026 Task 8: Strong End-to-End Multi-Turn RAG via LLM-Based Reranking and Faithfulness Control

BERTomelo: Your Portuguese Encoder Best Friend

How to Leverage Synthetic Speech for LLM-Based ASR Systems?

A Comparative Study on Affective Cues in Text Embeddings Across Psychological Emotion Theories

AB-RAG: Adaptive Budgeted Retrieval-Augmented Generation for Reliable Question Answering

MIThinker: A Plug-and-Play Policy-Optimized Thinker For Motivational Interviewing Counseling

TriageRA-CCF: Source-Side Clinical Confidence and Coverage Signals for Adaptive Rank Budgeting in Medical LLMs

mamabench and mamaretrieval: Benchmarks for Evaluating Medical Retrieval-Augmented Generation in Maternal, Neonatal, and Reproductive Health

Coverage-Driven KV Cache Eviction for Efficient and Improved Inference of LLM

Anisotropy Decides Cosine vs. Rank Metrics for Text Embeddings

MAM-AI: An On-Device Medical Retrieval-Augmented Generation System for Nurses and Midwives in Zanzibar

Managing Map Cardinality in Automatic Disease Classification Mapping: Balancing Precision, Recall and Coverage

MemDelta: Controlled Baselines and Hidden Confounds in Agent Memory Evaluation

Parametric Skills

Efficient Retrieval-Augmented Generation via Token Co-occurrence Graphs

Estimating Grammatical Gender Directions in Contextual Embeddings under Controlled and Natural Contexts

Forewarned is Forearmed: When Non-Sequential Embedding Turns Into an Anomaly Detector

I built a demo agricultural planning system with an AI advisor for small-scale farmers in Nicaragua using NASA data [p]

Pollen (CEO Negus-Fancey, CTO Wright) tried to remove article, and Google helped

RAGless: Q-Q retrieval with score aggregation for closed-domain FAQ [P]

Boundary condition fidelity for bottom-hole pressure and CO2 plume prediction in geological carbon storage

TeRoR: Decoupled Temporal Rotation with Relational Circular Region for Temporal Knowledge Graph Embedding

Are Time-Series Foundation Models Ready for E-Nose Data? An Empirical Assessment of Their Embeddings

Aurora: A Leverage-Aware Spectral Optimizer

Dangerous Liaisons of Convex Learning and Non-Affine Aggregation

On the Inseparability of Instructions and Data in Shared-Embedding Sequence Models

Causal Connections: Leveraging Multilingual Fine-Tuning for Financial QA@FinCausal 2026

Mitigating Position Bias in Transformers via Layer-Specific Positional Embedding Scaling

SHIFT: Gate-Modulated Activation Steering for Knowledge Conflict Mitigation in Retrieval-Augmented Generation

The Signal-Coverage Matrix: Stratifying Type and Semantic Errors in Statement Autoformalization

MultiHashFormer: Hash-based Generative Language Models

HPRO: Hierarchical Progressive Reward Optimization via Preference Extraction for Emotional Text-to-Speech

A lot of good M5 Max options available at Apple Refurbished

I shrank a transformer until every number fitted on the screen and made the weights editable [R]

Benchmarking Self-Hosted Gemma 2 9B vs. Frontier APIs: The FP8 Quantization Prefill Tax and VRAM Realities on an NVIDIA L4 [P]

Fast LeWorldModel

Asian AI startups launch Mythos-like models as Anthropic&#8217;s export ban drags on

What's one local AI workflow you wish you'd discovered sooner?

Hallucination in World Models is Predictable and Preventable

Fast LeWorldModel

Embedding Foundation Model Predictions in Discrete-Choice Models with Structural Guarantees

PersistentKV: Page-Aware Decode Scheduling for Long-Context LLM Serving on Commodity GPUs

Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on