News / #rag Tag Rag 500 articles archived under #rag · RSS Sign in to follow arXiv — Machine Learning research 30m ago Geometric Measurements of the Axiom of Choice in Neural Proof Embeddings arXiv:2606.28572v1 Announce Type: new Abstract: The axiom of choice has divided the foundations of mathematics for over a century, but the distinction between classical and constructive proofs has remained a philosophical and methodological one. We use Lean 4's kernel-level… 8 arXiv — Machine Learning research 30m ago How Token Influence Decays with Distance: A Green-Function View of Trained Language Models arXiv:2606.29139v1 Announce Type: new Abstract: We study how the next-token prediction of an autoregressive Transformer language model changes under small perturbations of earlier input token embeddings. Motivated by operator learning and iterative solvers for differential… 27 arXiv — Machine Learning research 30m ago Deterministic Decisions for High-Stakes AI. A Zero-Egress Pipeline with the Deployability of RAG and the Accuracy of Machine Learning arXiv:2606.29280v1 Announce Type: new Abstract: We identify intervention bias as a previously unquantified failure mode of zero-shot large-language-model (LLM) educational advisory agents: without task-specific training, they recommend action when a hindsight-optimal oracle… 31 arXiv — Machine Learning research 30m ago SP-CACW: Convergence-Aware Client Weighting for Selfish Personalized Learning arXiv:2606.29322v1 Announce Type: new Abstract: Collaborative learning is sustainable only when it benefits each participant. Standard federated learning optimizes a global average objective, which can under perform for clients whose data distributions differ substantially from… 35 arXiv — Machine Learning research 30m ago Deciphering Region-Level Signatures from Latency Measurements in LEO Satellite Internet arXiv:2606.29324v1 Announce Type: new Abstract: Low-Earth orbit (LEO) satellite Internet has become an indispensable infrastructure that provide growing coverage for global users. Despite extensive measurement efforts, the principles underlying region-level performance… 32 arXiv — Machine Learning research 30m ago The Mirage of Optimizing Training Policies: Monotonic Inference Policies as the Real Objective for LLM Reinforcement Learning arXiv:2606.29526v1 Announce Type: new Abstract: Reinforcement learning (RL) has gained growing attention in large language model (LLM) post-training, yet RL training remains fragile and can suffer from instability or collapse. One vital cause is training-inference mismatch: LLM… 17 arXiv — NLP / Computation & Language research 30m ago Turn-Averaged SAEs for Feature Discovery and Long-Context Attribution arXiv:2606.28548v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) have become a useful tool for extracting interpretable features in language models. However, standard SAE architectures operate on individual token activations, meaning that the number of active features… 25 arXiv — NLP / Computation & Language research 30m ago AnTenA: Actionable and Explainable Tensor Analysis System with Large Language Models arXiv:2606.28708v1 Announce Type: new Abstract: Accurately explaining hidden patterns in multi-aspect data has typically been done by leveraging labels and/or accompanying auxiliary metadata. However, labels and auxiliary data may be inaccurate (e.g. nonstandard, inconsistent),… 21 arXiv — NLP / Computation & Language research 30m ago 5ting at SemEval-2026 Task 8: Strong End-to-End Multi-Turn RAG via LLM-Based Reranking and Faithfulness Control arXiv:2606.28737v1 Announce Type: new Abstract: We introduce 5ting, our system for the SemEval2026 Task 8 (MTRAGEval), which evaluates multi-turn Retrieval Augmented Generation (RAG) systems. Multi turn RAG involves context drift, under specification, and hallucination risk. Our… 5 arXiv — NLP / Computation & Language research 30m ago BERTomelo: Your Portuguese Encoder Best Friend arXiv:2606.28999v1 Announce Type: new Abstract: Encoders have become the state of the art for multiple NLP tasks, especially those requiring deep contextual understanding. While multilingual models offer broad coverage, dedicated monolingual encoders are essential for capturing… 16 arXiv — NLP / Computation & Language research 30m ago How to Leverage Synthetic Speech for LLM-Based ASR Systems? arXiv:2606.29031v1 Announce Type: new Abstract: In regulated domains such as banking and healthcare, where privacy constraints make real speech costly to collect and retain, synthetic speech from modern text-to-speech (TTS) is an appealing alternative for training automatic… 15 arXiv — NLP / Computation & Language research 30m ago A Comparative Study on Affective Cues in Text Embeddings Across Psychological Emotion Theories arXiv:2606.29068v1 Announce Type: new Abstract: Text encoders are known for their utility in natural language processing, as they are able to efficiently compress inputs into dense vectors while preserving semantics. These models have been applied to affective computing, in… 19 arXiv — NLP / Computation & Language research 30m ago AB-RAG: Adaptive Budgeted Retrieval-Augmented Generation for Reliable Question Answering arXiv:2606.29090v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) has become the standard way to ground large language models in external knowledge, yet most systems retrieve a fixed number of passages for every question regardless of its difficulty. This… 11 arXiv — NLP / Computation & Language research 30m ago MIThinker: A Plug-and-Play Policy-Optimized Thinker For Motivational Interviewing Counseling arXiv:2606.29265v1 Announce Type: new Abstract: Reasoning large language models (LLMs) have recently made much progress in complex problem-solving, leveraging internal reasoning (or thought) to guide their solution generation. However, existing LLM-based counseling agents,… 17 arXiv — NLP / Computation & Language research 30m ago TriageRA-CCF: Source-Side Clinical Confidence and Coverage Signals for Adaptive Rank Budgeting in Medical LLMs arXiv:2606.29375v1 Announce Type: new Abstract: Medical large language models are commonly adapted with a fixed low-rank budget, even though medical questions differ substantially in confidence, clinical coverage, and cross-domain difficulty. We study adaptive rank budgeting for… 15 arXiv — NLP / Computation & Language research 30m ago mamabench and mamaretrieval: Benchmarks for Evaluating Medical Retrieval-Augmented Generation in Maternal, Neonatal, and Reproductive Health arXiv:2606.29467v1 Announce Type: new Abstract: Medical question-answering benchmarks rarely cover the maternal, neonatal, child, and reproductive-health questions a nurse-midwife asks, and, to our knowledge, no public chunk-level relevance benchmark exists for maternal-health… 25 arXiv — NLP / Computation & Language research 30m ago Coverage-Driven KV Cache Eviction for Efficient and Improved Inference of LLM arXiv:2606.29563v1 Announce Type: new Abstract: Large language models (LLMs) excel at complex tasks like question answering and summarization, thanks to their ability to handle long-context inputs. However, deploying LLMs is costly, not only due to the high computational demands… 7 arXiv — NLP / Computation & Language research 30m ago Anisotropy Decides Cosine vs. Rank Metrics for Text Embeddings arXiv:2606.29571v1 Announce Type: new Abstract: The standard way to compare two text embeddings is cosine similarity. Scattered studies report that a different metric does better, but never pin down the geometric condition that decides when, or why. We settle both with a… 10 arXiv — NLP / Computation & Language research 30m ago MAM-AI: An On-Device Medical Retrieval-Augmented Generation System for Nurses and Midwives in Zanzibar arXiv:2606.29580v1 Announce Type: new Abstract: Maternal and newborn mortality remain among the highest in sub-Saharan Africa, where midwifery care is often delivered by nurses who lack midwifery training to international standards, and consulting authoritative guidance at the… 7 arXiv — NLP / Computation & Language research 30m ago Managing Map Cardinality in Automatic Disease Classification Mapping: Balancing Precision, Recall and Coverage arXiv:2606.29750v1 Announce Type: new Abstract: Automatic mapping between disease classification systems, such as the International Classification of Diseases (ICD), is a challenging yet essential task for integrating health data and conducting longitudinal data analysis.… 32 arXiv — NLP / Computation & Language research 30m ago MemDelta: Controlled Baselines and Hidden Confounds in Agent Memory Evaluation arXiv:2606.29914v1 Announce Type: new Abstract: Agent memory systems are increasingly evaluated against RAG and full-context baselines, but reported gains often mix changes in the memory method with changes in the language model, embedding model, or retrieval pipeline, making it… 4 arXiv — NLP / Computation & Language research 30m ago Parametric Skills arXiv:2606.30015v1 Announce Type: new Abstract: Since intelligence fundamentally relies on efficient skill acquisition (Chollet, 2019), the ability to leverage skills is critical. For LLMs, skills, manually authored or extracted from task trajectories, are textual recipes… 16 arXiv — NLP / Computation & Language research 30m ago Efficient Retrieval-Augmented Generation via Token Co-occurrence Graphs arXiv:2606.30093v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) mitigates hallucinations in Large Language Models (LLMs) by grounding the generation process on external knowledge. However, standard RAG approaches struggle with multi-hop reasoning. While… 10 arXiv — NLP / Computation & Language research 30m ago Estimating Grammatical Gender Directions in Contextual Embeddings under Controlled and Natural Contexts arXiv:2606.30152v1 Announce Type: new Abstract: Contextual language models conflate grammatical gender and social semantic bias in gendered languages such as Spanish. Existing gender debiasing approaches only operate on static word embeddings leaving contextual representations… 26 arXiv — NLP / Computation & Language research 30m ago Forewarned is Forearmed: When Non-Sequential Embedding Turns Into an Anomaly Detector arXiv:2606.30196v1 Announce Type: new Abstract: This paper offers an in-depth analysis of non-sequential multimodal sentence-level embeddings, with a particular focus on the SONAR model. We demonstrate that certain embedding dimensions are sensitive to perturbations and can… 25 r/MachineLearning community 11h ago I built a demo agricultural planning system with an AI advisor for small-scale farmers in Nicaragua using NASA data [p] (this was deleted before but i dont know if it was the filters of reddit or the moderators, if is the moderators i will not post it again after you delete it sorry.) (The name will probably change soon because I didn't realize "AgroVision" is already a registered trademark lol.)… 15 Hacker News — AI on Front Page community 19h ago Pollen (CEO Negus-Fancey, CTO Wright) tried to remove article, and Google helped Article URL: https://blog.pragmaticengineer.com/pollen-tried-to-remove-my-article-about-callum-negus-fancey-and-google-is-assisting-to-it/ Comments URL: https://news.ycombinator.com/item?id=48716902 Points: 264 # Comments: 32 15 r/MachineLearning community 20h ago RAGless: Q-Q retrieval with score aggregation for closed-domain FAQ [P] What it does RAGless is a semantic retrieval system based on Question-to-Question matching. At ingestion, an LLM generates multiple question variants per answer (3–5) and each variant gets its own embedding. At query time, the user question is embedded, Top-K nearest question… 23 arXiv — Machine Learning research 1d ago Boundary condition fidelity for bottom-hole pressure and CO2 plume prediction in geological carbon storage arXiv:2606.27515v1 Announce Type: new Abstract: Accurate prediction of bottom-hole pressure (BHP) and CO2 plume migration is essential for safe geological carbon storage, yet practical simulations often rely on truncated domains where artificial boundaries distort pressure… 32 arXiv — Machine Learning research 1d ago TeRoR: Decoupled Temporal Rotation with Relational Circular Region for Temporal Knowledge Graph Embedding arXiv:2606.27651v1 Announce Type: new Abstract: In recent years, with the emergence of Temporal Knowledge Graphs (TKGs), research on learning entity and relation representations in TKGs has attracted increasing attention, giving rise to a large number of TKG embedding methods.… 35 arXiv — Machine Learning research 1d ago Are Time-Series Foundation Models Ready for E-Nose Data? An Empirical Assessment of Their Embeddings arXiv:2606.27672v1 Announce Type: new Abstract: Inspired by advances in natural language processing and computer vision, "time-series foundation models" (TSFMs) have recently been introduced with the promise of strong generalization across diverse time-series tasks, including… 5 arXiv — Machine Learning research 1d ago Aurora: A Leverage-Aware Spectral Optimizer arXiv:2606.27715v1 Announce Type: new Abstract: We show that for tall matrix parameters, like projection matrices in the MLP layers, the Muon update can have row norms that are arbitrarily non-uniform. This can lead to a self-reinforcing feedback loop whereby neurons receive… 13 arXiv — Machine Learning research 1d ago Dangerous Liaisons of Convex Learning and Non-Affine Aggregation arXiv:2606.28123v1 Announce Type: new Abstract: Last-iterate convergence and generalization guarantees in first-order convex learning hinge on the monotonicity of the update operator. While linear averaging preserves the monotonicity of gradient updates, this property is often… 17 arXiv — Machine Learning research 1d ago On the Inseparability of Instructions and Data in Shared-Embedding Sequence Models arXiv:2606.27567v1 Announce Type: cross Abstract: Prompt injection is the top security risk for LLM-integrated applications, yet every defense proposed so far has been broken. We prove this is not a coincidence: in shared-embedding architectures that lack enforced control-data… 20 arXiv — NLP / Computation & Language research 1d ago Causal Connections: Leveraging Multilingual Fine-Tuning for Financial QA@FinCausal 2026 arXiv:2606.27446v1 Announce Type: new Abstract: This paper describes team HSA_CORAL's submission to the FinCausal 2026 shared task on extracting cause-effect relations from financial narratives via extractive question answering in English and Spanish. We compare three modeling… 4 arXiv — NLP / Computation & Language research 1d ago Mitigating Position Bias in Transformers via Layer-Specific Positional Embedding Scaling arXiv:2606.27705v1 Announce Type: new Abstract: Large Language Models (LLMs) still struggle with the ``lost-in-the-middle'' problem, where critical information located in the middle of long-context inputs is often underrepresented or lost. While existing methods attempt to… 4 arXiv — NLP / Computation & Language research 1d ago SHIFT: Gate-Modulated Activation Steering for Knowledge Conflict Mitigation in Retrieval-Augmented Generation arXiv:2606.27786v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) enhances LLMs by incorporating external knowledge to support response generation. However, conflicts between retrieved context and parametric knowledge have emerged as a critical challenge in… 16 arXiv — NLP / Computation & Language research 1d ago The Signal-Coverage Matrix: Stratifying Type and Semantic Errors in Statement Autoformalization arXiv:2606.28013v1 Announce Type: new Abstract: Headline type-correctness (TC\%) of LLM autoformalization has climbed from $\sim$53\% to $\sim$76\% in two years, yet this scalar conceals which errors each method resolves. We propose a signal-coverage matrix that crosses the Lean… 23 arXiv — NLP / Computation & Language research 1d ago MultiHashFormer: Hash-based Generative Language Models arXiv:2606.28057v1 Announce Type: new Abstract: Language models (LMs) represent tokens using embedding matrices that scale linearly with the vocabulary size. To constrain the parameter footprint, prior work proposes hashing many tokens into a single vector within encoder-only… 15 arXiv — NLP / Computation & Language research 1d ago HPRO: Hierarchical Progressive Reward Optimization via Preference Extraction for Emotional Text-to-Speech arXiv:2606.28249v1 Announce Type: cross Abstract: Recently, Large Language Model (LLM)-based Text-to-Speech (TTS) models have achieved remarkable naturalness. However, the standard Supervised Fine-Tuning paradigm often converges to statistically averaged prosody, limiting… 20 r/LocalLLaMA community 1d ago A lot of good M5 Max options available at Apple Refurbished Just a heads-up. After Apple's price hike announcement, they added a bunch of top-of-the-line 14" M5 Pro/Max options to their refurbished website. If you got discouraged by the price hike, check out their refurbished store.   submitted by   /u/Hanthunius [link]  … 13 r/MachineLearning community 1d ago I shrank a transformer until every number fitted on the screen and made the weights editable [R] I've been teaching myself how LLMs actually work, not at the API level, but down to the matrix multiplications. To force myself to really understand the forward pass, I first built a complete transformer by hand in a spreadsheet from embeddings through to the loss. Then I turned… 31 r/MachineLearning community 2d ago Benchmarking Self-Hosted Gemma 2 9B vs. Frontier APIs: The FP8 Quantization Prefill Tax and VRAM Realities on an NVIDIA L4 [P] When evaluating migrating production LLM workloads off commercial cloud APIs, the conversation usually gets oversimplified into a trade-off between quality and infrastructure cost. To look past clean, isolated averages, I built a repeatable evaluation matrix using a real-world… 29 Hugging Face Daily Papers research 2d ago Fast LeWorldModel Abstract Fast-LeWM accelerates visual planning by replacing autoregressive rollout with parallel action-prefix prediction, reducing computational costs and latency accumulation during long-horizon predictions. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Joint-Embedding… 20 TechCrunch — AI news-outlet 2d ago Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on New models are launching in Asia that promise Mythos-like capabilities without fear of an export ban. U.S. AI labs may never recover this enormous market. 29 r/LocalLLaMA community 3d ago What's one local AI workflow you wish you'd discovered sooner? There are a lot of posts about the models and benchmarks, but I am more interested in the workflows that people use. What is one workflow that really saved you time or made your local LLM more useful? It could be anything—RAG, MCP, coding agents, organizing prompt, document… 23 Hugging Face Daily Papers research 3d ago Hallucination in World Models is Predictable and Preventable Abstract World models exhibit hallucinations in low-data regions of state-action space, which can be detected and mitigated using data-centric signals and coverage-aware sampling techniques. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Modern generative world models render… 25 arXiv — Machine Learning research 4d ago Fast LeWorldModel arXiv:2606.26217v1 Announce Type: new Abstract: Joint-Embedding Predictive Architectures (JEPAs), including recent LeWorldModel (LeWM), have become a promising foundation for reconstruction-free visual world models. For visual planning, however, LeWM evaluates candidate action… 32 arXiv — Machine Learning research 4d ago Embedding Foundation Model Predictions in Discrete-Choice Models with Structural Guarantees arXiv:2606.26432v1 Announce Type: new Abstract: Tabular foundation models achieve strong accuracy on choice prediction tasks, but their predictions often violate the economic logic those tasks require: raising a price can increase predicted demand, implied willingness-to-pay… 36 arXiv — Machine Learning research 4d ago PersistentKV: Page-Aware Decode Scheduling for Long-Context LLM Serving on Commodity GPUs arXiv:2606.26666v1 Announce Type: new Abstract: Autoregressive large language model (LLM) serving is increasingly limited by key-value (KV) cache movement rather than dense matrix multiplication. Modern paged-attention systems reduce KV-cache fragmentation and mature kernels… 20 Page 1 of 10 · 500 articles Older →