News / #rag Tag Rag 500 articles archived under #rag · RSS Sign in to follow arXiv — Machine Learning research 19d ago Fourier Features Let Agents Learn High Precision Policies with Imitation Learning arXiv:2606.12334v1 Announce Type: new Abstract: High-precision robotic manipulation requires fine-grained spatial reasoning that is often difficult to achieve with RGB-only policies due to depth ambiguity and perspective scale issues. Policies that leverage 3D information… 14 arXiv — NLP / Computation & Language research 19d ago The Structural Attention Tax: How Retrieval Format Hijacks In-Context Learning Independent of Content arXiv:2606.11198v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) systems inject external knowledge to improve LLM outputs, yet the format of injected content -- distinct from its semantic relevance -- can independently distort the model's attention… 6 arXiv — NLP / Computation & Language research 19d ago NightFeats @ MMU-RAGent NeurIPS 2025: A Context-Optimized Multi-Agent RAG System for the Text-to-Text Track arXiv:2606.11199v1 Announce Type: new Abstract: We present NightFeats, a structured multi-agent retrieval-augmented generation (RAG) system submitted to the MMU-RAGent competition at NeurIPS 2025, where it was awarded Best Dynamic Evaluation in the text-to-text track. Rather… 24 arXiv — NLP / Computation & Language research 19d ago EverydayGPT: Confidence-Gated Routing for Efficient and Safe Hybrid GPT-RAG Conversational QA arXiv:2606.11212v1 Announce Type: new Abstract: Standard Retrieval-Augmented Generation (RAG) pipelines route every query through retrieval and generation unconditionally, incurring unnecessary computation and propagating low-quality context to the generator. We introduce… 12 arXiv — NLP / Computation & Language research 19d ago Energy-Efficient On-Device RAG on a Mobile NPU: System Design and Benchmark on Snapdragon X Elite arXiv:2606.11257v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) pipelines are compute-intensive, combining embedding, retrieval, reranking, and large language model (LLM) generation. Running them entirely on-device benefits privacy, latency, and offline use,… 35 arXiv — NLP / Computation & Language research 19d ago When More Documents Hurt RAG: Mitigating Vector Search Dilution with Domain-Scoped, Model-Agnostic Retrieval arXiv:2606.11350v1 Announce Type: new Abstract: Retrieval-augmented generation degrades when scaled to large, heterogeneous document collections, where dense similarity loses discriminative power, and top-k retrieval increasingly returns semantically similar but contextually… 13 arXiv — NLP / Computation & Language research 19d ago When Probing Accuracy Saturates, Fragility Resolves: A Complementary Metric for LLM Pre-Training Analysis arXiv:2606.11375v1 Announce Type: new Abstract: Standard linear probing declares a property "encoded" when a classifier on hidden states achieves high accuracy. The protocol works well on a snapshot but breaks across pre-training: probe accuracy saturates within the first few… 17 arXiv — NLP / Computation & Language research 19d ago An Ontology-Guided Multi-Anchor Graph Retrieval Framework for Traffic Legal Liability Determination arXiv:2606.11910v1 Announce Type: new Abstract: Traffic law liability determination is critical for assigning legal penalties, requiring the simultaneous identification of interdependent statutory provisions across multiple legal dimensions. However, existing retrieval-augmented… 28 arXiv — NLP / Computation & Language research 19d ago uva-irlab-conv at SemEval-2026 Task 8: Multi-Turn RAG with Learned Sparse Retrieval and Listwise Reranking arXiv:2606.11945v1 Announce Type: new Abstract: This report describes our participation in SemEval-2026 Task 8 on multi-turn retrieval and question answering. The task evaluates conversational systems across four domains (finance, cloud documentation, government, Wikipedia), and… 28 arXiv — NLP / Computation & Language research 19d ago Augmenting Molecular Language Models with Local $n$-gram Memory arXiv:2606.12113v1 Announce Type: new Abstract: Transformer-based language models for SMILES strings suffer from a locality gap: standard character-level tokenization fragments chemically meaningful motifs, forcing models to repeatedly learn local syntax at the expense of… 34 arXiv — NLP / Computation & Language research 19d ago Measuring Epistemic Resilience of LLMs Under Misleading Medical Context arXiv:2606.12291v1 Announce Type: new Abstract: Large language models (LLMs) now reach expert-level scores on medical licensing exams, encouraging the assumption that high scores imply safe medical judgment while patients increasingly use them for health advice. We show this… 14 TechCrunch — AI news-outlet 19d ago How memory tools can make AI models worse New research suggests that AI memory systems can degrade model performance and encourage sycophantic tendencies. 28 NVIDIA Developer Blog official-blog 19d ago Designing Production-Ready Battery Energy Storage Systems for AI Factories AI factories are changing what data-center infrastructure must do. Unlike traditional data centers, AI factories are built to manufacture intelligence at scale.... 29 llama.cpp releases dev-tools 19d ago b9589 CUDA: Fix ssm_scan_f32 data-races ( #24360 ) Add missing syncthreads before resuing cub_temp_storage __syncthreads() is required before being allowed to resue TempStorage smem:… 32 arXiv — Machine Learning research 20d ago Conformal Risk Prediction for Non-Alcoholic Fatty Liver Disease Using Gradient Boosting with Distribution-Free Coverages arXiv:2606.09860v1 Announce Type: new Abstract: Non-alcoholic fatty liver disease (NAFLD) affects roughly 25% of global adults, posing substantial hepatic and cardiovascular risks. Yet, population-level screening tools remain inadequate. We present Method, a machine-learning… 4 arXiv — Machine Learning research 20d ago Hyperparameter Learning for Latent Factorization of Tensors for Representation Learning to Large-scale Dynamic Weighted Directed Network arXiv:2606.09880v1 Announce Type: new Abstract: Large-scale dynamic weighted directed networks (DWDNs) are widely used to model time-varying interactions among nodes. Latent factorization of tensors (LFT) extracts target knowledge from DWDNs via low-rank embedding. However,… 27 arXiv — Machine Learning research 20d ago Integrating Out, Twice:The Open-System Case That Neural-Network Ensemble Theory Is Missing arXiv:2606.09950v1 Announce Type: new Abstract: Averaging a neural network over its random parameters and marginalizing a Gaussian sector are the same operation, the Schur complement of the eliminated block, and when that block is closed it returns a covariance and its inverse.… 25 arXiv — Machine Learning research 20d ago Compositional Generative Modeling from Decentralized Data arXiv:2606.10153v1 Announce Type: new Abstract: Learning the compositional nature of the physical world requires joint observation of interacting factors. However, because practical data is often decentralized, these factors are fragmented across isolated silos. Existing… 33 arXiv — Machine Learning research 20d ago DUET -- Dual User Embedding Transformers for Offsite Conversion Prediction arXiv:2606.10243v1 Announce Type: new Abstract: Offsite conversion rate (OCVR) prediction is an important ranking problem in computational recommendation systems. This task presents a modeling challenge: click signals are abundant and exhibit short temporal horizons, whereas… 25 arXiv — NLP / Computation & Language research 20d ago MIRAGE: A Polarity-Flipping Encoding Subspace in LLM Agents arXiv:2606.10304v1 Announce Type: new Abstract: When LLM agents are coerced into covertly encoding sensitive data (Base64, ROT13, acrostic, synonym chains, and beyond), the resulting outputs evade output-side detection but the underlying computation does not. Across nine… 37 arXiv — NLP / Computation & Language research 20d ago Attention Expansion: Enhancing Keyphrase Extraction from Long Documents with Attention-Augmented Contextualized Embeddings arXiv:2606.10716v1 Announce Type: new Abstract: Pre-trained language models (PLMs) have achieved strong performance in keyphrase extraction (KPE), largely due to their ability to generate rich contextualized representations. However, long-document KPE remains challenging because… 30 arXiv — NLP / Computation & Language research 20d ago Attention-Discounted Adaptive Sampler for Masked Diffusion Language Models arXiv:2606.10829v1 Announce Type: new Abstract: Masked diffusion language models can reduce inference steps by revealing multiple tokens per denoising iteration, but this parallelism is fragile: positions that are individually confident may be unsafe to commit together when… 18 arXiv — NLP / Computation & Language research 20d ago Agentic Hybrid RAG for Evidence-Grounded Muon Collider Analysis arXiv:2606.10381v1 Announce Type: cross Abstract: Muon collider research spans accelerator physics, detector instrumentation, and high-energy phenomenology, with relevant evidence scattered across a rapidly expanding and heterogeneous body of scientific literature. As… 37 arXiv — NLP / Computation & Language research 20d ago Leveraging Social Media Data for COVID-19 Studies arXiv:2606.10459v1 Announce Type: cross Abstract: Nowadays, social media networks have become widely preferred sources of information. Especially during the time of the Coronavirus disease 2019 COVID 19 pandemic, social media has been one of the most used platforms to get the… 20 arXiv — NLP / Computation & Language research 20d ago Infini Memory: Maintainable Topic Documents for Long-Term LLM Agent Memory arXiv:2606.10677v1 Announce Type: cross Abstract: Long-term LLM agents need persistent memory that can track changing facts and provide relevant evidence across sessions. Existing memory systems often store observations as isolated records, summaries, or indexed fragments, which… 20 Hugging Face Daily Papers research 20d ago One Token per Multimodal Evidence: Latent Memory for Resource-Constrained QA Abstract Latent Memory introduces a compressed representation approach for external memory in question answering, reducing token consumption and storage requirements while maintaining competitive performance across text-only and multimodal benchmarks. Generated by… 28 Hugging Face Daily Papers research 20d ago Precision Is Not Faithfulness: Coverage-Aware Evaluation of Grounded Generation with a Complete Oracle Abstract Reference-free faithfulness metrics suffer from a blind spot measuring only precision, leading to rewards for abstention; completeness in deterministic domains enables measurement of both precision and recall, revealing that high-precision models often have poor fact… 34 llama.cpp releases dev-tools 20d ago b9585 graph: Fix granite speech model inference by applying embedding scale when deepstack is not used ( #24357 ) llama-graph : apply embedding scale when deepstack is not used nits: remove non-existant hunyuan-vl from the tests apply suggestion from @gabe-l-hart Co-authored-by: Xuan… 25 Hugging Face Daily Papers research 20d ago SDR: Set-Distance Rewards for Radiology Report Generation Abstract Set-based rewards using embedding distances improve chest X-ray report generation by enabling effective post-training and test-time selection without requiring causal reasoning structures. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Reinforcement learning with… 14 r/LocalLLaMA community 20d ago Still a VERY lightweight open web-search tool for smaller local LLMs - now with SearXNG support Hey everyone, TinySearch v0.2.0 (first stable beta) is out. The first version used DuckDuckGo directly, which worked well enough to prove the idea, but yeah.. relying on one search source was way too fragile lol. DDG started throwing limits/CAPTCHAs more often in the last 2… 25 Hugging Face Daily Papers research 21d ago Text-to-Image Models Need Less from Text Encoders Than You Think Abstract Text-to-image models primarily utilize basic text representation aspects like word merging and order rather than complex contextual information encoded in full text embeddings. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Text-to-image models rely on text prompts as… 36 Hugging Face Daily Papers research 21d ago Answer Presence Drives RAG Rewriting Gains Abstract Controlled interventions reveal that gold answer presence in rewritten contexts significantly boosts QA performance, with removal causing substantial F1 drops and injection improving results, while conventional probing methods show fragility to sentinel changes.… 35 Hugging Face Daily Papers research 21d ago Trajectory-Refined Distillation Abstract On-policy distillation suffers from prefix failure where dense token-level supervision creates fragmented gradients; trajectory-refined distillation addresses this by correcting student rollouts at the trajectory level before distillation. Generated by… 37 arXiv — Machine Learning research 21d ago UNIQ: Conformal Calibration for Adaptive Conservatism in Offline Reinforcement Learning arXiv:2606.07592v1 Announce Type: new Abstract: Offline reinforcement learning requires careful conservatism to mitigate distribution shift, yet most existing methods apply a fixed penalty uniformly across all states regardless of local data coverage. We present UNIQ… 5 arXiv — Machine Learning research 21d ago A Topological Characterization of Graph Neural Networks via Stochastic Block Model Embeddings on the n-Sphere arXiv:2606.07598v1 Announce Type: new Abstract: We propose a topological framework for comparing trained Graph Neural Networks (GNNs) by mapping the Stochastic Block Models (SBMs) induced on the graphon-signal space of a Message Passing Neural Network (MPNN) onto the unit… 15 arXiv — Machine Learning research 21d ago Large Language Models Should Learn Personalized Rather Than Aggregated Human Preferences arXiv:2606.07629v1 Announce Type: new Abstract: Current approaches to aligning large language models (LLMs) aggregate diverse human preferences into a single reward signal, effectively optimizing for a hypothetical ``average user'' who represents no real person particularly… 10 arXiv — Machine Learning research 21d ago Temporal Coverage over Density: Parsimonious Training-Set Design for ML Climate Downscaling arXiv:2606.07898v1 Announce Type: new Abstract: High-resolution regional climate simulations provide critical information for climate impacts assessments but remain computationally expensive, motivating the development of machine-learning downscalers and emulators. A key… 16 arXiv — Machine Learning research 21d ago Minibatch Selection via Partition Matroid Constrained Gradient Matching arXiv:2606.07954v1 Announce Type: new Abstract: Training large language models (LLMs) on heterogeneous data requires selecting minibatches that balance convergence speed with coverage across domains. Existing methods either select samples independently within each domain or rely… 5 arXiv — Machine Learning research 21d ago CausShield: Sample Reconstruction-Resilient Vertical FL via Causal Representation Learning arXiv:2606.08027v1 Announce Type: new Abstract: Vertical federated learning (VFL) is a distributed learning paradigm that leverages vertically partitioned features across isolated parties without sharing raw samples; however, it remains vulnerable to active sample reconstruction… 38 llama.cpp releases dev-tools 21d ago b9568 mtp: support for gemma-4 E2B and E4B assistants ( #24282 ) models: update converter to support smaller assistants models: add masked_embd tensors to gemma4-assist arch gemma-4: remove temp debug for conversion gemma-4-mtp: filter out masked_embedding tensors during conversion… 23 r/MachineLearning community 21d ago Why I stopped using semantic embeddings for tool selection and switched back to BM25 [D] I've been building agents for about a year and recently shipped one for a client running ~140 MCP-exposed tools at peak. Along the way I made the canonical mistake. I used cosine similarity over tool description embeddings to pick which tools the model could see per turn. Worked… 9 Hugging Face Daily Papers research 21d ago GENEB: Why Genomic Models Are Hard to Compare Abstract GENEB presents a comprehensive benchmark for evaluating genomic foundation models across diverse tasks and architectures under a unified protocol. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Progress in genomic foundation models is difficult to assess due to fragmented… 25 arXiv — Machine Learning research 22d ago FAIR-Calib: Frontier-Aware Instability-Reweighted Calibration for Post-Training Quantization of Diffusion Large Language Models arXiv:2606.06547v1 Announce Type: new Abstract: Diffusion Large Language Models (dLLMs) refine tokens iteratively but commit them irreversibly, leading to a "stability lag" where early decisions remain fragile even after being written. We reveal that Post-Training Quantization… 21 arXiv — Machine Learning research 22d ago PandaAI: A Practical Agent CQ2 for Neuro-symbolic Data Analysis And Integrated Decision-Making in Quantitative Finance arXiv:2606.06823v1 Announce Type: new Abstract: While deep learning has excelled in various domains, its application to sequential decision-making in finance remains challenging due to the low Signal-to-Noise Ratio (SNR) and non-stationarity of financial data. Leveraging the… 20 arXiv — Machine Learning research 22d ago Accelerating Multi-Objective Bayesian Optimisation via Predictive-Gradient Catalysts arXiv:2606.06984v1 Announce Type: new Abstract: This paper presents a general acceleration mechanism for multi-objective Bayesian optimisation (MOBO) that leverages Gaussian process predictive gradients as auxiliary signals. Rather than replacing existing Pareto-compliant… 7 arXiv — Machine Learning research 22d ago Closed-Form Spectral Regularization for Multi-Task Model Merging arXiv:2606.07289v1 Announce Type: new Abstract: Model merging combines several independently fine-tuned experts into a single multi-task model without any training data, reducing the storage, serving, and decentralized-development costs of large foundation models.… 38 arXiv — Machine Learning research 22d ago Bootstrap Theory of Representational Emergence: Explanatory Insufficiency as a Driver of Representation Learning and World Models arXiv:2606.07303v1 Announce Type: new Abstract: Representation learning is central to modern machine learning, enabling transitions from handcrafted features to learned embeddings, latent spaces, foundation models, world models, and digital twins. Yet most research examines how… 29 arXiv — Machine Learning research 22d ago Graph Neural Network leveraging Higher-order Class Label Connectivity for Heterophilous Graphs arXiv:2606.07475v1 Announce Type: new Abstract: Node classification in graph neural networks (GNNs) has been widely applied in various fields of graph analysis. GNNs achieve high-accuracy node classification in homophilous graphs, where nodes with the same class label tend to be… 32 arXiv — NLP / Computation & Language research 22d ago Evidence Graph Consistency in Retrieval-Augmented Generation: A Model-Dependent Analysis of Hallucination Detection arXiv:2606.06748v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) reduces but does not eliminate hallucination in large language models. Existing detection methods rely on flat similarity between generated answers and retrieved passages, ignoring structural… 36 arXiv — NLP / Computation & Language research 22d ago A Four-Condition Diagnostic Protocol for Evidence Utilization in Long-Context and Retrieval-Augmented Language Models arXiv:2606.06758v1 Announce Type: new Abstract: Final-answer accuracy, retrieval recall, and citation overlap do not by themselves identify whether a long-context or retrieval-augmented language model used the evidence it was given. A model can answer from parametric memory,… 21 Page 6 of 10 · 500 articles ← Newer Older →