Tag

Rag

500 articles archived under #rag · RSS

arXiv — Machine Learning research 4d ago

A Generalization Theory for JEPA-Based World Models

arXiv:2606.27014v1 Announce Type: new Abstract: Joint Embedding Predictive Architectures (JEPAs) have recently emerged as a promising paradigm for world modeling by learning predictive dynamics in a latent space rather than generating future observations at the input level.…

5
arXiv — Machine Learning research 4d ago

Cross-Head Attention Uplift Network with Inverse Propensity Score under Unobserved Confounding

arXiv:2606.27114v1 Announce Type: new Abstract: Uplift modeling, crucial for estimating individual treatment effects (ITE), faces dual challenges: flexibly leveraging inter-group similarity to enhance discriminative power and debiasing under unobserved confounding scenarios. In…

19
arXiv — Machine Learning research 4d ago

BetXplain: An Explanation-Annotated Dataset for Detecting Manipulative Betting Advertisements on Social Media

arXiv:2606.27274v1 Announce Type: new Abstract: The promotion of betting applications on social media platforms has increased significantly in recent years. Many of these advertisements use persuasive techniques that may mislead users, encourage risky behavior, and potentially…

37
arXiv — NLP / Computation & Language research 4d ago

Charting the Growth of Social-Physical HRI (spHRI): A Systematic Review Pipeline Augmented by Small Language Models

arXiv:2606.26382v1 Announce Type: new Abstract: Social-physical human-robot interaction (spHRI) has grown rapidly across robotics, human-computer interaction, human-robot interaction, and haptics. Yet, fragmented terminology and inconsistent methodologies make systematic…

35
arXiv — NLP / Computation & Language research 4d ago

ProvenAI: Provenance-Native Traces of Evidence in Generated Answers

arXiv:2606.26449v1 Announce Type: new Abstract: Retrieval-augmented systems routinely present citations alongside generated answers, yet a citation does not confirm that the corresponding source meaningfully shaped the output. This paper introduces ProvenAI, a framework that…

17
arXiv — NLP / Computation & Language research 4d ago

Speaking Numbers to LLMs: Multi-Wavelet Number Embeddings for Time Series Forecasting

arXiv:2606.26487v1 Announce Type: new Abstract: Large language models (LLMs) are attractive for context-aware time series forecasting because they can integrate heterogeneous textual signals, yet their discrete, language-oriented tokenization and embedding interfaces are…

21
arXiv — NLP / Computation & Language research 4d ago

Comparing BERT Sentence-Pair Classification and Few-Shot LLM Prompting for Detecting Threat and Solution Framing in German Climate News

arXiv:2606.26489v1 Announce Type: new Abstract: News media play a central role in shaping public perceptions of climate change, and whether coverage emphasizes threats or solutions has measurable effects on audience engagement and policy support. Automated detection of these…

23
arXiv — NLP / Computation & Language research 4d ago

Temporal Validity in Retrieval Memory: Eliminating Stale-Fact Errors for AI Agents over Evolving Knowledge

arXiv:2606.26511v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) gives agents access to accumulated knowledge, but has no model of time. When a fact changes (e.g., a function is renamed or API restructured), RAG retrieves both the stale and current value with…

6
arXiv — NLP / Computation & Language research 4d ago

Improving General Role-Playing Agents via Psychology-Grounded Reasoning and Role-Aware Policy Optimization

arXiv:2606.27025v1 Announce Type: new Abstract: Building general-purpose role-playing agents that faithfully portray any character from a natural-language profile remains challenging. The dominant paradigm -- supervised fine-tuning -- encourages behavioral mimicry without deep,…

16
arXiv — NLP / Computation & Language research 4d ago

From Clicks to Intent: Cross-Platform Session Embeddings with LLM-Distilled Taxonomy for Financial Services Recommendations

arXiv:2606.26277v1 Announce Type: cross Abstract: Sequential user behavior modeling is widely adopted in industrial recommender systems; however, significant gaps remain in financial services, where pre-login web interactions and authenticated in-app experiences differ…

24
arXiv — NLP / Computation & Language research 4d ago

Jailbreaking for the Average Jane: Choosing Optimal Jailbreaks via Bandit Algorithms for Automatically Enhanced Queries

arXiv:2606.26936v1 Announce Type: cross Abstract: With a profusion of jailbreaks for LLMs now widely known, a growing concern is that non-expert malicious actors ("the average Jane") could elicit actionable responses to malicious requests. In this work, we examine whether this…

36
arXiv — NLP / Computation & Language research 4d ago

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

arXiv:2506.15681v4 Announce Type: replace Abstract: Recent advancements in vision-language models (VLMs) have leveraged large language models (LLMs) to achieve performance on par with closed-source systems like GPT-4V. However, deploying these models in real-world scenarios,…

16
Vercel — AI dev-tools 4d ago

Sandboxes now expire based on last use

Vercel Sandbox snapshots now expire based on when they were last used, not when they were created. Active snapshots stay alive as long as workflows depend on them, while unused snapshots expire on their retention policy to keep storage in check. Every time a snapshot is used,…

7
r/LocalLLaMA community 4d ago

Fast medical RAG API to give your local LLMs access to facts

I created a simple RAG API using medical Wikipedia articles that you can point your agent to and use freely. It may be useful in allowing your local LLMs access to medical facts they might not be able to recall from their weights. I'm aiming for subsecond responses but cannot…

7
arXiv — Machine Learning research 5d ago

From Meta Idea to Advanced Mathematical Discovery -- Human-AI Co-Discovery of Sign-Embedding Quantum Algorithms

arXiv:2606.24899v1 Announce Type: new Abstract: AI-assisted mathematics is often evaluated on solving predefined problems. In practice, however, many important advances begin earlier, when a vague research intuition is transformed into a concrete problem, a promising route, and…

37
arXiv — Machine Learning research 5d ago

Holographic Memory for Zero-Shot Compositional Reasoning in Knowledge Graphs: A Mechanistic Study of Where and Why It Fails

arXiv:2606.24948v1 Announce Type: new Abstract: Knowledge graph embedding (KGE) models predict single-hop links well but have no mechanism for zero-shot compositional queries: multi-hop questions whose relation chains never appeared during training. Holographic Reduced…

31
arXiv — Machine Learning research 5d ago

A Single Stepsize Suffices for Unprojected Linear TD(0): Simultaneous Robust and Fast Rates via Polyak--Ruppert Averaging

arXiv:2606.24981v1 Announce Type: new Abstract: We study linear TD(0) under Markovian sampling, where data are generated along a single trajectory. We provide high-probability guarantees for a plain unprojected TD(0) algorithm with Polyak-Ruppert (PR) averaging, using a single…

38
arXiv — Machine Learning research 5d ago

Retrieval-Augmented Personalization with Foundation Models for Wearable Stress Detection

arXiv:2606.24985v1 Announce Type: new Abstract: Personalization in wearable-based stress detection remains challenging due to substantial inter-individual variability in physiological and behavioral responses. While traditional approaches rely on user-specific fine-tuning or…

5
arXiv — Machine Learning research 5d ago

What's in an Earth Embedding? An Explainability Analysis of Location Encoders

arXiv:2606.24997v1 Announce Type: new Abstract: Geographic implicit neural representations (INRs) learn to map any coordinate on Earth to a location embedding, implicitly encoding geospatial data into the weights of a neural network. Location embeddings are widely used off the…

15
arXiv — Machine Learning research 5d ago

Bias-Controlled Primal-Dual Natural Actor-Critic: Optimal Rates for Constrained Multi-Objective Average-Reward RL

arXiv:2606.25012v1 Announce Type: new Abstract: Many reinforcement learning (RL) problems in the infinite-horizon average-reward setting require optimizing multiple conflicting objectives while satisfying multiple safety constraints. A common approach is concave scalarization,…

27
arXiv — Machine Learning research 5d ago

Training for the Model You Return: Improving Optimization for Iterate-Averaged Language Models

arXiv:2606.25086v1 Announce Type: new Abstract: Many modern Language Model (LM) pipelines return an averaged model, such as an exponential moving average of the training iterates, rather than the final iterate itself. This raises a fundamental question: given that we will return…

15
arXiv — NLP / Computation & Language research 5d ago

Error-Aware TF-IDF Retrieval-Augmented Generation for ASR Error Correction

arXiv:2606.24915v1 Announce Type: new Abstract: End-to-end automatic speech recognition systems frequently hallucinate rare entities and domain-specific terms, especially in low-resource languages. While retrieval-augmented generation frameworks can mitigate these errors using…

18
arXiv — NLP / Computation & Language research 5d ago

Memory Makes the Difference: Evaluating How Different Memory Roles Shape Conversational Agents

arXiv:2606.25361v1 Announce Type: new Abstract: Prior research on memory mechanism in RAG-based conversational system has emphasized how memory is stored and retrieved. However, far less is known about how memories with different functional roles influence response quality.…

27
arXiv — NLP / Computation & Language research 5d ago

Story Operators: Decomposing the Original $\to$ Sequel Transformation in Embedding Space

arXiv:2606.25379v1 Announce Type: new Abstract: I treat a book as a point in a sentence-embedding space and a literary transformation as an operation on points. Given an original novel and its sequel, I ask what it takes, geometrically, to turn the first into the second. Using…

18
arXiv — NLP / Computation & Language research 5d ago

SFL-MTSC: Leveraging Semantic Frame-Level Multi-Task Self-Consistency for Robust Multi-Intent Spoken Language Understanding

arXiv:2606.25552v1 Announce Type: new Abstract: Prompt-based spoken language understanding (SLU) with large language models (LLMs) often suffers from inconsistent intent--slot structures due to decoding stochasticity, particularly in multi-intent scenarios. In view of this, we…

28
arXiv — NLP / Computation & Language research 5d ago

Is GraphRAG Needed? From Basic RAG to Graph-/Agentic Solutions with Context Optimization

arXiv:2606.25656v1 Announce Type: new Abstract: As advanced RAG variants like GraphRAG and Agentic RAG emerge, one leading question is when and how to use them. Here, we introduce a framework for different RAG scenarios evaluation and comparison on semi-structured knowledge…

21
arXiv — NLP / Computation & Language research 5d ago

BitNet Text Embeddings

arXiv:2606.25674v1 Announce Type: new Abstract: LLM-based text embedders have substantially improved retrieval and semantic representation quality, but their deployment remains costly: large backbone models slow down embedding inference, while high-dimensional full-precision…

32
arXiv — NLP / Computation & Language research 5d ago

Weave of Formal Thought

arXiv:2606.25987v1 Announce Type: new Abstract: Large language models (LLMs) attain remarkable surface fluency on code, yet they neither formally guarantee the syntactic validity of their output nor leverage the hierarchical structure defining the target language. While existing…

18
arXiv — NLP / Computation & Language research 5d ago

To Isolate or to Score? Model-Adaptive Assessment for Cost-Efficient Multi-Agent RAG

arXiv:2606.25191v1 Announce Type: cross Abstract: Multi-agent document assessment for retrieval-augmented generation is computationally expensive, driving practitioners toward smaller, deployable models whose assessment mechanisms remain poorly understood. We conduct a…

29
arXiv — NLP / Computation & Language research 5d ago

RAVEN: Long-Horizon Reasoning & Navigation with a Visuo-Spatio-Temporal Memory

arXiv:2606.25206v1 Announce Type: cross Abstract: Long-term robot deployment requires a compact and scalable memory that preserves fine-grained visual semantics, grounds observations in space and time, and enables efficient storage and retrieval. In this paper, we propose RAVEN,…

21
arXiv — NLP / Computation & Language research 5d ago

Security and Privacy in Retrieval-Augmented Generation: Architectures, Threats, Defenses, and Future Directions for Building Trustworthy Systems

arXiv:2606.25533v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) has emerged as a dominant paradigm for enhancing large language models with external knowledge. By coupling retrieval mechanisms with generative models, RAG systems improve factual grounding…

31
arXiv — NLP / Computation & Language research 5d ago

Tracing Target Answers in Poisoned Retrieval Corpora via Token Influence Attribution

arXiv:2606.25721v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) systems are vulnerable to corpus poisoning attacks that manipulate model outputs through malicious retrieved documents. Existing detection methods typically rely on auxiliary classifiers or…

30
arXiv — NLP / Computation & Language research 5d ago

Learning to Erase Private Knowledge from Multi-Documents for Retrieval-Augmented Large Language Models

arXiv:2504.09910v2 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) is a promising technique for applying LLMs to proprietary domains. However, retrieved documents may contain sensitive knowledge, posing risks of privacy leakage in generative results. Thus,…

4
arXiv — NLP / Computation & Language research 5d ago

How Pragmatics Shape Articulation: A Computational Case Study in STEM ASL Discourse

arXiv:2510.23842v2 Announce Type: replace Abstract: Most state-of-the-art sign language models are trained on interpreter or isolated vocabulary data, which overlooks the variability that characterizes natural dialogue. However, human communication dynamically adapts to contexts…

31
r/MachineLearning community 5d ago

High Dimensional, Dynamic Rotary Positional Embedding [P]

At the end of my last post , I presented an idea: what if I used the core of my last project, the cumulative matrix product, and repurposed it as a positional embedding? I just finished fleshing out the math behind HDD-RoPE and training a model with this positional embedding…

31
Hugging Face Daily Papers research 5d ago

ChartWalker: Benchmarking the Cross-Chart RAG Task

Abstract ChartWalker presents a novel framework for cross-chart retrieval-augmented generation with hierarchical knowledge graph construction and structure-aware sampling for challenging multi-modal analytical tasks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Cross-Chart…

33
r/LocalLLaMA community 5d ago

llama.cpp updates - granite-speech-4.1-2b, LFM2.5-ColBERT/Embedding-350M, Vulkan backend related changes & Misc items

Supported Models : granite-speech-4.1-2b-plus by 24818 LFM2.5-ColBERT-350M & LFM2.5-Embedding-350M by 24913 Vulkan : vulkan: link ggml-cpu when GGML_VULKAN_CHECK_RESULTS / RUN_TESTS are enabled #24444 vulkan: make mul_mm ALIGNED a spec constant #24689 vulkan: support CONV_3D…

27
llama.cpp releases dev-tools 5d ago

b9777

model : Add LFM2.5-ColBERT-350M and LFM2.5-Embedding-350M ( #24913 ) model : Add LFM2.5-ColBERT-350M and LFM2.5-Embedding-350M Restore LFM2 models in README.md macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS…

7
arXiv — Machine Learning research 6d ago

Reconstructing GRACE Terrestrial Water Storage with Spatio-Temporal Graph Neural Networks: An Application to South America

arXiv:2606.23833v1 Announce Type: new Abstract: Terrestrial water storage (TWS) integrates snow, soil moisture, surface water, and groundwater and is a key indicator of how climate variability and human activity reshape the global water cycle. The GRACE and GRACE-FO satellite…

21
arXiv — Machine Learning research 6d ago

Learning to Trigger: Reinforcement Learning at the Large Hadron Collider

arXiv:2606.23993v1 Announce Type: new Abstract: High-throughput scientific facilities such as the Large Hadron Collider depend on real-time event filtering (\textit{triggering}) under tight constraints on bandwidth, latency, and storage. In practice, trigger menus are largely…

24
arXiv — Machine Learning research 6d ago

Rapid FinFET Modelling Using an Autoencoder

arXiv:2606.24046v1 Announce Type: new Abstract: This work presents a machine learning framework that leverages an autoencoder (AE) for the efficient modeling of FinFET. We first calibrated a BSIM-CMG model to generate a dataset of current-voltage (ID-VG) characteristics. This…

7
arXiv — NLP / Computation & Language research 6d ago

Blockwise Policy-Drift Gating for On-Policy Distillation

arXiv:2606.24084v1 Announce Type: cross Abstract: On-policy distillation (OPD) trains a student policy using teacher signals computed on trajectories sampled by the student itself. Recent work shows that sampled-token OPD can be fragile on long-horizon reasoning tasks and that…

30
arXiv — NLP / Computation & Language research 6d ago

Do LLM Attribution Metrics Transfer? Auditing Retrieval-Augmented Generation Evaluation Across Datasets and Constructs

arXiv:2606.23915v1 Announce Type: new Abstract: Practice often treats automatic metrics for attribution in LLM retrieval-augmented generation as interchangeable. We audit eight automatic scorers -- lexical, embedding, and BERTScore baselines alongside…

28
arXiv — NLP / Computation & Language research 6d ago

Does My Embedding Reflect That $A = B$? Evaluating Mathematical Equivalence in Embedding Models

arXiv:2606.23959v1 Announce Type: new Abstract: Because mathematics is highly abstract, a single statement can take very different forms depending on what subfield it is framed in. There are many examples where breakthroughs occurred after researchers discovered that a question…

25
arXiv — NLP / Computation & Language research 6d ago

RASC+: Retrieval-Constrained LLM Adjudication for Clinical Value Set Authoring

arXiv:2606.23992v1 Announce Type: new Abstract: Clinical value sets define the standardized terminology codes used in quality measurement, phenotyping, cohort construction, and clinical decision support. The recently introduced Retrieval-Augmented Set Completion (RASC) benchmark…

32
arXiv — NLP / Computation & Language research 6d ago

A P\={a}ninian Foundation for Indic Language Processing

arXiv:2606.24172v1 Announce Type: new Abstract: More than a billion people communicate in Indic languages, yet the natural language processing infrastructure serving them remains fragmented and underdeveloped. The cause is structural: the field organizes its tools and benchmarks…

24
arXiv — NLP / Computation & Language research 6d ago

EXPO-SQL: Execution-based Clause-level Policy Optimization for Text-to-SQL

arXiv:2606.23693v1 Announce Type: new Abstract: Text-to-SQL enables users to query databases using natural language by generating executable SQL queries. Recent methods have increasingly adopted Large Language Models based reinforcement learning (RL) to leverage execution…

18
arXiv — NLP / Computation & Language research 6d ago

Quantifying Prior Dominance in RAG Systems

arXiv:2606.23695v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) grounds Large Language Models in external knowledge, yet current evaluations rely on discrete heuristics that suffer from ''epistemic blindness'' - failing to distinguish genuine contextual…

28
arXiv — NLP / Computation & Language research 6d ago

MMed-Bench-IR: A Heterogeneous Benchmark for Multilingual Medical Information Retrieval

arXiv:2606.24200v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) in clinical settings increasingly requires multilingual retrieval against predominantly English evidence corpora. Multilingual medical retrieval demands three capabilities: cross-lingual…

36
arXiv — NLP / Computation & Language research 6d ago

Prague Dependency Treebank -- Consolidated 2.0: Enriching a Complex Annotation Scheme

arXiv:2606.24324v1 Announce Type: new Abstract: The Prague Dependency Treebank framework is unique in its attempt to systematically include and link different layers of language, including a meaning representation with several types of inter-sentential phenomena, especially…

12

A Generalization Theory for JEPA-Based World Models

Cross-Head Attention Uplift Network with Inverse Propensity Score under Unobserved Confounding

BetXplain: An Explanation-Annotated Dataset for Detecting Manipulative Betting Advertisements on Social Media

Charting the Growth of Social-Physical HRI (spHRI): A Systematic Review Pipeline Augmented by Small Language Models

ProvenAI: Provenance-Native Traces of Evidence in Generated Answers

Speaking Numbers to LLMs: Multi-Wavelet Number Embeddings for Time Series Forecasting

Comparing BERT Sentence-Pair Classification and Few-Shot LLM Prompting for Detecting Threat and Solution Framing in German Climate News

Temporal Validity in Retrieval Memory: Eliminating Stale-Fact Errors for AI Agents over Evolving Knowledge

Improving General Role-Playing Agents via Psychology-Grounded Reasoning and Role-Aware Policy Optimization

From Clicks to Intent: Cross-Platform Session Embeddings with LLM-Distilled Taxonomy for Financial Services Recommendations

Jailbreaking for the Average Jane: Choosing Optimal Jailbreaks via Bandit Algorithms for Automatically Enhanced Queries

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

Sandboxes now expire based on last use

Fast medical RAG API to give your local LLMs access to facts

From Meta Idea to Advanced Mathematical Discovery -- Human-AI Co-Discovery of Sign-Embedding Quantum Algorithms

Holographic Memory for Zero-Shot Compositional Reasoning in Knowledge Graphs: A Mechanistic Study of Where and Why It Fails

A Single Stepsize Suffices for Unprojected Linear TD(0): Simultaneous Robust and Fast Rates via Polyak--Ruppert Averaging

Retrieval-Augmented Personalization with Foundation Models for Wearable Stress Detection

What's in an Earth Embedding? An Explainability Analysis of Location Encoders

Bias-Controlled Primal-Dual Natural Actor-Critic: Optimal Rates for Constrained Multi-Objective Average-Reward RL

Training for the Model You Return: Improving Optimization for Iterate-Averaged Language Models

Error-Aware TF-IDF Retrieval-Augmented Generation for ASR Error Correction

Memory Makes the Difference: Evaluating How Different Memory Roles Shape Conversational Agents

Story Operators: Decomposing the Original $\to$ Sequel Transformation in Embedding Space

SFL-MTSC: Leveraging Semantic Frame-Level Multi-Task Self-Consistency for Robust Multi-Intent Spoken Language Understanding

Is GraphRAG Needed? From Basic RAG to Graph-/Agentic Solutions with Context Optimization

BitNet Text Embeddings

Weave of Formal Thought

To Isolate or to Score? Model-Adaptive Assessment for Cost-Efficient Multi-Agent RAG

RAVEN: Long-Horizon Reasoning & Navigation with a Visuo-Spatio-Temporal Memory

Security and Privacy in Retrieval-Augmented Generation: Architectures, Threats, Defenses, and Future Directions for Building Trustworthy Systems

Tracing Target Answers in Poisoned Retrieval Corpora via Token Influence Attribution

Learning to Erase Private Knowledge from Multi-Documents for Retrieval-Augmented Large Language Models

How Pragmatics Shape Articulation: A Computational Case Study in STEM ASL Discourse

High Dimensional, Dynamic Rotary Positional Embedding [P]

ChartWalker: Benchmarking the Cross-Chart RAG Task

llama.cpp updates - granite-speech-4.1-2b, LFM2.5-ColBERT/Embedding-350M, Vulkan backend related changes & Misc items

b9777

Reconstructing GRACE Terrestrial Water Storage with Spatio-Temporal Graph Neural Networks: An Application to South America

Learning to Trigger: Reinforcement Learning at the Large Hadron Collider

Rapid FinFET Modelling Using an Autoencoder

Blockwise Policy-Drift Gating for On-Policy Distillation

Do LLM Attribution Metrics Transfer? Auditing Retrieval-Augmented Generation Evaluation Across Datasets and Constructs

Does My Embedding Reflect That $A = B$? Evaluating Mathematical Equivalence in Embedding Models

RASC+: Retrieval-Constrained LLM Adjudication for Clinical Value Set Authoring

A P\={a}ninian Foundation for Indic Language Processing

EXPO-SQL: Execution-based Clause-level Policy Optimization for Text-to-SQL

Quantifying Prior Dominance in RAG Systems

MMed-Bench-IR: A Heterogeneous Benchmark for Multilingual Medical Information Retrieval

Prague Dependency Treebank -- Consolidated 2.0: Enriching a Complex Annotation Scheme