Tag

Rag

500 articles archived under #rag · RSS

arXiv — NLP / Computation & Language research 1mo ago

SERC: LDPC-Inspired Semantic Error Correction for Retrieval-Augmented Generation

arXiv:2605.28837v1 Announce Type: new Abstract: While Large Language Models (LLMs) have demonstrated remarkable capabilities, their reliability is significantly compromised by hallucinations. Existing intrinsic self-correction methods attempt to address this, but often fail due…

17
arXiv — NLP / Computation & Language research 1mo ago

Same Question, Different Source, Different Answer: Auditing Source-Dependence in Medical Multi-Source RAG

arXiv:2605.29084v1 Announce Type: new Abstract: A retrieval-augmented generation (RAG) system deployed over a multi-author institutional corpus can give a different answer to the same question depending on which source it retrieves -- a failure mode the dominant…

24
arXiv — NLP / Computation & Language research 1mo ago

Slogans or Stance? A Label-Light Diagnostic for Entrepreneurial-Discourse Measurement on Chinese SOE Speeches

arXiv:2605.29188v1 Announce Type: new Abstract: Dictionary methods, topic models, and embedding-similarity scorers are widely used in CSS and management research to measure constructs such as "entrepreneurial spirit" in corporate speeches. We contribute a label-light measurement…

5
arXiv — NLP / Computation & Language research 1mo ago

SkillBrew: Multi-Objective Curation of Skill Banks for LLM Agents

arXiv:2605.29440v1 Announce Type: new Abstract: Retrieval-augmented LLM agents increasingly rely on curated skill banks: collections of reusable textual principles that guide decision making on complex tasks. Existing approaches typically expand these banks in an append-only…

12
arXiv — NLP / Computation & Language research 1mo ago

Kronecker Embeddings: Byte-Level Structured Token Representations for Parameter-Efficient Language Models

arXiv:2605.29459v1 Announce Type: new Abstract: Large language models route every input through a learned embedding table of shape |V| x d_model, consuming hundreds of millions to billions of trainable parameters at frontier scale. We introduce Kronecker Embeddings, a…

38
arXiv — NLP / Computation & Language research 1mo ago

Entity-Collision: A Stratified Protocol for Attributing Retrieval Lift in Agent Memory

arXiv:2605.29630v1 Announce Type: new Abstract: End-to-end agent-memory benchmarks report a single hit@k per retriever, confounding lexical leakage (uncontrolled query/gold/distractor entity overlap) with tag-mixing (preferences, services, tools averaged together). We propose…

10
arXiv — NLP / Computation & Language research 1mo ago

Teaching Language Models to Check Grounded Claim Factuality with Human Test-Taking Strategies

arXiv:2605.29712v1 Announce Type: new Abstract: Grounded claim factuality checking is important for large language model (LLM) applications such as retrieval-augmented generation, as it helps users assess the correctness of generated outputs. Existing metrics using entailment…

37
arXiv — NLP / Computation & Language research 1mo ago

Leveraging Routing Dynamics in Mixture-of-Experts Models for Efficient Language Adaptation

arXiv:2605.29714v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models are widely used to scale language models, yet their expert routing behavior and adaptation in a multilingual setting remain underexplored. In this work, we study multilingual routing dynamics during…

20
arXiv — NLP / Computation & Language research 1mo ago

User-Aware Active Knowledge Acquisition for Emotional Support Dialogue

arXiv:2605.29715v1 Announce Type: new Abstract: Emotional support plays an important role in dialogue systems, and its success depends on adapting to a user's evolving and implicit needs across multi-turn interactions while leveraging the strong reasoning capacity of large…

5
The Information — AI news-outlet 1mo ago

Dell Shares Rise Nearly 40% After Surge in AI Server Sales

Customer demand for AI is pouring jet fuel on Dell’s server, storage and networking businesses, as its overall revenue grew 88% to $43.8 billion during its April quarter compared to last year, beating its earlier forecast by more than $8 billion. Excluding items, Dell reported…

29
r/LocalLLaMA community 1mo ago

What's your favorite local MCP server?

I've seen so many rag this, memory that projects. What projects are people actually using day to day for agentic workloads. I only use 4, and I still consider that too much honestly. I just want to see what projects people recommend so I can bulk up or trim down my list.  …

15
Smol AI News news-outlet 1mo ago

Anthropic raises $65B in Series H at a $965B post-money valuation, releases Opus 4.8 and Dynamic Workflows

**Anthropic** announced a massive **$65B Series H financing** at a **$965B valuation**, led by **Altimeter, Dragoneer, Greenoaks, and Sequoia**, with run-rate revenue surpassing **$47B**. They launched **Claude Opus 4.8**, an update to Opus 4.7 featuring "sharper judgment,"…

28
arXiv — Machine Learning research 1mo ago

Transferable Reinforcement Learning via Probabilistic Latent Embeddings and Dynamic Policy Adaptation for Sim-to-Real Deployment

arXiv:2605.27659v1 Announce Type: new Abstract: Due to limited resources and public safety concerns, deep reinforcement learning (RL) agents for many cyber-physical systems (e.g., autonomous vehicles) are first trained in simulators. However, when deployed in real world…

15
arXiv — Machine Learning research 1mo ago

Heterogeneous Parallelism for Multimodal Large Language Model Training

arXiv:2605.27678v1 Announce Type: new Abstract: Foundation model training is becoming multimodal, from post-training pipelines to large-scale pretraining. As modality coverage broadens, context windows grow, and encoder LLM scales diverge, a single LLM-centric TP/CP/PP/DP/EP…

34
arXiv — Machine Learning research 1mo ago

Restoring the Sweet Spot: Pass-Rate Weighted Self-Distillation for LLM Reasoning

arXiv:2605.27765v1 Announce Type: new Abstract: Self-Distillation Policy Optimization (SDPO) provides dense token-level credit assignment for reinforcement learning with large language models by leveraging the model's own feedback-conditioned predictions as a self-teacher.…

34
arXiv — Machine Learning research 1mo ago

QuITE: Query-Based Irregular Time Series Embedding

arXiv:2605.28166v1 Announce Type: new Abstract: Irregular Multivariate Time Series (IMTS) are common in practice, yet their irregular sampling complicates effective modeling. Existing approaches typically either (i) design specialized architectures that limit the reuse of proven…

7
arXiv — Machine Learning research 1mo ago

Hierarchical Synthetic Tabular Data Generation: A Hybrid Top-Down and Bottom-Up Framework

arXiv:2605.28198v1 Announce Type: new Abstract: Existing approaches for synthetic tabular data generation are based on either purely generative models or LLMs, both of which struggle with data heterogeneity, logical consistency, rare-event coverage, and robustness in low-data…

4
arXiv — Machine Learning research 1mo ago

IRDS: Interpretable RLVR Data Selection via Verifier-Coupled Sparse Autoencoder Coverage

arXiv:2605.28247v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR) has become a key technique for en- hancing LLM reasoning, yet its data ineffi- ciency remains a major bottleneck. Existing methods address this problem only partially, each…

7
arXiv — NLP / Computation & Language research 1mo ago

RAG-Coding: Enhancing LLM Medical Coding with Structured External Knowledge

arXiv:2605.27377v1 Announce Type: new Abstract: We present RAG-Coding, an agentic method for automated ICD-10-CM coding. RAG-Coding orchestrates four large language model (LLM) agents and grounds their coding decisions in external knowledge sources (e.g. the official coding…

33
arXiv — NLP / Computation & Language research 1mo ago

The Fragility of Chain-of-Thought Monitoring Across Typologically Diverse Languages

arXiv:2605.27901v1 Announce Type: new Abstract: Chain-of-thought (CoT) monitoring has been proposed as a promising safety mechanism for detecting misaligned behavior in large language models. However, its reliability remains largely unexplored beyond English and across diverse…

30
arXiv — NLP / Computation & Language research 1mo ago

AI Research Agents Narrow Scientific Exploration

arXiv:2605.27905v1 Announce Type: new Abstract: AI research agents can now generate research ideas, design experiments, run code, and draft papers, raising the possibility of large-scale AI-assisted scientific discovery. Many current agent frameworks explicitly encourage the…

4
arXiv — NLP / Computation & Language research 1mo ago

Boundary Suppression Asymmetry in Post-trained Assistants: Over-expansion as a Controllability Cost

arXiv:2605.27969v1 Announce Type: new Abstract: Post-trained language-model assistants are often optimized to avoid under-answering, encouraging complete, helpful, cautious, and proactive responses. We ask whether this optimization creates asymmetric controllability costs: when…

38
arXiv — NLP / Computation & Language research 1mo ago

Beyond Chunk-Local Extraction: Cross-Chunk Graph Augmentation for GraphRAG

arXiv:2605.28004v1 Announce Type: new Abstract: GraphRAG extends retrieval-augmented generation by organizing corpora as explicit knowledge graphs, enabling graph-based retrieval for complex question answering. However, existing frameworks extract entities and relations within…

15
arXiv — NLP / Computation & Language research 1mo ago

PromptEmbedder:: Efficient and Transferable Text Embedding via Dual-LLM Soft Prompting

arXiv:2605.28066v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable efficacy in text embedding, yet current adaptation methods like LoRA face significant bottlenecks in computational efficiency and cross-architecture transferability.…

17
arXiv — NLP / Computation & Language research 1mo ago

ConRAG: Consensus-Driven Multi-View Retrieval for Multi-Hop Question Answering

arXiv:2605.28093v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) has emerged as a promising paradigm for enhancing large language models (LLMs) on multi-hop question answering (QA), which requires reasoning over evidence from multiple documents. Current…

19
arXiv — NLP / Computation & Language research 1mo ago

LegalGraphRAG: Multi-Agent Graph Retrieval-Augmented Generation for Reliable Legal Reasoning

arXiv:2605.28120v1 Announce Type: new Abstract: Graph-based Retrieval-Augmented Generation (GraphRAG) advances flat document retrieval by structuring knowledge as relational graphs, enabling more coherent and effective reasoning. However, applying it to specific domains like…

10
arXiv — NLP / Computation & Language research 1mo ago

Chinese Word Boundary Recovery through Character Alignment Projection

arXiv:2605.28128v1 Announce Type: new Abstract: Chinese word segmentation is especially fragile in non-standard text, where language learner errors and other character-level divergences disrupt the word boundaries assumed by downstream annotation and evaluation. This paper…

30
arXiv — NLP / Computation & Language research 1mo ago

The Harder Text Embedding Benchmark (HTEB): Beyond One-dimensional Static Robustness

arXiv:2605.28190v1 Announce Type: new Abstract: Embedding benchmarks like MTEB report a single score per model, implicitly treating robustness as a static, scalar property. We argue that embedding robustness is multidimensional, since models respond differently to different…

15
arXiv — NLP / Computation & Language research 1mo ago

Analyzing Quality-Latency-Resource Trade-offs in a Technical Documentation RAG Assistant Using LoRA Adaptation

arXiv:2605.28222v1 Announce Type: new Abstract: We study quality-latency-resource trade-offs in a documentation-grounded retrieval-augmented generation (RAG) system that uses Low-Rank Adaptation (LoRA) of the generator. We build a manually verified benchmark of 5,144…

17
arXiv — NLP / Computation & Language research 1mo ago

When Seekers Are Hard to Help: Evaluating Emotional Support Dialogue Systems in Worst-Case Interactions

arXiv:2605.28228v1 Announce Type: new Abstract: Emotional Support Dialogue Systems (ESDSes) are increasingly evaluated and trained with LLM-simulated seekers. However, such simulated seekers often behave as cooperative, average-case users who disclose clearly, respond…

29
r/LocalLLaMA community 1mo ago

CrankGPT by Squeez Labs - hand-cranked edge AI - talk about local AI!!!

I met Katrin from Squeez Labs at an event hosted by Pathway AI (the team behind Baby Dragon Hatchling) where she told me about CrankGPT, a literally hand-cranked device for running local LLMs. It's apparently real. It's appearently launched. It's apparently glorious. Check it…

15
Hacker News — AI on Front Page community 1mo ago

I'm Getting into Mesh Networks (Meshtastic, MeshCore, and Reticulum)

Article URL: https://www.jonaharagon.com/posts/im-getting-into-mesh-networks-meshtastic-meshcore-and-reticulum/ Comments URL: https://news.ycombinator.com/item?id=48299638 Points: 211 # Comments: 78

19
r/MachineLearning community 1mo ago

EMA-Gated Temporal Sequence Compression in Vision Transformers [P]

Vision Transformers waste 90% of their compute recalculating stationary asphalt. NeuroFlow tracks semantic surprise in embedding space, physically eliminating background tokens before the encoder. NeuroFlow is a dynamic routing framework for Vision Transformer video inference.…

34
r/LocalLLaMA community 1mo ago

Is Granite-4.1-30b Overshadowed by Qwen3.6 & Gemma4 models?

I don't see any threads on this model. Is it because it's dense and/or without- reasoning ? Anyone tried this for coding? Capabilities Summarization Text classification Text extraction Question-answering Retrieval Augmented Generation (RAG) Code related tasks Function-calling…

38
Hugging Face Daily Papers research 1mo ago

STREAM: A Data-Centric Framework for Mining High-Value Task-Oriented Dialogues from Streaming Media

Abstract A data-centric framework called Stream generates large-scale, multi-domain service dialogues by synthesizing interactions from streaming media, incorporating persona construction and conversational blueprints with retrieval-augmented generation for knowledge-aware…

9
Hugging Face Daily Papers research 1mo ago

Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini

Abstract Gemini Embedding 2 is a multimodal embedding model that generates unified representations for video, audio, image, and text data, achieving superior performance across diverse retrieval tasks and demonstrating strong zero-shot capabilities across specialized domains.…

18
Hugging Face Daily Papers research 1mo ago

VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions

Abstract VitaBench 2.0 evaluates personalized and proactive agent behavior in long-term user interactions by requiring continuous extraction and updating of user preferences from fragmented interactions. AI-generated summary Large language models (LLMs) have evolved into…

27
arXiv — Machine Learning research 1mo ago

Planning Neural Dynamics with Lie Group Embedding through Supervised Projective Manifold Learning

arXiv:2605.26167v1 Announce Type: new Abstract: We propose Lie group embedded dynamical neural networks (LieEDNN) and the corresponding learning algorithms based on gradient descent and metric projection on smooth manifold, where we treat Lie group as an intrinsic representation…

28
arXiv — Machine Learning research 1mo ago

Reparametrizing Shampoo and SOAP for Subspace Basis Updates and BFloat16 Storage

arXiv:2605.26327v1 Announce Type: new Abstract: Shampoo-based methods, such as KL-Shampoo and SOAP, have demonstrated strong performance in training neural networks and rely on QR decomposition. Because existing QR implementations require single-precision (FP32) arithmetic and…

6
arXiv — Machine Learning research 1mo ago

Adversarial Training for Robust Coverage Network under Worst-case Facility Losses

arXiv:2605.26763v1 Announce Type: new Abstract: The Maximal Covering Location-Interdiction Problem (MCLIP) is a classic bi-level optimization problem, which is fundamental to resilient infrastructure planning yet remains computationally intractable. Specifically, the upper level…

5
arXiv — Machine Learning research 1mo ago

MONA: Muon Optimizer with Nesterov Acceleration for Scalable Language Model Training

arXiv:2605.26842v1 Announce Type: new Abstract: The Muon optimizer has recently offered a promising alternative to AdamW for large language model training, leveraging matrix orthogonalization to produce geometry-aware updates. However, like all first-order methods, Muon can…

14
arXiv — Machine Learning research 1mo ago

SPHERE-JEPA: Spherical Prediction with Homogeneous Embeddings

arXiv:2605.26900v1 Announce Type: new Abstract: A fundamental open question in self-supervised learning (SSL) is the explicit characterization of the optimal geometry of the learned representations. Recently, LeJEPA identified isotropic Gaussian embeddings as optimal for…

38
arXiv — NLP / Computation & Language research 1mo ago

In-Context Optimization for Retrieval-Augmented Generation: A Gradient-Descent Perspective

arXiv:2605.26356v1 Announce Type: new Abstract: In-context learning has recently been linked to implicit gradient descent in linear self-attention models, suggesting that context can induce a forward-pass update. Retrieval-augmented generation (RAG) also relies on context, but…

10
arXiv — NLP / Computation & Language research 1mo ago

FAB-Bench: A Framework for Adaptive RAG Benchmarking in Semiconductor Manufacturing

arXiv:2605.26476v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) has become critical for knowledge-intensive applications, yet evaluating its performance in vertical domains remains difficult due to domain complexity, diverse context scales, and heavy…

6
arXiv — NLP / Computation & Language research 1mo ago

Hubness, Not Anisotropy, Drives Cross-Lingual Retrieval Asymmetry in Multilingual Embedding Models

arXiv:2605.26575v1 Announce Type: new Abstract: Multilingual embedding models are deployed under the assumption that cross-lingual retrieval is symmetric: if a query in language A retrieves its translation in language B, the reverse should also hold. In practice it does not.…

29
arXiv — NLP / Computation & Language research 1mo ago

DunbaaBERT: From Sacrifice to Semantics

arXiv:2605.26935v1 Announce Type: new Abstract: Large language models have achieved strong performance across many NLP tasks, yet Urdu remains comparatively underexplored due to limited resources and fragmented evaluation settings. To address this gap, we introduce DunbaaBERT, a…

22
arXiv — NLP / Computation & Language research 1mo ago

MiRD: Reliable Set-Valued Prediction for Open-Ended Question Answering via Miscoverage Risk Decomposition

arXiv:2605.27091v1 Announce Type: new Abstract: Reliable set-valued prediction provides a principled way to mitigate hallucinations in open-ended question answering (QA), yet existing conformal approaches typically rely on a fragile premise: finite sampling must already produce…

28
arXiv — NLP / Computation & Language research 1mo ago

LitSeg: Narrative-Aware Document Segmentation for Literary RAG

arXiv:2605.27156v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by incorporating external knowledge, particularly for long-tail domains such as literary works. However, the critical step of document segmentation in RAG…

18
arXiv — NLP / Computation & Language research 1mo ago

Grounding Text Embeddings in Stakeholder Associations

arXiv:2605.27168v1 Announce Type: new Abstract: Text embeddings are widely used to analyse large corpora of complex texts. However, it is unclear whether the embeddings capture the same semantic distances as the human experts using them. Ensuring alignment between embedding…

17
arXiv — NLP / Computation & Language research 1mo ago

The Coverage Illusion: From Pre-retrieval Routing Failure to Post-retrieval Cascades in a Production RAG System

arXiv:2605.27220v1 Announce Type: new Abstract: In modern RAG pipelines, query augmentation methods such as HyDE and query expansion are applied to every query, resulting in substantial LLM inference costs and increased end-to-end latency. The empirical justification for this…

31

SERC: LDPC-Inspired Semantic Error Correction for Retrieval-Augmented Generation

Same Question, Different Source, Different Answer: Auditing Source-Dependence in Medical Multi-Source RAG

Slogans or Stance? A Label-Light Diagnostic for Entrepreneurial-Discourse Measurement on Chinese SOE Speeches

SkillBrew: Multi-Objective Curation of Skill Banks for LLM Agents

Kronecker Embeddings: Byte-Level Structured Token Representations for Parameter-Efficient Language Models

Entity-Collision: A Stratified Protocol for Attributing Retrieval Lift in Agent Memory

Teaching Language Models to Check Grounded Claim Factuality with Human Test-Taking Strategies

Leveraging Routing Dynamics in Mixture-of-Experts Models for Efficient Language Adaptation

User-Aware Active Knowledge Acquisition for Emotional Support Dialogue

Dell Shares Rise Nearly 40% After Surge in AI Server Sales

What's your favorite local MCP server?

Anthropic raises $65B in Series H at a $965B post-money valuation, releases Opus 4.8 and Dynamic Workflows

Transferable Reinforcement Learning via Probabilistic Latent Embeddings and Dynamic Policy Adaptation for Sim-to-Real Deployment

Heterogeneous Parallelism for Multimodal Large Language Model Training

Restoring the Sweet Spot: Pass-Rate Weighted Self-Distillation for LLM Reasoning

QuITE: Query-Based Irregular Time Series Embedding

Hierarchical Synthetic Tabular Data Generation: A Hybrid Top-Down and Bottom-Up Framework

IRDS: Interpretable RLVR Data Selection via Verifier-Coupled Sparse Autoencoder Coverage

RAG-Coding: Enhancing LLM Medical Coding with Structured External Knowledge

The Fragility of Chain-of-Thought Monitoring Across Typologically Diverse Languages

AI Research Agents Narrow Scientific Exploration

Boundary Suppression Asymmetry in Post-trained Assistants: Over-expansion as a Controllability Cost

Beyond Chunk-Local Extraction: Cross-Chunk Graph Augmentation for GraphRAG

PromptEmbedder:: Efficient and Transferable Text Embedding via Dual-LLM Soft Prompting

ConRAG: Consensus-Driven Multi-View Retrieval for Multi-Hop Question Answering

LegalGraphRAG: Multi-Agent Graph Retrieval-Augmented Generation for Reliable Legal Reasoning

Chinese Word Boundary Recovery through Character Alignment Projection

The Harder Text Embedding Benchmark (HTEB): Beyond One-dimensional Static Robustness

Analyzing Quality-Latency-Resource Trade-offs in a Technical Documentation RAG Assistant Using LoRA Adaptation

When Seekers Are Hard to Help: Evaluating Emotional Support Dialogue Systems in Worst-Case Interactions

CrankGPT by Squeez Labs - hand-cranked edge AI - talk about local AI!!!

I'm Getting into Mesh Networks (Meshtastic, MeshCore, and Reticulum)

EMA-Gated Temporal Sequence Compression in Vision Transformers [P]

Is Granite-4.1-30b Overshadowed by Qwen3.6 & Gemma4 models?

STREAM: A Data-Centric Framework for Mining High-Value Task-Oriented Dialogues from Streaming Media

Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini

VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions

Planning Neural Dynamics with Lie Group Embedding through Supervised Projective Manifold Learning

Reparametrizing Shampoo and SOAP for Subspace Basis Updates and BFloat16 Storage

Adversarial Training for Robust Coverage Network under Worst-case Facility Losses

MONA: Muon Optimizer with Nesterov Acceleration for Scalable Language Model Training

SPHERE-JEPA: Spherical Prediction with Homogeneous Embeddings

In-Context Optimization for Retrieval-Augmented Generation: A Gradient-Descent Perspective

FAB-Bench: A Framework for Adaptive RAG Benchmarking in Semiconductor Manufacturing

Hubness, Not Anisotropy, Drives Cross-Lingual Retrieval Asymmetry in Multilingual Embedding Models

DunbaaBERT: From Sacrifice to Semantics

MiRD: Reliable Set-Valued Prediction for Open-Ended Question Answering via Miscoverage Risk Decomposition

LitSeg: Narrative-Aware Document Segmentation for Literary RAG

Grounding Text Embeddings in Stakeholder Associations

The Coverage Illusion: From Pre-retrieval Routing Failure to Post-retrieval Cascades in a Production RAG System