News / #rag Tag Rag 500 articles archived under #rag · RSS Sign in to follow arXiv — NLP / Computation & Language research 28d ago Graph-Augmented Retrieval for Cross-Entity Financial Sentiment Analysis: A Comparative Study arXiv:2606.00062v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) has become foundational for grounding large language models in domain-specific corpora, yet conventional vector-based RAG systems are fundamentally limited in their ability to capture the… 23 arXiv — NLP / Computation & Language research 28d ago DLLM-JEPA: Joint Embedding Predictive Architectures for Masked Diffusion Language Models arXiv:2606.00091v1 Announce Type: new Abstract: Joint Embedding Predictive Architectures (JEPAs) have reshaped self-supervised representation learning in vision. The recent LLM-JEPA ported JEPA to autoregressive language models but inherited two steep costs from the… 38 arXiv — NLP / Computation & Language research 28d ago OCC-RAG: Optimal Cognitive Core for Faithful Question Answering arXiv:2606.00683v1 Announce Type: new Abstract: Recent progress in the development of language models has been defined by scale, with each generation absorbing more of the world's knowledge into its weights. However, many practical applications benefit more from robust reasoning… 25 arXiv — NLP / Computation & Language research 28d ago Chunking Methods on Retrieval-Augmented Generation - Effectiveness Evaluation Against Computational Cost and Limitations arXiv:2606.00881v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) has demonstrated significant capabilities in enhancing the performance of Large Language Models (LLMs). One of the key tasks in RAG systems is the chunking process. Traditionally, fixed-size… 38 arXiv — NLP / Computation & Language research 28d ago ExpWeaver: LLM Agents Learn from Experience via Latent RAG arXiv:2606.01041v1 Announce Type: new Abstract: Experience learning has achieved promising results in enhancing LLM agent planning and reasoning by integrating past interactions as reusable knowledge. However, existing methods remain confined to explicit text space, retrieving… 28 arXiv — NLP / Computation & Language research 28d ago When Is 0.1% Enough? Analyzing the Combined Effects of Dimensionality Reduction and Quantization on Text Embedding Compression arXiv:2606.01074v1 Announce Type: new Abstract: Recent high-performing text embedding models often output high-dimensional real-valued vectors, resulting in substantial storage and computational costs. To address this issue, compression methods based on dimensionality reduction… 18 arXiv — NLP / Computation & Language research 28d ago DiscourseFlip: An Oblique Discourse-Level Opinion Manipulation Attack against Black-box Retrieval-Augmented Generation arXiv:2606.01212v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) systems are widely deployed and increasingly influential, but their reliance on external corpora exposes new security risks from poisoned retrieval content. Existing RAG attacks are largely… 23 arXiv — NLP / Computation & Language research 28d ago Connecting the Dots: Benchmarking Reflective Memory in Long-Horizon Dialogue arXiv:2606.01223v1 Announce Type: new Abstract: Despite substantial progress in long-context modeling, existing benchmarks remain confined to factual memory for explicit recall, failing to measure the reflective memory required to synthesize fragmented, multimodal cues into… 10 arXiv — NLP / Computation & Language research 28d ago Efficient RAG with Intent-Aware Retrieval and Semantics-Preserving Chunking arXiv:2606.01240v1 Announce Type: new Abstract: The demand for powerful instruction following and reasoning capability of large language models (LLMs) has promoted rapid development of retrieval-augmented generation (RAG). The RAG system assists LLM generation by retrieving… 36 Hugging Face Daily Papers research 28d ago Speculative Pipeline Decoding: Higher-Accruacy and Zero-Bubble Speculation via Pipeline Parallelism Abstract Speculative Pipeline Decoding introduces a novel framework that leverages pipeline parallelism to accelerate large language model inference by enabling parallel token processing and reducing decoding latency. AI-generated summary Speculative Decoding (SD) accelerates… 17 Ollama releases dev-tools 28d ago v0.30.0-rc32: llama-server followups (#16353) llama-server followups Misc fixes for #16031 Add back dropped ROCm build flag for multi-GPU support on windows Fix amdhip64_*.dll version detection for "latest" selection Fix embeddings API for consistent normalize behavior with prior versions ci: set up for automated llama.cpp… 19 r/MachineLearning community 28d ago [D] Simple Questions Thread Please post your questions here instead of creating a new thread. Encourage others who create new posts for questions to post here instead! Thread will stay alive until next one so keep posting after the date in the title. Thanks to everyone for answering questions in the… 32 Hugging Face Daily Papers research 28d ago A Topology-Aware Spatiotemporal Handover Framework for Continuous Multi-UAV Tracking Abstract A real-time multi-camera multi-vehicle tracking system addresses trajectory fragmentation in UAV-based traffic monitoring through a topology-based spatiotemporal handover mechanism and deterministic queue-based matching algorithm. AI-generated summary The integration of… 21 Hugging Face Daily Papers research 28d ago One Click per Cell Type Suffices: Training-free Group Interaction for Cell Instance Segmentation Abstract Group Prompting enables efficient cell instance segmentation by leveraging per-type prompting through a training-free framework that uses multi-scale encoder features and recursive prompt expansion. AI-generated summary Cell instance segmentation models trained on… 32 Hugging Face Daily Papers research 28d ago How can embedding models bind concepts? Abstract Vision-language models like CLIP struggle with concept binding despite recognizing individual concepts, but controlled transformer models can learn low-complexity binding functions that generalize better through multiplicative interactions. AI-generated summary Humans… 11 arXiv — Machine Learning research 29d ago Counterfactual Evaluation Reveals Hidden Capability Profiles in Clinical LLMs and Agents arXiv:2605.30590v1 Announce Type: new Abstract: Two clinical AI systems can score nearly identically on coverage-based rubrics yet behave radically differently when their patient inputs change: one updates its recommendations to match the new clinical signal, while the other… 23 arXiv — Machine Learning research 29d ago ScaleMAP: Preserving Local Density and Neighborhood Structure in Low-Dimensional Embeddings arXiv:2605.30597v1 Announce Type: new Abstract: Nonlinear dimensionality-reduction methods such as UMAP and PaCMAP adaptively normalize local distances during graph construction, erasing neighborhood scale from the data. This distorts more than relative cluster sizes: sparse… 9 arXiv — Machine Learning research 29d ago TASER: Task-Aware Stein Regularisation for Geometry-Driven Robustness arXiv:2605.30601v1 Announce Type: new Abstract: Modern deep networks remain fragile under distribution shift and adversarial perturbations, often due to excessive or poorly structured input sensitivity. We introduce TASER (Task-Aware Stein Regularisation), a training-time… 20 arXiv — Machine Learning research 29d ago SemStruct: Contextualizing Semantic Embeddings with Structural Information for Schema Matching arXiv:2605.30729v1 Announce Type: new Abstract: Schema matching is a fundamental step in integrating heterogeneous data sources. While Pre-trained Language Models (PLMs) have revolutionized this task by capturing linguistic semantics, they typically process tabular data as… 35 arXiv — Machine Learning research 29d ago Efficient and Uncertainty-Aware Diffusion Framework for Offline-to-Online Reinforcement Learning arXiv:2605.30776v1 Announce Type: new Abstract: Offline-to-Online Reinforcement Learning (O2O-RL) leverages an offline, pre-trained policy to minimize costly online interactions. Although data-efficient, O2O-RL is susceptible to shifts between offline and online distributions.… 8 arXiv — Machine Learning research 29d ago Federated Variational Preference Alignment with Gumbel-Softmax Prior for Personalized User Preferences arXiv:2605.30873v1 Announce Type: new Abstract: Federated Learning (FL) offers a privacy-preserving pathway for aligning Large Language Models (LLMs); however, existing frameworks typically enforce a monolithic reward model, inevitably averaging out inherently conflicting user… 35 arXiv — NLP / Computation & Language research 29d ago Protocol for evaluating ChatGPT in biomedical association generation and verification using a RAG-enabled, cross-model majority voting workflow arXiv:2605.30400v1 Announce Type: new Abstract: We present a protocol to evaluate ChatGPT's ability to generate disease-centric biomedical associations. It outlines how we generate the associations, validate the biological entities using biomedical ontologies, and verify… 26 arXiv — NLP / Computation & Language research 29d ago CanLegalRAGBench: Evaluating Retrieval-Augmented Generation on Canadian Case Law arXiv:2605.30497v1 Announce Type: new Abstract: RAG-based legal assistants have been growing in popularity, but LLM hallucinations remain a key issue and potentially undermines justice. While benchmarks have been developed to evaluate progress, many rely on synthetic queries… 37 arXiv — NLP / Computation & Language research 29d ago Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs arXiv:2605.30501v1 Announce Type: new Abstract: Watermarking embeds statistical signatures in AI-generated text for detection and attribution. We reveal a fundamental vulnerability: when users access multiple models (today's reality), watermarks trivially fail. Watermarks… 21 arXiv — NLP / Computation & Language research 29d ago Generalistic or Specific Embeddings, Which is Better? An Empirical Study on Search for Clinical Coding in Non-English Languages arXiv:2605.30529v1 Announce Type: new Abstract: Sentence-embedding models for semantic search are overwhelmingly developed and evaluated on English corpora. When applied to clinical retrieval in other languages -- particularly retrieval of ICD-10-CM / CIE-10 codes -- recall… 26 arXiv — NLP / Computation & Language research 29d ago SAGE: A Novelty Gate for Efficient Memory Evolution in Agentic LLMs arXiv:2605.30711v1 Announce Type: new Abstract: Agentic LLMs must continuously decide whether newly extracted facts should be added, merged with existing memories, or ignored, yet prior work has focused more on retrieval and storage than on principled write-side control. We… 38 arXiv — NLP / Computation & Language research 29d ago MoG: Mixture of Experts for Graph-based Retrieval-Augmented Generation arXiv:2605.31010v1 Announce Type: new Abstract: Retrieval-augmented generation is intensively studied to ground large language models on external evidence. However, retrieving from a unified knowledge base could inevitably introduce irrelevant information that may mislead… 23 arXiv — NLP / Computation & Language research 29d ago On the Robustness of Multilingual Text Embedding Rankings Across Learning Tasks, Languages, and Benchmark Datasets arXiv:2605.31142v1 Announce Type: new Abstract: Large-scale multilingual text embedding models play crucial role in both research and industry, yet their behavior in language-specific, multi-task settings remains insufficiently understood. Although benchmarking platforms such as… 30 arXiv — NLP / Computation & Language research 29d ago Learning Whom to Trust: Market-Feedback Adaptive Retrieval for Frozen LLMs in Event-Driven Financial RAG arXiv:2605.31201v1 Announce Type: new Abstract: Financial retrieval-augmented generation (RAG) systems typically rank evidence by textual relevance, but in financial markets the useful evidence source depends on event type, forecast horizon, and market context. We study… 20 Hugging Face Daily Papers research 29d ago From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors Abstract Multi-step trojan attacks in local LLM agents can bypass existing defenses by embedding malicious prompts across multiple operations, requiring new detection methods like DASGuard for effective protection. AI-generated summary LLM agents are evolving from conversational… 20 The Information — AI news-outlet 29d ago Why Forward Deployed Engineers Are the Rage AI researchers may have the hottest job in tech, but forward-deployed engineers who put the AI to good use are becoming indispensable too. The military-inspired job title, which Palantir began using in the context of business software more than a decade ago, has spread to all… 29 llama.cpp releases dev-tools 1mo ago b9442 vocab : add tokenizer support for jina-embeddings-v2-base-zh ( #18756 ) vocab : add jina-embeddings-v2-base-zh (whitespace tokenizer) lowercase defaults to true type fix Co-authored-by: Sigbjørn Skjæret [email protected] macOS/iOS: macOS Apple Silicon (arm64) macOS… 12 r/LocalLLaMA community 1mo ago Why does Thinking Output More Tokens Than a Response? I was too lazy to use a vector DB + Embedding + Clustering for this list of 1000 items I wanted to categorize. I was hoping to use a local LLM to do it, but it would only respond with a list of about 100 items or so and their categories. It confused me because when I saw the… 22 r/MachineLearning community 1mo ago Why do the output layer weights become word vectors in Word2Vec? [D] I'm trying to understand the intuition behind Word2Vec training using a neural network. In Word2Vec (CBOW or Skip-gram), we often hear that the weight matrices learned during training contain the vector representations (embeddings) of words. However, I don't understand why the… 31 Hugging Face Daily Papers research 1mo ago CONF-KV: Confidence-Aware KV Cache Eviction with Mixed-Precision Storage for Long-Horizon LLM Abstract CONF-KV is a KV-cache management system that dynamically adjusts cache retention based on model uncertainty, improving memory efficiency and performance for long-sequence language model inference. AI-generated summary Long-horizon LLM inference turns the key--value (KV)… 12 The Information — AI news-outlet 1mo ago Kalshi, Coinbase Approved to Offer Crypto Perpetuals in U.S. Prediction market Kalshi won approval from U.S. regulators to offer crypto trading through bitcoin perpetuals, a type of highly-leveraged derivatives product, confirming an April report by The Information. Coinbase also won a greenlight from the U.S. Commodity Futures Trading… 14 Hugging Face Daily Papers research 1mo ago Xetrieval: Mechanistically Explaining Dense Retrieval Abstract Xetrieval is a mechanistic framework that explains dense retrieval by enhancing sentence embeddings with reasoning information and decomposing them into interpretable sparse features for retrieval decision explanations. AI-generated summary Explaining why dense… 32 llama.cpp releases dev-tools 1mo ago b9406 llama: add llm_graph_input_mtp ( #23643 ) llama: add llm_graph_input_mtp rename input_mtp -> input_token_embd add TODO about mtmd embedding cont : clean-up Co-authored-by: Georgi Gerganov [email protected] macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64,… 38 MIT Technology Review — AI news-outlet 1mo ago How the Pope’s Magnifica Humanitas offers a template for individuals to meet the AI moment Pope Leo XIV’s new encyclical on artificial intelligence includes a statement that warrants serious attention from technologists and policymakers: “Technology is never neutral.” Magnifica Humanitas (“Magnificent Humanity”) is a clarion call to all people to act with courage and… 8 Hugging Face Daily Papers research 1mo ago PRISM: A Multi-Dimensional Benchmark for Evaluating LLM Peer Reviewers Abstract PRISM evaluates automated peer review systems across multiple dimensions using argument mining and retrieval-augmented verification, revealing that while LLMs match human performance in specific areas, no system consistently equals human reviewers across all evaluation… 19 Hugging Face Daily Papers research 1mo ago RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains Abstract RUBRIC-ARROW presents an alternating framework for reward modeling that improves upon rubric-based methods by reducing ties and leveraging pairwise preference data for training. AI-generated summary Pointwise reward modeling offers critical signals for LLM… 4 arXiv — Machine Learning research 1mo ago TaxDistill: Improving Metagenomic Taxonomic Annotation via Distilled Genomic Foundation Models arXiv:2605.28868v1 Announce Type: new Abstract: Metagenomic taxonomic annotation aims to identify the microbial origins of DNA fragments in environmental samples. Traditional methods that rely on sequence similarity are often constrained by the high microbial diversity and the… 11 arXiv — Machine Learning research 1mo ago Spectral Guidance for Flexible and Efficient Control of Diffusion Models arXiv:2605.28900v1 Announce Type: new Abstract: We introduce Spectral Guidance, a framework for controlling diffusion models by leveraging the intrinsic geometry of the generative process. As data is progressively corrupted by noise, only a small number of features remain… 38 arXiv — Machine Learning research 1mo ago Cycle-Space Informed Detection of Autoencoded Blind False Data Injection Attacks on Power Systems arXiv:2605.28912v1 Announce Type: new Abstract: The rapid growth of AI-driven data centers and large-scale energy storage systems is increasing the reliance of power system operation on real-time measurement data and automated decision-making. However, many existing detection… 28 arXiv — Machine Learning research 1mo ago FedQHD: Closed-Form Function-Space Federated Reinforcement Learning arXiv:2605.29002v1 Announce Type: new Abstract: Federated reinforcement learning enables decentralized agents to collaboratively improve policies or value estimates without exchanging raw trajectories. However, FedAvg-style parameter averaging is not function-space consistent:… 24 arXiv — Machine Learning research 1mo ago Do Physics Foundation Models Learn Generalizable Physics? A Bias-Aware Benchmark Across Physical Regimes and Distribution Shifts arXiv:2605.29283v1 Announce Type: new Abstract: Recent physics foundation models claim general spatiotemporal forecasting ability, yet their evaluations often collapse performance into a single average score under a fixed training distribution. This makes it difficult to… 22 arXiv — Machine Learning research 1mo ago K-FinHallu: A Hallucination Detection Benchmark for Multi-Turn RAG in Korean Finance arXiv:2605.29523v1 Announce Type: new Abstract: Large Language Models (LLMs) have advanced financial automation through Retrieval-Augmented Generation (RAG), yet hallucinations remain a critical barrier to deployment in high-stakes environments. Existing benchmarks focus on… 38 arXiv — NLP / Computation & Language research 1mo ago What are They Thinking? Delineation, Probing and Tracking of Concepts in LLMs arXiv:2605.28823v1 Announce Type: new Abstract: As the influence of LLMs expands, it is imperative to gain insight into their decisions. One way to do that is to develop probes that detect the presence or absence of a broad set of concepts within the embeddings computed in an… 33 arXiv — NLP / Computation & Language research 1mo ago A comparative study of transformer-based embeddings for topic coherence arXiv:2605.28832v1 Announce Type: new Abstract: Topic modeling is a branch of Natural Language Processing (NLP) that aims to organize large collections of texts into coherent groups according to word co-occurrence patterns, with Latent Dirichlet Allocation (LDA) remaining one of… 32 arXiv — NLP / Computation & Language research 1mo ago GenesisFunc: Multi-Agent Data Generation for Accurate and Generalizable Function-Calling arXiv:2605.28835v1 Announce Type: new Abstract: Large Language Models (LLMs) extend their capabilities through function-calling (FC), which relies on training data with high quality, diversity, and broad coverage of scenario. However, obtaining and annotating real… 15 Page 9 of 10 · 500 articles ← Newer Older →