News / #rag Tag Rag 500 articles archived under #rag · RSS Sign in to follow arXiv — Machine Learning research 15d ago Numbers Already Carry Their Own Embeddings arXiv:2606.14108v1 Announce Type: new Abstract: We introduce Adelic operation-preserved embeddings (AOE), a training-free representation that captures both a number's real value and its modular (p-adic) signatures. This construction preserves additive and multiplicative… 10 arXiv — Machine Learning research 15d ago Learning High Coverage Discriminative Parsimonious Rulesets arXiv:2606.14156v1 Announce Type: new Abstract: Learning systems based on IF-THEN rule representations readily offer interpretability, making them a crucial focus in contemporary AI research. A key objective for such rule sets is to achieve both high discriminative power and… 9 arXiv — Machine Learning research 15d ago DRIVE: Distributional and Retrieval-Augmented Bidding with Value Evaluation arXiv:2606.14192v1 Announce Type: new Abstract: Auto-bidding is a core component of real-time advertising systems, where decisions must optimize long-term performance under budget and cost constraints, while online exploration is prohibitively risky. Offline reinforcement… 9 arXiv — Machine Learning research 15d ago Graph Structured Combinatorial Semi-Bandit with Nonlinear Reward Associations through Separable Signals arXiv:2606.14650v1 Announce Type: new Abstract: The identification of optimal structures within vast arrays of interconnected data necessitates significant sampling- and computational effort. Learning and leveraging underlying signal dependencies can improve efficiency and… 10 arXiv — Machine Learning research 15d ago Beyond task performance: Decoding bioacoustic embeddings with speech features arXiv:2606.14662v1 Announce Type: new Abstract: Pretrained audio embeddings are standard in bioacoustics, yet little is known about which acoustic features these models encode, nor which are useful for a given task. This hinders transparency and limits extension to rare species… 6 arXiv — NLP / Computation & Language research 15d ago Fusing Stylometric and Embedding Systems to Estimate Authorship Likelihood Ratios in Japanese arXiv:2606.13991v1 Announce Type: new Abstract: The likelihood ratio framework is widely recognized as the logically and legally sound basis for evidential analysis across forensic sciences, and its importance is increasingly acknowledged in analyses of authorship in textual… 22 arXiv — NLP / Computation & Language research 15d ago The Holistic Storage of Verb+Up Phrases in Text-based and Audio-based Language Models arXiv:2606.13993v1 Announce Type: new Abstract: A crucial aspect of linguistic capability is the ability to trade off between stored representations and abstract knowledge: one must retrieve learned representations, but also generate novel ones by applying productive rules.… 34 arXiv — NLP / Computation & Language research 15d ago Decoupled Mixture-of-Experts for Parametric Knowledge Injection arXiv:2606.14243v1 Announce Type: new Abstract: Knowledge injection aims to equip large language models (LLMs) with external, domain-specific, or time-sensitive knowledge. Existing approaches typically face a trade-off between flexibility and integration: retrieval-augmented… 33 arXiv — NLP / Computation & Language research 15d ago ScoreGate: Adaptive Chunk Selection for Retrieval-Augmented Generation via Dual-Score Statistical Fusion arXiv:2606.14269v1 Announce Type: cross Abstract: Fixed-cardinality retrieval injects a constant top-K chunks into the generator regardless of query complexity, causing over-retrieval for narrow queries and under-retrieval for compositional ones. We describe ScoreGate, a… 11 arXiv — NLP / Computation & Language research 15d ago UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities arXiv:2504.20734v5 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) has shown substantial promise in improving factual accuracy by grounding model responses with external knowledge relevant to queries. However, most existing approaches are limited to a… 37 arXiv — NLP / Computation & Language research 15d ago Sentinel: Decoding Context Utilization via Attention Probing for Efficient LLM Context Compression arXiv:2505.23277v3 Announce Type: replace Abstract: Retrieval-augmented generation (RAG) often suffers from long and noisy retrieved contexts. Existing context compression methods typically rely on heuristic relevance estimation or supervised compression models rather than on… 29 arXiv — NLP / Computation & Language research 15d ago Pragmatic Inference for Moral Reasoning Acquisition: Generalization via Metapragmatic Links arXiv:2509.24102v5 Announce Type: replace Abstract: While moral reasoning has emerged as a promising research direction for large language models (LLMs), achieving robust generalization remains a critical challenge. This challenge arises from the gap between what is said and… 27 arXiv — NLP / Computation & Language research 15d ago Fragile Knowledge, Robust Instruction-Following: The Width Pruning Dichotomy in Llama-3.2 arXiv:2512.22671v3 Announce Type: replace Abstract: Structured width pruning of GLU-MLP layers in Llama-3.2 models, guided by the Peak-to-Peak Magnitude (PPM) criterion, reveals a systematic dichotomy in how reducing the expansion ratio affects different model capabilities.… 22 arXiv — NLP / Computation & Language research 15d ago C2-Faith: Benchmarking LLM Judges for Causal and Coverage Faithfulness in Chain-of-Thought Reasoning arXiv:2603.05167v2 Announce Type: replace Abstract: Large language models (LLMs) are increasingly used as judges of chain-of-thought (CoT) reasoning, yet it remains unclear whether they can reliably assess process faithfulness rather than merely answer plausibility. We introduce… 20 arXiv — NLP / Computation & Language research 15d ago ClaimFlow: Tracing the Evolution of Scientific Claims in NLP arXiv:2603.16073v2 Announce Type: replace Abstract: Scientific papers advance $\textit{claims}$ that later work supports, extends, or sometimes refutes. Yet existing methods for citation and claim analysis capture only fragments of this dialogue. In this work, we make these… 38 Vercel — AI dev-tools 15d ago Increased Blob store limit for Hobby users Hobby users can now create up to 100 Blob stores, up from 5. This gives teams more flexibility to organize data by project, environment, or region as applications grow. Storage, operations, and transfer limits still apply. Learn more in the Blob documentation . Read more 21 r/LocalLLaMA community 15d ago Gemma 12b less than 10 watts 6.5pp 1.3tg Google pixel 10 pro Termux Llamacpp version: 9639 (ef8268fee) $ ./llama.cpp/build_vulkan/bin/llama-cli -m storage/downloads/gemma-4-12b-it-UD-Q3_K_XL.gguf --model-draft storage/downloads/mtp-gemma-4-12b-it.gguf --temp 1.0 --top-p 0.95 --top-k 64 --spec-type draft-mtp… 5 r/MachineLearning community 16d ago I’m building a free bilingual machine-learning notebook course — looking for feedback on structure and coverage [R] Hi everyone, I’m building an open-source machine-learning tutorial repository in Jupyter Notebook format: https://github.com/mohammadijoo/Machine_Learning_Tutorials The course is bilingual: English and Persian/Farsi versions are organized in parallel. The goal is to make a… 18 r/LocalLLaMA community 16d ago I don’t know who needs to hear this but 128GB BD-R XL M-DISC is SOTA for consumer-available archival optical storage (for backing up your models) If you’re trying to download and preserve your local LLMs in case of future availability issues due to AI-related politics, your best bet is either 128gb or 100gb Blu-Ray optical disks, more specifically BD-R XL M-DISC standard format which are archival-grade and built to last… 21 r/LocalLLaMA community 17d ago 3090 died, good night sweet prince Feelsbadman.jpeg Once you've tasted 4x GPUs and almost BF16 models with BF16 KV cache you can't go back 😞. AND IT'S THE WEEKEND OH MAN.   submitted by   /u/fragment_me [link]   [comments] 32 Vercel — AI dev-tools 17d ago Workflow SDK now runs natively in Nitro v3 Workflow SDK 's native Nitro v3 integration is now in beta. Steps run inside the same bundled runtime as the rest of your app, instead of a separate bundle. Nitro's useStorage() and other server-side APIs work directly inside "use step" functions. The Nitro dev server also… 26 TechCrunch — AI news-outlet 17d ago SpaceX IPO: Live updates on everything you need to know TechCrunch has followed SpaceX's start, struggles, and successes from the early days. And we're here for what happens next too. This package of SpaceX IPO coverage includes who stands to win (and maybe some who won't), pre-IPO deals, and what's tucked inside its S-1 registration… 4 NVIDIA Developer Blog official-blog 17d ago Deploy Long-Context Reasoning and Agentic Workflows with MiniMax M3 on NVIDIA Accelerated Infrastructure As enterprise AI adoption scales, developers are increasingly forced to stitch together fragmented pipelines—separate models for text, vision, and... 25 TechCrunch — AI news-outlet 17d ago SpaceX IPO: Everything you need to know TechCrunch has followed SpaceX's start, struggles, and successes from the early days. And we're here for what happens next too. This package of SpaceX IPO coverage includes who stands to win (and maybe some who won't), pre-IPO deals, and what's tucked inside its S-1 registration… 32 r/LocalLLaMA community 17d ago We should heavily discourage and moderate cloud API (deepseek api, GLM api, etc.) topics and discussion. This is LOCAL first. I’m just some fucking guy. This is just some fucking opinion. I’ve seen tons of stealth marketing or related topics on this subreddit about how great or how easy it is to use some random subscription api. Why the fuck are we allowing people to so casually talk about how much… 31 Hugging Face Daily Papers research 17d ago Leveraging Morphology for Historical Script Metrological Analysis Abstract A transformer-based architecture with prototype learning enables scalable paleographic measurements from historical documents using only line-level transcriptions, demonstrating its effectiveness on a 160-page codex with minimal training data requirements. Generated by… 37 arXiv — NLP / Computation & Language research 18d ago Rigel: Reverse-Engineering the Metal 4.1 Tensor Compute Path on the Apple M4 Max GPU arXiv:2606.12765v1 Announce Type: new Abstract: Apple's Metal 4.1 exposes a tensor compute path: the Metal Performance Primitives (MPP) matmul2d operation over cooperative_tensor fragments, whose interface is documented but whose hardware behavior is deliberately hidden. The… 18 arXiv — NLP / Computation & Language research 18d ago How Fine-Grained Should a RAG Benchmark Be? A Hierarchical Framework for Synthetic Question Generation arXiv:2606.12789v1 Announce Type: new Abstract: Evaluating retrieval-augmented generation (RAG) systems requires benchmarks that capture diverse question characteristics, yet practitioners lack empirical guidance on which dimensions to vary and at what granularity. We present… 22 arXiv — NLP / Computation & Language research 18d ago SafeLLM: Extraction as a Hallucination-Resistant Alternative to Rewriting in Safety-Critical Settings arXiv:2606.12897v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used to access organisational documentation, including standard operating procedures (SOPs), HR policies and institutional guidelines. However, retrieval-augmented generation (RAG)… 29 arXiv — NLP / Computation & Language research 18d ago X-MADAM-RAG: Diagnosing and Handling Chinese-English Evidence Conflict in Retrieval-Augmented Generation arXiv:2606.12903v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) systems may receive evidence that is not merely noisy but mutually contradictory. This issue becomes particularly salient in multilingual settings, where retrieved Chinese and English evidence… 8 arXiv — NLP / Computation & Language research 18d ago HyPE: Category-Aware Hypergraph Encoding with Persistent Edge Embeddings for Persona-Grounded Dialogue arXiv:2606.13142v1 Announce Type: new Abstract: Persona-grounded dialogue systems aim to produce responses consistent with a speaker's persona, yet existing methods treat personas as a flat set of sentences and fail to model the high-order relations among persona… 25 arXiv — NLP / Computation & Language research 18d ago SICI: A Semantic-Pragmatic Complexity Index Reveals Regime Shifts in LLM Stance Detection arXiv:2606.13189v1 Announce Type: new Abstract: Prompt-based LLMs are increasingly used for stance detection, but harder examples are not always repaired by clearer instructions, reasoning prompts, retrieval, or debate. We introduce SICI (Stance Inference Complexity Index), a… 10 arXiv — NLP / Computation & Language research 18d ago PolyAlign: Conditional Human-Distribution Alignment arXiv:2606.13227v1 Announce Type: new Abstract: Post-training methods such as supervised fine-tuning (SFT) and preference optimization typically align language models toward a single global assistant behavior. While effective for improving average helpfulness, this can suppress… 29 arXiv — NLP / Computation & Language research 18d ago Leveraging Audio-LLMs to Filter Speech-to-Speech Training Data arXiv:2606.13507v1 Announce Type: new Abstract: Large-scale mined corpora provide abundant training data for end-to-end speech-to-speech translation (S2ST) but may contain noise, misalignment, and semantic errors. Filtering noisy data is crucial to maintain robust speech… 30 arXiv — NLP / Computation & Language research 18d ago When Does Mixing Help? Analyzing Query Embedding Interpolation in Multilingual Dense Retrieval arXiv:2606.13537v1 Announce Type: new Abstract: While mixed-language querying is ubiquitous in multilingual communities, the sensitivity of dense retrievers to such queries remains poorly understood. We present a ratio-controlled study on mMARCO that systematically evaluates… 11 arXiv — NLP / Computation & Language research 18d ago SkMTEB: Slovak Massive Text Embedding Benchmark and Model Adaptation arXiv:2606.13647v1 Announce Type: new Abstract: We introduce SkMTEB, the first comprehensive MTEB-style text embedding benchmark for Slovak, a low-resource West Slavic language, comprising 31 datasets across 7 task types -- nearly 4$\times$ the depth of existing multilingual… 25 arXiv — NLP / Computation & Language research 18d ago Learning to Reason by Analogy via Retrieval-Augmented Reinforcement Fine-Tuning arXiv:2606.13680v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) has become a standard mechanism for grounding language models in external knowledge, yet conventional retrieval based on lexical or semantic similarity is poorly suited for complex reasoning… 11 arXiv — NLP / Computation & Language research 18d ago Identifiability Without Gaussianity: Symbolic World Models and Near-Infinite Temporal Consistency arXiv:2606.12471v1 Announce Type: cross Abstract: Klindt, LeCun, and Balestriero (arXiv:2605.26379) proved that Joint-Embedding Predictive Architectures (JEPAs) achieve linear identifiability, the linear recovery of the world's true latent variables, if and only if the world's… 28 arXiv — NLP / Computation & Language research 18d ago PersonaDrive: Human-Style Retrieval-Augmented VLA Agents for Closed-Loop Driving Simulation arXiv:2606.12616v1 Announce Type: cross Abstract: Closed-loop driving simulators typically populate their environments with non-ego traffic agents that behave largely the same way, produced either by rule-based traffic managers or by learned models trained toward a single… 16 arXiv — NLP / Computation & Language research 18d ago MiniPIC: Flexible Position-Independent Caching in <100LOC arXiv:2606.13126v1 Announce Type: cross Abstract: Retrieval-augmented and agentic workloads repeatedly prefill recurring predictable structured inputs (which we call "spans") such as documents and code files. Yet, prefix caching in engines such as vLLM cannot reuse their KV… 12 arXiv — NLP / Computation & Language research 18d ago ComAct: Reframing Professional Software Manipulation via COM-as-Action Paradigm arXiv:2606.13239v1 Announce Type: cross Abstract: Existing computer-use agents remain fundamentally limited in professional software manipulation: GUI-based agents suffer from fragile visual grounding and long-horizon error accumulation, while API-basedapproaches struggle with… 34 arXiv — NLP / Computation & Language research 18d ago TimeLens: On-Device Artifact Recognition with Retrieval-Augmented Question Answering for the Grand Egyptian Museum arXiv:2606.13267v1 Announce Type: cross Abstract: TimeLens is an AI-powered bilingual mobile guide for the Grand Egyptian Museum (GEM). Pointing a phone at an exhibit, a visitor sees the artifact recognized in real time and can ask follow-up questions answered in English or… 37 arXiv — NLP / Computation & Language research 18d ago Uncertainty-Aware Hybrid Retrieval for Long-Document RAG arXiv:2606.13550v1 Announce Type: cross Abstract: Retrieval augmented generation (RAG) depends critically on the quality and granularity of retrieved evidence. Large retrieval units preserve context but often introduce irrelevant content, which can dilute answer bearing evidence… 38 Hugging Face Daily Papers research 18d ago N-GRPO: Embedding-Level Neighbor Mixing for Enhanced Policy Optimization Abstract N-GRPO, a novel exploration strategy within GRPO framework, enhances mathematical reasoning in large language models through semantic neighbor mixing that maintains semantic consistency while injecting diversity. Generated by Qwen/Qwen2.5-Coder-32B-Instruct The success… 27 Hugging Face Daily Papers research 18d ago Time-Series Foundation Model Embeddings for Remaining Useful Life Estimation Abstract A lightweight approach combining a frozen pretrained time-series foundation model with a simple regression head achieves superior RUL prediction performance compared to various baseline methods on industrial sensor data. Generated by Qwen/Qwen2.5-Coder-32B-Instruct… 15 arXiv — Machine Learning research 19d ago RoVE: Rotary Value Embeddings Attention for Relative Position-dependent Value Pathways arXiv:2606.11275v1 Announce Type: new Abstract: Rotary Position Embeddings (RoPE) make attention scores position-relative but leave the value pathway position-blind: the message sent by a value token is the same regardless of its distance from the query. We propose RoVE, a… 10 arXiv — Machine Learning research 19d ago RePAIR: Predictive Self-Supervised Representation Learning in Chess arXiv:2606.11860v1 Announce Type: new Abstract: In this paper, we introduce Representation Prediction via Autoencoding using Iterative Refinement (RePAIR) - a novel self-supervised representation learning architecture that synthesizes Masked Autoencoders (MAE), Joint Embedding… 15 arXiv — Machine Learning research 19d ago Time-Series Foundation Model Embeddings for Remaining Useful Life Estimation arXiv:2606.11990v1 Announce Type: new Abstract: Remaining Useful Life (RUL) prediction is essential for industrial predictive maintenance, yet many learning-based approaches rely on extensive feature engineering or large labeled datasets to train task-specific sequence models.… 20 arXiv — Machine Learning research 19d ago Bootstrapped Monitoring: Leveraging Transparent Reasoning to Oversee Stronger AI Agents arXiv:2606.11998v1 Announce Type: new Abstract: Trusted monitoring is a cornerstone of AI control. However, as frontier models grow more capable, the increasing capabilities gap between trusted and untrusted models may render trusted models unreliable monitors. We introduce… 30 arXiv — Machine Learning research 19d ago nD-RoPE: A Generalized RoPE for n-Dimensional Position Embedding arXiv:2606.12146v1 Announce Type: new Abstract: Rotary Position Embedding (RoPE) is widely adopted in Transformer models, yet its extension to high-dimensional domains lacks a unified theoretical formulation. Most existing approaches either apply rotations independently along… 8 Page 5 of 10 · 500 articles ← Newer Older →