Tag

Rag

500 articles archived under #rag · RSS

arXiv — Machine Learning research 15d ago

Numbers Already Carry Their Own Embeddings

arXiv:2606.14108v1 Announce Type: new Abstract: We introduce Adelic operation-preserved embeddings (AOE), a training-free representation that captures both a number's real value and its modular (p-adic) signatures. This construction preserves additive and multiplicative…

10
arXiv — Machine Learning research 15d ago

Learning High Coverage Discriminative Parsimonious Rulesets

arXiv:2606.14156v1 Announce Type: new Abstract: Learning systems based on IF-THEN rule representations readily offer interpretability, making them a crucial focus in contemporary AI research. A key objective for such rule sets is to achieve both high discriminative power and…

9
arXiv — Machine Learning research 15d ago

DRIVE: Distributional and Retrieval-Augmented Bidding with Value Evaluation

arXiv:2606.14192v1 Announce Type: new Abstract: Auto-bidding is a core component of real-time advertising systems, where decisions must optimize long-term performance under budget and cost constraints, while online exploration is prohibitively risky. Offline reinforcement…

9
arXiv — Machine Learning research 15d ago

Graph Structured Combinatorial Semi-Bandit with Nonlinear Reward Associations through Separable Signals

arXiv:2606.14650v1 Announce Type: new Abstract: The identification of optimal structures within vast arrays of interconnected data necessitates significant sampling- and computational effort. Learning and leveraging underlying signal dependencies can improve efficiency and…

10
arXiv — Machine Learning research 15d ago

Beyond task performance: Decoding bioacoustic embeddings with speech features

arXiv:2606.14662v1 Announce Type: new Abstract: Pretrained audio embeddings are standard in bioacoustics, yet little is known about which acoustic features these models encode, nor which are useful for a given task. This hinders transparency and limits extension to rare species…

6
arXiv — NLP / Computation & Language research 15d ago

Fusing Stylometric and Embedding Systems to Estimate Authorship Likelihood Ratios in Japanese

arXiv:2606.13991v1 Announce Type: new Abstract: The likelihood ratio framework is widely recognized as the logically and legally sound basis for evidential analysis across forensic sciences, and its importance is increasingly acknowledged in analyses of authorship in textual…

22
arXiv — NLP / Computation & Language research 15d ago

The Holistic Storage of Verb+Up Phrases in Text-based and Audio-based Language Models

arXiv:2606.13993v1 Announce Type: new Abstract: A crucial aspect of linguistic capability is the ability to trade off between stored representations and abstract knowledge: one must retrieve learned representations, but also generate novel ones by applying productive rules.…

34
arXiv — NLP / Computation & Language research 15d ago

Decoupled Mixture-of-Experts for Parametric Knowledge Injection

arXiv:2606.14243v1 Announce Type: new Abstract: Knowledge injection aims to equip large language models (LLMs) with external, domain-specific, or time-sensitive knowledge. Existing approaches typically face a trade-off between flexibility and integration: retrieval-augmented…

33
arXiv — NLP / Computation & Language research 15d ago

ScoreGate: Adaptive Chunk Selection for Retrieval-Augmented Generation via Dual-Score Statistical Fusion

arXiv:2606.14269v1 Announce Type: cross Abstract: Fixed-cardinality retrieval injects a constant top-K chunks into the generator regardless of query complexity, causing over-retrieval for narrow queries and under-retrieval for compositional ones. We describe ScoreGate, a…

11
arXiv — NLP / Computation & Language research 15d ago

UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities

arXiv:2504.20734v5 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) has shown substantial promise in improving factual accuracy by grounding model responses with external knowledge relevant to queries. However, most existing approaches are limited to a…

37
arXiv — NLP / Computation & Language research 15d ago

Sentinel: Decoding Context Utilization via Attention Probing for Efficient LLM Context Compression

arXiv:2505.23277v3 Announce Type: replace Abstract: Retrieval-augmented generation (RAG) often suffers from long and noisy retrieved contexts. Existing context compression methods typically rely on heuristic relevance estimation or supervised compression models rather than on…

29
arXiv — NLP / Computation & Language research 15d ago

Pragmatic Inference for Moral Reasoning Acquisition: Generalization via Metapragmatic Links

arXiv:2509.24102v5 Announce Type: replace Abstract: While moral reasoning has emerged as a promising research direction for large language models (LLMs), achieving robust generalization remains a critical challenge. This challenge arises from the gap between what is said and…

27
arXiv — NLP / Computation & Language research 15d ago

Fragile Knowledge, Robust Instruction-Following: The Width Pruning Dichotomy in Llama-3.2

arXiv:2512.22671v3 Announce Type: replace Abstract: Structured width pruning of GLU-MLP layers in Llama-3.2 models, guided by the Peak-to-Peak Magnitude (PPM) criterion, reveals a systematic dichotomy in how reducing the expansion ratio affects different model capabilities.…

22
arXiv — NLP / Computation & Language research 15d ago

C2-Faith: Benchmarking LLM Judges for Causal and Coverage Faithfulness in Chain-of-Thought Reasoning

arXiv:2603.05167v2 Announce Type: replace Abstract: Large language models (LLMs) are increasingly used as judges of chain-of-thought (CoT) reasoning, yet it remains unclear whether they can reliably assess process faithfulness rather than merely answer plausibility. We introduce…

20
arXiv — NLP / Computation & Language research 15d ago

ClaimFlow: Tracing the Evolution of Scientific Claims in NLP

arXiv:2603.16073v2 Announce Type: replace Abstract: Scientific papers advance $\textit{claims}$ that later work supports, extends, or sometimes refutes. Yet existing methods for citation and claim analysis capture only fragments of this dialogue. In this work, we make these…

38
Vercel — AI dev-tools 15d ago

Increased Blob store limit for Hobby users

Hobby users can now create up to 100 Blob stores, up from 5. This gives teams more flexibility to organize data by project, environment, or region as applications grow. Storage, operations, and transfer limits still apply. Learn more in the Blob documentation . Read more

21
r/LocalLLaMA community 15d ago

Gemma 12b less than 10 watts 6.5pp 1.3tg

Google pixel 10 pro Termux Llamacpp version: 9639 (ef8268fee) $ ./llama.cpp/build_vulkan/bin/llama-cli -m storage/downloads/gemma-4-12b-it-UD-Q3_K_XL.gguf --model-draft storage/downloads/mtp-gemma-4-12b-it.gguf --temp 1.0 --top-p 0.95 --top-k 64 --spec-type draft-mtp…

5
r/MachineLearning community 16d ago

I’m building a free bilingual machine-learning notebook course — looking for feedback on structure and coverage [R]

Hi everyone, I’m building an open-source machine-learning tutorial repository in Jupyter Notebook format: https://github.com/mohammadijoo/Machine_Learning_Tutorials The course is bilingual: English and Persian/Farsi versions are organized in parallel. The goal is to make a…

18
r/LocalLLaMA community 16d ago

I don’t know who needs to hear this but 128GB BD-R XL M-DISC is SOTA for consumer-available archival optical storage (for backing up your models)

If you’re trying to download and preserve your local LLMs in case of future availability issues due to AI-related politics, your best bet is either 128gb or 100gb Blu-Ray optical disks, more specifically BD-R XL M-DISC standard format which are archival-grade and built to last…

21
r/LocalLLaMA community 17d ago

3090 died, good night sweet prince

Feelsbadman.jpeg Once you've tasted 4x GPUs and almost BF16 models with BF16 KV cache you can't go back 😞. AND IT'S THE WEEKEND OH MAN.   submitted by   /u/fragment_me [link]   [comments]

32
Vercel — AI dev-tools 17d ago

Workflow SDK now runs natively in Nitro v3

Workflow SDK 's native Nitro v3 integration is now in beta. Steps run inside the same bundled runtime as the rest of your app, instead of a separate bundle. Nitro's useStorage() and other server-side APIs work directly inside "use step" functions. The Nitro dev server also…

26
TechCrunch — AI news-outlet 17d ago

SpaceX IPO: Live updates on everything you need to know

TechCrunch has followed SpaceX's start, struggles, and successes from the early days. And we're here for what happens next too. This package of SpaceX IPO coverage includes who stands to win (and maybe some who won't), pre-IPO deals, and what's tucked inside its S-1 registration…

4
NVIDIA Developer Blog official-blog 17d ago

Deploy Long-Context Reasoning and Agentic Workflows with MiniMax M3 on NVIDIA Accelerated Infrastructure

As enterprise AI adoption scales, developers are increasingly forced to stitch together fragmented pipelines—separate models for text, vision, and...

25
TechCrunch — AI news-outlet 17d ago

SpaceX IPO: Everything you need to know

TechCrunch has followed SpaceX's start, struggles, and successes from the early days. And we're here for what happens next too. This package of SpaceX IPO coverage includes who stands to win (and maybe some who won't), pre-IPO deals, and what's tucked inside its S-1 registration…

32
r/LocalLLaMA community 17d ago

We should heavily discourage and moderate cloud API (deepseek api, GLM api, etc.) topics and discussion. This is LOCAL first.

I’m just some fucking guy. This is just some fucking opinion. I’ve seen tons of stealth marketing or related topics on this subreddit about how great or how easy it is to use some random subscription api. Why the fuck are we allowing people to so casually talk about how much…

31
Hugging Face Daily Papers research 17d ago

Leveraging Morphology for Historical Script Metrological Analysis

Abstract A transformer-based architecture with prototype learning enables scalable paleographic measurements from historical documents using only line-level transcriptions, demonstrating its effectiveness on a 160-page codex with minimal training data requirements. Generated by…

37
arXiv — NLP / Computation & Language research 18d ago

Rigel: Reverse-Engineering the Metal 4.1 Tensor Compute Path on the Apple M4 Max GPU

arXiv:2606.12765v1 Announce Type: new Abstract: Apple's Metal 4.1 exposes a tensor compute path: the Metal Performance Primitives (MPP) matmul2d operation over cooperative_tensor fragments, whose interface is documented but whose hardware behavior is deliberately hidden. The…

18
arXiv — NLP / Computation & Language research 18d ago

How Fine-Grained Should a RAG Benchmark Be? A Hierarchical Framework for Synthetic Question Generation

arXiv:2606.12789v1 Announce Type: new Abstract: Evaluating retrieval-augmented generation (RAG) systems requires benchmarks that capture diverse question characteristics, yet practitioners lack empirical guidance on which dimensions to vary and at what granularity. We present…

22
arXiv — NLP / Computation & Language research 18d ago

SafeLLM: Extraction as a Hallucination-Resistant Alternative to Rewriting in Safety-Critical Settings

arXiv:2606.12897v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used to access organisational documentation, including standard operating procedures (SOPs), HR policies and institutional guidelines. However, retrieval-augmented generation (RAG)…

29
arXiv — NLP / Computation & Language research 18d ago

X-MADAM-RAG: Diagnosing and Handling Chinese-English Evidence Conflict in Retrieval-Augmented Generation

arXiv:2606.12903v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) systems may receive evidence that is not merely noisy but mutually contradictory. This issue becomes particularly salient in multilingual settings, where retrieved Chinese and English evidence…

8
arXiv — NLP / Computation & Language research 18d ago

HyPE: Category-Aware Hypergraph Encoding with Persistent Edge Embeddings for Persona-Grounded Dialogue

arXiv:2606.13142v1 Announce Type: new Abstract: Persona-grounded dialogue systems aim to produce responses consistent with a speaker's persona, yet existing methods treat personas as a flat set of sentences and fail to model the high-order relations among persona…

25
arXiv — NLP / Computation & Language research 18d ago

SICI: A Semantic-Pragmatic Complexity Index Reveals Regime Shifts in LLM Stance Detection

arXiv:2606.13189v1 Announce Type: new Abstract: Prompt-based LLMs are increasingly used for stance detection, but harder examples are not always repaired by clearer instructions, reasoning prompts, retrieval, or debate. We introduce SICI (Stance Inference Complexity Index), a…

10
arXiv — NLP / Computation & Language research 18d ago

PolyAlign: Conditional Human-Distribution Alignment

arXiv:2606.13227v1 Announce Type: new Abstract: Post-training methods such as supervised fine-tuning (SFT) and preference optimization typically align language models toward a single global assistant behavior. While effective for improving average helpfulness, this can suppress…

29
arXiv — NLP / Computation & Language research 18d ago

Leveraging Audio-LLMs to Filter Speech-to-Speech Training Data

arXiv:2606.13507v1 Announce Type: new Abstract: Large-scale mined corpora provide abundant training data for end-to-end speech-to-speech translation (S2ST) but may contain noise, misalignment, and semantic errors. Filtering noisy data is crucial to maintain robust speech…

30
arXiv — NLP / Computation & Language research 18d ago

When Does Mixing Help? Analyzing Query Embedding Interpolation in Multilingual Dense Retrieval

arXiv:2606.13537v1 Announce Type: new Abstract: While mixed-language querying is ubiquitous in multilingual communities, the sensitivity of dense retrievers to such queries remains poorly understood. We present a ratio-controlled study on mMARCO that systematically evaluates…

11
arXiv — NLP / Computation & Language research 18d ago

SkMTEB: Slovak Massive Text Embedding Benchmark and Model Adaptation

arXiv:2606.13647v1 Announce Type: new Abstract: We introduce SkMTEB, the first comprehensive MTEB-style text embedding benchmark for Slovak, a low-resource West Slavic language, comprising 31 datasets across 7 task types -- nearly 4$\times$ the depth of existing multilingual…

25
arXiv — NLP / Computation & Language research 18d ago

Learning to Reason by Analogy via Retrieval-Augmented Reinforcement Fine-Tuning

arXiv:2606.13680v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) has become a standard mechanism for grounding language models in external knowledge, yet conventional retrieval based on lexical or semantic similarity is poorly suited for complex reasoning…

11
arXiv — NLP / Computation & Language research 18d ago

Identifiability Without Gaussianity: Symbolic World Models and Near-Infinite Temporal Consistency

arXiv:2606.12471v1 Announce Type: cross Abstract: Klindt, LeCun, and Balestriero (arXiv:2605.26379) proved that Joint-Embedding Predictive Architectures (JEPAs) achieve linear identifiability, the linear recovery of the world's true latent variables, if and only if the world's…

28
arXiv — NLP / Computation & Language research 18d ago

PersonaDrive: Human-Style Retrieval-Augmented VLA Agents for Closed-Loop Driving Simulation

arXiv:2606.12616v1 Announce Type: cross Abstract: Closed-loop driving simulators typically populate their environments with non-ego traffic agents that behave largely the same way, produced either by rule-based traffic managers or by learned models trained toward a single…

16
arXiv — NLP / Computation & Language research 18d ago

MiniPIC: Flexible Position-Independent Caching in <100LOC

arXiv:2606.13126v1 Announce Type: cross Abstract: Retrieval-augmented and agentic workloads repeatedly prefill recurring predictable structured inputs (which we call "spans") such as documents and code files. Yet, prefix caching in engines such as vLLM cannot reuse their KV…

12
arXiv — NLP / Computation & Language research 18d ago

ComAct: Reframing Professional Software Manipulation via COM-as-Action Paradigm

arXiv:2606.13239v1 Announce Type: cross Abstract: Existing computer-use agents remain fundamentally limited in professional software manipulation: GUI-based agents suffer from fragile visual grounding and long-horizon error accumulation, while API-basedapproaches struggle with…

34
arXiv — NLP / Computation & Language research 18d ago

TimeLens: On-Device Artifact Recognition with Retrieval-Augmented Question Answering for the Grand Egyptian Museum

arXiv:2606.13267v1 Announce Type: cross Abstract: TimeLens is an AI-powered bilingual mobile guide for the Grand Egyptian Museum (GEM). Pointing a phone at an exhibit, a visitor sees the artifact recognized in real time and can ask follow-up questions answered in English or…

37
arXiv — NLP / Computation & Language research 18d ago

Uncertainty-Aware Hybrid Retrieval for Long-Document RAG

arXiv:2606.13550v1 Announce Type: cross Abstract: Retrieval augmented generation (RAG) depends critically on the quality and granularity of retrieved evidence. Large retrieval units preserve context but often introduce irrelevant content, which can dilute answer bearing evidence…

38
Hugging Face Daily Papers research 18d ago

N-GRPO: Embedding-Level Neighbor Mixing for Enhanced Policy Optimization

Abstract N-GRPO, a novel exploration strategy within GRPO framework, enhances mathematical reasoning in large language models through semantic neighbor mixing that maintains semantic consistency while injecting diversity. Generated by Qwen/Qwen2.5-Coder-32B-Instruct The success…

27
Hugging Face Daily Papers research 18d ago

Time-Series Foundation Model Embeddings for Remaining Useful Life Estimation

Abstract A lightweight approach combining a frozen pretrained time-series foundation model with a simple regression head achieves superior RUL prediction performance compared to various baseline methods on industrial sensor data. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…

15
arXiv — Machine Learning research 19d ago

RoVE: Rotary Value Embeddings Attention for Relative Position-dependent Value Pathways

arXiv:2606.11275v1 Announce Type: new Abstract: Rotary Position Embeddings (RoPE) make attention scores position-relative but leave the value pathway position-blind: the message sent by a value token is the same regardless of its distance from the query. We propose RoVE, a…

10
arXiv — Machine Learning research 19d ago

RePAIR: Predictive Self-Supervised Representation Learning in Chess

arXiv:2606.11860v1 Announce Type: new Abstract: In this paper, we introduce Representation Prediction via Autoencoding using Iterative Refinement (RePAIR) - a novel self-supervised representation learning architecture that synthesizes Masked Autoencoders (MAE), Joint Embedding…

15
arXiv — Machine Learning research 19d ago

Time-Series Foundation Model Embeddings for Remaining Useful Life Estimation

arXiv:2606.11990v1 Announce Type: new Abstract: Remaining Useful Life (RUL) prediction is essential for industrial predictive maintenance, yet many learning-based approaches rely on extensive feature engineering or large labeled datasets to train task-specific sequence models.…

20
arXiv — Machine Learning research 19d ago

Bootstrapped Monitoring: Leveraging Transparent Reasoning to Oversee Stronger AI Agents

arXiv:2606.11998v1 Announce Type: new Abstract: Trusted monitoring is a cornerstone of AI control. However, as frontier models grow more capable, the increasing capabilities gap between trusted and untrusted models may render trusted models unreliable monitors. We introduce…

30
arXiv — Machine Learning research 19d ago

nD-RoPE: A Generalized RoPE for n-Dimensional Position Embedding

arXiv:2606.12146v1 Announce Type: new Abstract: Rotary Position Embedding (RoPE) is widely adopted in Transformer models, yet its extension to high-dimensional domains lacks a unified theoretical formulation. Most existing approaches either apply rotations independently along…

8

Numbers Already Carry Their Own Embeddings

Learning High Coverage Discriminative Parsimonious Rulesets

DRIVE: Distributional and Retrieval-Augmented Bidding with Value Evaluation

Graph Structured Combinatorial Semi-Bandit with Nonlinear Reward Associations through Separable Signals

Beyond task performance: Decoding bioacoustic embeddings with speech features

Fusing Stylometric and Embedding Systems to Estimate Authorship Likelihood Ratios in Japanese

The Holistic Storage of Verb+Up Phrases in Text-based and Audio-based Language Models

Decoupled Mixture-of-Experts for Parametric Knowledge Injection

ScoreGate: Adaptive Chunk Selection for Retrieval-Augmented Generation via Dual-Score Statistical Fusion

UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities

Sentinel: Decoding Context Utilization via Attention Probing for Efficient LLM Context Compression

Pragmatic Inference for Moral Reasoning Acquisition: Generalization via Metapragmatic Links

Fragile Knowledge, Robust Instruction-Following: The Width Pruning Dichotomy in Llama-3.2

C2-Faith: Benchmarking LLM Judges for Causal and Coverage Faithfulness in Chain-of-Thought Reasoning

ClaimFlow: Tracing the Evolution of Scientific Claims in NLP

Increased Blob store limit for Hobby users

Gemma 12b less than 10 watts 6.5pp 1.3tg

I’m building a free bilingual machine-learning notebook course — looking for feedback on structure and coverage [R]

I don’t know who needs to hear this but 128GB BD-R XL M-DISC is SOTA for consumer-available archival optical storage (for backing up your models)

3090 died, good night sweet prince

Workflow SDK now runs natively in Nitro v3

SpaceX IPO: Live updates on everything you need to know

Deploy Long-Context Reasoning and Agentic Workflows with MiniMax M3 on NVIDIA Accelerated Infrastructure

SpaceX IPO: Everything you need to know

We should heavily discourage and moderate cloud API (deepseek api, GLM api, etc.) topics and discussion. This is LOCAL first.

Leveraging Morphology for Historical Script Metrological Analysis

Rigel: Reverse-Engineering the Metal 4.1 Tensor Compute Path on the Apple M4 Max GPU

How Fine-Grained Should a RAG Benchmark Be? A Hierarchical Framework for Synthetic Question Generation

SafeLLM: Extraction as a Hallucination-Resistant Alternative to Rewriting in Safety-Critical Settings

X-MADAM-RAG: Diagnosing and Handling Chinese-English Evidence Conflict in Retrieval-Augmented Generation

HyPE: Category-Aware Hypergraph Encoding with Persistent Edge Embeddings for Persona-Grounded Dialogue

SICI: A Semantic-Pragmatic Complexity Index Reveals Regime Shifts in LLM Stance Detection

PolyAlign: Conditional Human-Distribution Alignment

Leveraging Audio-LLMs to Filter Speech-to-Speech Training Data

When Does Mixing Help? Analyzing Query Embedding Interpolation in Multilingual Dense Retrieval

SkMTEB: Slovak Massive Text Embedding Benchmark and Model Adaptation

Learning to Reason by Analogy via Retrieval-Augmented Reinforcement Fine-Tuning

Identifiability Without Gaussianity: Symbolic World Models and Near-Infinite Temporal Consistency

PersonaDrive: Human-Style Retrieval-Augmented VLA Agents for Closed-Loop Driving Simulation

MiniPIC: Flexible Position-Independent Caching in <100LOC

ComAct: Reframing Professional Software Manipulation via COM-as-Action Paradigm

TimeLens: On-Device Artifact Recognition with Retrieval-Augmented Question Answering for the Grand Egyptian Museum

Uncertainty-Aware Hybrid Retrieval for Long-Document RAG

N-GRPO: Embedding-Level Neighbor Mixing for Enhanced Policy Optimization

Time-Series Foundation Model Embeddings for Remaining Useful Life Estimation

RoVE: Rotary Value Embeddings Attention for Relative Position-dependent Value Pathways

RePAIR: Predictive Self-Supervised Representation Learning in Chess

Time-Series Foundation Model Embeddings for Remaining Useful Life Estimation

Bootstrapped Monitoring: Leveraging Transparent Reasoning to Oversee Stronger AI Agents

nD-RoPE: A Generalized RoPE for n-Dimensional Position Embedding