Tag

Rag

500 articles archived under #rag · RSS

arXiv — NLP / Computation & Language research 6d ago

Prague Dependency Treebank -- Consolidated 2.0: Enriching a Complex Annotation Scheme

arXiv:2606.24324v1 Announce Type: new Abstract: The Prague Dependency Treebank framework is unique in its attempt to systematically include and link different layers of language, including a meaning representation with several types of inter-sentential phenomena, especially…

12
arXiv — NLP / Computation & Language research 6d ago

Meet UD_Czech-PDTC: A Large and Genre-Rich Treebank in Universal Dependencies

arXiv:2606.24337v1 Announce Type: new Abstract: Czech has been part of Universal Dependencies since its first release in 2015. It has also been one of the best represented languages, with the Prague Dependency Treebank being order of magnitude larger than most other UD…

20
arXiv — NLP / Computation & Language research 6d ago

To Compare, or Not to Compare: On Methodological Practices in Evaluating Social Bias

arXiv:2606.24596v1 Announce Type: new Abstract: As Large Language Models are increasingly deployed in critical applications, robustly evaluating their social biases is paramount. However, the current literature suffers from widespread methodological fragmentation, which yields…

22
arXiv — NLP / Computation & Language research 6d ago

Privacy-Preserving RAG via Multi-Agent Semantic Rewriting: Achieving Confidentiality Without Compromising Contextual Fidelity

arXiv:2606.24623v1 Announce Type: new Abstract: Retrieval-Augmented Generation enhances large language models by incorporating external knowledge, but deploying it in sensitive scenarios risks privacy leakage via malicious prompts. To address this, we propose a multi-agent…

30
arXiv — NLP / Computation & Language research 6d ago

AI-PAVE-Br: Leveraging Large Language Models for Enhanced Product Attribute Value Extraction through a Golden Set Approach

arXiv:2606.24655v1 Announce Type: new Abstract: The explosive growth and complexity of product data within the dynamic Brazilian e-commerce landscape demand robust and specialized methods for structured information extraction. Traditional approaches to Product Attribute Value…

5
arXiv — NLP / Computation & Language research 6d ago

DREAM: Dense Retrieval Embeddings via Autoregressive Modeling

arXiv:2606.24667v1 Announce Type: new Abstract: Dense retrieval embedding models are a fundamental component of modern retrieval-based AI systems. Most dense retrievers are trained with contrastive objectives, which require labeled positive and negative document pairs that are…

23
arXiv — NLP / Computation & Language research 6d ago

Are We Ready For An Agent-Native Memory System?

arXiv:2606.24775v1 Announce Type: new Abstract: Memory for large language model (LLM) agents has rapidly evolved from simple retrieval-augmented mechanisms into a data management system that supports persistent information storage, retrieval, update, consolidation, and dynamic…

8
arXiv — NLP / Computation & Language research 6d ago

Neuro-Symbolic Drive: Rule-Grounded Faithful Reasoning for Driving VLAs

arXiv:2606.23938v1 Announce Type: cross Abstract: Driving VLA models incorporating Chain-of-Thought (CoT) reasoning are attractive because they leverage pretrained VLM representations and expose intermediate decisions in natural language, yet current rationales often lack the…

4
Hugging Face Daily Papers research 6d ago

DREAM: Dense Retrieval Embeddings via Autoregressive Modeling

Abstract DREAM trains dense retrieval embeddings using autoregressive language model attention mechanisms to supervise document-query similarity without requiring labeled examples. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Dense retrieval embedding models are a fundamental…

22
Hugging Face Daily Papers research 6d ago

Go-with-the-Track: Video Compositing and Motion Control with Point Tracking

Abstract Go-with-the-Track unifies motion control and reference image compositing in video generation by using point-track embeddings with spatial-aware encoding and video diffusion transformers. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Filmmaking demands precise motion…

32
Hugging Face Daily Papers research 7d ago

KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking

Abstract KaLM-Reranker-V1 is a fast reranker that decouples query and passage computation using encoder-decoder architecture with Matryoshka embedding pooling and cross-attention for efficient relevance modeling. Generated by Qwen/Qwen2.5-Coder-32B-Instruct As retrieval systems…

32
Hugging Face Daily Papers research 7d ago

EvoEmbedding: Evolvable Representations for Long-Context Retrieval and Agentic Memory

Abstract EvoEmbedding is a dynamic embedding model that generates adaptive representations by maintaining a continuously updated latent memory, enabling improved retrieval performance in long-context scenarios. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Existing embedding…

32
Hugging Face Daily Papers research 7d ago

DailyReport: An Open-ended Benchmark for Evaluating Search Agents on Daily Search Tasks

Abstract Search agents face challenges in real-world evaluation due to limited benchmarks and coarse metrics, necessitating more nuanced assessment approaches. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Search Agents (SAs) typically leverage large language models (LLMs) to…

14
Hugging Face official-blog 7d ago

Experimenting with the proposed Cross-Origin Storage API in Transformers.js

Back to Articles a]:hidden"> Experimenting with the proposed Cross-Origin Storage API in Transformers.js Published June 23, 2026 Update on GitHub Upvote 1 Thomas Steiner tomayac google (This is a guest post by Developer Relations Engineer Thomas Steiner from the Chrome team at…

30
Hugging Face Daily Papers research 7d ago

SproutRAG: Attention-Guided Tree Search with Progressive Embeddings for Long-Document RAG

Abstract SproutRAG is an attention-guided hierarchical retrieval-augmented generation framework that organizes sentence-level chunks into semantically coherent units using learned inter-sentence attention, enabling multi-granularity retrieval without additional LLM calls or…

33
Hugging Face Daily Papers research 7d ago

MCompassRAG: Topic Metadata as a Semantic Compass for Paragraph-Level Retrieval

Abstract MCompassRAG enhances retrieval-augmented generation by using topic-level metadata to guide chunk selection, improving both efficiency and precision in complex research tasks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Retrieval-augmented generation (RAG) systems…

32
Hugging Face Daily Papers research 8d ago

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

Abstract A 3D brain MRI generative model uses a masked-autoencoder tokenizer to create clinically informative embeddings that support both medical task performance and controlled image generation. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Three-dimensional (3D) brain MRI is…

6
Vercel — AI dev-tools 8d ago

Workflow SDK now compresses run and step payloads

The Workflow SDK 5 beta now compresses all run, hook, and step inputs and outputs with zstd . Compression kicks in automatically, but only when it helps. Small payloads stay as-is, larger ones get compressed before they're persisted. Compressed payloads use less storage and are…

16
r/MachineLearning community 8d ago

An Update on Matrix Recurrent Units, an Attention Alternative [R]

I recently revisited my matrix recurrent units algorithm (the MRU), a novel linear-time sequence architecture I created as an alternative to attention. I explain it in depth at the repo , but the gist is the MRU works by transforming the embedding into an input state matrix,…

29
r/LocalLLaMA community 9d ago

Help with a Local Document RAG System (Storage + Ingestion + Query + Highlighting)

Hey folks, I’m working on designing a local, offline document retrieval + LLM pipeline and would love your input on the architecture. Here’s what I’m aiming for: Storage Upload PDF, DOCX, XLSX, CSV, tables All data stored locally (no cloud) Document Ingestion Watch folder (e.g.,…

29
Hugging Face Daily Papers research 10d ago

LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents

Abstract LEDGERAGENT is a method for customer service agents that maintains task states in a separate ledger to improve policy adherence and state management during tool calling. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Policy-adherent tool-calling agents in customer-service…

36
Hugging Face Daily Papers research 10d ago

Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why

Abstract ACIE, an agentic RAG system deployed in a clinical setting, demonstrates high accuracy in extracting medical information from complex patient contexts, achieving 96.5% acceptance rate by nuclear-medicine physicians across 7,326 judgments. Generated by…

5
Hugging Face Daily Papers research 10d ago

Duration Aware Scheduling for ASR Serving Under Workload Drift

Abstract Duration-aware scheduling policies improve ASR serving latency by leveraging audio length as a predictor for processing time, with SJF and HRRN algorithms showing significant median latency reductions while maintaining throughput. Generated by…

26
Hugging Face Daily Papers research 10d ago

Taylor-Calibrate: Principled Initialization for Hybrid Linear Attention Distillation

Abstract Hybrid linear attention models can be improved through a novel initialization technique that enhances conversion from pretrained Transformers by leveraging teacher attention statistics and alignment steps. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Hybrid linear…

6
Hugging Face Daily Papers research 10d ago

DragMesh-2: Physically Plausible Dexterous Hand-Object Interaction with Articulated Objects

Abstract DragMesh-2 enables dexterous hand-object interaction through contact-driven manipulation, with PICA enhancing robustness under varying contact loads without tactile feedback. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Dexterous interaction with articulated objects is…

19
arXiv — Machine Learning research 11d ago

Spectral Retrieval-Augmented Time-Series Forecasting

arXiv:2606.19412v1 Announce Type: new Abstract: Time series forecasting leverages historical patterns to predict future values, but traditional methods face challenges when dealing with complex, non-stationary patterns that are difficult to memorize during training.…

24
arXiv — Machine Learning research 11d ago

Concept Flow Models: Anchoring Concept-Based Reasoning with Hierarchical Bottlenecks

arXiv:2606.19489v1 Announce Type: new Abstract: Concept Bottleneck Models (CBMs) enhance interpretability by projecting learned features into a human-understandable concept space. Recent approaches leverage vision-language models to generate concept embeddings, reducing the need…

8
arXiv — Machine Learning research 11d ago

On the Oracle Complexity of Interpolation-Based Gradient Descent

arXiv:2606.19878v1 Announce Type: new Abstract: Recent work on first-order optimizers for empirical risk minimization (ERM) has suggested that smoothness of ERM loss functions in the training data, rather than in the optimization parameters, can be leveraged to improve the…

19
arXiv — Machine Learning research 11d ago

Exploring the potential of AlphaEarth and TESSERA embeddings for Fine-scale Local Climate Zone Mapping: A case study across five cities in Switzerland

arXiv:2606.20034v1 Announce Type: new Abstract: Understanding urban spatial morphology is critical for climate modeling, risk assessment, and sustainable urban design, and Local Climate Zone (LCZ) mapping provides the basic framework for this. However, many cities still use…

10
arXiv — Machine Learning research 11d ago

Multi-Modal Contrastive Learning for Implicit Earth Embeddings via Location Tying

arXiv:2606.20167v1 Announce Type: new Abstract: Spatial prediction tasks are often limited by a lack of high-quality labelled ground-truth observations. To overcome this challenge, self-supervised pre-training is a possible solution, with contrastive learning dominant for…

6
arXiv — Machine Learning research 11d ago

Boundary Embedding Shaping with Adaptive Contrastive Learning for Graph Structural Disentanglement

arXiv:2606.20283v1 Announce Type: new Abstract: Graph neural networks (GNNs) excel at aggregating neighbor information for classification, yet their performance is hindered by graph structural entanglement, where spurious correlations from semantically irrelevant neighbors…

15
arXiv — NLP / Computation & Language research 11d ago

Sign-Language Datasets at Scale: A Comprehensive Survey on Resources, Benchmarks, and Annotation Standards

arXiv:2606.19352v1 Announce Type: new Abstract: Sign languages are expressive visual languages used by Deaf and Hard-of-Hearing (DHH) communities. Despite substantial progress in sign-language recognition, translation, and production, advances remain constrained by fragmented…

16
arXiv — NLP / Computation & Language research 11d ago

CacheWeaver: Cache-Aware Evidence Ordering for Efficient Grounded RAG Inference

arXiv:2606.19667v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) improves factual grounding, but it also lengthens prompts and raises prefill cost. Prefix caching in serving engines such as vLLM reduces this cost only when requests share the same token…

15
arXiv — NLP / Computation & Language research 11d ago

Leverage Is Not Reach: A Control-Window Law for Single-Neuron Steering in Language Models

arXiv:2606.19831v1 Announce Type: new Abstract: Aligned language models gate behaviors such as refusal and language routing through sparse feed forward neurons, yet no theory predicts when a single neuron intervention controls a behavior coherently rather than collapsing the…

8
arXiv — NLP / Computation & Language research 11d ago

When Does Streaming Tool Use Help? Characterizing Tool-Intent Stabilization in Streaming Retrieval-Augmented Generation

arXiv:2606.20113v1 Announce Type: new Abstract: Streaming Retrieval-Augmented Generation (Streaming RAG) reduces user-perceived latency by issuing tool queries in parallel with ongoing user input, before the utterance is complete. Reported gains are aggregate, yet the…

21
arXiv — NLP / Computation & Language research 11d ago

CATCH-ME if you RAG: a dataset of Contextually Annotated multi-Turn Counterspeech against Hate and Misinformation Exchanges

arXiv:2606.20369v1 Announce Type: new Abstract: Online hate speech and misinformation frequently overlap, yet NLP research has mainly treated them in isolation. While LLMs represent a scalable solution for assisting humans in the generation of counterspeech for both threats,…

5
arXiv — NLP / Computation & Language research 11d ago

Toten: Knowledge-Based Ontological Tokenization Of Physical Quantities And Technical Notation In Brazilian Portuguese

arXiv:2606.19626v1 Announce Type: cross Abstract: Byte-Pair Encoding tokenization is statistically efficient for vocabulary compression, but semantically blind to structured technical entities, fragmenting physical quantities, numbers, units, and symbolic expressions into…

10
arXiv — NLP / Computation & Language research 11d ago

A Layered Security Framework Against Prompt Injection in RAG-Based Chatbots

arXiv:2606.19660v1 Announce Type: cross Abstract: Prompt injection is ranked as the most critical vulnerability in large language model (LLM) deployments by the OWASP Top 10 for LLM Applications, yet existing defenses operate at isolated pipeline stages and remain incomplete.…

25
arXiv — NLP / Computation & Language research 11d ago

LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents

arXiv:2606.20529v1 Announce Type: cross Abstract: Policy-adherent tool-calling agents in customer-service domains must maintain task states across turns while calling tools and obeying domain policies. Task states consist of relevant facts, identifiers, constraints, and…

27
arXiv — NLP / Computation & Language research 11d ago

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

arXiv:2603.16606v3 Announce Type: replace Abstract: Cross-lingual sentence encoders typically cover only a few hundred languages and often trade downstream quality for stronger alignment, limiting their adoption. We introduce OmniSONAR, a new family of omnilingual, cross-lingual…

7
Hugging Face Daily Papers research 11d ago

REVES: REvision and VErification--Augmented Training for Test-Time Scaling

Abstract A two-stage iterative framework alternates between data augmentation and policy optimization to improve LLM reasoning by leveraging intermediate correction steps, achieving superior performance on coding benchmarks and constraint satisfaction problems. Generated by…

23
r/LocalLLaMA community 11d ago

LFM2.5-Embedding-350M & LFM2.5-ColBERT-350M

LFM2.5-Embedding-350M is a dense bi-encoder for fast multilingual retrieval. It produces a single vector per document — the smallest, fastest index — for reliable cross-lingual search across 11 languages. Best-in-class multilingual accuracy for a dense embedder of its size.…

27
r/MachineLearning community 11d ago

Is ACL now irrelevant? [D]

I just read in a comment of another Post that an ACL paper is considered a weak signal in the community apparently, and having an ACL first author paper is not a great plus for improving chances at finding a PhD position. Is this some kind of ragebait or is academia becoming…

26
Hugging Face Daily Papers research 11d ago

Morpheus: A Morphology-Aware Neural Tokenizer and Word Embedder for Turkish

Abstract A neural morpheme-boundary model for Turkish achieves lossless tokenization and morphology-aware embeddings with improved efficiency and performance over traditional subword methods. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Turkish is agglutinative: meaning is…

27
arXiv — Machine Learning research 12d ago

Attribution-Guided and Coverage-Maximized Pruning for Structural MoE Compression

arXiv:2606.18304v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models scale compute efficiently, yet remain expensive to deploy due to their substantial memory footprint and inference overhead. Prior compression methods mainly operate at the expert level, either…

10
arXiv — Machine Learning research 12d ago

Fisher Width: A Geometric Measure of Complexity on Statistical Manifolds

arXiv:2606.18306v1 Announce Type: new Abstract: Gaussian width is a central geometric complexity measure in high-dimensional probability, compressed sensing, convex optimization, and learning theory. It quantifies the average extent of a set along random directions, thereby…

16
arXiv — Machine Learning research 12d ago

Structured Representation Learning with Locally Linear Embeddings and Adaptive Feature Fusion

arXiv:2606.18469v1 Announce Type: new Abstract: Neuroscientific research has revealed that the brain encodes complex behaviors by leveraging structured, low-dimensional manifolds and dynamically fusing multiple sources of information through adaptive gating mechanisms. Inspired…

37
arXiv — Machine Learning research 12d ago

TS-Fault: Benchmarking Time Series Forecasters Against Structural Faults

arXiv:2606.18539v1 Announce Type: new Abstract: Time series forecasting (TSF) underpins consequential decisions in energy, transportation, finance, and healthcare, yet TSF models are almost universally ranked by a single number (e.g., average error) on clean held-out data, under…

7
arXiv — Machine Learning research 12d ago

InTrain: Intrinsic Trainability for Zero-Cost Neural Architecture Search

arXiv:2606.18676v1 Announce Type: new Abstract: Training-free neural architecture search promises efficient discovery of high-performance networks without costly training. However, existing zero-cost proxies rely on fragmented heuristics that fail to capture the fundamental…

5
arXiv — Machine Learning research 12d ago

Dual-Channel Grounded World Modeling (DCGWM): Structural Prevention of Objective Interference Collapse via Heterogeneous External Grounding with Inward-Only Gradient Flow

arXiv:2606.18688v1 Announce Type: new Abstract: Joint Embedding Predictive Architectures (JEPAs) are a leading approach to world model representation learning. We identify a failure mode in JEPA-based world models grounded against two qualitatively distinct external signals:…

12

Prague Dependency Treebank -- Consolidated 2.0: Enriching a Complex Annotation Scheme

Meet UD_Czech-PDTC: A Large and Genre-Rich Treebank in Universal Dependencies

To Compare, or Not to Compare: On Methodological Practices in Evaluating Social Bias

Privacy-Preserving RAG via Multi-Agent Semantic Rewriting: Achieving Confidentiality Without Compromising Contextual Fidelity

AI-PAVE-Br: Leveraging Large Language Models for Enhanced Product Attribute Value Extraction through a Golden Set Approach

DREAM: Dense Retrieval Embeddings via Autoregressive Modeling

Are We Ready For An Agent-Native Memory System?

Neuro-Symbolic Drive: Rule-Grounded Faithful Reasoning for Driving VLAs

DREAM: Dense Retrieval Embeddings via Autoregressive Modeling

Go-with-the-Track: Video Compositing and Motion Control with Point Tracking

KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking

EvoEmbedding: Evolvable Representations for Long-Context Retrieval and Agentic Memory

DailyReport: An Open-ended Benchmark for Evaluating Search Agents on Daily Search Tasks

Experimenting with the proposed Cross-Origin Storage API in Transformers.js

SproutRAG: Attention-Guided Tree Search with Progressive Embeddings for Long-Document RAG

MCompassRAG: Topic Metadata as a Semantic Compass for Paragraph-Level Retrieval

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

Workflow SDK now compresses run and step payloads

An Update on Matrix Recurrent Units, an Attention Alternative [R]

Help with a Local Document RAG System (Storage + Ingestion + Query + Highlighting)

LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents

Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why

Duration Aware Scheduling for ASR Serving Under Workload Drift

Taylor-Calibrate: Principled Initialization for Hybrid Linear Attention Distillation

DragMesh-2: Physically Plausible Dexterous Hand-Object Interaction with Articulated Objects

Spectral Retrieval-Augmented Time-Series Forecasting

Concept Flow Models: Anchoring Concept-Based Reasoning with Hierarchical Bottlenecks

On the Oracle Complexity of Interpolation-Based Gradient Descent

Exploring the potential of AlphaEarth and TESSERA embeddings for Fine-scale Local Climate Zone Mapping: A case study across five cities in Switzerland

Multi-Modal Contrastive Learning for Implicit Earth Embeddings via Location Tying

Boundary Embedding Shaping with Adaptive Contrastive Learning for Graph Structural Disentanglement

Sign-Language Datasets at Scale: A Comprehensive Survey on Resources, Benchmarks, and Annotation Standards

CacheWeaver: Cache-Aware Evidence Ordering for Efficient Grounded RAG Inference

Leverage Is Not Reach: A Control-Window Law for Single-Neuron Steering in Language Models

When Does Streaming Tool Use Help? Characterizing Tool-Intent Stabilization in Streaming Retrieval-Augmented Generation

CATCH-ME if you RAG: a dataset of Contextually Annotated multi-Turn Counterspeech against Hate and Misinformation Exchanges

Toten: Knowledge-Based Ontological Tokenization Of Physical Quantities And Technical Notation In Brazilian Portuguese

A Layered Security Framework Against Prompt Injection in RAG-Based Chatbots

LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

REVES: REvision and VErification--Augmented Training for Test-Time Scaling

LFM2.5-Embedding-350M & LFM2.5-ColBERT-350M

Is ACL now irrelevant? [D]

Morpheus: A Morphology-Aware Neural Tokenizer and Word Embedder for Turkish

Attribution-Guided and Coverage-Maximized Pruning for Structural MoE Compression

Fisher Width: A Geometric Measure of Complexity on Statistical Manifolds

Structured Representation Learning with Locally Linear Embeddings and Adaptive Feature Fusion

TS-Fault: Benchmarking Time Series Forecasters Against Structural Faults

InTrain: Intrinsic Trainability for Zero-Cost Neural Architecture Search

Dual-Channel Grounded World Modeling (DCGWM): Structural Prevention of Objective Interference Collapse via Heterogeneous External Grounding with Inward-Only Gradient Flow