arXiv — NLP / Computation & Language

500 articles archived · Visit source ↗ · RSS

arXiv — NLP / Computation & Language research 4d ago

ProvenAI: Provenance-Native Traces of Evidence in Generated Answers

arXiv:2606.26449v1 Announce Type: new Abstract: Retrieval-augmented systems routinely present citations alongside generated answers, yet a citation does not confirm that the corresponding source meaningfully shaped the output. This paper introduces ProvenAI, a framework that…

17
arXiv — NLP / Computation & Language research 4d ago

AnySimLite: A Lightweight Few-Shot Similarity Encoder for On-Device Speech-Adjacent Classification

arXiv:2606.26452v1 Announce Type: new Abstract: To minimize privacy concerns and inference latency on edge devices like smartphones, lightweight on-device models remain important for end-user applications. Many of these applications involve natural language classification, but…

31
arXiv — NLP / Computation & Language research 4d ago

Soft Token Alignment for Cross-Lingual Reasoning

arXiv:2606.26466v1 Announce Type: new Abstract: Multilingual large language models often produce inconsistent reasoning and answers for semantically equivalent prompts in different languages. Prior work suggests that intermediate representations can be relatively…

5
arXiv — NLP / Computation & Language research 4d ago

Extracting Problem and Method Sentence from Scientific Papers: A Context-enhanced Transformer Using Formulaic Expression Desensitization

arXiv:2606.26481v1 Announce Type: new Abstract: Billions of scientific papers lead to the need to identify essential parts from the massive text. Scientific research is an activity from putting forward problems to using methods. To learn the main idea from scientific papers, we…

24
arXiv — NLP / Computation & Language research 4d ago

Utilizing Cognitive Signals Generated during Human Reading to Enhance Keyphrase Extraction from Microblogs

arXiv:2606.26485v1 Announce Type: new Abstract: Microblogging platforms generate massive amounts of short, noisy, and dispersed user content, making automatic keyphrase extraction (AKE) an important but challenging task. Prior studies have used eye-tracking signals to improve…

7
arXiv — NLP / Computation & Language research 4d ago

Speaking Numbers to LLMs: Multi-Wavelet Number Embeddings for Time Series Forecasting

arXiv:2606.26487v1 Announce Type: new Abstract: Large language models (LLMs) are attractive for context-aware time series forecasting because they can integrate heterogeneous textual signals, yet their discrete, language-oriented tokenization and embedding interfaces are…

21
arXiv — NLP / Computation & Language research 4d ago

Comparing BERT Sentence-Pair Classification and Few-Shot LLM Prompting for Detecting Threat and Solution Framing in German Climate News

arXiv:2606.26489v1 Announce Type: new Abstract: News media play a central role in shaping public perceptions of climate change, and whether coverage emphasizes threats or solutions has measurable effects on audience engagement and policy support. Automated detection of these…

23
arXiv — NLP / Computation & Language research 4d ago

Nemotron-TwoTower: Diffusion Language Modeling with Pretrained Autoregressive Context

arXiv:2606.26493v1 Announce Type: new Abstract: Diffusion language models offer a promising alternative to autoregressive models due to their potential for parallel and iterative generation. However, existing approaches use a single network for both context representation and…

23
arXiv — NLP / Computation & Language research 4d ago

Temporal Validity in Retrieval Memory: Eliminating Stale-Fact Errors for AI Agents over Evolving Knowledge

arXiv:2606.26511v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) gives agents access to accumulated knowledge, but has no model of time. When a fact changes (e.g., a function is renamed or API restructured), RAG retrieves both the stale and current value with…

6
arXiv — NLP / Computation & Language research 4d ago

Assessing Post-Reform Changes in Risk Disclosure Quality with a Multidimensional Text Analysis Approach

arXiv:2606.26522v1 Announce Type: new Abstract: While corporate narrative disclosures provide crucial information to capital markets, comprehensively evaluating their qualitative changes over time remains challenging. Narrative text is inherently multidimensional, meaning that…

27
arXiv — NLP / Computation & Language research 4d ago

The Inattentional Gap: Task-Conditioned Language and Vision Models Omit the Safety-Critical Signals They Can Otherwise Report

arXiv:2606.26529v1 Announce Type: new Abstract: AI safety is evaluated by how reliably a model detects the hazards it is told to find, yet accidents often arise from the hazard no one specified. We show that conditioning a language or vision model on a narrow task suppresses its…

14
arXiv — NLP / Computation & Language research 4d ago

\textsc{DiARC}: Distinguishing Positive and Negative Samples Helps Improving ARC-like Reasoning Ability of Large Language Models

arXiv:2606.26530v1 Announce Type: new Abstract: The Abstraction and Reasoning Corpus (ARC;~\citealp{chollet2019measure}) contains tasks that require summarizing patterns from limited grid samples and predicting output grids. Recently, many large language model based approaches…

22
arXiv — NLP / Computation & Language research 4d ago

Erase-then-Delta Attention: Decoupling Erase and Write Addresses in Delta-Rule Linear Attention

arXiv:2606.26560v1 Announce Type: new Abstract: Delta-rule linear attention improves recurrent memory updates by correcting what is already stored at the current write address before writing new content. However, the active correction is still anchored to that same write…

25
arXiv — NLP / Computation & Language research 4d ago

Zero-shot Tweet-Level Stance Detection Enhanced by External Knowledge and Reflective Chain-of-Thought Reasoning

arXiv:2606.26571v1 Announce Type: new Abstract: Zero-shot tweet-level stance detection confronts two primary challenges: (1) mitigating the context sparsity inherent in short texts, and (2) establishing the relevance between implicit targets and textual content. While existing…

35
arXiv — NLP / Computation & Language research 4d ago

Closing the Quality Gap in Low-Resource Text-to-Speech: LoRA Fine-Tuning of VoxCPM2 for Khmer and Korean

arXiv:2606.26618v1 Announce Type: new Abstract: Large pretrained text-to-speech (TTS) models sound almost human for well-resourced languages, but much worse for languages that are rare in their training data. We study this quality gap for Khmer and Korean using VoxCPM2, a…

26
arXiv — NLP / Computation & Language research 4d ago

CAT-Q: Cost-efficient and Accurate Ternary Quantization for LLMs

arXiv:2606.26650v1 Announce Type: new Abstract: In this paper, we present CAT-Q, Cost-efficient and Accurate Ternary Quantization, for compressing and accelerating LLMs. Unlike existing state-of-the-art ternary quantization methods that rely on data-intensive and costly…

9
arXiv — NLP / Computation & Language research 4d ago

SocialPersona: Benchmarking Personalized Profiling and Response with Multimodal Social-Media Context

arXiv:2606.26654v1 Announce Type: new Abstract: Personalized language-model assistants are often evaluated through a memory lens: can a model recall preferences users have explicitly stated in dialogue? More comprehensive personalization demands a harder capability -- inferring…

13
arXiv — NLP / Computation & Language research 4d ago

Beyond Logical Forms: LLM-Extracted Patterns for Fallacy Classification

arXiv:2606.26698v1 Announce Type: new Abstract: In today's fast-paced information era, logical fallacies, defined as defective patterns of reasoning, inevitably contribute to the growth of information disorder. However, often fallacies appear in nuanced forms that complicate…

37
arXiv — NLP / Computation & Language research 4d ago

ConvMemory v3: A Validity Context Layer for Conversational Memory via Target-Conditioned Relation Verification

arXiv:2606.26753v1 Announce Type: new Abstract: Conversational memory retrieval optimizes relevance, yet a retrieved memory can be relevant and simultaneously outdated: a later turn updates, corrects, or supersedes it. ConvMemory v3 adds a validity context layer that detects and…

35
arXiv — NLP / Computation & Language research 4d ago

Evaluation Pitfalls and Challenges in Multimedia Event Extraction

arXiv:2606.26775v1 Announce Type: new Abstract: Multimedia event extraction aims to jointly identify events and their arguments across multiple modalities, such as text and images, to support more comprehensive event understanding. While recent work reports steady and…

15
arXiv — NLP / Computation & Language research 4d ago

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

arXiv:2606.26790v1 Announce Type: new Abstract: Outcome-based reinforcement learning provides a stable optimization backbone for language agents, but its sparse trajectory-level rewards provide little guidance on which intermediate decisions should be reinforced or suppressed.…

34
arXiv — NLP / Computation & Language research 4d ago

From Vajrayana Tara to Bengali Baul: A Computational Study of Lexical Transmission Across Buddhist, Shakta, and Vaishnava Traditions in Bengal

arXiv:2606.26803v1 Announce Type: new Abstract: We present a computational corpus study of vocabulary relationships across eight tradition layers of Bengali and Sanskrit devotional literature spanning the 8th to 19th centuries, encompassing Buddhist Vajrayana, Shakta Tantra,…

26
arXiv — NLP / Computation & Language research 4d ago

FBK's Long-form SpeechLLMs for IWSLT 2026 Instruction Following

arXiv:2606.26819v1 Announce Type: new Abstract: This paper describes our submission to the IWSLT 2026 Instruction Following shared task. SpeechLLMs are developed for both short-form and long-form speech instruction following under constrained settings. For the short track,…

14
arXiv — NLP / Computation & Language research 4d ago

Cascaded Multi-Granularity Pruning for On-Device LLM Inference in Industrial IoT

arXiv:2606.26861v1 Announce Type: new Abstract: Deploying large language models (LLMs) on Industrial Internet of Things (IIoT) edge devices demands extreme compression, yet existing structured pruning methods collapse at high compression ratios due to one-shot importance…

27
arXiv — NLP / Computation & Language research 4d ago

Information-Aware KV Cache Compression for Long Reasoning

arXiv:2606.26875v1 Announce Type: new Abstract: Reasoning capability has advanced rapidly in large language models (LLMs), leading to an increasing size of key-value (KV) cache in both prefilling and decoding stages. Existing KV cache compression methods mainly rely on attention…

15
arXiv — NLP / Computation & Language research 4d ago

Heterogeneous Neural Predictivity from Language Models During Naturalistic Comprehension

arXiv:2606.26880v1 Announce Type: new Abstract: Language-model representations provide structured, high-dimensional annotations of naturalistic language stimuli and can serve as informative neural predictors during comprehension. We analyzed locked derived data from Brain…

23
arXiv — NLP / Computation & Language research 4d ago

SamaVaani: Auditing and Debiasing Multilingual Clinical ASR for Indian Languages

arXiv:2606.26901v1 Announce Type: new Abstract: Automatic Speech Recognition (ASR) is increasingly used to document clinical encounters, yet its reliability in multilingual and demographically diverse Indian healthcare context remains largely unknown. In this study, we first…

6
arXiv — NLP / Computation & Language research 4d ago

GAVEL: Grounded Caption Error Verification and Localization

arXiv:2606.26923v1 Announce Type: new Abstract: Vision-language models (VLMs) often produce hallucinated or inconsistent outputs, where text and images are not properly aligned. Addressing this issue requires not only detecting misalignment but also explaining the discrepancy…

24
arXiv — NLP / Computation & Language research 4d ago

Term-Centric Hierarchy Induction from Heterogeneous Corpora

arXiv:2606.26963v1 Announce Type: new Abstract: Organizing knowledge from diverse text sources into interpretable hierarchies is crucial for tasks such as policy analysis, innovation monitoring, and exploratory domain mapping. Existing taxonomy induction methods typically rely…

31
arXiv — NLP / Computation & Language research 4d ago

RedVox: Safety and Fairness Gaps in Speech Models Across Languages

arXiv:2606.26968v1 Announce Type: new Abstract: Speech-capable models are increasingly deployed in real-world applications across languages. Yet their safety and fairness beyond English settings and under naturalistic conditions remain understudied. We survey safety reporting…

35
arXiv — NLP / Computation & Language research 4d ago

Auditing Framing-Sensitive Behavioral Instability in Large Language Models for Mental Health Interactions

arXiv:2606.26982v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly being integrated into mental health support tools and other psychologically sensitive conversational applications. In such settings, behavioral stability and consistency are important…

15
arXiv — NLP / Computation & Language research 4d ago

ReaORE: Reasoning-Guided Progressive Open Relation Extraction Empowered by Large Reasoning Models

arXiv:2606.26986v1 Announce Type: new Abstract: Open Relation Extraction (OpenRE) requires a model to extract unseen relations between head and tail entities from unstructured text for real-world applications. The core challenge of OpenRE lies in achieving reliable…

13
arXiv — NLP / Computation & Language research 4d ago

Where Do Models Find Happiness? Emotion Vectors in Open-Source LLMs

arXiv:2606.26987v1 Announce Type: new Abstract: Recent work identified emotion vectors in Claude Sonnet 4.5, which are internal representations that encode emotion concepts, causally influence behavior, and exhibit geometry mirroring human psychological structure. We test the…

29
arXiv — NLP / Computation & Language research 4d ago

MinGram: A Minimalist Unigram Tokenizer with High Compression and Competitive Morphological Alignment

arXiv:2606.27019v1 Announce Type: new Abstract: The Unigram tokenizer uses an elegant representation which makes it straightforward to edit vocabularies, but its training is comparatively heavy and complex. We introduce MinGram (Minimalist Unigram), which keeps the token-list…

7
arXiv — NLP / Computation & Language research 4d ago

Improving General Role-Playing Agents via Psychology-Grounded Reasoning and Role-Aware Policy Optimization

arXiv:2606.27025v1 Announce Type: new Abstract: Building general-purpose role-playing agents that faithfully portray any character from a natural-language profile remains challenging. The dominant paradigm -- supervised fine-tuning -- encourages behavioral mimicry without deep,…

16
arXiv — NLP / Computation & Language research 4d ago

NuclearQAv2: A Structured Benchmark for Evaluating Domain-Science Competence in Large Language Models

arXiv:2606.27047v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated strong performance across a wide range of tasks, but ensuring their reliability in highly technical domains remains a significant challenge. In nuclear engineering, problem solving…

16
arXiv — NLP / Computation & Language research 4d ago

Towards Explainable Adjudicative Variance: Quantifying Judicial Discretion via Gated Multi-Task Learning

arXiv:2606.27069v1 Announce Type: new Abstract: Legal outcome prediction must disentangle objective case facts from adjudicative context. Merit-based rulings rely on factual evidence while technical disposals may hinge on judicial discretion. We propose a Judge-Aware Gated…

13
arXiv — NLP / Computation & Language research 4d ago

The Riddle Riddle: Testing Flexible Reasoning in Large Language Models and Humans

arXiv:2606.27103v1 Announce Type: new Abstract: Humans flexibly adapt their reasoning strategies to the requirements of a given problem. Large language models (LLMs) have performed well on many cognitive tasks, however, it is unclear whether this accuracy is a result of pattern…

9
arXiv — NLP / Computation & Language research 4d ago

Forecasting With LLMs: Improved Generalization Through Feature Steering

arXiv:2606.27199v1 Announce Type: new Abstract: Successful forecasting involves identifying patterns between historical and future states of the world which generalize to future observations. We apply LLMs to a variety of forecasting tasks and inspect their internal states using…

31
arXiv — NLP / Computation & Language research 4d ago

Syntactic Belief Update as the Driver of Garden Path Processing Difficulty

arXiv:2606.27206v1 Announce Type: new Abstract: Garden path sentences present a processing difficulty for humans -- the sentence prefix leads the listener towards one interpretation, until the listener hears a critical word that shows that the initial interpretation was wrong.…

38
arXiv — NLP / Computation & Language research 4d ago

Paved with True Intents: Intent-Aware Training Improves LLM Safety Classification Across Training Regimes

arXiv:2606.27210v1 Announce Type: new Abstract: We argue that safety classifiers should model user intent as an explicit signal between the prompt and the final label. To study this, we introduce AIMS, a human-annotated dataset of 1,724 difficult safety prompts, each paired with…

17
arXiv — NLP / Computation & Language research 4d ago

Compositionality and the lexicon in evolutionary semantics

arXiv:2606.27228v1 Announce Type: new Abstract: Formal semantics has shown that sentence meanings arise by recursively composing lexical meanings, yet much of the literature on semantic universals models either lexicons with fixed signal structures or holistic composition…

17
arXiv — NLP / Computation & Language research 4d ago

CARVE: Content-Aware Recurrent with Value Efficiency for Chunk-Parallel Linear Attention

arXiv:2606.27229v1 Announce Type: new Abstract: Recurrent models must forget in order to remember, yet the state of the art decides what to erase without consulting what is stored -- the gate sees only the arriving token, not the memory it is about to modify. This memory-blind…

24
arXiv — NLP / Computation & Language research 4d ago

Bridging Talk and Thought: Understanding Dialogue Dynamics Across Collaborative Problem-Solving Contexts

arXiv:2606.27233v1 Announce Type: new Abstract: We present a conceptual framework for analyzing dialogue in collaborative problem-solving contexts, with an emphasis on the emerging dynamics of human-AI and multi-agent collaboration. As intelligent systems become active agents…

36
arXiv — NLP / Computation & Language research 4d ago

LMs as Task-Specific Knowledge Bases: An Interpretability Analysis

arXiv:2606.27237v1 Announce Type: new Abstract: Language models (LMs) capture large amounts of factual knowledge applicable to a wide range of tasks, motivating the view of their parameters as a knowledge base. An important property of knowledge bases is that different queries…

25
arXiv — NLP / Computation & Language research 4d ago

How Surprising Is Historical Italian to Language Models? Tokenization Tax, Comprehension Tax, and a Simple Mitigation

arXiv:2606.27275v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly critical to digital library workflows, yet their ability to process historical language remains poorly understood. Historical difficulty is typically treated as a monolithic barrier,…

16
arXiv — NLP / Computation & Language research 4d ago

Multilingual Reasoning Cascades Need More Context

arXiv:2606.27306v1 Announce Type: new Abstract: Translation cascades for reasoning translate the query from another language to English, reason in English, and translate the answer back to the original language. This is a competitive approach to multilingual reasoning, but…

7
arXiv — NLP / Computation & Language research 4d ago

Beyond Surface Forms: A Comprehensive, Mechanism-Oriented Taxonomy of Indirect Linguistic Encoding for LLM-Based Coded Language Detection

arXiv:2606.27314v1 Announce Type: new Abstract: To avoid moderation and surveillance on social media, some users routinely invent indirect linguistic expressions (ILE) that camouflage sensitive meanings. Such expressions surface as algospeak, euphemisms, and adversarial…

19
arXiv — NLP / Computation & Language research 4d ago

LLM-Based Examination of Eligibility Criteria from Securities Prospectuses at the German Central Bank

arXiv:2606.27316v1 Announce Type: new Abstract: Verifying the eligibility of securities as collateral is a key responsibility of the German Central Bank. However, manually verifying these assets against legal and financial criteria within lengthy, semi-structured, and often…

4
arXiv — NLP / Computation & Language research 4d ago

Empowering GUI Agents via Autonomous Experience Exploration and Hindsight Experience Utilization for Task Planning

arXiv:2606.27330v1 Announce Type: new Abstract: Multimodal web agents can assist humans in operating repetitive GUI tasks, where effective task planning is essential for decomposing complex tasks into executable actions. While small open source MLLMs are cost efficient and…

8

ProvenAI: Provenance-Native Traces of Evidence in Generated Answers

AnySimLite: A Lightweight Few-Shot Similarity Encoder for On-Device Speech-Adjacent Classification

Soft Token Alignment for Cross-Lingual Reasoning

Extracting Problem and Method Sentence from Scientific Papers: A Context-enhanced Transformer Using Formulaic Expression Desensitization

Utilizing Cognitive Signals Generated during Human Reading to Enhance Keyphrase Extraction from Microblogs

Speaking Numbers to LLMs: Multi-Wavelet Number Embeddings for Time Series Forecasting

Comparing BERT Sentence-Pair Classification and Few-Shot LLM Prompting for Detecting Threat and Solution Framing in German Climate News

Nemotron-TwoTower: Diffusion Language Modeling with Pretrained Autoregressive Context

Temporal Validity in Retrieval Memory: Eliminating Stale-Fact Errors for AI Agents over Evolving Knowledge

Assessing Post-Reform Changes in Risk Disclosure Quality with a Multidimensional Text Analysis Approach

The Inattentional Gap: Task-Conditioned Language and Vision Models Omit the Safety-Critical Signals They Can Otherwise Report

\textsc{DiARC}: Distinguishing Positive and Negative Samples Helps Improving ARC-like Reasoning Ability of Large Language Models

Erase-then-Delta Attention: Decoupling Erase and Write Addresses in Delta-Rule Linear Attention

Zero-shot Tweet-Level Stance Detection Enhanced by External Knowledge and Reflective Chain-of-Thought Reasoning

Closing the Quality Gap in Low-Resource Text-to-Speech: LoRA Fine-Tuning of VoxCPM2 for Khmer and Korean

CAT-Q: Cost-efficient and Accurate Ternary Quantization for LLMs

SocialPersona: Benchmarking Personalized Profiling and Response with Multimodal Social-Media Context

Beyond Logical Forms: LLM-Extracted Patterns for Fallacy Classification

ConvMemory v3: A Validity Context Layer for Conversational Memory via Target-Conditioned Relation Verification

Evaluation Pitfalls and Challenges in Multimedia Event Extraction

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

From Vajrayana Tara to Bengali Baul: A Computational Study of Lexical Transmission Across Buddhist, Shakta, and Vaishnava Traditions in Bengal

FBK's Long-form SpeechLLMs for IWSLT 2026 Instruction Following

Cascaded Multi-Granularity Pruning for On-Device LLM Inference in Industrial IoT

Information-Aware KV Cache Compression for Long Reasoning

Heterogeneous Neural Predictivity from Language Models During Naturalistic Comprehension

SamaVaani: Auditing and Debiasing Multilingual Clinical ASR for Indian Languages

GAVEL: Grounded Caption Error Verification and Localization

Term-Centric Hierarchy Induction from Heterogeneous Corpora

RedVox: Safety and Fairness Gaps in Speech Models Across Languages

Auditing Framing-Sensitive Behavioral Instability in Large Language Models for Mental Health Interactions

ReaORE: Reasoning-Guided Progressive Open Relation Extraction Empowered by Large Reasoning Models

Where Do Models Find Happiness? Emotion Vectors in Open-Source LLMs

MinGram: A Minimalist Unigram Tokenizer with High Compression and Competitive Morphological Alignment

Improving General Role-Playing Agents via Psychology-Grounded Reasoning and Role-Aware Policy Optimization

NuclearQAv2: A Structured Benchmark for Evaluating Domain-Science Competence in Large Language Models

Towards Explainable Adjudicative Variance: Quantifying Judicial Discretion via Gated Multi-Task Learning

The Riddle Riddle: Testing Flexible Reasoning in Large Language Models and Humans

Forecasting With LLMs: Improved Generalization Through Feature Steering

Syntactic Belief Update as the Driver of Garden Path Processing Difficulty

Paved with True Intents: Intent-Aware Training Improves LLM Safety Classification Across Training Regimes

Compositionality and the lexicon in evolutionary semantics

CARVE: Content-Aware Recurrent with Value Efficiency for Chunk-Parallel Linear Attention

Bridging Talk and Thought: Understanding Dialogue Dynamics Across Collaborative Problem-Solving Contexts

LMs as Task-Specific Knowledge Bases: An Interpretability Analysis

How Surprising Is Historical Italian to Language Models? Tokenization Tax, Comprehension Tax, and a Simple Mitigation

Multilingual Reasoning Cascades Need More Context

Beyond Surface Forms: A Comprehensive, Mechanism-Oriented Taxonomy of Indirect Linguistic Encoding for LLM-Based Coded Language Detection

LLM-Based Examination of Eligibility Criteria from Securities Prospectuses at the German Central Bank

Empowering GUI Agents via Autonomous Experience Exploration and Hindsight Experience Utilization for Task Planning