arXiv — NLP / Computation & Language
500 articles archived · Visit source ↗ · RSS
-
arXiv — NLP / Computation & Language research 1d ago
Joint Transcription and Decryption of Images of Encrypted Handwritten Documents: A Comparison with the Traditional Pipeline
arXiv:2606.27700v1 Announce Type: cross Abstract: Historical encrypted manuscripts present a challenging problem at the intersection of cryptology, linguistics, paleography, and computer vision. Current automatic decipherment approaches usually rely on a two-stage pipeline:…
7 -
arXiv — NLP / Computation & Language research 1d ago
Verifiable Geometry Problem Solving: Solver-Driven Autoformalization and Theorem Proposing
arXiv:2606.27926v1 Announce Type: cross Abstract: Geometry Problem Solving have increasingly adopt the neuro-symbolic paradigm, combining neural intuition with symbolic rigor. However, current frameworks suffer from severe bottlenecks in two core stages: autoformalization, which…
28 -
arXiv — NLP / Computation & Language research 1d ago
AI Persuasive Framing in Collective Dilemmas
arXiv:2606.27951v1 Announce Type: cross Abstract: AI agents are promising tools that can act as flexible behavioral nudges to enhance human cooperation in addressing large-scale societal problems. However, evidence on whether AI agents can effectively boost cooperation remains…
32 -
arXiv — NLP / Computation & Language research 1d ago
DG^VoiC: Speaker Clustering for Fraud Investigation under Real Call-Centre Conditions
arXiv:2606.28048v1 Announce Type: cross Abstract: Insurance fraud remains costly and operationally difficult, particularly in call-centre workflows where many customer interactions begin at FNOL. While recent fraud detection methods mainly rely on structured data, text, or…
19 -
arXiv — NLP / Computation & Language research 1d ago
Single and Multi Truth Data Fusion using Large Language Models
arXiv:2606.28062v1 Announce Type: cross Abstract: Data fusion, also known as truth discovery, is a data integration problem that aims to determine the correct value or set of values for each attribute of an object when presented with potentially conflicting values from multiple…
27 -
arXiv — NLP / Computation & Language research 1d ago
Scaling limit of the Random Language Model
arXiv:2606.28105v1 Announce Type: cross Abstract: We develop a quantitative theory of the Random Language Model (RLM), an ensemble of stochastic context-free grammars, in a scaling limit where the number of hidden symbols $N \to \infty$ while the grammar temperature…
10 -
arXiv — NLP / Computation & Language research 1d ago
HPRO: Hierarchical Progressive Reward Optimization via Preference Extraction for Emotional Text-to-Speech
arXiv:2606.28249v1 Announce Type: cross Abstract: Recently, Large Language Model (LLM)-based Text-to-Speech (TTS) models have achieved remarkable naturalness. However, the standard Supervised Fine-Tuning paradigm often converges to statistically averaged prosody, limiting…
20 -
arXiv — NLP / Computation & Language research 1d ago
Towards Automating Scientific Review with Google's Paper Assistant Tool
arXiv:2606.28277v1 Announce Type: cross Abstract: Artificial intelligence is driving a revolution in scientific discovery, accelerating everything from hypothesis generation to mathematical theorem proving. However, this rapid acceleration is creating a systemic challenge:…
24 -
arXiv — NLP / Computation & Language research 1d ago
Continual Memorization of Factoids in Language Models
arXiv:2411.07175v3 Announce Type: replace Abstract: As new knowledge rapidly accumulates, language models (LMs) with pretrained knowledge quickly become obsolete. A common approach to updating LMs is fine-tuning them directly on new knowledge. However, recent studies have shown…
27 -
arXiv — NLP / Computation & Language research 1d ago
ReFreeKV: Towards Threshold-Free KV Cache Compression
arXiv:2502.16886v4 Announce Type: replace Abstract: To reduce memory consumption during LLM inference, a handful of methods have been proposed for KV cache pruning. While these techniques can accomplish lossless memory reduction on many datasets, they often hinge on an…
28 -
arXiv — NLP / Computation & Language research 1d ago
On the Effect of Uncertainty on Layer-wise Inference Dynamics
arXiv:2507.06722v2 Announce Type: replace Abstract: Understanding how large language models (LLMs) internally represent and process their predictions is central to detecting uncertainty and preventing hallucinations. While several studies have shown that models encode…
33 -
arXiv — NLP / Computation & Language research 1d ago
Training-free Truthfulness Detection via Sparse MLP Value Vectors
arXiv:2509.17932v2 Announce Type: replace Abstract: Large language models (LLMs) are prone to generating factually incorrect content, motivating methods for assessing truthfulness from internal model signals. While supervised probing approaches can be effective, they require…
5 -
arXiv — NLP / Computation & Language research 1d ago
Check Yourself Before You Wreck Yourself: Selectively Quitting Improves LLM Agent Safety
arXiv:2510.16492v4 Announce Type: replace Abstract: As Large Language Model (LLM) agents increasingly operate in complex environments with real-world consequences, their safety becomes critical. While uncertainty quantification is well-studied for single-turn tasks, multi-turn…
20 -
arXiv — NLP / Computation & Language research 1d ago
Hybrid Fact-Checking that Integrates Knowledge Graphs, Large Language Models, and Search-Based Retrieval Agents Improves Interpretable Claim Verification
arXiv:2511.03217v2 Announce Type: replace Abstract: Large language models (LLMs) excel in generating fluent utterances but can lack reliable grounding in verified information. At the same time, knowledge-graph-based fact-checkers deliver precise and interpretable evidence, yet…
4 -
arXiv — NLP / Computation & Language research 1d ago
Safe Language Generation in the Limit
arXiv:2601.08648v2 Announce Type: replace Abstract: Recent results in learning a language in the limit have shown that, although language identification is impossible, language generation is tractable. As this foundational area expands, we need to consider the implications of…
5 -
arXiv — NLP / Computation & Language research 1d ago
Learning to Evict from Key-Value Cache
arXiv:2602.10238v2 Announce Type: replace Abstract: The growing size of Large Language Models (LLMs) makes efficient inference challenging, primarily due to the memory demands of the autoregressive Key-Value (KV) cache. Existing eviction or compression methods reduce cost but…
25 -
arXiv — NLP / Computation & Language research 1d ago
Measuring the Redundancy of Decoder Layers in SpeechLLMs
arXiv:2603.05121v2 Announce Type: replace Abstract: Speech Large Language Models route speech encoder representations into an LLM decoder that typically accounts for over 90% of total parameters. We study how much of this decoder capacity is actually needed for speech tasks.…
36 -
arXiv — NLP / Computation & Language research 1d ago
LiveClawBench: Benchmarking LLM Agents on Complex, Real-World Assistant Tasks
arXiv:2604.13072v2 Announce Type: replace Abstract: OpenClaw-style personal assistants extend LLM agents from isolated tool use to open-ended, stateful, and personalized software environments. Evaluating these assistants is fundamentally a fidelity problem: benchmarks must be…
28 -
arXiv — NLP / Computation & Language research 1d ago
Copy First, Translate Later: Interpreting Translation Dynamics in Multilingual Pretraining
arXiv:2604.17633v2 Announce Type: replace Abstract: Large language models exhibit impressive cross-lingual capabilities. However, prior work analyzes this phenomenon through isolated factors and at sparse points during training, limiting our understanding of how cross-lingual…
29 -
arXiv — NLP / Computation & Language research 1d ago
Subject-level Inference for Realistic Text Anonymization Evaluation
arXiv:2604.21211v2 Announce Type: replace Abstract: Current text anonymization evaluation relies on span-based metrics that fail to capture what an adversary could actually infer, and assumes a single data subject, ignoring multi-subject scenarios. To address these limitations,…
6 -
arXiv — NLP / Computation & Language research 1d ago
Characterizing the Expressivity of Local Attention in Transformers
arXiv:2605.00768v3 Announce Type: replace Abstract: The transformer is the most popular neural architecture for language modeling. The cornerstone of the transformer is its global attention mechanism, which lets the model aggregate information from all preceding tokens before…
16 -
arXiv — NLP / Computation & Language research 1d ago
ELF: Embedded Language Flows
arXiv:2605.10938v2 Announce Type: replace Abstract: Diffusion and flow-based models have become the de facto approaches for generating continuous data, e.g., in domains such as images and videos. Their success has attracted growing interest in applying them to language modeling.…
22 -
arXiv — NLP / Computation & Language research 1d ago
Machine Learning for Coding Retail Product Names to Consumer-Price Categories: A Rule-plus-Bag-of-Words Pipeline with Reliability-Weighted Human-in-the-Loop Labeling
arXiv:2606.02004v2 Announce Type: replace Abstract: Consumer-price measurement increasingly draws on alternative data sources -- scanner, web-scraped, and transaction/receipt data -- whose product descriptions are short, noisy, and carry no standard product code, so each item…
4 -
arXiv — NLP / Computation & Language research 1d ago
Self-Stigma Is Not a Monolith, but Generic Empathy Is: Persona-Conditioned LLM Support for People Who Use Drugs
arXiv:2606.23387v2 Announce Type: replace Abstract: Self-stigma predicts treatment avoidance and disengagement among people who use drugs (PWUD), yet conversational systems aiming to provide support typically treat self-stigma expression as a uniform signal. We present a…
9 -
arXiv — NLP / Computation & Language research 1d ago
SIGNER: Temporally Grounded Sign Language Generation via Time-Resolved Conditioning
arXiv:2506.07460v2 Announce Type: replace-cross Abstract: Sign language generation (SLG), also known as text-to-sign generation, aims to bridge the communication gap between signers and non-signers. Unlike many other generative tasks, SLG must satisfy two fundamental linguistic…
16 -
arXiv — NLP / Computation & Language research 1d ago
PRISON: Unmasking the Criminal Potential of Large Language Models
arXiv:2506.16150v4 Announce Type: replace-cross Abstract: As large language models (LLMs) advance, concerns about their misconduct in complex social contexts intensify. Existing research overlooked the systematic understanding and assessment of their criminal capability in…
37 -
arXiv — NLP / Computation & Language research 1d ago
Retaining by Doing: The Role of On-Policy Data in Mitigating Forgetting
arXiv:2510.18874v3 Announce Type: replace-cross Abstract: Adapting language models (LMs) to new tasks via post-training carries the risk of degrading existing capabilities -- a phenomenon classically known as catastrophic forgetting. In this paper, toward identifying guidelines…
38 -
arXiv — NLP / Computation & Language research 1d ago
Psychometric Comparability of LLM-Based Digital Twins
arXiv:2601.14264v2 Announce Type: replace-cross Abstract: Large language models (LLMs) act as digital twins for human respondents, yet their psychometric comparability remains uncertain. We propose a construct validity framework spanning construct representation and the…
23 -
arXiv — NLP / Computation & Language research 1d ago
EXPLORE-Bench: Egocentric Scene Prediction with Long-Horizon Reasoning
arXiv:2603.09731v3 Announce Type: replace-cross Abstract: Multimodal large language models (MLLMs) are increasingly considered as a foundation for embodied agents, yet it remains unclear whether they can reliably reason about the long-term physical consequences of actions from…
34 -
arXiv — NLP / Computation & Language research 1d ago
RateQuant: Optimal Mixed-Precision KV Cache Quantization via Rate-Distortion Theory
arXiv:2605.06675v2 Announce Type: replace-cross Abstract: Large language models cache all previously computed key-value (KV) pairs during generation, and this KV cache grows linearly with sequence length, making it a primary memory bottleneck for serving. Quantizing the KV cache…
5 -
arXiv — NLP / Computation & Language research 1d ago
Auto-Configuring Scientific Simulators with Lightweight Coding-Agent Adapters
arXiv:2606.09774v2 Announce Type: replace-cross Abstract: Configuring an advanced scientific simulator, translating a modeling goal into a valid, runnable input deck, is a persistent bottleneck that costs domain scientists hours to days. Input decks are executable interfaces:…
33 -
arXiv — NLP / Computation & Language research 1d ago
Multimodal Evaluator Preference Collapse: Cross-Modal Coupling in Self-Evolving Agents
arXiv:2606.16682v3 Announce Type: replace-cross Abstract: When AI agents use language models to evaluate their own outputs in a feedback loop, systematic biases emerge. We show that Evaluator Preference Collapse (EPC) is dramatically amplified in multimodal settings. Using…
4 -
arXiv — NLP / Computation & Language research 1d ago
SingGuard: A Policy-Adaptive Multimodal LLM Guardrail with Dynamic Reasoning
arXiv:2606.22873v3 Announce Type: replace-cross Abstract: Vision-language models (VLMs) are increasingly deployed in consumer, medical, financial, and enterprise applications. This broad deployment expands the safety surface: risks can arise from multimodal question answering,…
31 -
arXiv — NLP / Computation & Language research 4d ago
HierBias: Context-Conditioned Hierarchical Media Bias Detection with Multi-Task Type Classification
arXiv:2606.26100v1 Announce Type: new Abstract: Media bias detection is a critical task for ensuring fair and balanced information dissemination, yet existing sentence-level approaches classify each sentence independently, ignoring inter-sentence contextual signals that human…
17 -
arXiv — NLP / Computation & Language research 4d ago
Know2Guess: A Contamination-Aware Multi-Zone Benchmark for Knowledge-Boundary Evaluation in Large Language Models
arXiv:2606.26101v1 Announce Type: new Abstract: Reliable evaluation of large language models should separate supported answering from unsupported guessing without conflating either with data contamination, prompt idiosyncrasy, or generic refusal behavior. We present a…
21 -
arXiv — NLP / Computation & Language research 4d ago
Helpfulness Hurts: Domain-Dependent Degradation of Mid-Trained Compassion Values Under Post-Training
arXiv:2606.26102v1 Announce Type: new Abstract: Standard post-training pipelines apply supervised fine-tuning (SFT) and reinforcement learning (RL) to make language models helpful, but these processes may inadvertently degrade values instilled during pre-training. We investigate…
22 -
arXiv — NLP / Computation & Language research 4d ago
Investigating LLM's Problem Solving Capability -- a Study on Statics Questions
arXiv:2606.26103v1 Announce Type: new Abstract: Large Language Models (LLMs) have rapidly influenced many aspects of society, particularly education, due to their demonstrated ability to complete assignments and examinations across a wide range of subjects. Although prior…
35 -
arXiv — NLP / Computation & Language research 4d ago
Assert, don't describe: Linguistic features that shift LLM reasoning about animal welfare
arXiv:2606.26104v1 Announce Type: new Abstract: Animal-welfare advocates produce a lot of writing, and increasingly that writing trains the language models that millions of people then ask about animal welfare. Using vocabulary-matched stance-contrast probes on a held-out…
19 -
arXiv — NLP / Computation & Language research 4d ago
Context Recycling for Long-Horizon LLM Inference
arXiv:2606.26105v1 Announce Type: new Abstract: Large language models (LLMs) exhibit strong capabilities in short-context reasoning but degrade in performance over long conversational horizons due to context window limitations and inefficient token usage. We introduce…
27 -
arXiv — NLP / Computation & Language research 4d ago
Reducing Conversational Escalation in Large Language Model Dialogue with Nonviolent Communication Constraints
arXiv:2606.26106v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used in emotionally charged situations involving interpersonal conflict, frustration, and distress. While prior safety research has focused on preventing explicit harms such as toxic or…
26 -
arXiv — NLP / Computation & Language research 4d ago
Low Resource Multimodal Translation of Nepali Spoken Words into Emotion-Conditioned Sign Language Avatars
arXiv:2606.26107v1 Announce Type: new Abstract: Sign language communication systems, that integrate emotional expression remain underexplored, particularly for low-resource languages. This pilot study presents NEST-V1 (Nepali Emotion and Speech Transformer - Version 1), a…
37 -
arXiv — NLP / Computation & Language research 4d ago
Where Larger Models Excel: The Primacy of Constraint-Guided Reasoning
arXiv:2606.26108v1 Announce Type: new Abstract: Larger language models consistently outperform smaller ones on reasoning benchmarks, yet the reasoning differences underlying this gap remain underexplored. Across benchmarks in mathematics, physics, chemistry, and programming, we…
35 -
arXiv — NLP / Computation & Language research 4d ago
From Lexicon to AI: A Structured-Data Pipeline for Specialized Conversational Systems in Low-Resource Languages
arXiv:2606.26112v1 Announce Type: new Abstract: Low-resource languages face a critical challenge in AI development: creating specialized conversational systems without access to massive training corpora. We present a systematic methodology for transforming structured linguistic…
36 -
arXiv — NLP / Computation & Language research 4d ago
Dynamic-dLLM: Dynamic Cache-Budget and Adaptive Parallel Decoding for Training-Free Acceleration of Diffusion LLM
arXiv:2606.26120v1 Announce Type: new Abstract: Diffusion Large Language Models (dLLMs) offer a promising alternative to autoregressive models, excelling in text generation tasks due to their bidirectional attention mechanisms. However, their computational complexity scales on…
15 -
arXiv — NLP / Computation & Language research 4d ago
Thinking Like a Scientist? A Structural Study of LLM-Generated Research Methods
arXiv:2606.26130v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly used to guide research methodology, yet their default methodological tendencies under minimal prompting remain unclear. Here, we prompt GPT-5.1, Gemini 3 Pro, and DeepSeek-V3.2 with an…
38 -
arXiv — NLP / Computation & Language research 4d ago
From Structure to Synergy: A Survey of Vision-Language Perception Paradigm Evolution in Multimodal Large Language Models
arXiv:2606.26196v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have recently made remarkable progress in unifying vision-language understanding and reasoning, especially following the introduction of models such as OpenAI's O-series and DeepSeek's…
12 -
arXiv — NLP / Computation & Language research 4d ago
Phonetic and semantic analyses of spoken corpora of Beijing and Taiwan Mandarin indicate that the neutral tone is a lexical tone
arXiv:2606.26360v1 Announce Type: new Abstract: The neutral, or floating, tone of Mandarin Chinese is a tone with an enigmatic set of properties. It has been described as a reduced tone, or as a tone that sometimes is lexically fixed but that can also be toneless. In…
4 -
arXiv — NLP / Computation & Language research 4d ago
Charting the Growth of Social-Physical HRI (spHRI): A Systematic Review Pipeline Augmented by Small Language Models
arXiv:2606.26382v1 Announce Type: new Abstract: Social-physical human-robot interaction (spHRI) has grown rapidly across robotics, human-computer interaction, human-robot interaction, and haptics. Yet, fragmented terminology and inconsistent methodologies make systematic…
35 -
arXiv — NLP / Computation & Language research 4d ago
ProfileFoundry: A Synthetic Person-Object Substrate for Privacy, Memory, and Tool-Use Evaluation in LLM Agent
arXiv:2606.26403v1 Announce Type: new Abstract: Foundation-model research increasingly needs data about people: user state, personal histories, relationships, contact-like fields, documents, and longitudinal updates. Real user data is difficult to share, perturb, audit, or…
34 -
arXiv — NLP / Computation & Language research 4d ago
ConflictScore: Identifying and Measuring How Language Models Handle Conflicting Evidence
arXiv:2606.26437v1 Announce Type: new Abstract: Existing metrics for factuality and faithfulness evaluate whether an answer is supported or contradicted by its grounding documents, but they fail to capture when both supporting and contradicting evidence coexist. We introduce…
6