News / #paper Tag Research papers 500 articles archived under #paper · RSS Sign in to follow arXiv — NLP / Computation & Language research 1d ago DysLexLens: A Low-Resource LLM Framework for Analysing Dyslexic Learners Insights from Online Forums arXiv:2606.27619v1 Announce Type: cross Abstract: Dyslexic learners increasingly use artificial intelligence (AI) tools to support reading, writing, organisation, and study-related tasks. However, their lived experiences with these tools remain largely underexamined. This paper… 23 arXiv — Machine Learning research 1d ago Physics-Guided Robotic Radiation Source Localization along Arbitrary Measurement Paths in Unstructured Environments arXiv:2606.27624v1 Announce Type: cross Abstract: Using robots to estimate the location of the radiation source is an effective way to improve efficiency and safety. Existing methods focus on planning the robot's path to achieve precise estimation, typically approaching the… 19 arXiv — NLP / Computation & Language research 1d ago A Survey of Automated Presentation Coaching: Systems, Methods, and Open Challenges arXiv:2606.27380v1 Announce Type: new Abstract: Automated coaching for oral presentations sits at the intersection of computer-assisted pronunciation training (CAPT), prosody modeling, and speech synthesis, yet no prior work has systematically surveyed and compared existing… 6 arXiv — NLP / Computation & Language research 1d ago Causal Connections: Leveraging Multilingual Fine-Tuning for Financial QA@FinCausal 2026 arXiv:2606.27446v1 Announce Type: new Abstract: This paper describes team HSA_CORAL's submission to the FinCausal 2026 shared task on extracting cause-effect relations from financial narratives via extractive question answering in English and Spanish. We compare three modeling… 4 arXiv — NLP / Computation & Language research 1d ago Developmental approach reveals the statistical learning of Neural Language Models: Transformers generalize from the most abstract statistical patterns arXiv:2606.27460v1 Announce Type: new Abstract: In this study, we use a developmental approach to investigate the statistical learning and mental representation of neural language models (NLM). A series of Generative Transformer models are trained on a synthetic grammar. The… 4 arXiv — NLP / Computation & Language research 1d ago The Context-Ready Transformer arXiv:2606.27538v1 Announce Type: new Abstract: We introduce the context-ready transformer, a new recurrent neural network architecture built from a D-layer transformer block that pre-contextualizes each token before it enters the block. During left-to-right generation, a… 26 arXiv — NLP / Computation & Language research 1d ago Ko-WideSearch: A Korean Breadth-Search Benchmark for Exhaustive Set Enumeration by Web Agents arXiv:2606.27595v1 Announce Type: new Abstract: Web-agent benchmarks overwhelmingly measure depth -- pinning one obscure answer behind a chain of constraints -- while breadth, exhaustively enumerating a closed set and filling each item's attributes, is barely evaluated,… 32 arXiv — NLP / Computation & Language research 1d ago Narrative-UFET: Narrative Generation for Ultra-Fine Entity Typing arXiv:2606.27598v1 Announce Type: new Abstract: Ultra-fine entity typing (UFET) assigns highly specific types to entity mentions, but current approaches struggle with types in the long tail. We hypothesize that a key limitation is the reliance on sentence-level context, since… 14 arXiv — NLP / Computation & Language research 1d ago Cross-Platform Chinese Offensive Comment Detection via Dual-Threshold Hard Example Mining arXiv:2606.27629v1 Announce Type: new Abstract: Cross-platform deployment of offensive comment detection for Chinese social media suffers performance degradation. The paper proposes a dual-threshold hard mining method to address this. First, the clean-Chinese-base RoBERTa is… 16 arXiv — NLP / Computation & Language research 1d ago Yuvion LLM: An Adversarially-Aware Large Language Model for Content And AI Safety arXiv:2606.27632v1 Announce Type: new Abstract: As large language models are increasingly deployed in real-world systems, safety failures can still lead to harmful outputs and dangerous misuse. We argue that the essence of safety is adversarial: many failures arise not from… 29 arXiv — NLP / Computation & Language research 1d ago When Search Agents Should Ask: DiscoBench for Clarification-Aware Deep Search arXiv:2606.27669v1 Announce Type: new Abstract: Search agents powered by large language models (LLMs) are increasingly used to solve complex information-seeking tasks, requiring multi-step retrieval and reasoning to fulfill user goals. However, existing benchmarks often assume… 27 arXiv — NLP / Computation & Language research 1d ago From Signals to Transfer: A Factorised Study of Probe-Based Uncertainty Estimation in Large Language Models arXiv:2606.27679v1 Announce Type: new Abstract: Probe-based uncertainty estimation (UE) has emerged as a prominent approach to detect hallucinations in Large Language Models (LLMs) by learning uncertainty from internal model signals. Yet, recent methods vary simultaneously… 22 arXiv — NLP / Computation & Language research 1d ago Mitigating LLM-based p-Hacking by Preregistering for the Next LLM arXiv:2606.27687v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used to generate, classify, and annotate data whose outputs feed downstream hypothesis tests. However, LLM-based research is easy to p-hack: a researcher can tune the prompts, decoding… 32 arXiv — NLP / Computation & Language research 1d ago Mitigating Position Bias in Transformers via Layer-Specific Positional Embedding Scaling arXiv:2606.27705v1 Announce Type: new Abstract: Large Language Models (LLMs) still struggle with the ``lost-in-the-middle'' problem, where critical information located in the middle of long-context inputs is often underrepresented or lost. While existing methods attempt to… 4 arXiv — NLP / Computation & Language research 1d ago Low-Agreeableness Persona Conditioning for Safe LLM Fine-Tuning arXiv:2606.27709v1 Announce Type: new Abstract: Recent work has shown that fine-tuning large language models (LLMs) for social warmth degrades factual reliability and increases sycophancy. We investigate a related but distinct failure mode: warmth fine-tuning also weakens… 22 arXiv — NLP / Computation & Language research 1d ago Do Speech Emphasis Models Generalize across Languages and Emotions? arXiv:2606.27717v1 Announce Type: new Abstract: Prosodic emphasis varies across languages, emotions, and speaking styles, yet existing emphasis detection models are largely trained and evaluated on monolingual neutral read speech. We introduce MMEE (Multilingual Multi-Emotion… 12 arXiv — NLP / Computation & Language research 1d ago Enhancing Numerical Prediction in LLMs via Smooth MMD Alignment arXiv:2606.27731v1 Announce Type: new Abstract: Despite their strong general capabilities, large language models (LLMs) often remain unreliable when outputs must be numerically precise. A key reason is the training objective: standard cross-entropy treats numeric tokens as… 31 arXiv — NLP / Computation & Language research 1d ago KG2Cypher: Data-Centric Pipeline for Building Enterprise Text-to-Cypher Systems arXiv:2606.27742v1 Announce Type: new Abstract: Enterprise Knowledge Graphs (KGs) are increasingly used for internal search, analytics, and question answering, but building natural-language interfaces for private enterprise graphs remains costly. We present KG2Cypher, a… 14 arXiv — NLP / Computation & Language research 1d ago Output-Space Allocation Costs for Calibration-Guided LLM Compression: An Empirical Study arXiv:2606.27785v1 Announce Type: new Abstract: Training-free compression methods for large language models (LLMs) often use calibration data to guide compression decisions. ROCKET, a recent method combining sparse-dictionary factorization with multi-choice knapsack problem… 30 arXiv — NLP / Computation & Language research 1d ago SHIFT: Gate-Modulated Activation Steering for Knowledge Conflict Mitigation in Retrieval-Augmented Generation arXiv:2606.27786v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) enhances LLMs by incorporating external knowledge to support response generation. However, conflicts between retrieved context and parametric knowledge have emerged as a critical challenge in… 16 arXiv — NLP / Computation & Language research 1d ago NLL-Guided Full-Attention Layer Selection for Training-Free Sliding-Window Adaptation arXiv:2606.27791v1 Announce Type: new Abstract: Hybrid attention models that mix full and sliding-window attention across layers offer a promising approach to efficient long-context inference, but the critical question of \emph{which layers} should retain full attention remains… 19 arXiv — NLP / Computation & Language research 1d ago Position Bias Correction is Insufficient for One-Pass Attention Sorting arXiv:2606.27793v1 Announce Type: new Abstract: Long-context language models suffer from position bias, where information in middle positions is underutilized. Attention Sorting addresses this by iteratively reordering documents based on attention patterns, but its multiple… 9 arXiv — NLP / Computation & Language research 1d ago Learning Complementary Action Modeling from Automotive Maintenance Instructions arXiv:2606.27808v1 Announce Type: new Abstract: A minute lexical variation can reverse the procedural meaning of an instruction even when the rest of the sentence remains unchanged. In automotive maintenance instructions, this pattern often appears when an action phrase turns an… 8 arXiv — NLP / Computation & Language research 1d ago A Study of Temporal Fusion Strategies for Named Entity Recognition in Historical Texts arXiv:2606.27881v1 Announce Type: new Abstract: Temporal variation poses a unique challenge for named entity recognition (NER) in historical texts, where entities drift in surface form and salience across time. While language models (LMs) have made progress in various NLP tasks,… 22 arXiv — NLP / Computation & Language research 1d ago Triadic Werewolf: A Jester Role for Multi-Hop Theory of Mind in LLMs arXiv:2606.27909v1 Announce Type: new Abstract: Theory-of-mind evaluations of large language models typically use dyadic social-deduction games, where every observable cue points to a single hidden side, so a model with strong language priors can score well without ever… 15 arXiv — NLP / Computation & Language research 1d ago VASAE: Naming SAE Dictionary Directions with Vocabulary-Aligned Anchoring arXiv:2606.27941v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) provide useful decompositions of Transformer residual streams, but their learned features are usually named post hoc rather than directly connected to the Transformer's token vocabulary. We introduce… 35 arXiv — NLP / Computation & Language research 1d ago An Empirical Analysis of Factual Errors in Human-Written Text and its Application arXiv:2606.27959v1 Announce Type: new Abstract: Factual Error Detection (FED), which is the task of identifying factually incorrect spans in a given text, has long been recognized as an important research problem. However, with the rapid rise of large language models (LLMs),… 21 arXiv — NLP / Computation & Language research 1d ago From Black-Box to Clinical Insight: A Multi-Stage Explainable Framework for Speech-Based Cognitive Impairment Detection arXiv:2606.27973v1 Announce Type: new Abstract: Speech-based cognitive impairment detection offers a noninvasive, accessible alternative to costly biomarker assays, yet transformer-based models remain clinically uninterpretable. We propose a multi-stage explainability framework… 23 arXiv — NLP / Computation & Language research 1d ago ToxiREX: A Dataset on Toxic REasoning in ConteXt arXiv:2606.27981v1 Announce Type: new Abstract: We introduce a new, contextual, multilingual dataset called ToxiREX: Toxic REasoning in ConteXt. The dataset consists of threads of Reddit comments and structured characterizations of what the comments imply, following a systematic… 5 arXiv — NLP / Computation & Language research 1d ago Dialogue to Detection: A Multimodal Hybrid NLP Pipeline for Insurance Fraud Detection arXiv:2606.28002v1 Announce Type: new Abstract: Insurance fraud imposes substantial financial losses and operational inefficiencies, raising premiums and impacting trust among legitimate policyholders. Early detection at FNOL remains a persistent challenge. Existing approaches… 25 arXiv — NLP / Computation & Language research 1d ago The Signal-Coverage Matrix: Stratifying Type and Semantic Errors in Statement Autoformalization arXiv:2606.28013v1 Announce Type: new Abstract: Headline type-correctness (TC\%) of LLM autoformalization has climbed from $\sim$53\% to $\sim$76\% in two years, yet this scalar conceals which errors each method resolves. We propose a signal-coverage matrix that crosses the Lean… 23 arXiv — NLP / Computation & Language research 1d ago A Tree-of-Thoughts Inspired Hybrid Approach for Legal Case Judgement Summarization using LLMs arXiv:2606.28044v1 Announce Type: new Abstract: In recent times, Large Language Models (LLMs) are increasingly being used for legal case judgement summarization. Most prior works have tried traditional extractive and abstractive summarization of case judgements. However, hybrid… 34 arXiv — NLP / Computation & Language research 1d ago Can LLMs Judge Better Than They Generate? Evaluating Task Asymmetry, Mechanistic Interpretability and Transferability for In-Context QA arXiv:2606.28050v1 Announce Type: new Abstract: LLM-as-a-Judge and self-evaluation pipelines implicitly assume that evaluation is easier than generation. We test this in a controlled in-context QA setting where a context passage is the sole information source and each model… 29 arXiv — NLP / Computation & Language research 1d ago MultiHashFormer: Hash-based Generative Language Models arXiv:2606.28057v1 Announce Type: new Abstract: Language models (LMs) represent tokens using embedding matrices that scale linearly with the vocabulary size. To constrain the parameter footprint, prior work proposes hashing many tokens into a single vector within encoder-only… 15 arXiv — NLP / Computation & Language research 1d ago Mechanism-Driven Monitors for Preemptive Detection of LLM Training Instability arXiv:2606.28116v1 Announce Type: new Abstract: Frontier large language model training consumes massive accelerator fleets and long wall-clock computation, making stability failures costly when they occur. After a numerical or a hyperparameter fault has already destabilized the… 31 arXiv — NLP / Computation & Language research 1d ago From Tokens to States: LLMs as a Special Case of World Models and the Continuous Path Beyond arXiv:2606.28127v1 Announce Type: new Abstract: The AI community has framed the relationship between large language models (LLMs) and world models as a dichotomy: LLMs predict tokens; world models simulate reality. Yann LeCun argues in 2022 that reaching general intelligence… 25 arXiv — NLP / Computation & Language research 1d ago Cognitive Episodes in LLM Reasoning Traces Enable Interpretable Human Item Difficulty Prediction arXiv:2606.28186v1 Announce Type: new Abstract: Predicting human item difficulty is central to educational assessment, where reliable estimates support fairness and effective test construction. Existing methods often depend on costly human calibration or item-level textual… 35 arXiv — NLP / Computation & Language research 1d ago Vision-Default, Prior-Override: Causal Mechanisms of Perception-Knowledge Conflict in Vision-Language Models arXiv:2606.28273v1 Announce Type: new Abstract: Vision-language models must reconcile visual evidence with memorized world knowledge when the two conflict. How they resolve this conflict shapes the reliability of multimodal systems, yet prior work characterizes it behaviorally… 31 arXiv — NLP / Computation & Language research 1d ago CalBrief: A Pilot Diagnostic Benchmark for Evidence-Calibrated Scientific Briefing with Large Language Models arXiv:2606.27383v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used as research assistants, yet it remains unclear whether they can calibrate research takeaways to the strength and scope of the supporting evidence. We study evidence-calibrated… 17 arXiv — NLP / Computation & Language research 1d ago Cluster, Route, Escalate: Cascaded Framework for Cost-Aware LLM Serving arXiv:2606.27457v1 Announce Type: cross Abstract: Efficient deployment of large language models (LLMs) in production forces a trade-off between accuracy and cost. Operators often default to a single model that is either expensive for easy queries or insufficient for hard ones.… 20 arXiv — NLP / Computation & Language research 1d ago DMV-Bench: Diagnosing Long-Horizon Multimodal Agents' Visual Memory with Incidental Cue Injection arXiv:2606.27499v1 Announce Type: cross Abstract: Research on agent memory has matured rapidly, but almost entirely on the text side: few existing benchmarks ask, in an interactive environment, when an agent genuinely needs to remember what it saw rather than what it could write… 11 arXiv — NLP / Computation & Language research 1d ago Aloe-Vision: Robust Vision-Language Models for Healthcare arXiv:2606.27500v1 Announce Type: cross Abstract: Large Vision-Language Models (LVLMs) specialized in healthcare are emerging as a promising research direction due to their potential impact in clinical and biomedical applications. However, progress is constrained by the scarcity… 28 arXiv — NLP / Computation & Language research 1d ago Joint Transcription and Decryption of Images of Encrypted Handwritten Documents: A Comparison with the Traditional Pipeline arXiv:2606.27700v1 Announce Type: cross Abstract: Historical encrypted manuscripts present a challenging problem at the intersection of cryptology, linguistics, paleography, and computer vision. Current automatic decipherment approaches usually rely on a two-stage pipeline:… 7 arXiv — NLP / Computation & Language research 1d ago Verifiable Geometry Problem Solving: Solver-Driven Autoformalization and Theorem Proposing arXiv:2606.27926v1 Announce Type: cross Abstract: Geometry Problem Solving have increasingly adopt the neuro-symbolic paradigm, combining neural intuition with symbolic rigor. However, current frameworks suffer from severe bottlenecks in two core stages: autoformalization, which… 28 arXiv — NLP / Computation & Language research 1d ago AI Persuasive Framing in Collective Dilemmas arXiv:2606.27951v1 Announce Type: cross Abstract: AI agents are promising tools that can act as flexible behavioral nudges to enhance human cooperation in addressing large-scale societal problems. However, evidence on whether AI agents can effectively boost cooperation remains… 32 arXiv — NLP / Computation & Language research 1d ago DG^VoiC: Speaker Clustering for Fraud Investigation under Real Call-Centre Conditions arXiv:2606.28048v1 Announce Type: cross Abstract: Insurance fraud remains costly and operationally difficult, particularly in call-centre workflows where many customer interactions begin at FNOL. While recent fraud detection methods mainly rely on structured data, text, or… 19 arXiv — NLP / Computation & Language research 1d ago Single and Multi Truth Data Fusion using Large Language Models arXiv:2606.28062v1 Announce Type: cross Abstract: Data fusion, also known as truth discovery, is a data integration problem that aims to determine the correct value or set of values for each attribute of an object when presented with potentially conflicting values from multiple… 27 arXiv — NLP / Computation & Language research 1d ago Scaling limit of the Random Language Model arXiv:2606.28105v1 Announce Type: cross Abstract: We develop a quantitative theory of the Random Language Model (RLM), an ensemble of stochastic context-free grammars, in a scaling limit where the number of hidden symbols $N \to \infty$ while the grammar temperature… 10 arXiv — NLP / Computation & Language research 1d ago HPRO: Hierarchical Progressive Reward Optimization via Preference Extraction for Emotional Text-to-Speech arXiv:2606.28249v1 Announce Type: cross Abstract: Recently, Large Language Model (LLM)-based Text-to-Speech (TTS) models have achieved remarkable naturalness. However, the standard Supervised Fine-Tuning paradigm often converges to statistically averaged prosody, limiting… 20 arXiv — NLP / Computation & Language research 1d ago Continual Memorization of Factoids in Language Models arXiv:2411.07175v3 Announce Type: replace Abstract: As new knowledge rapidly accumulates, language models (LMs) with pretrained knowledge quickly become obsolete. A common approach to updating LMs is fine-tuning them directly on new knowledge. However, recent studies have shown… 27 Page 7 of 10 · 500 articles ← Newer Older →