News / #paper Tag Research papers 224 articles archived under #paper · RSS Sign in to follow r/MachineLearning community 4h ago Have the "on-hold" durations been getting longer for arXiv submissions? [D] I have a paper that has been "on-hold" for about 2 weeks now. I understand that it might take a little longer now because of inundation of AI generated low-effort papers but my papers have gone from "on-hold" to "submitted" within a couple of days in the past. Wondering if… 13 Google DeepMind official-blog 10h ago GPT-5 paper drops on arXiv — scaling laws revisited OpenAI researchers released a 47-page preprint examining how scaling laws hold up at trillion-parameter regimes, with new evidence for compute-optimal training. 27 2 r/LocalLLaMA community 10h ago AIDC-AI/Ovis2.6-80B-A3B · Hugging Face We introduce Ovis2.6-80B-A3B , the latest advancement in the Ovis series of Multimodal Large Language Models (MLLMs). Building on the strong foundation of Ovis2.5, Ovis2.6 upgrades the LLM backbone to a Mixture-of-Experts (MoE) architecture, delivering superior multimodal… 31 NVIDIA Developer Blog official-blog 12h ago Google DeepMind paper: reinforcement learning at scale New work demonstrates RL fine-tuning at unprecedented scale, with concrete benchmarks on reasoning tasks. 14 Hugging Face Daily Papers research 14h ago Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation Abstract Pion is a spectrum-preserving optimizer for large language model training that uses orthogonal equivalence transformations to maintain singular values during weight updates, offering stable performance comparable to standard optimizers. AI-generated summary We introduce… 34 Hacker News — AI on Front Page community 19h ago Deterministic Fully-Static Whole-Binary Translation Without Heuristics Article URL: https://arxiv.org/abs/2605.08419 Comments URL: https://news.ycombinator.com/item?id=48117810 Points: 227 # Comments: 53 27 Hugging Face Daily Papers research 19h ago AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via Decompositional Verifiable Reward Abstract AlphaGRPO enhances multimodal generation by applying Group Relative Policy Optimization to AR-Diffusion Unified Multimodal Models through self-reflective refinement and decompositional verifiable reward mechanisms. AI-generated summary In this paper, we propose… 26 arXiv — Machine Learning research 19h ago Interpretable EEG Microstate Discovery via Variational Deep Embedding: A Systematic Architecture Search with Multi-Quadrant Evaluation arXiv:2605.10947v1 Announce Type: new Abstract: EEG microstate analysis segments continuous brain electrical activity into brief, quasi-stable topographic configurations that reflect discrete functional brain states. Conventional approaches such as Modified K-Means operate… 22 arXiv — Machine Learning research 19h ago QuIDE: Mastering the Quantized Intelligence Trade-off via Active Optimization arXiv:2605.10959v1 Announce Type: new Abstract: There is currently no unified metric for evaluating the efficiency of quantized neural networks. We propose QuIDE, built around the Intelligence Index I = (C x P)/log_2(T+1), which collapses the compression-accuracy-latency… 22 arXiv — Machine Learning research 19h ago Steering Without Breaking: Mechanistically Informed Interventions for Discrete Diffusion Language Models arXiv:2605.10971v1 Announce Type: new Abstract: Discrete diffusion language models (DLMs) generate text by iteratively denoising all positions in parallel, offering an alternative to autoregressive models. Controlled generation methods for DLMs, imported from autoregressive… 4 arXiv — Machine Learning research 19h ago Rotation-Preserving Supervised Fine-Tuning arXiv:2605.10973v1 Announce Type: new Abstract: Supervised fine-tuning (SFT) improves in-domain performance but can degrade out-of-domain (OOD) generalization. Prior work suggests that this degradation is related to changes in dominant singular subspaces of pretrained weight… 22 arXiv — Machine Learning research 19h ago Vertex-Softmax: Tight Transformer Verification via Exact Softmax Optimization arXiv:2605.10974v1 Announce Type: new Abstract: Certified verification of transformer attention requires bounding the softmax function over interval constraints on the pre-softmax scores. Existing verifiers relax softmax ndependently of the downstream objective, leaving… 26 arXiv — Machine Learning research 19h ago Hierarchical Multi-Scale Graph Neural Networks: Scalable Heterophilous Learning with Oversmoothing and Oversquashing Mitigation arXiv:2605.10975v1 Announce Type: new Abstract: Graphs with heterophily, where adjacent nodes carry different labels, are prevalent in real-world applications, from social networks to molecular interactions. However, existing spectral Graph Neural Network (GNN) approaches… 24 arXiv — Machine Learning research 19h ago LEAP: Unlocking dLLM Parallelism via Lookahead Early-Convergence Token Detection arXiv:2605.10980v1 Announce Type: new Abstract: Diffusion Language Models (dLLMs) have garnered significant attention for their potential in highly parallel processing. The parallel capabilities of existing dLLMs stem from the assumption of conditional independence at high… 35 arXiv — Machine Learning research 19h ago $\xi$-DPO: Direct Preference Optimization via Ratio Reward Margin arXiv:2605.10981v1 Announce Type: new Abstract: Reference-free preference optimization has emerged as an efficient alternative to reinforcement learning from human feedback, with Simple Preference Optimization(SimPO) demonstrating strong performance by eliminating the explicit… 23 arXiv — Machine Learning research 19h ago TMPO: Trajectory Matching Policy Optimization for Diverse and Efficient Diffusion Alignment arXiv:2605.10983v1 Announce Type: new Abstract: Reinforcement learning (RL) has shown extraordinary potential in aligning diffusion models to downstream tasks, yet most of them still suffer from significant reward hacking, which degrades generative diversity and quality by… 10 arXiv — Machine Learning research 19h ago Structural Interpretations of Protein Language Model Representations via Differentiable Graph Partitioning arXiv:2605.10985v1 Announce Type: new Abstract: Protein language models such as ESM-2 learn rich residue representations that achieve strong performance on protein function prediction, but their features remain difficult to interpret as structural $\&$ evolutionary signals are… 17 arXiv — Machine Learning research 19h ago AESOP: Adversarial Execution-path Selection to Overload Deep Learning Pipelines arXiv:2605.10987v1 Announce Type: new Abstract: Modern machine learning deployments increasingly compose specialized models into dynamic inference pipelines, where upstream components produce intermediate predictions that determine the workload and inputs of downstream… 21 arXiv — Machine Learning research 19h ago Seeing the Needle in the Haystack: Towards Weakly-Supervised Log Instance Anomaly Localization via Counterfactual Perturbation arXiv:2605.10988v1 Announce Type: new Abstract: Log anomaly detection is a critical task for system operations and security assurance. However, in networked systems at scale, log data are generated at massive scale while instance-level annotations are prohibitively expensive,… 29 arXiv — Machine Learning research 19h ago SURGE: Surrogate Gradient Adaptation in Binary Neural Networks arXiv:2605.10989v1 Announce Type: new Abstract: The training of Binary Neural Networks (BNNs) is fundamentally based on gradient approximation for non-differentiable binarization operations (e.g., sign function). However, prevailing methods including the Straight-Through… 11 arXiv — Machine Learning research 19h ago Test-Time Personalization: A Diagnostic Framework and Probabilistic Fix for Scaling Failures arXiv:2605.10991v1 Announce Type: new Abstract: Existing approaches to LLM personalization focus on constructing better personalized models or inputs, while treating inference as a single-shot process. In this work, we study Test-Time Personalization (TTP) along an unexplored… 11 arXiv — Machine Learning research 19h ago SkillGen: Verified Inference-Time Agent Skill Synthesis arXiv:2605.10999v1 Announce Type: new Abstract: Skills are a promising way to improve LLM agent capabilities without retraining, while keeping the added procedure reusable and controllable. However, high-quality skills are still largely written by hand. We introduce SkillGen, a… 33 arXiv — Machine Learning research 19h ago Finite Volume-Informed Neural Network Framework for 2D Shallow Water Equations: Rugged Loss Landscapes and the Importance of Data Guidance arXiv:2605.11001v1 Announce Type: new Abstract: Physics-informed neural networks (PINNs) are a simple surrogate-modelling paradigm for partial differential equations, but their standard strong-form residual formulation is ill suited to the shallow water equations (SWE). It… 20 arXiv — Machine Learning research 19h ago DisagMoE: Computation-Communication overlapped MoE Training via Disaggregated AF-Pipe Parallelism arXiv:2605.11005v1 Announce Type: new Abstract: Mixture-of-experts (MoE) architectures enable trillion-parameter LLMs with sparsely activated experts. Expert parallelism (EP) is a widely adopted MoE training strategy, but it suffers from severe all-to-all communication… 25 arXiv — Machine Learning research 19h ago RT-Transformer: The Transformer Block as a Spherical State Estimator arXiv:2605.11007v1 Announce Type: new Abstract: We show that the core components of the Transformer block -- attention, residual connections, and normalization -- arise naturally from a single geometric estimation problem. Modeling the latent state as a direction on the… 19 arXiv — Machine Learning research 19h ago When and How to Canonize: A Generalization Perspective arXiv:2605.11008v1 Announce Type: new Abstract: While invariant architectures are standard for processing symmetric data, there is growing interest in achieving invariance by applying group averaging or canonization to non-invariant backbones. However, the theoretical… 12 arXiv — Machine Learning research 19h ago ACSAC: Adaptive Chunk Size Actor-Critic with Causal Transformer Q-Network arXiv:2605.11009v1 Announce Type: new Abstract: Long-horizon, sparse-reward tasks pose a fundamental challenge for reinforcement learning, since single-step TD learning suffers from bootstrapping error accumulation across successive Bellman updates. Actor-critic methods with… 34 arXiv — Machine Learning research 19h ago A Comparative Study of Federated Learning Aggregation Strategies under Homogeneous and Heterogeneous Data Distributions arXiv:2605.11010v1 Announce Type: new Abstract: Federated Learning has emerged as a transformative paradigm for collaborative machine learning across distributed environments. However, its performance is strongly influenced by the aggregation strategy used to combine local model… 17 arXiv — Machine Learning research 19h ago LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models arXiv:2605.11011v1 Announce Type: new Abstract: Looped computation shows promise in improving the reasoning-oriented performance of LLMs by scaling test-time compute. However, existing approaches typically require either training recurrent models from scratch or applying… 37 arXiv — Machine Learning research 19h ago Backbone-Equated Diffusion OOD via Sparse Internal Snapshots arXiv:2605.11014v1 Announce Type: new Abstract: Fair comparison between diffusion-based OOD detectors is challenging, as conclusions can vary with backbone choice, corruption parameterization, and test-time budget. We address this issue through a Mutualized Backbone-Equated… 30 arXiv — Machine Learning research 19h ago Simpson's Paradox in Behavioral Curves: How Aggregation Distorts Parametric Models of User Dynamics arXiv:2605.11017v1 Announce Type: new Abstract: Behavioral curve modeling -- fitting parametric functions to engagement-versus-exposure data -- is standard practice in recommendation, advertising, and clinical dosing. We show that aggregation introduces a systematic distortion:… 13 arXiv — Machine Learning research 19h ago Efficient LLM Reasoning via Variational Posterior Guidance with Efficiency Awareness arXiv:2605.11019v1 Announce Type: new Abstract: Although large language models rely on chain-of-thought for complex reasoning, the overthinking phenomenon severely degrades inference efficiency. Existing reinforcement learning methods compress reasoning chains by designing… 23 arXiv — Machine Learning research 19h ago Trust Region Inverse Reinforcement Learning: Explicit Dual Ascent using Local Policy Updates arXiv:2605.11020v1 Announce Type: new Abstract: Inverse reinforcement learning (IRL) is typically formulated as maximizing entropy subject to matching the distribution of expert trajectories. Classical (dual-ascent) IRL guarantees monotonic performance improvement but requires… 14 arXiv — Machine Learning research 19h ago A Switching System Theory of Q-Learning with Linear Function Approximation arXiv:2605.11021v1 Announce Type: new Abstract: This paper develops a switching-system interpretation of Q-learning with linear function approximation (LFA) based on the joint spectral radius (JSR). We derive an exact linear switched model for the mean dynamics and relate… 11 arXiv — Machine Learning research 19h ago ASD-Bench: A Four-Axis Comprehensive Benchmark of AI Models for Autism Spectrum Disorder arXiv:2605.11091v1 Announce Type: new Abstract: Automated ASD screening tools remain limited by single-architecture evaluations, axis-restricted assessment, and near-exclusive focus on adult cohorts, obscuring age-specific diagnostic patterns critical for early intervention. We… 4 arXiv — Machine Learning research 19h ago Enabling Performant and Flexible Model-Internal Observability for LLM Inference arXiv:2605.11093v1 Announce Type: new Abstract: Today's inference-time workloads increasingly depend on timely access to a model's internal states. We present DMI-Lib, a high-speed deep model inspector that treats internal observability as a first-class systems primitive,… 18 arXiv — Machine Learning research 19h ago Newton's Lantern: A Reinforcement Learning Framework for Finetuning AC Power Flow Warm Start Models arXiv:2605.11102v1 Announce Type: new Abstract: Neural warm starts can sharply reduce the number of Newton-Raphson iterations required to solve the AC power flow problem, but existing supervised approaches generalize poorly on heavily loaded instances near voltage collapse. We… 11 arXiv — Machine Learning research 19h ago GRAFT-ATHENA: Self-Improving Agentic Teams for Autonomous Discovery and Evolutionary Numerical Algorithms arXiv:2605.11117v1 Announce Type: new Abstract: Scientific discovery can be modeled as a sequence of probabilistic decisions that map physical problems to numerical solutions. Recent agentic AI systems automate individual scientific tasks by orchestrating LLM-driven planners,… 22 arXiv — Machine Learning research 19h ago Language Modeling with Hyperspherical Flows arXiv:2605.11125v1 Announce Type: new Abstract: Discrete Diffusion Language Models progressed rapidly as an alternative to autoregressive (AR) models, motivated by their parallel generation abilities. However, for tractability, discrete diffusion models sample from a factorized… 17 arXiv — Machine Learning research 19h ago HEPA: A Self-Supervised Horizon-Conditioned Event Predictive Architecture for Time Series arXiv:2605.11130v1 Announce Type: new Abstract: Critical events in multivariate time series, from turbine failures to cardiac arrhythmias, demand accurate prediction, yet labeled data is scarce because such events are rare and costly to annotate. We introduce HEPA… 16 arXiv — Machine Learning research 19h ago Steerable Neural ODEs on Homogeneous Spaces arXiv:2605.11133v1 Announce Type: new Abstract: We introduce steerable neural ordinary differential equations on homogeneous spaces $M=G/H$. These models constitute a novel geometric extension of manifold neural ordinary differential equations (NODEs) that transport associated… 33 arXiv — Machine Learning research 19h ago Spurious Correlation Learning in Preference Optimization: Mechanisms, Consequences, and Mitigation via Tie Training arXiv:2605.11134v1 Announce Type: new Abstract: Preference learning methods such as Direct Preference Optimization (DPO) are known to induce reliance on spurious correlations, leading to sycophancy and length bias in today's language models and potentially severe goal… 13 arXiv — Machine Learning research 19h ago Rank Is Not Capacity: Spectral Occupancy for Latent Graph Models arXiv:2605.11142v1 Announce Type: new Abstract: Graph representation learning has become a standard approach for analyzing networked data, with latent embeddings widely used for link prediction, community detection, and related tasks. Yet a basic design choice, the latent… 36 arXiv — Machine Learning research 19h ago CORE: Cyclic Orthotope Relation Embedding for Knowledge Graph Completion arXiv:2605.11159v1 Announce Type: new Abstract: Knowledge graph completion (KGC) aims to automatically infer missing facts in multi-relational data by mapping entities and relations into continuous representation spaces. Recent region-based embedding models have shown great… 16 arXiv — Machine Learning research 19h ago Interpretability Can Be Actionable arXiv:2605.11161v1 Announce Type: new Abstract: Interpretability aims to explain the behavior of deep neural networks. Despite rapid growth, there is mounting concern that much of this work has not translated into practical impact, raising questions about its relevance and… 37 arXiv — Machine Learning research 19h ago COSMOS: Model-Agnostic Personalized Federated Learning with Clustered Server Models and Pseudo-Label-Only Communication arXiv:2605.11165v1 Announce Type: new Abstract: Federated learning (FL) in heterogeneous environments remains challenging because client models often differ in both architecture and data distribution. While recent approaches attempt to address this challenge through client… 36 arXiv — Machine Learning research 19h ago Unlearning with Asymmetric Sources: Improved Unlearning-Utility Trade-off with Public Data arXiv:2605.11170v1 Announce Type: new Abstract: Noise-based certified machine unlearning currently faces a hard ceiling: the noise magnitude required to certify unlearning typically destroys model utility, particularly for large-scale deletion requests. While leveraging public… 12 arXiv — Machine Learning research 19h ago Optimistic Dual Averaging Unifies Modern Optimizers arXiv:2605.11172v1 Announce Type: new Abstract: We introduce SODA, a generalization of Optimistic Dual Averaging, which provides a common perspective on state-of-the-art optimizers like Muon, Lion, AdEMAMix and NAdam, showing that they can all be viewed as optimistic instances… 31 arXiv — Machine Learning research 19h ago Oversmoothing as Representation Degeneracy in Neural Sheaf Diffusion arXiv:2605.11178v1 Announce Type: new Abstract: Neural Sheaf Diffusion (NSD) generalizes diffusion-based Graph Neural Networks by replacing scalar graph Laplacians with sheaf Laplacians whose learned restriction maps define a task-adapted geometry. While the diffusion limit of… 25 arXiv — Machine Learning research 19h ago Muon is Not That Special: Random or Inverted Spectra Work Just as Well arXiv:2605.11181v1 Announce Type: new Abstract: The recent empirical success of the Muon optimizer has renewed interest in non-Euclidean optimization, typically justified by similarities with second-order methods, and linear minimization oracle (LMO) theory. In this paper, we… 8 arXiv — Machine Learning research 19h ago CATS: Cascaded Adaptive Tree Speculation for Memory-Limited LLM Inference Acceleration arXiv:2605.11186v1 Announce Type: new Abstract: Auto-regressive decoding in Large Language Models (LLMs) is inherently memory-bound: every generation step requires loading the model weights and intermediate results from memory (e.g., High-Bandwidth Memory (HBM) for GPU servers),… 19 arXiv — Machine Learning research 19h ago Deep Learning for Protein Complex Prediction and Design arXiv:2605.11189v1 Announce Type: new Abstract: Accurately modeling and designing protein complex structures is a central problem in computational structural biology, with broad implications for understanding cellular function and developing therapeutics. This thesis… 16 arXiv — Machine Learning research 19h ago Variational Linear Attention: Stable Associative Memory for Long-Context Transformers arXiv:2605.11196v1 Announce Type: new Abstract: Linear attention reduces the quadratic cost of softmax attention to $\mathcal{O}(T)$, but its memory state grows as $\mathcal{O}(T)$ in Frobenius norm, causing progressive interference between stored associations. We introduce… 13 arXiv — Machine Learning research 19h ago FeatMap: Understanding image manipulation in the feature space and its implications for feature space geometry arXiv:2605.11203v1 Announce Type: new Abstract: Intermediate feature representations represent the backbone for the expressivity and adaptability of deep neural networks. However, their geometric structure remains poorly understood. In this submission, we provide indirect… 20 arXiv — Machine Learning research 19h ago The Scaling Law of Evaluation Failure: Why Simple Averaging Collapses Under Data Sparsity and Item Difficulty Gaps, and How Item Response Theory Recovers Ground Truth Across Domains arXiv:2605.11205v1 Announce Type: new Abstract: Benchmark evaluation across AI and safety-critical domains overwhelmingly relies on simple averaging. We demonstrate that this practice produces substantially misleading rankings when two conditions co-occur: (1) the evaluation… 34 arXiv — Machine Learning research 19h ago Measuring Five-Nines Reliability: Sample-Efficient LLM Evaluation in Saturated Benchmarks arXiv:2605.11209v1 Announce Type: new Abstract: While existing benchmarks demonstrate the near-perfect performance of large language models (LLMs) on various tasks, this apparent saturation often obscures the need for rigorous evaluation of their reliability. In real-world… 36 arXiv — Machine Learning research 19h ago Enforcing Constraints in Generative Sampling via Adaptive Correction Scheduling arXiv:2605.11214v1 Announce Type: new Abstract: Hard constraints in generative sampling are typically enforced by projection, applied either once at the end of sampling or after every update. This binary framing overlooks a fundamental issue: projection changes the distribution… 17 arXiv — Machine Learning research 19h ago Leveraging RAG for Training-Free Alignment of LLMs arXiv:2605.11217v1 Announce Type: new Abstract: Large language model (LLM) alignment algorithms typically consist of post-training over preference pairs. While such algorithms are widely used to enable safety guardrails and align LLMs with general human preferences, we show that… 36 arXiv — Machine Learning research 19h ago ADMM-Q: An Improved Hessian-based Weight Quantizer for Post-Training Quantization of Large Language Models arXiv:2605.11222v1 Announce Type: new Abstract: Quantization is an effective strategy to reduce the storage and computation footprint of large language models (LLMs). Post-training quantization (PTQ) is a leading approach for compressing LLMs. Popular weight quantization… 5 arXiv — Machine Learning research 19h ago LiBaGS: Lightweight Boundary Gap Synthesis for Targeted Synthetic Data Selection arXiv:2605.11231v1 Announce Type: new Abstract: Synthetic data is useful only when the added samples fill missing parts of the training distribution that matter for the downstream task. We introduce LiBaGS, a lightweight, generator-agnostic method for targeted synthetic training… 30 arXiv — Machine Learning research 19h ago A Comparative Study of Model Selection Criteria for Symbolic Regression arXiv:2605.11233v1 Announce Type: new Abstract: Effective model selection is critical in symbolic regression (SR) to identify mathematical expressions that balance accuracy and complexity, and have low expected error on unseen data. Many modern implementations of genetic… 38 arXiv — Machine Learning research 19h ago Internalizing Curriculum Judgment for LLM Reinforcement Fine-Tuning arXiv:2605.11235v1 Announce Type: new Abstract: In LLM Reinforcement Fine-Tuning (RFT), curriculum learning drives both efficiency and performance. Yet, current methods externalize curriculum judgment via handcrafted heuristics or auxiliary models, risking misalignment with the… 18 arXiv — Machine Learning research 19h ago DeconDTN-Toolkit: A Library for Evaluation and Enhancement of Robustness to Provenance Shift arXiv:2605.11237v1 Announce Type: new Abstract: Despite the burgeoning body of work on distribution shifts, provenance shift-where the relationship between data source and label changes at deployment-remains poorly understood and under-addressed. In this paper, we establish a… 13 arXiv — Machine Learning research 19h ago Extending Kernel Trick to Influence Functions arXiv:2605.11239v1 Announce Type: new Abstract: In this paper, we present a dual representation of the influence functions, whose computational complexity scales with dataset size rather than model size. Both analytically and experimentally, we show that this representation can… 7 arXiv — Machine Learning research 19h ago Support-Proximity Augmented Diffusion Estimation for Offline Black-Box Optimization arXiv:2605.11246v1 Announce Type: new Abstract: Offline black-box optimization aims to discover novel designs with high property scores using only a static dataset, a task fundamentally challenged by the out-of-distribution (OOD) extrapolation problem. Existing approaches… 13 arXiv — Machine Learning research 19h ago A Proof-of-Concept Simulation-Driven Digital Twin Framework for Decision-Aware Diabetes Modeling arXiv:2605.11247v1 Announce Type: new Abstract: This paper presents a proof-of-concept digital twin framework for simulation-driven diabetes modeling using benchmark clinical data, synthetic temporal augmentation, and illustrative continuous glucose monitoring (CGM) analysis.… 27 arXiv — Machine Learning research 19h ago Curriculum Learning-Guided Progressive Distillation in Large Language Models arXiv:2605.11260v1 Announce Type: new Abstract: Knowledge distillation is a key technique for transferring the capabilities of large language models (LLMs) into smaller, more efficient student models. Existing distillation approaches often overlook two critical factors: the… 26 arXiv — Machine Learning research 19h ago Latent Chain-of-Thought Improves Structured-Data Transformers arXiv:2605.11262v1 Announce Type: new Abstract: Chain-of-thought and more broadly test-time compute are known to augment the expressive capabilities of language models and have led to major innovations in reasoning. Motivated by this success, this paper explores latent… 24 arXiv — Machine Learning research 19h ago Localization Boosting for Growth Markets: Mitigating Cross-Locale Behavioral Bias in Learning-to-Rank arXiv:2605.11272v1 Announce Type: new Abstract: Adobe Express is expanding internationally, but the US has a disproportionately large content supply and interaction volume. Learning-to-rank (LTR) models trained primarily on behavioral feedback inherit this imbalance: templates… 20 arXiv — Machine Learning research 19h ago Beyond Similarity: Temporal Operator Attention for Time Series Analysis arXiv:2605.11287v1 Announce Type: new Abstract: A persistent paradox in time-series forecasting is that structurally simple MLP and linear models often outperform high-capacity Transformers. We argue that this gap arises from a mismatch in the sequence-modeling primitive: while… 18 arXiv — Machine Learning research 19h ago Quotient-Categorical Representations for Bellman-Compatible Average-Reward Distributional Reinforcement Learning arXiv:2605.11289v1 Announce Type: new Abstract: Average-reward reinforcement learning requires estimating the gain and the bias, which is defined only up to an additive constant. This makes direct distributional analogues ill-posed on the real line. We introduce a quotient-space… 27 arXiv — Machine Learning research 19h ago Optimal Representations for Generalized Contrastive Learning with Imbalanced Datasets arXiv:2605.11291v1 Announce Type: new Abstract: In this paper, we provide a computable characterization of the geometry of optimal representations in Contrastive Learning (CL) when the classes are imbalanced. When classes are balanced and the representation dimension is greater… 27 arXiv — Machine Learning research 19h ago Primal Generation, Dual Judgment: Self-Training from Test-Time Scaling arXiv:2605.11299v1 Announce Type: new Abstract: Code generation is typically trained in the primal space of programs: a model produces a candidate solution and receives sparse execution feedback, often a single pass/fail bit. Test-time scaling enriches the inference procedure by… 32 arXiv — Machine Learning research 19h ago A Theory of Time-Sensitive Language Generation: Sparse Hallucination Beats Mode Collapse arXiv:2605.11302v1 Announce Type: new Abstract: We study language generation in the limit under a global preference ordering on strings, as introduced by Kleinberg and Wei. As in [arXiv:2504.14370, arXiv:2511.05295], we aim for \emph{breadth}, but impose an additional… 20 arXiv — Machine Learning research 19h ago Couple to Control: Joint Initial Noise Design in Diffusion Models arXiv:2605.11311v1 Announce Type: new Abstract: Diffusion models typically generate image batches from independent Gaussian initial noises. We argue that this independence assumption is only one choice within a broader class of valid joint noise designs. Instead, one can specify… 11 arXiv — Machine Learning research 19h ago Error whitening: Why Gauss-Newton outperforms Newton arXiv:2605.11316v1 Announce Type: new Abstract: The Gauss-Newton matrix is widely viewed as a positive semidefinite approximation of the Hessian, yet mounting empirical evidence shows that Gauss-Newton descent outperforms Newton's method. We adopt a function space perspective to… 5 arXiv — Machine Learning research 19h ago $\varepsilon$-Good Action Identification in Fixed-Budget Monte Carlo Tree Search arXiv:2605.11324v1 Announce Type: new Abstract: We study the fixed-budget max-min action identification problem in depth-2 max-min trees, an important special case of Monte Carlo Tree Search. A learner sequentially allocates $T$ samples to leaves and then recommends a subtree… 17 arXiv — Machine Learning research 19h ago Neural Statistical Functions arXiv:2605.11327v1 Announce Type: new Abstract: Classical deep learning typically operates on individual cases. Despite its success, real-world usage often requires repeated inference to estimate statistical quantities for complex decision-making tasks involving uncertainty or… 24 arXiv — Machine Learning research 19h ago Epistemic Uncertainty for Test-Time Discovery arXiv:2605.11328v1 Announce Type: new Abstract: Automated scientific discovery using large language models relies on identifying genuinely novel solutions. Standard reinforcement learning penalizes high-variance mutations, which leads the policy to prioritize familiar patterns.… 31 arXiv — Machine Learning research 19h ago Physics-Informed Teacher-Student Ensemble Learning for Traffic State Estimation with a Varying Speed Limit Scenario arXiv:2605.11346v1 Announce Type: new Abstract: Physics-informed deep learning (PIDL) neural networks have shown their capability as a useful instrument for transportation practitioners in utilizing the underlying relationship between the state variables for traffic state… 11 arXiv — Machine Learning research 19h ago Gradient-Free Noise Optimization for Reward Alignment in Generative Models arXiv:2605.11347v1 Announce Type: new Abstract: Existing reward alignment methods for diffusion and flow models rely on multi-step stochastic trajectories, making them difficult to extend to deterministic generators. A natural alternative is noise-space optimization, but… 38 arXiv — Machine Learning research 19h ago gym-invmgmt: An Open Benchmarking Framework for Inventory Management Methods arXiv:2605.11355v1 Announce Type: new Abstract: Inventory-policy comparisons are often difficult to interpret because performance depends on the evaluation contract as much as on the policy itself. Differences in topology, demand regime, information access, feasibility… 32 arXiv — Machine Learning research 19h ago The tractability landscape of diffusion alignment: regularization, rewards, and computational primitives arXiv:2605.11361v1 Announce Type: new Abstract: Inference-time reward alignment asks how to turn a pre-trained diffusion model with base law $p$ into a sampler that favors a reward $r$ while remaining close to $p$. Since there is no canonical distributional distance for this… 27 arXiv — Machine Learning research 19h ago Causal Fairness for Survival Analysis arXiv:2605.11362v1 Announce Type: new Abstract: In the data-driven era, large-scale datasets are routinely collected and analyzed using machine learning (ML) and artificial intelligence (AI) to inform decisions in high-stakes domains such as healthcare, employment, and criminal… 31 arXiv — Machine Learning research 19h ago LPDP: Inference-Time Reward Control for Variable-Length DNA Generation with Edit Flows arXiv:2605.11368v1 Announce Type: new Abstract: We study the application of recent Edit Flows for inference-time reward control for DNA sequence generation. Unlike most reward-guided DNA generation frameworks, which operate on fixed-length sequence spaces, Edit Flows have a… 6 arXiv — Machine Learning research 19h ago TRACE: Temporal Routing with Autoregressive Cross-channel Experts for EEG Representation Learning arXiv:2605.11380v1 Announce Type: new Abstract: Learning transferable representations for electroencephalography (EEG) remains challenging because EEG signals are inherently multi-channel and non-stationary. Channels observed at the same time provide coupled measurements of… 25 arXiv — Machine Learning research 19h ago Behavioral Mode Discovery for Fine-tuning Multimodal Generative Policies arXiv:2605.11387v1 Announce Type: new Abstract: We address the problem of fine-tuning pre-trained generative policies with reinforcement learning (RL) while preserving the multimodality of their action distributions. Existing methods for RL fine-tuning of generative policies… 17 arXiv — Machine Learning research 19h ago MuonQ: Enhancing Low-Bit Muon Quantization via Directional Fidelity Optimization arXiv:2605.11396v1 Announce Type: new Abstract: The Muon optimizer has emerged as a compelling alternative to Adam for training large language models, achieving remarkable computational savings through gradient orthogonalization. However, Muon's optimizer state is more sensitive… 21 arXiv — Machine Learning research 19h ago More Than Meets the Eye: A Semantics-Aware Traffic Augmentation Framework for Generalizable Website Fingerprinting arXiv:2605.11402v1 Announce Type: new Abstract: Deep learning-based website fingerprinting has emerged as an effective technique for inferring the websites users visit. Although existing methods achieve strong performance on closed-world datasets, they often fail to generalize… 23 arXiv — Machine Learning research 19h ago 20/20 Vision Language Models: A Prescription for Better VLMs through Data Curation Alone arXiv:2605.11405v1 Announce Type: new Abstract: Data curation has shifted the quality-compute frontier for language-model and contrastive image-text pretraining, but its role for vision-language models (VLMs) is far less established. We ask how far data curation alone can take… 33 arXiv — Machine Learning research 19h ago A Boundary-Aware Non-parametric Granular-Ball Classifier Based on Minimum Description Length arXiv:2605.11406v1 Announce Type: new Abstract: Existing granular-ball classification methods are often driven by handcrafted quality measures, neighborhood rules, or heuristic splitting and stopping criteria, which may reduce the transparency of local construction decisions and… 6 arXiv — Machine Learning research 19h ago Generative Diffusion Prior Distillation for Long-Context Knowledge Transfer arXiv:2605.11414v1 Announce Type: new Abstract: While traditional time-series classifiers assume full sequences at inference, practical constraints (latency and cost) often limit inputs to partial prefixes. The absence of class-discriminative patterns in partial data can… 29 arXiv — Machine Learning research 19h ago FastUMAP: Scalable Dimensionality Reduction via Bipartite Landmark Sampling arXiv:2605.11428v1 Announce Type: new Abstract: Exploratory analysis of high-dimensional data rarely stops at a single embedding. In practice, analysts rerun dimensionality reduction after changing preprocessing, subsets, or hyperparameters, and standard nonlinear methods can… 26 arXiv — Machine Learning research 19h ago Deep Minds and Shallow Probes arXiv:2605.11448v1 Announce Type: new Abstract: Neural representations are not unique objects. Even when two systems realize the same downstream computation, their hidden coordinates may differ by reparameterization. A probe family intended to reveal structure already present in… 18 arXiv — Machine Learning research 19h ago Beyond Prediction: Interval Neural Networks for Uncertainty-Aware System Identification arXiv:2605.11460v1 Announce Type: new Abstract: System identification (SysID) is critical for modeling dynamical systems from experimental data, yet traditional approaches often fail to capture nonlinear behaviors. While deep learning offers powerful tools for modeling such… 20 arXiv — Machine Learning research 19h ago Drop the Act: Probe-Filtered RL for Faithful Chain-of-Thought Reasoning arXiv:2605.11467v1 Announce Type: new Abstract: Reasoning models post-hoc rationalize answers they have already committed to internally, producing chains of *reasoning theater*: deliberative-looking steps that contribute nothing to correctness. This wastes inference tokens,… 7 arXiv — Machine Learning research 19h ago Robust Multi-Agent Path Finding under Observation Attacks: A Principled Adversarial-Plus-Smoothing Training Recipe arXiv:2605.11469v1 Announce Type: new Abstract: Decentralized multi-agent path finding (MAPF) routes a team of agents on a shared grid, each acting from its own local view. The standard solution trains one shared neural policy with Proximal Policy Optimization (PPO), a popular… 20 arXiv — Machine Learning research 19h ago On the Approximation Complexity of Matrix Product Operator Born Machines arXiv:2605.11471v1 Announce Type: new Abstract: Matrix product operator Born machines (MPO-BMs) are tractable tensor-network models for probabilistic modeling, but their efficient approximation capability remains unclear. We characterize this boundary from both negative and… 35 arXiv — Machine Learning research 19h ago Efficient Adjoint Matching for Fine-tuning Diffusion Models arXiv:2605.11480v1 Announce Type: new Abstract: Reward fine-tuning has become a common approach for aligning pretrained diffusion and flow models with human preferences in text-to-image generation. Among reward-gradient-based methods, Adjoint Matching (AM) provides a principled… 30 arXiv — Machine Learning research 19h ago Adaptive Calibration in Non-Stationary Environments arXiv:2605.11490v1 Announce Type: new Abstract: Making calibrated online predictions is a central challenge in modern AI systems. Much of the existing literature focuses on fully adversarial environments where outcomes may be arbitrary, leading to conservative algorithms that… 9 arXiv — Machine Learning research 19h ago Understanding and Preventing Entropy Collapse in RLVR with On-Policy Entropy Flow Optimization arXiv:2605.11491v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR) has become an effective paradigm for improving the reasoning ability of large language models. However, widely used RLVR algorithms, such as GRPO, often suffer from entropy… 12 arXiv — Machine Learning research 19h ago CTFusion: A CTF-based Benchmark for LLM Agent Evaluation arXiv:2605.11504v1 Announce Type: new Abstract: Recent advances in Large Language Models (LLMs) have enabled agentic systems for complex, multi-step tasks; cybersecurity is emerging as a prominent application. To evaluate such agents, researchers widely adopt Capture The Flag… 23 arXiv — Machine Learning research 19h ago EqOD: Symmetry-Informed Stability Selection for PDE Identification arXiv:2605.11524v1 Announce Type: new Abstract: Data-driven identification of partial differential equations (PDEs) relies on sparse regression over a candidate library of differential operators, where larger libraries inflate false positives under observation noise and smaller… 26 arXiv — NLP / Computation & Language research 19h ago Sampling More, Getting Less: Calibration is the Diversity Bottleneck in LLMs arXiv:2605.11128v1 Announce Type: new Abstract: Diversity is essential for language-model applications ranging from creative generation to scientific discovery, yet modern LLMs often collapse into a narrow subset of plausible outputs. While prior work has developed benchmarks… 11 arXiv — NLP / Computation & Language research 19h ago ClinicalBench: Stress-Testing Assertion-Aware Retrieval for Cross-Admission Clinical QA on MIMIC-IV arXiv:2605.11143v1 Announce Type: new Abstract: Reasoning benchmarks measure clinical performance on clean inputs. We evaluate the step before reasoning: retrieval over real EHR notes, where negation, temporality, and family-versus-patient attribution can flip a correct answer… 27 arXiv — NLP / Computation & Language research 19h ago Decomposing Evolutionary Mixture-of-LoRA Architectures: The Routing Lever, the Lifecycle Penalty, and a Substrate-Conditional Boundary arXiv:2605.11153v1 Announce Type: new Abstract: We decompose an evolutionary mixture-of-LoRA system on a from-scratch ~150M-parameter widened-D substrate (D=1536, V=32000; D/V approx 0.048; the "widened-1536" substrate) into three factors -- a router rewrite (parallel sigmoid… 19 arXiv — NLP / Computation & Language research 19h ago The Bicameral Model: Bidirectional Hidden-State Coupling Between Parallel Language Models arXiv:2605.11167v1 Announce Type: new Abstract: Existing multi-model and tool-augmented systems communicate by generating text, serializing every exchange through the output vocabulary. Can two pretrained language models instead coordinate through a continuous, concurrent… 16 arXiv — NLP / Computation & Language research 19h ago How Does Differential Privacy Affect Social Bias in LLMs? A Systematic Evaluation arXiv:2605.11195v1 Announce Type: new Abstract: Large language models (LLMs) trained on web-scale corpora can memorize sensitive training data, posing significant privacy risks. Differential privacy (DP) has emerged as a principled framework that limits the influence of… 32 arXiv — NLP / Computation & Language research 19h ago Instructions shape Production of Language, not Processing arXiv:2605.11206v1 Announce Type: new Abstract: Instructions trigger a production-centered mechanism in language models. Through a cognitively inspired lens that separates language processing and production, we reveal this mechanism as an asymmetry between the two stages by… 14 arXiv — NLP / Computation & Language research 19h ago ReVision: Scaling Computer-Use Agents via Temporal Visual Redundancy Reduction arXiv:2605.11212v1 Announce Type: new Abstract: Computer-use agents~(CUAs) rely on visual observations of graphical user interfaces, where each screenshot is encoded into a large number of visual tokens. As interaction trajectories grow, the token cost increases rapidly,… 11 arXiv — NLP / Computation & Language research 19h ago RETUYT-INCO at BEA 2026 Shared Task 2: Meta-prompting in Rubric-based Scoring for German arXiv:2605.11242v1 Announce Type: new Abstract: In this paper, we present the RETUYT-INCO participation at the BEA 2026 shared task "Rubric-based Short Answer Scoring for German". Our team participated in track 1 (Unseen answers three-way), track 3 (Unseen answers two-way) and… 26 arXiv — NLP / Computation & Language research 19h ago HEBATRON: A Hebrew-Specialized Open-Weight Mixture-of-Experts Language Model arXiv:2605.11255v1 Announce Type: new Abstract: We present Hebatron, a Hebrew-specialized open-weight large language model built on the NVIDIA Nemotron-3 sparse Mixture-of-Experts architecture. Training employs a three-phase easy-to-hard curriculum with continuous… 11 arXiv — NLP / Computation & Language research 19h ago ReAD: Reinforcement-Guided Capability Distillation for Large Language Models arXiv:2605.11290v1 Announce Type: new Abstract: Capability distillation applies knowledge distillation to selected model capabilities, aiming to compress a large language model (LLM) into a smaller one while preserving the abilities needed for a downstream task. However, most… 27 arXiv — NLP / Computation & Language research 19h ago Predicting Psychological Well-Being from Spontaneous Speech using LLMs arXiv:2605.11303v1 Announce Type: new Abstract: We investigate the use of Large Language Models (LLMs) for zero-shot prediction of Ryff Psychological Well-Being (PWB) scores from spontaneous speech. Using a few minutes of voice recordings from 111 participants in the PsyVoiD… 7 arXiv — NLP / Computation & Language research 19h ago SOMA: Efficient Multi-turn LLM Serving via Small Language Model arXiv:2605.11317v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly deployed in multi-turn dialogue settings where preserving conversational context across turns is essential. A standard serving practice concatenates the full dialogue history at every… 33 arXiv — NLP / Computation & Language research 19h ago Large Language Models for Causal Relations Extraction in Social Media: A Validation Framework for Disaster Intelligence arXiv:2605.11348v1 Announce Type: new Abstract: During disasters, extracting causal relations from social media can strengthen situational awareness by identifying factors linked to casualties, physical damage, infrastructure disruption, and cascading impacts. However,… 17 arXiv — NLP / Computation & Language research 19h ago An Empirical Study of Automating Agent Evaluation arXiv:2605.11378v1 Announce Type: new Abstract: Agent evaluation requires assessing complex multi-step behaviors involving tool use and intermediate reasoning, making it costly and expertise-intensive. A natural question arises: can frontier coding assistants reliably automate… 5 arXiv — NLP / Computation & Language research 19h ago Deep Reasoning in General Purpose Agents via Structured Meta-Cognition arXiv:2605.11388v1 Announce Type: new Abstract: Humans intuitively solve complex problems by flexibly shifting among reasoning modes: they plan, execute, revise intermediate goals, resolve ambiguity through associative judgment, and apply formal procedures to well-specified… 5 arXiv — NLP / Computation & Language research 19h ago Freeze Deep, Train Shallow: Interpretable Layer Allocation for Continued Pre-Training arXiv:2605.11416v1 Announce Type: new Abstract: Selective layer-wise updates are essential for low-cost continued pre-training of Large Language Models (LLMs), yet determining which layers to freeze or train remains an empirical black-box problem due to the lack of interpretable… 28 arXiv — NLP / Computation & Language research 19h ago Agent-BRACE: Decoupling Beliefs from Actions in Long-Horizon Tasks via Verbalized State Uncertainty arXiv:2605.11436v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly deployed on long-horizon tasks in partially observable environments, where they must act while inferring and tracking a complex environment state over many steps. This leads to two… 38 arXiv — NLP / Computation & Language research 19h ago StoicLLM: Preference Optimization for Philosophical Alignment in Small Language Models arXiv:2605.11483v1 Announce Type: new Abstract: While large language models excel at factual adaptation, their ability to internalize nuanced philosophical frameworks under severe data constraints remains underexplored. We investigate this by specializing small LLMs on… 13 arXiv — NLP / Computation & Language research 19h ago Robust Biomedical Publication Type and Study Design Classification with Knowledge-Guided Perturbations arXiv:2605.11502v1 Announce Type: new Abstract: Accurately and consistently indexing biomedical literature by publication type and study design is essential for supporting evidence synthesis and knowledge discovery. Prior work on automated publication type and study design… 23 arXiv — NLP / Computation & Language research 19h ago A Study on Hidden Layer Distillation for Large Language Model Pre-Training arXiv:2605.11513v1 Announce Type: new Abstract: Knowledge Distillation (KD) is a critical tool for training Large Language Models (LLMs), yet the majority of research focuses on approaches that rely solely on output logits, neglecting semantic information in the teacher's… 25 arXiv — NLP / Computation & Language research 19h ago Checkup2Action: A Multimodal Clinical Check-up Report Dataset for Patient-Oriented Action Card Generation arXiv:2605.11533v1 Announce Type: new Abstract: Clinical check-up reports are multimodal documents that combine page layouts, tables, numerical biomarkers, abnormality flags, imaging findings, and domain-specific terminology. Such heterogeneous evidence is difficult for… 30 arXiv — NLP / Computation & Language research 19h ago Taming Extreme Tokens: Covariance-Aware GRPO with Gaussian-Kernel Advantage Reweighting arXiv:2605.11538v1 Announce Type: new Abstract: Group Relative Policy Optimization (GRPO) has emerged as a promising approach for improving the reasoning capabilities of large language models. However, it struggles to effectively balance the tradeoff between exploration and… 23 arXiv — NLP / Computation & Language research 19h ago Three Regimes of Context-Parametric Conflict: A Predictive Framework and Empirical Validation arXiv:2605.11574v1 Announce Type: new Abstract: The literature on how large language models handle conflict between their training knowledge and a contradicting document presents a persistent empirical contradiction: some studies find models stubbornly retain their trained… 35 arXiv — NLP / Computation & Language research 19h ago BitLM: Unlocking Multi-Token Language Generation with Bitwise Continuous Diffusion arXiv:2605.11577v1 Announce Type: new Abstract: Autoregressive language models generate text one token at a time, yet natural language is inherently structured in multi-token units, including phrases, n-grams, and collocations that carry meaning jointly. This one-token… 27 arXiv — NLP / Computation & Language research 19h ago Ada-MK: Adaptive MegaKernel Optimization via Automated DAG-based Search for LLM Inference arXiv:2605.11581v1 Announce Type: new Abstract: When large language models (LLMs) serve real-time inference in commercial online advertising systems, end-to-end latency must be strictly bounded to the millisecond range. Yet every token generated during the decode phase triggers… 32 arXiv — NLP / Computation & Language research 19h ago Efficient LLM-based Advertising via Model Compression and Parallel Verification arXiv:2605.11582v1 Announce Type: new Abstract: Large language models (LLMs) have shown remarkable potential in advertising scenarios such as ad creative generation and targeted advertising. However, deploying LLMs in real-time advertising systems poses significant challenges… 19 arXiv — NLP / Computation & Language research 19h ago DiffScore: Text Evaluation Beyond Autoregressive Likelihood arXiv:2605.11601v1 Announce Type: new Abstract: Autoregressive language models are widely used for text evaluation, however, their left-to-right factorization introduces positional bias, i.e., early tokens are scored with only leftward context, conflating architectural asymmetry… 38 arXiv — NLP / Computation & Language research 19h ago PRISM: A Geometric Risk Bound that Decomposes Drift into Scale, Shape, and Head arXiv:2605.11608v1 Announce Type: new Abstract: Comparing post-training LLM variants, such as quantized, LoRA-adapted, and distilled models, requires a diagnostic that identifies how a variant has drifted, not only whether it has degraded. Existing similarity scores such as CKA… 31 arXiv — NLP / Computation & Language research 19h ago When Emotion Becomes Trigger: Emotion-style dynamic Backdoor Attack Parasitising Large Language Models arXiv:2605.11612v1 Announce Type: new Abstract: Backdoor vulnerabilities widely exist in the fine-tuning of large language models(LLMs). Most backdoor poisoning methods operate mainly at the token level and lack deeper semantic manipulation, which limits stealthiness. In… 25 arXiv — NLP / Computation & Language research 19h ago OmniThoughtVis: A Scalable Distillation Pipeline for Deployable Multimodal Reasoning Models arXiv:2605.11629v1 Announce Type: new Abstract: Recent multimodal large language models (MLLMs) have shown strong chain-of-thought (CoT) reasoning ability on vision-language tasks, but their direct deployment in real-world systems is often limited by latency and resource… 38 arXiv — NLP / Computation & Language research 19h ago Enhancing Multilingual Counterfactual Generation through Alignment-as-Preference Optimization arXiv:2605.11632v1 Announce Type: new Abstract: Self-generated counterfactual explanations (SCEs) are minimally modified inputs (minimality) generated by large language models (LLMs) that flip their own predictions (validity), offering a causally grounded approach to unraveling… 37 arXiv — NLP / Computation & Language research 19h ago Human-Grounded Multimodal Benchmark with 900K-Scale Aggregated Student Response Distributions from Japan's National Assessment of Academic Ability arXiv:2605.11663v1 Announce Type: new Abstract: Authentic school examinations provide a high-validity test bed for evaluating multimodal large language models (MLLMs), yet benchmarks grounded in Japanese K-12 assessments remain scarce. We present a multimodal dataset constructed… 13 arXiv — NLP / Computation & Language research 19h ago Robust LLM Unlearning Against Relearning Attacks: The Minor Components in Representations Matter arXiv:2605.11685v1 Announce Type: new Abstract: Large language model (LLM) unlearning aims to remove specific data influences from pre-trained model without costly retraining, addressing privacy, copyright, and safety concerns. However, recent studies reveal a critical… 17 arXiv — NLP / Computation & Language research 19h ago Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation arXiv:2605.11739v1 Announce Type: new Abstract: On-policy distillation (OPD) has emerged as an efficient post-training paradigm for large language models. However, existing studies largely attribute this advantage to denser and more stable supervision, while the parameter-level… 6 arXiv — NLP / Computation & Language research 19h ago Training-Inference Consistent Segmented Execution for Long-Context LLMs arXiv:2605.11744v1 Announce Type: new Abstract: Transformer-based large language models face severe scalability challenges in long-context generation due to the computational and memory costs of full-context attention. Under practical computation and memory constraints, many… 7 arXiv — NLP / Computation & Language research 19h ago Safety-Oriented Evaluation of Language Understanding Systems for Air Traffic Control arXiv:2605.11769v1 Announce Type: new Abstract: Air Traffic Control (ATC) is a safety-critical domain in which incorrect interpretation of instructions may lead to severe operational consequences. While large language models (LLMs) demonstrate strong general performance, their… 7 arXiv — NLP / Computation & Language research 19h ago From Token to Token Pair: Efficient Prompt Compression for Large Language Models in Clinical Prediction arXiv:2605.11774v1 Announce Type: new Abstract: By processing electronic health records (EHRs) as natural language sequences, large language models (LLMs) have shown potential in clinical prediction tasks such as mortality prediction and phenotyping. However, longitudinal or… 13 arXiv — NLP / Computation & Language research 19h ago Choosing features for classifying multiword expressions arXiv:2605.11779v1 Announce Type: new Abstract: Multiword expressions (MWEs) are a heterogeneous set with a glaring need for classifications. Designing a satisfactory classification involves choosing features. In the case of MWEs, many features are a priori available. Not all… 21 arXiv — NLP / Computation & Language research 19h ago Probabilistic Calibration Is a Trainable Capability in Language Models arXiv:2605.11845v1 Announce Type: new Abstract: Language models are increasingly used in settings where outputs must satisfy user-specified randomness constraints, yet their generation probabilities are often poorly calibrated to those targets. We study whether this capability… 17 arXiv — NLP / Computation & Language research 19h ago Self-Distilled Trajectory-Aware Boltzmann Modeling: Bridging the Training-Inference Discrepancy in Diffusion Language Models arXiv:2605.11854v1 Announce Type: new Abstract: Diffusion Language Models (DLMs) have recently emerged as a promising alternative to autoregressive language models, offering stronger global awareness and highly parallel generation. However, post-training DLMs with standard… 18 arXiv — NLP / Computation & Language research 19h ago Concordance Comparison as a Means of Assembling Local Grammars arXiv:2605.11862v1 Announce Type: new Abstract: Named Entity Recognition for person names is an important but non-trivial task in information extraction. This article uses a tool that compares the concordances obtained from two local grammars (LG) and highlights the differences.… 12 arXiv — NLP / Computation & Language research 19h ago Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models arXiv:2605.11887v1 Announce Type: new Abstract: Large language models have achieved remarkable capabilities across diverse tasks, yet their internal decision-making processes remain largely opaque, limiting our ability to inspect, control, and systematically improve them. This… 22 arXiv — NLP / Computation & Language research 19h ago YFPO: A Preliminary Study of Yoked Feature Preference Optimization with Neuron-Guided Rewards for Mathematical Reasoning arXiv:2605.11906v1 Announce Type: new Abstract: Preference optimization has become an important post-training paradigm for improving the reasoning abilities of large language models. Existing methods typically rely on externally constructed preference data, using preferred and… 31 arXiv — NLP / Computation & Language research 19h ago Enhancing Target-Guided Proactive Dialogue Systems via Conversational Scenario Modeling and Intent-Keyword Bridging arXiv:2605.11964v1 Announce Type: new Abstract: A target-guided proactive dialogue system aims to steer conversations proactively toward pre-defined targets, such as designated keywords or specific topics. During guided conversations, dynamically modeling conversational… 37 arXiv — NLP / Computation & Language research 19h ago On Predicting the Post-training Potential of Pre-trained LLMs arXiv:2605.11978v1 Announce Type: new Abstract: The performance of Large Language Models (LLMs) on downstream tasks is fundamentally constrained by the capabilities acquired during pre-training. However, traditional benchmarks like MMLU often fail to reflect a base model's… 11 arXiv — NLP / Computation & Language research 19h ago Towards Visually-Guided Movie Subtitle Translation for Indic Languages arXiv:2605.11993v1 Announce Type: new Abstract: Movie subtitle translation is inherently multimodal, yet text-only systems often miss visual cues needed to convey emotion, action, and social nuance, especially for low-resource Indic languages (English to Hindi, Bengali, Telugu,… 13 arXiv — NLP / Computation & Language research 19h ago Learning Agentic Policy from Action Guidance arXiv:2605.12004v1 Announce Type: new Abstract: Agentic reinforcement learning (RL) for Large Language Models (LLMs) critically depends on the exploration capability of the base policy, as training signals emerge only within its in-capability region. For tasks where the base… 12 arXiv — NLP / Computation & Language research 19h ago SAGE: Scalable Automated Robustness Augmentation for LLM Knowledge Evaluation arXiv:2605.12022v1 Announce Type: new Abstract: Large Language Models (LLMs) achieve strong performance on standard knowledge evaluation benchmarks, yet recent work shows that their knowledge capabilities remain brittle under question variants that test the same knowledge in… 26 arXiv — NLP / Computation & Language research 19h ago Caraman at SemEval-2026 Task 8: Three-Stage Multi-Turn Retrieval with Query Rewriting, Hybrid Search, and Cross-Encoder Reranking arXiv:2605.12028v1 Announce Type: new Abstract: We describe our system for SemEval-2026 Task 8 (MTRAGEval), participating in Task A (Retrieval) across four English-language domains. Our approach employs a three-stage pipeline: (1) query rewriting via a LoRA-fine-tuned Qwen 2.5… 30 arXiv — NLP / Computation & Language research 19h ago SkillGraph: Skill-Augmented Reinforcement Learning for Agents via Evolving Skill Graphs arXiv:2605.12039v1 Announce Type: new Abstract: Skill libraries enable large language model agents to reuse experience from past interactions, but most existing libraries store skills as isolated entries and retrieve them only by semantic similarity. This leads to two key… 11 arXiv — NLP / Computation & Language research 19h ago Is Child-Directed Language Optimized for Word Learning? A Computational Study of Verb Meaning Acquisition arXiv:2605.12047v1 Announce Type: new Abstract: Is child-directed language (CDL) optimized to support language learning, and which aspects of linguistic development does it facilitate? We investigate this question using neural language models trained on CDL versus adult-directed… 6 arXiv — NLP / Computation & Language research 19h ago Do Language Models Encode Knowledge of Linguistic Constraint Violations? arXiv:2605.12055v1 Announce Type: new Abstract: Large Language Models (LLMs) achieve strong linguistic performance, yet their internal mechanisms for producing these predictions remain unclear. We investigate the hypothesis that LLMs encode representations of linguistic… 31 arXiv — NLP / Computation & Language research 19h ago Sign Language Recognition and Translation for Low-Resource Languages: Challenges and Pathways Forward arXiv:2605.12096v1 Announce Type: new Abstract: Sign languages are natural, visual-gestural languages used by Deaf communities worldwide. Over 300 distinct sign languages remain severely low-resource due to limited documentation, sparse datasets, and insufficient computational… 27 arXiv — NLP / Computation & Language research 19h ago Metaphor Is Not All Attention Needs arXiv:2605.12128v1 Announce Type: new Abstract: Large language models are increasingly deployed in safety-critical applications, where their ability to resist harmful instructions is essential. Although post-training aims to make models robust against many jailbreak strategies,… 20 arXiv — NLP / Computation & Language research 19h ago Latent Causal Void: Explicit Missing-Context Reconstruction for Misinformation Detection arXiv:2605.12156v1 Announce Type: new Abstract: Automatic misinformation detection performs well when deception is visible in what an article explicitly states. However, some misinformation articles remain locally coherent and only become misleading once compared with… 27 arXiv — NLP / Computation & Language research 19h ago Correcting Selection Bias in Sparse User Feedback for Large Language Model Quality Estimation: A Multi-Agent Hierarchical Bayesian Approach arXiv:2605.12177v1 Announce Type: new Abstract: [Abridged] Production LLM deployments receive feedback from a non-random fraction of users: thumbs sit mostly in the tails of the satisfaction distribution, and a naive average over them can land 40-50 percentage points away from… 6 arXiv — NLP / Computation & Language research 19h ago Mitigating Context-Memory Conflicts in LLMs through Dynamic Cognitive Reconciliation Decoding arXiv:2605.12185v1 Announce Type: new Abstract: Large language models accumulate extensive parametric knowledge through pre-training. However, knowledge conflicts occur when outdated or incorrect parametric knowledge conflicts with external knowledge in the context. Existing… 27 arXiv — NLP / Computation & Language research 19h ago Mechanistic Interpretability of ASR models using Sparse Autoencoders arXiv:2605.12225v1 Announce Type: new Abstract: Understanding the internal machinations of deep Transformer-based NLP models is more crucial than ever as these models see widespread use in various domains that affect the public at large, such as industry, academia, finance,… 24 arXiv — NLP / Computation & Language research 19h ago Combining On-Policy Optimization and Distillation for Long-Context Reasoning in Large Language Models arXiv:2605.12227v1 Announce Type: new Abstract: Adapting large language models (LLMs) to long-context tasks requires post-training methods that remain accurate and coherent over thousands of tokens. Existing approaches are limited in several ways: 1) off-policy methods such as… 12 arXiv — NLP / Computation & Language research 19h ago Mind the Pause: Disfluency-Aware Objective Tuning for Multilingual Speech Correction with LLMs arXiv:2605.12242v1 Announce Type: new Abstract: Automatic Speech Recognition (ASR) transcripts often contain disfluencies, such as fillers, repetitions, and false starts, which reduce readability and hinder downstream applications like chatbots and voice assistants. If left… 5 arXiv — NLP / Computation & Language research 19h ago PreScam: A Benchmark for Predicting Scam Progression from Early Conversations arXiv:2605.12243v1 Announce Type: new Abstract: Conversational scams, such as romance and investment scams, are emerging as a major form of online fraud. Unlike one-shot scam lures such as fake lottery or unpaid toll messages, they unfold through multi-turn conversations in… 33 arXiv — NLP / Computation & Language research 19h ago PRISM: Pareto-Efficient Retrieval over Intent-Aware Structured Memory for Long-Horizon Agents arXiv:2605.12260v1 Announce Type: new Abstract: Long-horizon language agents accumulate conversation history far faster than any fixed context window can hold, making memory management critical to both answer accuracy and serving cost. Existing approaches either expand the… 8 arXiv — NLP / Computation & Language research 19h ago What makes a word hard to learn? Modeling L1 influence on English vocabulary difficulty arXiv:2605.12281v1 Announce Type: new Abstract: What makes a word difficult to learn, and how does the difficulty depend on the learner's native language? We computationally model vocabulary difficulty for English learners whose first language is Spanish, German, or Chinese with… 32 arXiv — NLP / Computation & Language research 19h ago TokenRatio: Principled Token-Level Preference Optimization via Ratio Matching arXiv:2605.12288v1 Announce Type: new Abstract: Direct Preference Optimization (DPO) is a widely used RL-free method for aligning language models from pairwise preferences, but it models preferences over full sequences even though generation is driven by per-token decisions.… 12 arXiv — NLP / Computation & Language research 19h ago GKnow: Measuring the Entanglement of Gender Bias and Factual Gender arXiv:2605.12299v1 Announce Type: new Abstract: Recent works have analyzed the impact of individual components of neural networks on gendered predictions, often with a focus on mitigating gender bias. However, mechanistic interpretations of gender tend to (i) focus on a very… 23 arXiv — NLP / Computation & Language research 19h ago Overview of the MedHopQA track at BioCreative IX: track description, participation and evaluation of systems for multi-hop medical question answering arXiv:2605.12313v1 Announce Type: new Abstract: Multi-hop question answering (QA) remains a significant challenge in the biomedical domain, requiring systems to integrate information across multiple sources to answer complex questions. To address this problem, the BioCreative IX… 18 arXiv — NLP / Computation & Language research 19h ago A categorical error sensitivity index (ISEC): A preventive ordinal decision-support measure for irrecoverable errors in manual data entry systems arXiv:2605.12328v1 Announce Type: new Abstract: Data entry systems remain structurally vulnerable to categorical misclassifications, particularly in small and medium sized enterprises (SMEs). When nominal categories exhibit semantic or morphological proximity, human machine… 29 arXiv — NLP / Computation & Language research 19h ago Output Composability of QLoRA PEFT Modules for Plug-and-Play Attribute-Controlled Text Generation arXiv:2605.12345v1 Announce Type: new Abstract: Parameter-efficient fine-tuning (PEFT) techniques offer task-specific fine-tuning at a fraction of the cost of full fine-tuning, but require separate fine-tuning for every new task (combination). In this paper, we explore three… 25 arXiv — NLP / Computation & Language research 19h ago MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering arXiv:2605.12361v1 Announce Type: new Abstract: Evaluating large language models (LLMs) in the biomedical domain requires benchmarks that can distinguish reasoning from pattern matching and remain discriminative as model capabilities improve. Existing biomedical question… 6 arXiv — NLP / Computation & Language research 19h ago Context Convergence Improves Answering Inferential Questions arXiv:2605.12370v1 Announce Type: new Abstract: While Large Language Models (LLMs) are widely used in open-domain Question Answering (QA), their ability to handle inferential questions-where answers must be derived rather than directly retrieved-remains still underexplored. This… 21 arXiv — NLP / Computation & Language research 19h ago Pretraining Exposure Explains Popularity Judgments in Large Language Models arXiv:2605.12382v1 Announce Type: new Abstract: Large language models (LLMs) exhibit systematic preferences for well-known entities, a phenomenon often attributed to popularity bias. However, the extent to which these preferences reflect real-world popularity versus statistical… 19 arXiv — NLP / Computation & Language research 19h ago Scalable Token-Level Hallucination Detection in Large Language Models arXiv:2605.12384v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated remarkable capabilities, but they still frequently produce hallucinations. These hallucinations are difficult to detect in reasoning-intensive tasks, where the content appears coherent… 35 arXiv — NLP / Computation & Language research 19h ago A Comparative Study of Controlled Text Generation Systems Using Level-Playing-Field Evaluation Principles arXiv:2605.12395v1 Announce Type: new Abstract: Background: Many different approaches to controlled text generation (CTG) have been proposed over recent years, but it is difficult to get a clear picture of which approach performs best, because different datasets and evaluation… 23 arXiv — NLP / Computation & Language research 19h ago Question Difficulty Estimation for Large Language Models via Answer Plausibility Scoring arXiv:2605.12398v1 Announce Type: new Abstract: Estimating question difficulty is a critical component in evaluating and improving large language models (LLMs) for question answering (QA). Existing approaches often rely on readability formulas, retrieval-based signals, or… 8 arXiv — NLP / Computation & Language research 19h ago Stories in Space: In-Context Learning Trajectories in Conceptual Belief Space arXiv:2605.12412v1 Announce Type: new Abstract: Large Language Models (LLMs) update their behavior in context, which can be viewed as a form of Bayesian inference. However, the structure of the latent hypothesis space over which this inference operates remains unclear. In this… 9 arXiv — NLP / Computation & Language research 19h ago ORBIT: Preserving Foundational Language Capabilities in GenRetrieval via Origin-Regulated Merging arXiv:2605.12419v1 Announce Type: new Abstract: Despite the rapid advancements in large language model (LLM) development, fine-tuning them for specific tasks often results in the catastrophic forgetting of their general, language-based reasoning abilities. This work investigates… 24 arXiv — NLP / Computation & Language research 19h ago Predicting Disagreement with Human Raters in LLM-as-a-Judge Difficulty Assessment without Using Generation-Time Probability Signals arXiv:2605.12422v1 Announce Type: new Abstract: Automatic generation of educational materials using large language models (LLMs) is becoming increasingly common, but assigning difficulty levels to such materials still requires substantial human effort. LLM-as-a-Judge has… 16 arXiv — NLP / Computation & Language research 19h ago Geometric Factual Recall in Transformers arXiv:2605.12426v1 Announce Type: new Abstract: How do transformer language models memorize factual associations? A common view casts internal weight matrices as associative memories over pairs of embeddings, requiring parameter counts that scale linearly with the number of… 5 arXiv — NLP / Computation & Language research 19h ago A Causal Language Modeling Detour Improves Encoder Continued Pretraining arXiv:2605.12438v1 Announce Type: new Abstract: When adapting an encoder to a new domain, the standard approach is to continue training with Masked Language Modeling (MLM). We show that temporarily switching to Causal Language Modeling (CLM) followed by a short MLM decay… 38 arXiv — NLP / Computation & Language research 19h ago The Algorithmic Caricature: Auditing LLM-Generated Political Discourse Across Crisis Events arXiv:2605.12452v1 Announce Type: new Abstract: Large Language Models (LLMs) can generate fluent political text at scale, raising concerns about synthetic discourse during crises and social conflict. Existing AI-text detection often focuses on sentence-level cues such as… 18 arXiv — NLP / Computation & Language research 19h ago Task-Adaptive Embedding Refinement via Test-time LLM Guidance arXiv:2605.12487v1 Announce Type: new Abstract: We explore the effectiveness of an LLM-guided query refinement paradigm for extending the usability of embedding models to challenging zero-shot search and classification tasks. Our approach refines the embedding representation of… 36 arXiv — NLP / Computation & Language research 19h ago LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues arXiv:2605.12493v1 Announce Type: new Abstract: Long-term memory is crucial for agents in specialized web environments, where success depends on recalling interface affordances, state dynamics, workflows, and recurring failure modes. However, existing memory benchmarks for… 17 arXiv — NLP / Computation & Language research 19h ago AgentShield: Deception-based Compromise Detection for Tool-using LLM Agents arXiv:2605.11026v1 Announce Type: cross Abstract: Defenses against indirect prompt injection (IPI) in tool-using LLM agents share two structural weaknesses. First, they all attempt to prevent attacks rather than detect the compromises that slip through. Second, they have only… 21 arXiv — NLP / Computation & Language research 19h ago On Problems of Implicit Context Compression for Software Engineering Agents arXiv:2605.11051v1 Announce Type: cross Abstract: LLM-based Software Engineering agents face a critical bottleneck: context length limitations cause failures on complex, long-horizon tasks. One promising solution is to encode context as continuous embeddings rather than discrete… 27 arXiv — NLP / Computation & Language research 19h ago Unlocking LLM Creativity in Science through Analogical Reasoning arXiv:2605.11258v1 Announce Type: cross Abstract: Autonomous science promises to augment scientific discovery, particularly in complex fields like biomedicine. However, this requires AI systems that can consistently generate novel and diverse solutions to open-ended problems. We… 22 arXiv — NLP / Computation & Language research 19h ago LatentRouter: Can We Choose the Right Multimodal Model Before Seeing Its Answer? arXiv:2605.11301v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) have heterogeneous strengths across OCR, chart understanding, spatial reasoning, visual question answering, cost, and latency. Effective MLLM routing therefore requires more than… 24 arXiv — NLP / Computation & Language research 19h ago VERDI: Single-Call Confidence Estimation for Verification-Based LLM Judges via Decomposed Inference arXiv:2605.11334v1 Announce Type: cross Abstract: LLM-as-Judge systems are widely deployed for automated evaluation, yet practitioners lack reliable methods to know when a judge's verdict should be trusted. Token log-probabilities, the standard post-hoc confidence signal, are… 19 arXiv — NLP / Computation & Language research 19h ago Much of Geospatial Web Search Is Beyond Traditional GIS arXiv:2605.11336v1 Announce Type: cross Abstract: Web search queries concern place far more often than existing labelling schemes suggest, yet the landscape of geospatial web search queries - what people ask of place, and how often - remains poorly characterised at scale. We… 13 arXiv — NLP / Computation & Language research 19h ago PresentAgent-2: Towards Generalist Multimodal Presentation Agents arXiv:2605.11363v1 Announce Type: cross Abstract: Presentation generation is moving beyond static slide creation toward end-to-end presentation video generation with research grounding, multimodal media, and interactive delivery. We introduce PresentAgent-2, an agentic framework… 30 arXiv — NLP / Computation & Language research 19h ago Test-Time Compute for Dense Retrieval: Agentic Program Generation with Frozen Embedding Models arXiv:2605.11374v1 Announce Type: cross Abstract: Test-time compute is widely believed to benefit only large reasoning models. We show it also helps small embedding models. Most modern embedding checkpoints are distilled from large LLM backbones and inherit their representation… 21 arXiv — NLP / Computation & Language research 19h ago AcuityBench: Evaluating Clinical Acuity Identification and Uncertainty Alignment arXiv:2605.11398v1 Announce Type: cross Abstract: We introduce AcuityBench, a benchmark for evaluating whether language models identify the appropriate urgency of care from user medical presentations. Existing health benchmarks emphasize medical question answering, broad health… 36 arXiv — NLP / Computation & Language research 19h ago fg-expo: Frontier-guided exploration-prioritized policy optimization via adaptive kl and gaussian curriculum arXiv:2605.11403v1 Announce Type: cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has become the standard paradigm for LLM mathematical reasoning, with Group Relative Policy Optimization (GRPO) serving as the dominant algorithm. We identify two overlooked… 38 arXiv — NLP / Computation & Language research 19h ago MaskTab: Scalable Masked Tabular Pretraining with Scaling Laws and Distillation for Industrial Classification arXiv:2605.11408v1 Announce Type: cross Abstract: Tabular data forms the backbone of high-stakes decision systems in finance, healthcare, and beyond. Yet industrial tabular datasets are inherently difficult: high-dimensional, riddled with missing entries, and rarely labeled at… 5 arXiv — NLP / Computation & Language research 19h ago Can a Single Message Paralyze the AI Infrastructure? The Rise of AbO-DDoS Attacks through Targeted Mobius Injection arXiv:2605.11442v1 Announce Type: cross Abstract: Large Language Model (LLM) agents have emerged as key intermediaries, orchestrating complex interactions between human users and a wide range of digital services and LLM infrastructures. While prior research has extensively… 20 arXiv — NLP / Computation & Language research 19h ago Adaptive Teacher Exposure for Self-Distillation in LLM Reasoning arXiv:2605.11458v1 Announce Type: cross Abstract: On-policy self-distillation has become a strong recipe for LLM reasoning, where a privileged teacher supervises the student's own rollouts while conditioning on the reference solution. A design choice shared by nearly all such… 28 arXiv — NLP / Computation & Language research 19h ago AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration -- Learning from Cheap, Optimizing Expensive arXiv:2605.11518v1 Announce Type: cross Abstract: Effectively configuring scalable large language model (LLM) experiments, spanning architecture design, hyperparameter tuning, and beyond, is crucial for advancing LLM research, as poor configuration choices can waste substantial… 13 arXiv — NLP / Computation & Language research 19h ago Controllable User Simulation arXiv:2605.11519v1 Announce Type: cross Abstract: Using offline datasets to evaluate conversational agents often fails to cover rare scenarios or to support testing new policies. This has motivated the use of controllable user simulators for targeted, counterfactual evaluation,… 20 arXiv — Machine Learning research 1d ago Reinforcement learning for inverse structural design and rapid laser cutting of kirigami prototypes arXiv:2605.08098v1 Announce Type: new Abstract: Kirigami is an increasingly useful fabrication method to produce shape-programmable metamaterial structures. However, inverse design remains difficult because deployment… 12 arXiv — Machine Learning research 1d ago Path-Based Gradient Boosting for Graph-Level Prediction arXiv:2605.08102v1 Announce Type: new Abstract: We propose PathBoost, a gradient tree boosting method for graph-level classification and regression that learns discriminative path-based features directly from the input… 20 arXiv — Machine Learning research 1d ago Distributional Reinforcement Learning via the Cram\'er Distance arXiv:2605.08104v1 Announce Type: new Abstract: This paper explores the application of the Soft Actor-Critic (SAC) algorithm within a Distributional Reinforcement Learning setting and introduces an implementation of… 15 arXiv — Machine Learning research 1d ago Geometry-free prediction of inertial lift forces in microfluidic devices using deep learning arXiv:2605.08109v1 Announce Type: new Abstract: Inertial microfluidic devices (IMDs) offer low-cost, high-throughput alternative techniques for many traditional particle- (or cell-) manipulation tasks, but simulating… 19 arXiv — Machine Learning research 1d ago BaLoRA: Bayesian Low-Rank Adaptation of Large Scale Models arXiv:2605.08110v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) has become the standard for fine-tuning large pre-trained models at reduced computational cost. However, its low-rank point-estimate updates… 6 arXiv — Machine Learning research 1d ago TTCD:Transformer Integrated Temporal Causal Discovery from Non-Stationary Time Series Data arXiv:2605.08111v1 Announce Type: new Abstract: The widespread availability of complex time series data in various domains such as environmental science, epidemiology, and economics demands robust causal discovery… 35 arXiv — Machine Learning research 1d ago Do Foundation Model Embeddings Improve Cross-Country Crop Yield Generalisation? A Leave-One-Country-Out Evaluation in Sub-Saharan Africa arXiv:2605.08113v1 Announce Type: new Abstract: Accurate predictions of smallholder maize yields across national boundaries are critical for food security planning in sub-Saharan Africa, yet most published benchmarks… 17 arXiv — Machine Learning research 1d ago Statistical Inference and Quality Measures of KV Cache Quantisations Inspired by TurboQuant arXiv:2605.08114v1 Announce Type: new Abstract: We analyse three KV cache quantization schemes under a fair bit budget: \textbf{KV} (scalar MSE baseline), \textbf{KQV} (WHT + MSE on $K$; WHT + MSE + QJL on $V$), and… 27 arXiv — Machine Learning research 1d ago The Safety-Aware Denoiser for Text Diffusion Models arXiv:2605.08116v1 Announce Type: new Abstract: Recent work on text diffusion models offers a promising alternative to autoregressive generation, but controlling their safety remains underexplored. Existing safety… 9 arXiv — Machine Learning research 1d ago Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking arXiv:2605.08119v1 Announce Type: new Abstract: Tian (2025) proves a repulsion theorem (Theorem 6) for the matrix $ B = (\widetilde{F}^\top \widetilde{F} + \eta I)^{-1} $ during the interactive feature-learning stage of… 31 arXiv — Machine Learning research 1d ago Block-Wise Differentiable Sinkhorn Attention: Tail-Refinement Gradients with a Gap-Aware Dustbin Bridge arXiv:2605.08123v1 Announce Type: new Abstract: We study long-context balanced entropic optimal transport (OT) attention on TPU hardware through a stopped-base, fixed-depth tail-refinement surrogate. After a stopped… 32 arXiv — Machine Learning research 1d ago Towards Universal Gene Regulatory Network Inference: Unlocking Generalizable Regulatory Knowledge in Single-cell Foundation Models arXiv:2605.08128v1 Announce Type: new Abstract: Gene Regulatory Network (GRN) inference is essential for understanding complex cellular mechanisms, rendered tractable through single-cell transcriptomic data. With the… 19 arXiv — Machine Learning research 1d ago Towards Customized Multimodal Role-Play arXiv:2605.08129v1 Announce Type: new Abstract: Unified multimodal understanding and generation models enable richer human-AI interaction. Yet jointly customizing a character's persona, dialogue style, and visual… 26 arXiv — Machine Learning research 1d ago Additive Atomic Forests for Symbolic Function and Antiderivative Discovery arXiv:2605.08130v1 Announce Type: new Abstract: We present a framework for the simultaneous symbolic recovery of a function and its antiderivative from data. The framework rests on three ideas. First, a derivative… 27 arXiv — Machine Learning research 1d ago Interactive Inverse Reinforcement Learning of Interaction Scenarios via Bi-level Optimization arXiv:2605.08131v1 Announce Type: new Abstract: Inverse reinforcement learning (IRL) learns a reward function and a corresponding policy that best fit the demonstration data of an expert. However, in the current IRL… 18 arXiv — Machine Learning research 1d ago DARE: Diffusion Language Model Activation Reuse for Efficient Inference arXiv:2605.08134v1 Announce Type: new Abstract: Diffusion Large Language Models (dLLMs) have emerged as a promising alternative to auto-regressive (AR) models, offering greater expressive capacity and potential for… 36 arXiv — Machine Learning research 1d ago Dendritic Neural Networks with Equilibrium Propagation arXiv:2605.08135v1 Announce Type: new Abstract: Equilibrium propagation (EP) is a biologically plausible alternative to backpropagation (BP), but its effectiveness can degrade in deeper and more challenging learning… 26 arXiv — Machine Learning research 1d ago Weight Pruning Amplifies Bias: A Multi-Method Study of Compressed LLMs for Edge AI arXiv:2605.08137v1 Announce Type: new Abstract: Weight pruning is widely advocated for deploying Large Language Models on resource-constrained IoT and edge devices, yet its impact on model fairness remains poorly… 6 arXiv — Machine Learning research 1d ago DataArc-SynData-Toolkit: A Unified Closed-Loop Framework for Multi-Path, Multimodal, and Multilingual Data Synthesis arXiv:2605.08138v1 Announce Type: new Abstract: Synthetic data has emerged as a crucial solution to the data scarcity bottleneck in large language models (LLMs), particularly for specialized domains and low-resource… 10 arXiv — Machine Learning research 1d ago Reasoning emerges from constrained inference manifolds in large language models arXiv:2605.08142v1 Announce Type: new Abstract: Reasoning in large language models is predominantly evaluated through labeled benchmarks, conflating task performance with the quality of internal inference. Here we study… 15 NVIDIA Developer Blog official-blog 20d ago Simplify Sparse Deep Learning with Universal Sparse Tensor in nvmath-python In a previous post, we introduced the Universal Sparse Tensor (UST), enabling developers to decouple a tensor’s sparsity from its memory layout for greater... 14 Hugging Face official-blog 6mo ago Voice Cloning with Consent Back to Articles Voice Cloning with Consent Published October 28, 2025 Update on GitHub Upvote 40 Margaret Mitchell meg Lucie-Aimée Kaffee frimelle In this blog post, we introduce the idea of a 'voice consent gate' to support voice cloning with consent. We provide an example… 24 Google DeepMind official-blog 6mo ago VaultGemma: The world's most capable differentially private LLM We introduce VaultGemma, the most capable model trained from scratch with differential privacy. 12 Lil'Log (Lilian Weng) research 106mo ago From GAN to WGAN [Updated on 2018-09-30: thanks to Yoonju, we have this post translated in Korean !] [Updated on 2019-04-18: this post is also available on arXiv .] Generative adversarial network (GAN) has shown great results in many generative tasks to replicate the real-world rich content such… 4