arXiv — Machine Learning

500 articles archived · Visit source ↗ · RSS

arXiv — Machine Learning research 5d ago

Neural Scaling Universality: If Exponents Are Fixed, Time to Understand Coefficients

arXiv:2606.25008v1 Announce Type: new Abstract: Neural scaling laws describe how pre-training loss decays as power laws with training time, model size, and compute. This position paper argues that the exponents of these power laws are fixed by generic mechanisms: a one-third…

13
arXiv — Machine Learning research 5d ago

Emergent Capabilities Arise Randomly from Learning Sparse Attention Patterns

arXiv:2606.25010v1 Announce Type: new Abstract: Neural scaling laws for transformer language models predict smooth improvements in pretraining loss with increasing parameters, but downstream capabilities such as in-context learning are known to emerge abruptly past a certain…

19
arXiv — Machine Learning research 5d ago

Bias-Controlled Primal-Dual Natural Actor-Critic: Optimal Rates for Constrained Multi-Objective Average-Reward RL

arXiv:2606.25012v1 Announce Type: new Abstract: Many reinforcement learning (RL) problems in the infinite-horizon average-reward setting require optimizing multiple conflicting objectives while satisfying multiple safety constraints. A common approach is concave scalarization,…

27
arXiv — Machine Learning research 5d ago

Do Thinking Tokens Help with Safety?

arXiv:2606.25013v1 Announce Type: new Abstract: Today's reasoning models use thinking tokens to attain stronger performance on benchmarks than their instruction-tuned counterparts. It is also generally believed that this more "deliberative" mode should improve alignment and…

37
arXiv — Machine Learning research 5d ago

LLM-ACES: Closed-Loop Discovery of Dynamical Systems with LLM-Guided Adaptive Search

arXiv:2606.25039v1 Announce Type: new Abstract: Recovering governing Ordinary Differential Equations (ODEs) from data is a central challenge in modeling dynamical systems across scientific domains. Existing approaches cast discovery as a static inference problem over fixed…

35
arXiv — Machine Learning research 5d ago

Adapt Only When It Pays: Budgeted Decision-Loss Priority for Delayed Online Time-Series Adaptation

arXiv:2606.25068v1 Announce Type: new Abstract: Online time-series forecasters receive labels only after horizon-dependent delays, while every adaptation step spends limited compute. We study when an online learner should update, not how to adapt at every opportunity, and…

18
arXiv — Machine Learning research 5d ago

GCT-MARL: Graph-Based Contrastive Transfer for Sample-Efficient Cooperative Multi-Agent Reinforcement Learning

arXiv:2606.25073v1 Announce Type: new Abstract: In cooperative multi-agent reinforcement learning (MARL), from a deployment perspective, it is challenging and expensive to train agents from scratch for each new environment or task. In this work, we propose GCT-MARL, a transfer…

30
arXiv — Machine Learning research 5d ago

Training for the Model You Return: Improving Optimization for Iterate-Averaged Language Models

arXiv:2606.25086v1 Announce Type: new Abstract: Many modern Language Model (LM) pipelines return an averaged model, such as an exponential moving average of the training iterates, rather than the final iterate itself. This raises a fundamental question: given that we will return…

15
arXiv — Machine Learning research 5d ago

How Modular Is a Frontier Mixture-of-Experts? A Pre-registered Causal Test in Which Apparent Expert Modularity Mostly Dissolves

arXiv:2606.25092v1 Announce Type: new Abstract: Sparse Mixture-of-Experts (MoE) models route each token to a few of many experts, inviting the hypothesis that experts form functional modules tied to capabilities or languages. We test this causally on Command A+, a frontier…

5
arXiv — Machine Learning research 5d ago

Speculative Decoding at Temperature Zero: A Scoped Safety-Invariance Screen with a 48,072-Sample Expansion

arXiv:2606.25097v1 Announce Type: new Abstract: Speculative decoding accelerates inference by letting a draft model propose tokens for a target model to verify, raising a concrete safety question: at temperature zero, can draft-side behavior leak into safety-scored outputs? We…

7
arXiv — Machine Learning research 5d ago

A Framework for Directed Hypergraph Signal Processing via tensor t-SVD

arXiv:2606.25112v1 Announce Type: new Abstract: We introduce Directed Hypergraph Signal Processing (DHGSP), a unified framework that extends graph signal processing to accommodate both higher-order (polyadic) and asymmetric (directional) relationships simultaneously. Using the…

11
arXiv — Machine Learning research 5d ago

Forget to Improve: On-Device LLM-Agent Continual Learning via Budget-Curated Memory

arXiv:2606.25115v1 Announce Type: new Abstract: On-device language-model agents improve by accumulating experience in retrieved memory rather than by updating weights. This memory is hard-bounded and exposed: it consumes RAM and energy, reaches peers through a thin uplink, and…

24
arXiv — Machine Learning research 5d ago

Reward-Conditioned Attention: How Reward Design Shapes What Autonomous Driving Agents See

arXiv:2606.25127v1 Announce Type: new Abstract: We investigate how reward design shapes the internal attention patterns of reinforcement learning agents trained for autonomous driving. Using three Perceiver-based agents that share identical architectures and training data but…

33
arXiv — Machine Learning research 5d ago

Silent Failures in Physics-Informed Neural Networks: Parameter Poisoning and the Limits of Loss-Based Validation

arXiv:2606.25151v1 Announce Type: new Abstract: Physics-informed neural networks (PINNs) embed governing equations in their loss function, enabling mesh-free solutions to partial differential equations. Low training loss is treated as evidence that the learned solution is…

20
arXiv — Machine Learning research 5d ago

ATMA: Length-Invariant Language Modeling via Polar Attention and Gated-Delta Compression Memory

arXiv:2606.25156v1 Announce Type: new Abstract: Modern large language models based on softmax scaled-dot-product attention are constrained by their training sequence length: as the key-value sequence grows, softmax probability mass can dilute across a wider distribution,…

26
arXiv — Machine Learning research 5d ago

The Gentle Collapse: Distributional Metrics for Continual Learning

arXiv:2606.25165v1 Announce Type: new Abstract: Accuracy degradation is the standard metric for Catastrophic Forgetting (CF), however, it records only whether forgetting occurred or not. It saturates at the extremes and collapses discretely at task boundaries, hiding the…

35
arXiv — Machine Learning research 5d ago

An iterative energy-based multimodal transformer for joint retrieval of wheat soil moisture, leaf area index, and plant height from Sentinel-1 and Sentinel-2 time series

arXiv:2606.25174v1 Announce Type: new Abstract: Field-scale retrieval of surface soil moisture (SM), leaf area index (LAI), and plant height (PH) is essential for precision agriculture, yet it remains an ill-posed inverse problem. Concurrent variations in soil moisture and…

24
arXiv — Machine Learning research 5d ago

EveLoad: Cognitive Workload Recognition from Event-Based Eye Movements

arXiv:2606.25177v1 Announce Type: new Abstract: Cognitive workload monitoring is important for adaptive rehabilitation and assistive interfaces, where task difficulty, pacing, and feedback should be adjusted according to the user's cognitive state to avoid overload and…

22
arXiv — Machine Learning research 5d ago

Neural operator-based digital twins for modeling amyloid-$\beta$ and tau propagation and treatment optimization in Alzheimer's disease

arXiv:2606.25185v1 Announce Type: new Abstract: Accurately predicting the spatiotemporal evolution of amyloid-$\beta$ and tau proteins at the individual level is critical for improving the diagnosis and treatment of Alzheimer's disease. We consider the problem of constructing…

6
arXiv — Machine Learning research 5d ago

Efficient Analytic Uncertainty Quantification for Multi-Modal Regression

arXiv:2606.25188v1 Announce Type: new Abstract: Efficient uncertainty quantification (UQ) is essential for trustworthy large-scale learning. Existing UQ methods for regression tasks mainly operate under the assumption that the conditional label marginal satisfies single-peak…

16
arXiv — Machine Learning research 5d ago

Efficient Adaptive Data Acquisition via Pretrained Belief Representations

arXiv:2606.25197v1 Announce Type: new Abstract: Learning effective policies for adaptive data acquisition remains challenging: posterior-based methods rely on surrogate models and posterior approximations that can be misspecified or biased, while direct policy-learning methods…

28
arXiv — Machine Learning research 5d ago

A Hybrid CNN-LSTM Intrusion Detection Framework for Cybersecurity in Smart Renewable Energy Grids

arXiv:2606.25200v1 Announce Type: new Abstract: The accelerated digitalization of renewable energy smart grids through IoT sensors, AMI, and SCADA systems has significantly expanded the attack surface for sophisticated cyberattacks, FDI attacks that stealthily distort state…

38
arXiv — Machine Learning research 5d ago

FDN: Interpretable Spatiotemporal Forecasting with Future Decomposition Networks

arXiv:2606.25201v1 Announce Type: new Abstract: Spatiotemporal systems comprise a collection of spatially distributed yet interdependent entities each generating unique dynamic signals. Highly sophisticated methods have been proposed in recent years delivering state-of-the-art…

21
arXiv — Machine Learning research 5d ago

ASAP: Agent-System Co-Design for Wall-Clock-Centered Auto HPO Research for ML Experiments

arXiv:2606.25207v1 Announce Type: new Abstract: Hyperparameter Optimization (HPO) is essential for maximizing machine learning model performance, and its core challenge is sample efficiency: finding strong configurations within a limited budget. Because every HPO tool relies on…

27
arXiv — Machine Learning research 5d ago

Semantic Allocation in Ordered Bottlenecks: Predictive Residual Inference for Visual Representation Learning

arXiv:2606.25232v1 Announce Type: new Abstract: Ordered bottlenecks aim to provide utility at flexible budgets by assigning coarse information to early tokens and task-relevant detail to later ones. Prior work, including tail dropping (TD), typically enforces ordering by means…

32
arXiv — Machine Learning research 5d ago

Variational Inference via Entropic Transport Descent

arXiv:2606.25265v1 Announce Type: new Abstract: Particle-based variational inference (ParVI) methods approximate an intractable target distribution by evolving an ensemble of interacting samples. Existing approaches rely predominantly on kernel-based repulsion (e.g., SVGD),…

14
arXiv — Machine Learning research 5d ago

Inverse Reinforcement Learning for Interpretable Keystroke Biomarkers in Parkinson's Disease

arXiv:2606.25270v1 Announce Type: new Abstract: Keystroke dynamics have been explored extensively as a passive digital biomarker for Parkinson's disease (PD), typically by extracting summary statistics from typing timing and training a classifier to discriminate PD from healthy…

30
arXiv — Machine Learning research 5d ago

UC-Search: Risk-Aware Test-Time Search for Delayed Constrained Time-Series Control

arXiv:2606.25274v1 Announce Type: new Abstract: Time-series models are usually scored as forecasters, yet deployed systems often require delayed decisions under uncertainty and hard feasibility constraints. UC-Search is a model-agnostic test-time wrapper: a backbone emits…

29
arXiv — Machine Learning research 5d ago

EPTS: Elastic Post-Training Sparsity for Efficient Large Language Model Compression

arXiv:2606.25285v1 Announce Type: new Abstract: Post-Training Sparsity (PTS) has emerged as a crucial paradigm for compressing Large Language Models to facilitate efficient deployment on resource-constrained devices. However, existing PTS methodologies are typically confined to…

32
arXiv — Machine Learning research 5d ago

Communicability-Inspired Positional Encoding (CIPE)

arXiv:2606.25293v1 Announce Type: new Abstract: Positional encodings (PEs) are essential for Transformers. Yet designing effective PEs for non-Euclidean graphs remains challenging. Such encodings should ideally induce an Attention-Compatible Geometry for self-attention: not…

15
arXiv — Machine Learning research 5d ago

Stagnant Neuron: Towards Understanding the Plasticity Loss in Multi-Agent Reinforcement Learning Value Factorization Methods

arXiv:2606.25335v1 Announce Type: new Abstract: Multi-Agent Reinforcement Learning (MARL) value factorization methods can suffer from a loss of plasticity, gradually failing to adapt when transferring to new task instances. We trace this issue to stagnant neurons, units whose…

23
arXiv — Machine Learning research 5d ago

Lifelong In-Context Learning with Transformers Requires Parametric Forms of Attention

arXiv:2606.25342v1 Announce Type: new Abstract: Lifelong continual learning remains an obstacle on the path to human-like intelligence. Modern transformers show sparks of intelligence with in-context learning. The quadratic nature of attention, however, prohibits transformers…

36
arXiv — Machine Learning research 5d ago

Geometry-Anchored Transport Framework for Exemplar-Free Class-Incremental Learning

arXiv:2606.25347v1 Announce Type: new Abstract: Exemplar-free class-incremental learning (EFCIL) requires stable decision boundaries within a shifting feature space. While maintaining class-conditional Gaussian statistics provides a principled classification strategy, these…

16
arXiv — Machine Learning research 5d ago

Compositional Behavioral Semantics for State Abstraction in Reinforcement Learning

arXiv:2606.25357v1 Announce Type: new Abstract: State abstraction plays a key role in scaling reinforcement learning to complex but structured systems. In studying such systems, a wide range of behavioral structures have been studied in reinforcement learning, including value…

15
arXiv — Machine Learning research 5d ago

FactorLibrary: From Polynomials to Circuits via Recursive Subgoals

arXiv:2606.25394v1 Announce Type: new Abstract: Finding minimal arithmetic circuits for polynomials over finite fields is a combinatorially hard problem central to algebraic complexity theory. We formulate it as a reinforcement learning problem in two directions, bottom-up and…

30
arXiv — Machine Learning research 5d ago

DFMU: Data-Frugal Machine Unlearning

arXiv:2606.25410v1 Announce Type: new Abstract: Machine unlearning is an emerging domain that ensures the safe removal of elements (includes concepts, attributes, entity and class) from the trained model along with least drop in model performance. The domain of machine…

6
arXiv — Machine Learning research 5d ago

Brevity is the Soul of Inference Efficiency: Inducing Concision in VLMs via Data Curation

arXiv:2606.25432v1 Announce Type: new Abstract: Inference efficiency is typically pursued by shrinking the model: distillation, pruning, quantization, and sparse routing each lower per-token cost while treating token count as fixed. But output length has been inflating, and it…

28
arXiv — Machine Learning research 5d ago

Interpretable Concept-Guided Polynomial Tabular Kolmogorov-Arnold Network for EEG-Based Mild Cognitive Impairment Detection

arXiv:2606.25434v1 Announce Type: new Abstract: Early and scalable detection of mild cognitive impairment (MCI) remains an unresolved clinical challenge. Existing EEG-based screening approaches are constrained by handcrafted feature pipelines that discard neurophysiologically…

10
arXiv — Machine Learning research 5d ago

TopoCast: A Topological Fidelity Framework for Evaluating Transformer-Based Time Series Forecasting

arXiv:2606.25439v1 Announce Type: new Abstract: Deep learning-based models have achieved state-of-the-art performance in Time Series Forecasting (TSF), yet their evaluation remains dominated by pointwise error metrics such as Mean Squared Error (MSE), which quantify numerical…

37
arXiv — Machine Learning research 5d ago

The Interplay of Harness Design and Post-Training in LLM Agents

arXiv:2606.25447v1 Announce Type: new Abstract: Tool-integrated LLM agents are often wrapped within a harness: the scaffolding that determines which tools are exposed, how they are described, and what auxiliary information accompanies each per-step observation. While agents are…

15
arXiv — Machine Learning research 5d ago

The Generalization Spectrum: A Chromatographic Approach to Evaluating Learning Algorithms

arXiv:2606.25450v1 Announce Type: new Abstract: Traditional evaluations measure a learning algorithm's final performance on an i.i.d. test set, reducing learning to a single aggregate score. This approach obscures a fundamental question: to what extent does learning from a…

12
arXiv — Machine Learning research 5d ago

Learning with a Single Rollout via Monte Carlo Pass@k Critic

arXiv:2606.25451v1 Announce Type: new Abstract: Estimating token-level advantages in reinforcement learning (RL) for language models remains challenging because scaling up episodic experience collection is expensive. The difficulty intensifies for baseline advantage estimation…

12
arXiv — Machine Learning research 5d ago

Towards Robust EEG Decoding Based on Riemannian Self-Attention

arXiv:2606.25456v1 Announce Type: new Abstract: Brain-Computer Interface (BCI) based on electroencephalography (EEG) enables direct interaction between the brain and external environments and has significant applications in assistive technologies, medical rehabilitation, and…

32
arXiv — Machine Learning research 5d ago

Distill on a Diet: Efficient Knowledge Distillation via Learnable Data Pruning

arXiv:2606.25488v1 Announce Type: new Abstract: Knowledge Distillation (KD) is widely used to obtain compact models for efficient inference in resource-constrained environments. Yet the computational overhead of the distillation process itself is often overlooked, raising the…

19
arXiv — Machine Learning research 5d ago

Low Variance Trust Region Optimization with Independent Actors and Sequential Updates in Cooperative Multi-agent Reinforcement Learning

arXiv:2606.25526v1 Announce Type: new Abstract: Cooperative multi-agent reinforcement learning assumes each agent shares the same reward function and can be trained effectively using the Trust Region framework of single-agent. Instead of relying on other agents' actions, the…

28
arXiv — Machine Learning research 5d ago

Beyond One-Size-Fits-All: Diagnosis-Driven Online Reinforcement Learning with Offline Priors

arXiv:2606.25527v1 Announce Type: new Abstract: Online reinforcement learning (RL) agents increasingly depend on knowledge acquired offline to achieve practical efficiency. Originally studied in offline-to-online RL, this paradigm now spans foundation model post-training and…

27
arXiv — Machine Learning research 5d ago

Leaking Circuit Secrets: Gradient Leakage Attacks on Graph Neural Networks

arXiv:2606.25589v1 Announce Type: new Abstract: As graph neural networks (GNNs) become standard tools for critical tasks in circuit design and analysis, their security and privacy risks require careful attention. Here, we present the first comprehensive evaluation of gradient…

20
arXiv — Machine Learning research 5d ago

Low-Complexity Policy Tessellations in Structured Markov Decision Processes

arXiv:2606.25593v1 Announce Type: new Abstract: We study optimal-policy geometry in structured Markov decision processes. While approximate dynamic programming and reinforcement learning typically approximate high-dimensional value functions, we show that optimal policies induce…

32
arXiv — Machine Learning research 5d ago

TL++: Accuracy and Privacy Preserving Traversal Learning for Distributed Intelligent Systems

arXiv:2606.25627v1 Announce Type: new Abstract: Distributed intelligent systems increasingly need to train across data silos without centralizing raw data. Federated learning keeps data local but can suffer under heterogeneous partitions and requires repeated full-model…

22
arXiv — Machine Learning research 5d ago

Learning Subset-Shared Invariances for Domain Generalization with Mixture-of-Experts

arXiv:2606.25665v1 Announce Type: new Abstract: Domain generalization (DG) aims to learn a model from one or more source domains that generalizes to an unseen target domain without accessing target data during training. A common approach enforces invariance of representations…

29

Neural Scaling Universality: If Exponents Are Fixed, Time to Understand Coefficients

Emergent Capabilities Arise Randomly from Learning Sparse Attention Patterns

Bias-Controlled Primal-Dual Natural Actor-Critic: Optimal Rates for Constrained Multi-Objective Average-Reward RL

Do Thinking Tokens Help with Safety?

LLM-ACES: Closed-Loop Discovery of Dynamical Systems with LLM-Guided Adaptive Search

Adapt Only When It Pays: Budgeted Decision-Loss Priority for Delayed Online Time-Series Adaptation

GCT-MARL: Graph-Based Contrastive Transfer for Sample-Efficient Cooperative Multi-Agent Reinforcement Learning

Training for the Model You Return: Improving Optimization for Iterate-Averaged Language Models

How Modular Is a Frontier Mixture-of-Experts? A Pre-registered Causal Test in Which Apparent Expert Modularity Mostly Dissolves

Speculative Decoding at Temperature Zero: A Scoped Safety-Invariance Screen with a 48,072-Sample Expansion

A Framework for Directed Hypergraph Signal Processing via tensor t-SVD

Forget to Improve: On-Device LLM-Agent Continual Learning via Budget-Curated Memory

Reward-Conditioned Attention: How Reward Design Shapes What Autonomous Driving Agents See

Silent Failures in Physics-Informed Neural Networks: Parameter Poisoning and the Limits of Loss-Based Validation

ATMA: Length-Invariant Language Modeling via Polar Attention and Gated-Delta Compression Memory

The Gentle Collapse: Distributional Metrics for Continual Learning

An iterative energy-based multimodal transformer for joint retrieval of wheat soil moisture, leaf area index, and plant height from Sentinel-1 and Sentinel-2 time series

EveLoad: Cognitive Workload Recognition from Event-Based Eye Movements

Neural operator-based digital twins for modeling amyloid-$\beta$ and tau propagation and treatment optimization in Alzheimer's disease

Efficient Analytic Uncertainty Quantification for Multi-Modal Regression

Efficient Adaptive Data Acquisition via Pretrained Belief Representations

A Hybrid CNN-LSTM Intrusion Detection Framework for Cybersecurity in Smart Renewable Energy Grids

FDN: Interpretable Spatiotemporal Forecasting with Future Decomposition Networks

ASAP: Agent-System Co-Design for Wall-Clock-Centered Auto HPO Research for ML Experiments

Semantic Allocation in Ordered Bottlenecks: Predictive Residual Inference for Visual Representation Learning

Variational Inference via Entropic Transport Descent

Inverse Reinforcement Learning for Interpretable Keystroke Biomarkers in Parkinson's Disease

UC-Search: Risk-Aware Test-Time Search for Delayed Constrained Time-Series Control

EPTS: Elastic Post-Training Sparsity for Efficient Large Language Model Compression

Communicability-Inspired Positional Encoding (CIPE)

Stagnant Neuron: Towards Understanding the Plasticity Loss in Multi-Agent Reinforcement Learning Value Factorization Methods

Lifelong In-Context Learning with Transformers Requires Parametric Forms of Attention

Geometry-Anchored Transport Framework for Exemplar-Free Class-Incremental Learning

Compositional Behavioral Semantics for State Abstraction in Reinforcement Learning

FactorLibrary: From Polynomials to Circuits via Recursive Subgoals

DFMU: Data-Frugal Machine Unlearning

Brevity is the Soul of Inference Efficiency: Inducing Concision in VLMs via Data Curation

Interpretable Concept-Guided Polynomial Tabular Kolmogorov-Arnold Network for EEG-Based Mild Cognitive Impairment Detection

TopoCast: A Topological Fidelity Framework for Evaluating Transformer-Based Time Series Forecasting

The Interplay of Harness Design and Post-Training in LLM Agents

The Generalization Spectrum: A Chromatographic Approach to Evaluating Learning Algorithms

Learning with a Single Rollout via Monte Carlo Pass@k Critic

Towards Robust EEG Decoding Based on Riemannian Self-Attention

Distill on a Diet: Efficient Knowledge Distillation via Learnable Data Pruning

Low Variance Trust Region Optimization with Independent Actors and Sequential Updates in Cooperative Multi-agent Reinforcement Learning

Beyond One-Size-Fits-All: Diagnosis-Driven Online Reinforcement Learning with Offline Priors

Leaking Circuit Secrets: Gradient Leakage Attacks on Graph Neural Networks

Low-Complexity Policy Tessellations in Structured Markov Decision Processes

TL++: Accuracy and Privacy Preserving Traversal Learning for Distributed Intelligent Systems

Learning Subset-Shared Invariances for Domain Generalization with Mixture-of-Experts