arXiv — Machine Learning
500 articles archived · Visit source ↗ · RSS
-
arXiv — Machine Learning research 5d ago
Neural Scaling Universality: If Exponents Are Fixed, Time to Understand Coefficients
arXiv:2606.25008v1 Announce Type: new Abstract: Neural scaling laws describe how pre-training loss decays as power laws with training time, model size, and compute. This position paper argues that the exponents of these power laws are fixed by generic mechanisms: a one-third…
13 -
arXiv — Machine Learning research 5d ago
Emergent Capabilities Arise Randomly from Learning Sparse Attention Patterns
arXiv:2606.25010v1 Announce Type: new Abstract: Neural scaling laws for transformer language models predict smooth improvements in pretraining loss with increasing parameters, but downstream capabilities such as in-context learning are known to emerge abruptly past a certain…
19 -
arXiv — Machine Learning research 5d ago
Bias-Controlled Primal-Dual Natural Actor-Critic: Optimal Rates for Constrained Multi-Objective Average-Reward RL
arXiv:2606.25012v1 Announce Type: new Abstract: Many reinforcement learning (RL) problems in the infinite-horizon average-reward setting require optimizing multiple conflicting objectives while satisfying multiple safety constraints. A common approach is concave scalarization,…
27 -
arXiv — Machine Learning research 5d ago
Do Thinking Tokens Help with Safety?
arXiv:2606.25013v1 Announce Type: new Abstract: Today's reasoning models use thinking tokens to attain stronger performance on benchmarks than their instruction-tuned counterparts. It is also generally believed that this more "deliberative" mode should improve alignment and…
37 -
arXiv — Machine Learning research 5d ago
LLM-ACES: Closed-Loop Discovery of Dynamical Systems with LLM-Guided Adaptive Search
arXiv:2606.25039v1 Announce Type: new Abstract: Recovering governing Ordinary Differential Equations (ODEs) from data is a central challenge in modeling dynamical systems across scientific domains. Existing approaches cast discovery as a static inference problem over fixed…
35 -
arXiv — Machine Learning research 5d ago
Adapt Only When It Pays: Budgeted Decision-Loss Priority for Delayed Online Time-Series Adaptation
arXiv:2606.25068v1 Announce Type: new Abstract: Online time-series forecasters receive labels only after horizon-dependent delays, while every adaptation step spends limited compute. We study when an online learner should update, not how to adapt at every opportunity, and…
18 -
arXiv — Machine Learning research 5d ago
GCT-MARL: Graph-Based Contrastive Transfer for Sample-Efficient Cooperative Multi-Agent Reinforcement Learning
arXiv:2606.25073v1 Announce Type: new Abstract: In cooperative multi-agent reinforcement learning (MARL), from a deployment perspective, it is challenging and expensive to train agents from scratch for each new environment or task. In this work, we propose GCT-MARL, a transfer…
30 -
arXiv — Machine Learning research 5d ago
Training for the Model You Return: Improving Optimization for Iterate-Averaged Language Models
arXiv:2606.25086v1 Announce Type: new Abstract: Many modern Language Model (LM) pipelines return an averaged model, such as an exponential moving average of the training iterates, rather than the final iterate itself. This raises a fundamental question: given that we will return…
15 -
-
arXiv — Machine Learning research 5d ago
Speculative Decoding at Temperature Zero: A Scoped Safety-Invariance Screen with a 48,072-Sample Expansion
arXiv:2606.25097v1 Announce Type: new Abstract: Speculative decoding accelerates inference by letting a draft model propose tokens for a target model to verify, raising a concrete safety question: at temperature zero, can draft-side behavior leak into safety-scored outputs? We…
7 -
arXiv — Machine Learning research 5d ago
A Framework for Directed Hypergraph Signal Processing via tensor t-SVD
arXiv:2606.25112v1 Announce Type: new Abstract: We introduce Directed Hypergraph Signal Processing (DHGSP), a unified framework that extends graph signal processing to accommodate both higher-order (polyadic) and asymmetric (directional) relationships simultaneously. Using the…
11 -
arXiv — Machine Learning research 5d ago
Forget to Improve: On-Device LLM-Agent Continual Learning via Budget-Curated Memory
arXiv:2606.25115v1 Announce Type: new Abstract: On-device language-model agents improve by accumulating experience in retrieved memory rather than by updating weights. This memory is hard-bounded and exposed: it consumes RAM and energy, reaches peers through a thin uplink, and…
24 -
arXiv — Machine Learning research 5d ago
Reward-Conditioned Attention: How Reward Design Shapes What Autonomous Driving Agents See
arXiv:2606.25127v1 Announce Type: new Abstract: We investigate how reward design shapes the internal attention patterns of reinforcement learning agents trained for autonomous driving. Using three Perceiver-based agents that share identical architectures and training data but…
33 -
arXiv — Machine Learning research 5d ago
Silent Failures in Physics-Informed Neural Networks: Parameter Poisoning and the Limits of Loss-Based Validation
arXiv:2606.25151v1 Announce Type: new Abstract: Physics-informed neural networks (PINNs) embed governing equations in their loss function, enabling mesh-free solutions to partial differential equations. Low training loss is treated as evidence that the learned solution is…
20 -
arXiv — Machine Learning research 5d ago
ATMA: Length-Invariant Language Modeling via Polar Attention and Gated-Delta Compression Memory
arXiv:2606.25156v1 Announce Type: new Abstract: Modern large language models based on softmax scaled-dot-product attention are constrained by their training sequence length: as the key-value sequence grows, softmax probability mass can dilute across a wider distribution,…
26 -
arXiv — Machine Learning research 5d ago
The Gentle Collapse: Distributional Metrics for Continual Learning
arXiv:2606.25165v1 Announce Type: new Abstract: Accuracy degradation is the standard metric for Catastrophic Forgetting (CF), however, it records only whether forgetting occurred or not. It saturates at the extremes and collapses discretely at task boundaries, hiding the…
35 -
-
arXiv — Machine Learning research 5d ago
EveLoad: Cognitive Workload Recognition from Event-Based Eye Movements
arXiv:2606.25177v1 Announce Type: new Abstract: Cognitive workload monitoring is important for adaptive rehabilitation and assistive interfaces, where task difficulty, pacing, and feedback should be adjusted according to the user's cognitive state to avoid overload and…
22 -
-
arXiv — Machine Learning research 5d ago
Efficient Analytic Uncertainty Quantification for Multi-Modal Regression
arXiv:2606.25188v1 Announce Type: new Abstract: Efficient uncertainty quantification (UQ) is essential for trustworthy large-scale learning. Existing UQ methods for regression tasks mainly operate under the assumption that the conditional label marginal satisfies single-peak…
16 -
arXiv — Machine Learning research 5d ago
Efficient Adaptive Data Acquisition via Pretrained Belief Representations
arXiv:2606.25197v1 Announce Type: new Abstract: Learning effective policies for adaptive data acquisition remains challenging: posterior-based methods rely on surrogate models and posterior approximations that can be misspecified or biased, while direct policy-learning methods…
28 -
arXiv — Machine Learning research 5d ago
A Hybrid CNN-LSTM Intrusion Detection Framework for Cybersecurity in Smart Renewable Energy Grids
arXiv:2606.25200v1 Announce Type: new Abstract: The accelerated digitalization of renewable energy smart grids through IoT sensors, AMI, and SCADA systems has significantly expanded the attack surface for sophisticated cyberattacks, FDI attacks that stealthily distort state…
38 -
arXiv — Machine Learning research 5d ago
FDN: Interpretable Spatiotemporal Forecasting with Future Decomposition Networks
arXiv:2606.25201v1 Announce Type: new Abstract: Spatiotemporal systems comprise a collection of spatially distributed yet interdependent entities each generating unique dynamic signals. Highly sophisticated methods have been proposed in recent years delivering state-of-the-art…
21 -
arXiv — Machine Learning research 5d ago
ASAP: Agent-System Co-Design for Wall-Clock-Centered Auto HPO Research for ML Experiments
arXiv:2606.25207v1 Announce Type: new Abstract: Hyperparameter Optimization (HPO) is essential for maximizing machine learning model performance, and its core challenge is sample efficiency: finding strong configurations within a limited budget. Because every HPO tool relies on…
27 -
arXiv — Machine Learning research 5d ago
Semantic Allocation in Ordered Bottlenecks: Predictive Residual Inference for Visual Representation Learning
arXiv:2606.25232v1 Announce Type: new Abstract: Ordered bottlenecks aim to provide utility at flexible budgets by assigning coarse information to early tokens and task-relevant detail to later ones. Prior work, including tail dropping (TD), typically enforces ordering by means…
32 -
arXiv — Machine Learning research 5d ago
Variational Inference via Entropic Transport Descent
arXiv:2606.25265v1 Announce Type: new Abstract: Particle-based variational inference (ParVI) methods approximate an intractable target distribution by evolving an ensemble of interacting samples. Existing approaches rely predominantly on kernel-based repulsion (e.g., SVGD),…
14 -
arXiv — Machine Learning research 5d ago
Inverse Reinforcement Learning for Interpretable Keystroke Biomarkers in Parkinson's Disease
arXiv:2606.25270v1 Announce Type: new Abstract: Keystroke dynamics have been explored extensively as a passive digital biomarker for Parkinson's disease (PD), typically by extracting summary statistics from typing timing and training a classifier to discriminate PD from healthy…
30 -
arXiv — Machine Learning research 5d ago
UC-Search: Risk-Aware Test-Time Search for Delayed Constrained Time-Series Control
arXiv:2606.25274v1 Announce Type: new Abstract: Time-series models are usually scored as forecasters, yet deployed systems often require delayed decisions under uncertainty and hard feasibility constraints. UC-Search is a model-agnostic test-time wrapper: a backbone emits…
29 -
arXiv — Machine Learning research 5d ago
EPTS: Elastic Post-Training Sparsity for Efficient Large Language Model Compression
arXiv:2606.25285v1 Announce Type: new Abstract: Post-Training Sparsity (PTS) has emerged as a crucial paradigm for compressing Large Language Models to facilitate efficient deployment on resource-constrained devices. However, existing PTS methodologies are typically confined to…
32 -
arXiv — Machine Learning research 5d ago
Communicability-Inspired Positional Encoding (CIPE)
arXiv:2606.25293v1 Announce Type: new Abstract: Positional encodings (PEs) are essential for Transformers. Yet designing effective PEs for non-Euclidean graphs remains challenging. Such encodings should ideally induce an Attention-Compatible Geometry for self-attention: not…
15 -
-
arXiv — Machine Learning research 5d ago
Lifelong In-Context Learning with Transformers Requires Parametric Forms of Attention
arXiv:2606.25342v1 Announce Type: new Abstract: Lifelong continual learning remains an obstacle on the path to human-like intelligence. Modern transformers show sparks of intelligence with in-context learning. The quadratic nature of attention, however, prohibits transformers…
36 -
arXiv — Machine Learning research 5d ago
Geometry-Anchored Transport Framework for Exemplar-Free Class-Incremental Learning
arXiv:2606.25347v1 Announce Type: new Abstract: Exemplar-free class-incremental learning (EFCIL) requires stable decision boundaries within a shifting feature space. While maintaining class-conditional Gaussian statistics provides a principled classification strategy, these…
16 -
arXiv — Machine Learning research 5d ago
Compositional Behavioral Semantics for State Abstraction in Reinforcement Learning
arXiv:2606.25357v1 Announce Type: new Abstract: State abstraction plays a key role in scaling reinforcement learning to complex but structured systems. In studying such systems, a wide range of behavioral structures have been studied in reinforcement learning, including value…
15 -
arXiv — Machine Learning research 5d ago
FactorLibrary: From Polynomials to Circuits via Recursive Subgoals
arXiv:2606.25394v1 Announce Type: new Abstract: Finding minimal arithmetic circuits for polynomials over finite fields is a combinatorially hard problem central to algebraic complexity theory. We formulate it as a reinforcement learning problem in two directions, bottom-up and…
30 -
arXiv — Machine Learning research 5d ago
DFMU: Data-Frugal Machine Unlearning
arXiv:2606.25410v1 Announce Type: new Abstract: Machine unlearning is an emerging domain that ensures the safe removal of elements (includes concepts, attributes, entity and class) from the trained model along with least drop in model performance. The domain of machine…
6 -
arXiv — Machine Learning research 5d ago
Brevity is the Soul of Inference Efficiency: Inducing Concision in VLMs via Data Curation
arXiv:2606.25432v1 Announce Type: new Abstract: Inference efficiency is typically pursued by shrinking the model: distillation, pruning, quantization, and sparse routing each lower per-token cost while treating token count as fixed. But output length has been inflating, and it…
28 -
-
arXiv — Machine Learning research 5d ago
TopoCast: A Topological Fidelity Framework for Evaluating Transformer-Based Time Series Forecasting
arXiv:2606.25439v1 Announce Type: new Abstract: Deep learning-based models have achieved state-of-the-art performance in Time Series Forecasting (TSF), yet their evaluation remains dominated by pointwise error metrics such as Mean Squared Error (MSE), which quantify numerical…
37 -
arXiv — Machine Learning research 5d ago
The Interplay of Harness Design and Post-Training in LLM Agents
arXiv:2606.25447v1 Announce Type: new Abstract: Tool-integrated LLM agents are often wrapped within a harness: the scaffolding that determines which tools are exposed, how they are described, and what auxiliary information accompanies each per-step observation. While agents are…
15 -
arXiv — Machine Learning research 5d ago
The Generalization Spectrum: A Chromatographic Approach to Evaluating Learning Algorithms
arXiv:2606.25450v1 Announce Type: new Abstract: Traditional evaluations measure a learning algorithm's final performance on an i.i.d. test set, reducing learning to a single aggregate score. This approach obscures a fundamental question: to what extent does learning from a…
12 -
arXiv — Machine Learning research 5d ago
Learning with a Single Rollout via Monte Carlo Pass@k Critic
arXiv:2606.25451v1 Announce Type: new Abstract: Estimating token-level advantages in reinforcement learning (RL) for language models remains challenging because scaling up episodic experience collection is expensive. The difficulty intensifies for baseline advantage estimation…
12 -
arXiv — Machine Learning research 5d ago
Towards Robust EEG Decoding Based on Riemannian Self-Attention
arXiv:2606.25456v1 Announce Type: new Abstract: Brain-Computer Interface (BCI) based on electroencephalography (EEG) enables direct interaction between the brain and external environments and has significant applications in assistive technologies, medical rehabilitation, and…
32 -
arXiv — Machine Learning research 5d ago
Distill on a Diet: Efficient Knowledge Distillation via Learnable Data Pruning
arXiv:2606.25488v1 Announce Type: new Abstract: Knowledge Distillation (KD) is widely used to obtain compact models for efficient inference in resource-constrained environments. Yet the computational overhead of the distillation process itself is often overlooked, raising the…
19 -
-
arXiv — Machine Learning research 5d ago
Beyond One-Size-Fits-All: Diagnosis-Driven Online Reinforcement Learning with Offline Priors
arXiv:2606.25527v1 Announce Type: new Abstract: Online reinforcement learning (RL) agents increasingly depend on knowledge acquired offline to achieve practical efficiency. Originally studied in offline-to-online RL, this paradigm now spans foundation model post-training and…
27 -
arXiv — Machine Learning research 5d ago
Leaking Circuit Secrets: Gradient Leakage Attacks on Graph Neural Networks
arXiv:2606.25589v1 Announce Type: new Abstract: As graph neural networks (GNNs) become standard tools for critical tasks in circuit design and analysis, their security and privacy risks require careful attention. Here, we present the first comprehensive evaluation of gradient…
20 -
arXiv — Machine Learning research 5d ago
Low-Complexity Policy Tessellations in Structured Markov Decision Processes
arXiv:2606.25593v1 Announce Type: new Abstract: We study optimal-policy geometry in structured Markov decision processes. While approximate dynamic programming and reinforcement learning typically approximate high-dimensional value functions, we show that optimal policies induce…
32 -
arXiv — Machine Learning research 5d ago
TL++: Accuracy and Privacy Preserving Traversal Learning for Distributed Intelligent Systems
arXiv:2606.25627v1 Announce Type: new Abstract: Distributed intelligent systems increasingly need to train across data silos without centralizing raw data. Federated learning keeps data local but can suffer under heterogeneous partitions and requires repeated full-model…
22 -
arXiv — Machine Learning research 5d ago
Learning Subset-Shared Invariances for Domain Generalization with Mixture-of-Experts
arXiv:2606.25665v1 Announce Type: new Abstract: Domain generalization (DG) aims to learn a model from one or more source domains that generalizes to an unseen target domain without accessing target data during training. A common approach enforces invariance of representations…
29