Tag

Research papers

500 articles archived under #paper · RSS

arXiv — Machine Learning research 4d ago

Rethinking Training & Inference for Forecasting: Linking Winner-Take-All back to GMMs

arXiv:2606.26424v1 Announce Type: new Abstract: Trajectory forecasting for autonomous driving has advanced rapidly, yet representative models often produce uninformative posteriors over forecast modes, causing problems for mode pruning. We trace this to a modeling-training…

37
arXiv — NLP / Computation & Language research 4d ago

DualEval: Joint Model-Item Calibration for Unified LLM Evaluation

arXiv:2606.26429v1 Announce Type: cross Abstract: Current LLM evaluation relies on two complementary but often disconnected signals: static benchmarks with objective correctness labels and arena-style preference data that better reflect open-ended user interactions. We introduce…

24
arXiv — Machine Learning research 4d ago

Embedding Foundation Model Predictions in Discrete-Choice Models with Structural Guarantees

arXiv:2606.26432v1 Announce Type: new Abstract: Tabular foundation models achieve strong accuracy on choice prediction tasks, but their predictions often violate the economic logic those tasks require: raising a price can increase predicted demand, implied willingness-to-pay…

36
arXiv — Machine Learning research 4d ago

Optimizing CUDA like a Human: Micro-Profiling Tools as Expert Surrogates for LLM-Based GPU Kernel Optimization

arXiv:2606.26453v1 Announce Type: new Abstract: We present KernelPro, a closed-loop multi-agent system that automatically generates, profiles, and iteratively optimizes GPU kernel code by integrating large language model (LLM) code generation with hardware profiler feedback and…

21
arXiv — Machine Learning research 4d ago

Finding the Time to Think: Learning Planning Budgets in Real-Time RL

arXiv:2606.26463v1 Announce Type: new Abstract: Deliberating takes time. In real-time settings, that time is not free. Standard reinforcement learning (RL) sidesteps this as the environment waits indefinitely for the agent's decision. Instead, we study real-time RL environments…

38
arXiv — Machine Learning research 4d ago

A Causal Foundation Model for Structure and Outcome Prediction

arXiv:2606.26467v1 Announce Type: new Abstract: We introduce TabPFN-CFM, a causal foundation model that can handle multiple causal problems. TabPFN-CFM predicts both causal structure and outcomes from observational data, supports queries on all three levels of Pearl's Causal…

9
arXiv — NLP / Computation & Language research 4d ago

Epiphany-Aware KV Cache Eviction Without the Attention Matrix

arXiv:2606.26472v1 Announce Type: cross Abstract: As reasoning models emit chains of thought tens of thousands of tokens long, KV cache increasingly becomes a deployment bottleneck. Existing cache eviction methods rank tokens by attention weight, which is a noisy importance…

21
arXiv — Machine Learning research 4d ago

When Does Quality-Aware Multimodal Fusion Matter? A Leakage-Safe Diagnostic for Decision-Level Dependence

arXiv:2606.26473v1 Announce Type: new Abstract: Many multimodal systems estimate the reliability of each modality and weight their contributions to the final prediction. However, it remains unclear whether these scores influence model decisions or merely correlate with…

20
arXiv — Machine Learning research 4d ago

Localizing RL-Induced Tool Use to a Single Crosscoder Feature

arXiv:2606.26474v1 Announce Type: new Abstract: Fine-tuning through RL reshapes the internal representations of language models to enable agentic behaviors such as tool use, yet the mechanistic basis of these changes remains poorly understood. While RL substantially improves…

4
arXiv — Machine Learning research 4d ago

Retrieval-Warmed Energy-Based Reasoning: A Five-Arm Ablation Methodology for Diffusion-as-Inference on Structured Reasoning Tasks

arXiv:2606.26476v1 Announce Type: new Abstract: Warm-started diffusion samplers accelerate iterative inference, but it is rarely clear which part of the pipeline carries the gain. We study \textbf{retrieval-warmed energy-based reasoning (RW-EBR)} -- an IRED energy-based…

9
arXiv — Machine Learning research 4d ago

What Survives When You Compress a Recursive Reasoner for the Edge?

arXiv:2606.26488v1 Announce Type: new Abstract: Recursive reasoning models can solve complex structured tasks with only a few million parameters by repeatedly updating a latent state. Deploying these models on edge hardware requires significant compression, but unlike…

30
arXiv — Machine Learning research 4d ago

Learning Probabilistic Filters with Strictly Proper Scoring Rules

arXiv:2606.26497v1 Announce Type: new Abstract: Bayesian filtering of partially and noisily observed dynamical systems seeks to infer the evolving conditional distribution of the state of a dynamical system, given observations, in an online fashion. This Bayesian filtering…

7
arXiv — Machine Learning research 4d ago

Multipath Adaptive Gated Bottleneck Latent ODE with Raman Data Fusion for Cell Culture Process Forecasting

arXiv:2606.26520v1 Announce Type: new Abstract: Mammalian cell-culture processes underpin the manufacture of many biopharmaceuticals, yet keeping a run on track is hard: critical process parameters drift over days, and an off-specification trend is often confirmed too late to…

4
arXiv — Machine Learning research 4d ago

Theory-Scale Auto-Formalization of Logics for Computer Science

arXiv:2606.26525v1 Announce Type: new Abstract: Auto-formalization is critical for scalable formal verification, but existing progress largely focuses on isolated statements, while theory-scale auto-formalization, which coherently translates hundreds of interdependent…

8
arXiv — Machine Learning research 4d ago

Sample-efficient Transfer Reinforcement Learning via Adaptive Reward Shaping and Policy-Ratio Reweighting Strategy

arXiv:2606.26527v1 Announce Type: new Abstract: Transfer learning improves policy learning efficiency by reusing knowledge from source tasks, providing a feasible paradigm for safe and efficient autonomous highway lane changing decision-making. Existing methods frequently…

25
arXiv — Machine Learning research 4d ago

CascadeFormer: Depth-Tapered Transformers Motivated by Gradient Fan-in Asymmetry

arXiv:2606.26538v1 Announce Type: new Abstract: Deep Transformers are composed of uniformly stacked residual blocks, yet their deepest layers often add little value. We present two efficiency methods that exploit this asymmetry. CascadeFormer tapers width with depth to match the…

31
arXiv — Machine Learning research 4d ago

Can Large Language Models Reliably Code Qualitative Humanitarian Data? A Benchmark Study Against Human Expert Adjudication

arXiv:2606.26541v1 Announce Type: new Abstract: Data from affected populations are crucial for informing humanitarian response, but their value depends on timely and consistent interpretation of nuanced accounts of need. Humanitarian organizations often lack the staff, time, and…

4
arXiv — Machine Learning research 4d ago

Revisiting Action Factorization for Complex Action Spaces

arXiv:2606.26574v1 Announce Type: new Abstract: Many real-world control problems involve hybrid discrete-continuous action spaces. For example, steering and signaling in autonomous driving, and aiming and firing in robotics or video-games. Despite real-world hybrid factorization…

10
arXiv — Machine Learning research 4d ago

SharQ: Bridging Activation Sparsity and FP4 Quantization for LLM Inference

arXiv:2606.26587v1 Announce Type: new Abstract: Low-bit floating-point formats and semi-structured sparsity are increasingly supported by modern accelerators, yet combining them for LLM activation compression remains challenging: activations contain input-dependent outliers that…

29
arXiv — Machine Learning research 4d ago

Empirical Software Engineering TerraProbe: A Layered-Oracle Framework for Detecting Deceptive Fixes in LLM-Assisted Terraform

arXiv:2606.26590v1 Announce Type: new Abstract: Security misconfigurations in Terraform Infrastructure-as-Code are a growing risk in cloud deployments, and large language models are increasingly used as automated repair agents. Existing evaluations often treat a repair as…

5
arXiv — Machine Learning research 4d ago

Sketched Linear Contrastive Learning: Approximation, Optimization, and Statistical Scaling

arXiv:2606.26617v1 Announce Type: new Abstract: Scaling laws describe how learning performance varies with model size, data size, and compute. While recent theoretical work has established scaling laws for sketched linear regression, much less is understood for contrastive…

25
arXiv — Machine Learning research 4d ago

Discovering Millions of Interpretable Features with Sparse Autoencoders

arXiv:2606.26620v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) have emerged as a powerful tool for decomposing superposed language model representations into sparse and interpretable features. However, training SAEs is computationally expensive, and available…

5
arXiv — NLP / Computation & Language research 4d ago

From Weights to Features: SAE-Guided Activation Regularization for LLM Continual Learning

arXiv:2606.26629v1 Announce Type: cross Abstract: Weight-space regularization methods such as Elastic Weight Consolidation (EWC) are the standard approach to catastrophic forgetting in continual learning. However, those methods tend to underperform when applied to large language…

15
arXiv — Machine Learning research 4d ago

Target-Aware Bandit Allocation for Scalable Surrogate Optimization in Chemical Space

arXiv:2606.26657v1 Announce Type: new Abstract: Identifying high-utility candidates from massive discrete spaces under expensive evaluations is a recurring challenge across the sciences, with structure-based drug discovery as a prominent example. While surrogate-based…

20
arXiv — Machine Learning research 4d ago

Zero-Shot Size Transfer for Neural ODEs on Sparse Random Graphs: Graphon Limits and Adjoint Convergence

arXiv:2606.26662v1 Announce Type: new Abstract: Graph Neural Differential Equations (GNDEs) model continuous-time graph dynamics by parameterizing Neural ODE velocity fields with Graph Neural Networks. Their local, size-independent filters suggest a zero-shot size-transfer…

24
arXiv — Machine Learning research 4d ago

PersistentKV: Page-Aware Decode Scheduling for Long-Context LLM Serving on Commodity GPUs

arXiv:2606.26666v1 Announce Type: new Abstract: Autoregressive large language model (LLM) serving is increasingly limited by key-value (KV) cache movement rather than dense matrix multiplication. Modern paged-attention systems reduce KV-cache fragmentation and mature kernels…

20
arXiv — Machine Learning research 4d ago

Algorithmic Foundations of Deep Learning: Complexity-Theoretic Rates and a Characterization of Universal Approximation

arXiv:2606.26705v1 Announce Type: new Abstract: Feedforward neural network (NN) expressivity is typically studied by emulating optimal basis-expansion schemes. While powerful, this perspective is incomplete: it primarily captures complexity through regularity, and therefore does…

37
arXiv — NLP / Computation & Language research 4d ago

HyperDFlash: MHC-Aligned Block Speculative Decoding with Gated Residual Reduction

arXiv:2606.26744v1 Announce Type: cross Abstract: We present HyperDFlash, a block-parallel speculative decoding framework tailored to the novel multi-hyper-connection (MHC) architecture proposed by DeepSeek-V4. Despite the strong initial-token drafting performance of the native…

10
arXiv — NLP / Computation & Language research 4d ago

Structure Before Collapse: Transient semantic geometry in next-token prediction

arXiv:2606.26749v1 Announce Type: cross Abstract: Neural Collapse predicts that balanced one-hot classification pushes model representations to be equally far from each other; a symmetric configuration that depends only on the output label and ignores any semantic similarity in…

29
arXiv — Machine Learning research 4d ago

Batch-Invariant Spectral Intelligence for Robust and Explainable Insect Authentication

arXiv:2606.26757v1 Announce Type: new Abstract: Edible insects offer an efficient source of alternative protein, requiring less land, water and emitting less greenhouse gas than conventional livestock. However, their successful integration into the food supply chain demands…

22
arXiv — Machine Learning research 4d ago

Escaping Iterative Parameter-Space Noise: Differentially Private Learning with a Hypernetwork

arXiv:2606.26772v1 Announce Type: new Abstract: Differentially private (DP) training of neural networks is often hindered by the large amount of noise required by gradient-based methods such as DP-SGD, which repeatedly inject high-dimensional noise in parameter space throughout…

20
arXiv — NLP / Computation & Language research 4d ago

Reproducibility Study of "AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models"

arXiv:2606.26783v1 Announce Type: cross Abstract: Fang et al. (2025) introduced a null-space constrained projection, named AlphaEdit, for locate-then-edit knowledge editing methods, theoretically guaranteeing that edits do not disrupt previously preserved knowledge, and reports…

20
arXiv — NLP / Computation & Language research 4d ago

AIGP: An LLM-Based Framework for Long-Term Value Alignment in E-Commerce Pricing

arXiv:2606.26787v1 Announce Type: cross Abstract: Traditional dynamic pricing models in large-scale e-commerce suffer from limited interpretability, poor utilization of unstructured information, and misalignment with long-term business objectives such as cumulative Gross…

26
arXiv — Machine Learning research 4d ago

Reasoning Quality Emerges Early: Data Curation for Reasoning Models

arXiv:2606.26797v1 Announce Type: new Abstract: Supervised fine-tuning (SFT) on a small, high-quality set of long reasoning traces is an effective approach for eliciting strong reasoning capabilities in Large Language Models (LLMs). However, existing methods for curating…

14
arXiv — Machine Learning research 4d ago

Quantization in Federated Learning: Methods, Challenges and Future Directions

arXiv:2606.26822v1 Announce Type: new Abstract: Federated Learning (FL) has become a foundational paradigm for privacy-preserving distributed intelligence, yet its scalability remains fundamentally constrained by communication bottlenecks, device heterogeneity, and the…

20
arXiv — Machine Learning research 4d ago

Asymptotically Optimal Learning for Parametric Prophet Inequalities

arXiv:2606.26893v1 Announce Type: new Abstract: We study learning in prophet inequalities with i.i.d. rewards drawn from an exponential-type parametric family with an unknown parameter $\theta$, a class that includes exponential, Pareto, and bounded-support power-family…

32
arXiv — Machine Learning research 4d ago

GEOALIGN: Geometric Rollout Curation for Robust LLM Reinforcement Learning

arXiv:2606.26917v1 Announce Type: new Abstract: Online reinforcement learning is widely used to align large language models (LLMs) with reward signals, yet training can be unstable under noisy or misspecified rewards. We identify a failure mode we call directional inconsistency:…

26
arXiv — Machine Learning research 4d ago

Decision-Aligned Evaluation of Uncertainty Quantification

arXiv:2606.26990v1 Announce Type: new Abstract: Uncertainty estimates in machine learning are typically evaluated using generic metrics such as the negative log-likelihood and expected calibration error, yet good performance on such metrics does not necessarily imply high…

13
arXiv — Machine Learning research 4d ago

Uncertainty quantification via conformal prediction in data assimilation

arXiv:2606.27001v1 Announce Type: new Abstract: Quantifying the evolution of uncertainty is critical to both probabilistic forecasting and data assimilation in numerical weather prediction. In this study, we investigate the applicability of conformal prediction (CP), a recent…

30
arXiv — Machine Learning research 4d ago

A Generalization Theory for JEPA-Based World Models

arXiv:2606.27014v1 Announce Type: new Abstract: Joint Embedding Predictive Architectures (JEPAs) have recently emerged as a promising paradigm for world modeling by learning predictive dynamics in a latent space rather than generating future observations at the input level.…

5
arXiv — NLP / Computation & Language research 4d ago

Just how sure are you? Improving Verbalized Uncertainty Calibration in Medical VQA

arXiv:2606.27023v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) applied to Medical Visual Question Answering (VQA) tend to produce overconfident outputs regardless of actual correctness, and existing verbalized confidence calibration methods, developed…

15
arXiv — Machine Learning research 4d ago

Symplectic Neural Networks for learning Generalized Hamiltonians

arXiv:2606.27029v1 Announce Type: new Abstract: Hamiltonian Neural Networks (HNNs) integrate physical priors into neural models by learning a system's Hamiltonian, improving generalization and sample efficiency. Identifying the system Hamiltonian from noisy observations of state…

9
arXiv — Machine Learning research 4d ago

State Representation Matters in Deep Reinforcement Learning: Application to Energy Trading

arXiv:2606.27032v1 Announce Type: new Abstract: Energy trading decisions depend not only on current market prices, but also on expected future market conditions, and operational constraints. This makes the state representation given to a reinforcement learning agent an important…

5
arXiv — Machine Learning research 4d ago

Finding Stationary Points by Comparisons

arXiv:2606.27082v1 Announce Type: new Abstract: We study the problem of finding stationary points of non-convex functions when access to the objective is provided only through a comparison oracle that, given two points, outputs which has the larger function value. For a twice…

17
arXiv — Machine Learning research 4d ago

Data-Free Reservoir Features for Efficient Long-Horizon Cold-Start Continual Learning

arXiv:2606.27095v1 Announce Type: new Abstract: Cold-start exemplar-free class-incremental learning requires learning a growing set of classes without replay, external pretraining, or a large initial task. Existing cold-start methods typically either train the backbone…

19
arXiv — Machine Learning research 4d ago

Transformer-Based Classification of Bacterial Raman Spectra with LOOCV

arXiv:2606.27096v1 Announce Type: new Abstract: Transformer-based models have recently attracted increasing attention for Raman spectral classification. In this study, a transformer-based approach was systematically evaluated using a nested leave-one-replicate-out…

31
arXiv — Machine Learning research 4d ago

Heavy-Ball Q-Learning with Residual Weighting Correction

arXiv:2606.27112v1 Announce Type: new Abstract: This paper proposes a corrected heavy-ball Q-learning method for reinforcement learning (RL) and establishes its convergence. It also identifies conditions under which the method is theoretically guaranteed to converge faster than…

31
arXiv — Machine Learning research 4d ago

Cross-Head Attention Uplift Network with Inverse Propensity Score under Unobserved Confounding

arXiv:2606.27114v1 Announce Type: new Abstract: Uplift modeling, crucial for estimating individual treatment effects (ITE), faces dual challenges: flexibly leveraging inter-group similarity to enhance discriminative power and debiasing under unobserved confounding scenarios. In…

19
arXiv — Machine Learning research 4d ago

Kolmogorov Arnold networks (KAN) for aerodynamic prediction: a comparison with MLPs and GNNs

arXiv:2606.27126v1 Announce Type: new Abstract: Kolmogorov Arnold networks (KAN) have recently been introduced as a (deep) neural network architecture whose trainable parameters adapt the activation functions, instead of the coefficients of the affine transformations at the core…

27
arXiv — Machine Learning research 4d ago

fTNN: a tensor neural network for fractional PDEs

arXiv:2606.27140v1 Announce Type: new Abstract: We develop the fTNN, a deterministic tensor neural network subspace method for problems involving the fractional Laplacian on bounded domains, taking the fractional Poisson equation and time-dependent fractional advection-diffusion…

21

Rethinking Training & Inference for Forecasting: Linking Winner-Take-All back to GMMs

DualEval: Joint Model-Item Calibration for Unified LLM Evaluation

Embedding Foundation Model Predictions in Discrete-Choice Models with Structural Guarantees

Optimizing CUDA like a Human: Micro-Profiling Tools as Expert Surrogates for LLM-Based GPU Kernel Optimization

Finding the Time to Think: Learning Planning Budgets in Real-Time RL

A Causal Foundation Model for Structure and Outcome Prediction

Epiphany-Aware KV Cache Eviction Without the Attention Matrix

When Does Quality-Aware Multimodal Fusion Matter? A Leakage-Safe Diagnostic for Decision-Level Dependence

Localizing RL-Induced Tool Use to a Single Crosscoder Feature

Retrieval-Warmed Energy-Based Reasoning: A Five-Arm Ablation Methodology for Diffusion-as-Inference on Structured Reasoning Tasks

What Survives When You Compress a Recursive Reasoner for the Edge?

Learning Probabilistic Filters with Strictly Proper Scoring Rules

Multipath Adaptive Gated Bottleneck Latent ODE with Raman Data Fusion for Cell Culture Process Forecasting

Theory-Scale Auto-Formalization of Logics for Computer Science

Sample-efficient Transfer Reinforcement Learning via Adaptive Reward Shaping and Policy-Ratio Reweighting Strategy

CascadeFormer: Depth-Tapered Transformers Motivated by Gradient Fan-in Asymmetry

Can Large Language Models Reliably Code Qualitative Humanitarian Data? A Benchmark Study Against Human Expert Adjudication

Revisiting Action Factorization for Complex Action Spaces

SharQ: Bridging Activation Sparsity and FP4 Quantization for LLM Inference

Empirical Software Engineering TerraProbe: A Layered-Oracle Framework for Detecting Deceptive Fixes in LLM-Assisted Terraform

Sketched Linear Contrastive Learning: Approximation, Optimization, and Statistical Scaling

Discovering Millions of Interpretable Features with Sparse Autoencoders

From Weights to Features: SAE-Guided Activation Regularization for LLM Continual Learning

Target-Aware Bandit Allocation for Scalable Surrogate Optimization in Chemical Space

Zero-Shot Size Transfer for Neural ODEs on Sparse Random Graphs: Graphon Limits and Adjoint Convergence

PersistentKV: Page-Aware Decode Scheduling for Long-Context LLM Serving on Commodity GPUs

Algorithmic Foundations of Deep Learning: Complexity-Theoretic Rates and a Characterization of Universal Approximation

HyperDFlash: MHC-Aligned Block Speculative Decoding with Gated Residual Reduction

Structure Before Collapse: Transient semantic geometry in next-token prediction

Batch-Invariant Spectral Intelligence for Robust and Explainable Insect Authentication

Escaping Iterative Parameter-Space Noise: Differentially Private Learning with a Hypernetwork

Reproducibility Study of "AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models"

AIGP: An LLM-Based Framework for Long-Term Value Alignment in E-Commerce Pricing

Reasoning Quality Emerges Early: Data Curation for Reasoning Models

Quantization in Federated Learning: Methods, Challenges and Future Directions

Asymptotically Optimal Learning for Parametric Prophet Inequalities

GEOALIGN: Geometric Rollout Curation for Robust LLM Reinforcement Learning

Decision-Aligned Evaluation of Uncertainty Quantification

Uncertainty quantification via conformal prediction in data assimilation

A Generalization Theory for JEPA-Based World Models

Just how sure are you? Improving Verbalized Uncertainty Calibration in Medical VQA

Symplectic Neural Networks for learning Generalized Hamiltonians

State Representation Matters in Deep Reinforcement Learning: Application to Energy Trading

Finding Stationary Points by Comparisons

Data-Free Reservoir Features for Efficient Long-Horizon Cold-Start Continual Learning

Transformer-Based Classification of Bacterial Raman Spectra with LOOCV

Heavy-Ball Q-Learning with Residual Weighting Correction

Cross-Head Attention Uplift Network with Inverse Propensity Score under Unobserved Confounding

Kolmogorov Arnold networks (KAN) for aerodynamic prediction: a comparison with MLPs and GNNs

fTNN: a tensor neural network for fractional PDEs