News / #paper Tag Research papers 500 articles archived under #paper · RSS Sign in to follow Hugging Face official-blog 15h ago DiScoFormer: One transformer for density and score, across distributions Back to Articles a]:hidden"> DiScoFormer: One transformer for density and score, across distributions Enterprise Article Published June 29, 2026 Upvote 1 Kyle Wiggers Ai2Comms allenai 📄 Tech report: arxiv.org/abs/2511.05924 Many problems in machine learning and the sciences… 8 arXiv — Machine Learning research 1d ago OverFlowLight: Real-Time Gridlock Prevention and Traffic Signal Optimization for Urban Intersections arXiv:2606.27381v1 Announce Type: new Abstract: Queue overflow, a severe consequence of urban traffic congestion, occurs when vehicle queues exceed intersection capacity, obstructing upstream traffic and triggering cascading gridlocks. Prevailing traffic signal control (TSC)… 16 arXiv — Machine Learning research 1d ago RANSAC Scoring Done Right arXiv:2606.27385v1 Announce Type: new Abstract: The most widely used RANSAC variants score candidate models by counting inliers or summing per-point scores that saturate beyond a residual threshold. Every such score requires a user-supplied parameter that is a function of the… 27 arXiv — Machine Learning research 1d ago Unified Zero-Shot Time Series Forecasting: A Darts Foundation arXiv:2606.27438v1 Announce Type: new Abstract: Since its initial release in 2020, Darts has become a widely used open-source Python library for time series analysis. A series of foundation models have recently claimed accuracy improvements in zero-shot forecasting, promising a… 15 arXiv — Machine Learning research 1d ago PairSAE: Mechanistic Interpretability from Pair Representations in Protein Co-Folding arXiv:2606.27440v1 Announce Type: new Abstract: Foundation models for structural biology have achieved remarkable performance in predicting biomolecular structure and show promise for the design of proteins and small molecules. Yet understanding which internal features drive… 29 arXiv — Machine Learning research 1d ago Learning in Markovian bandits with non-observable states and constrained decision epochs arXiv:2606.27448v1 Announce Type: new Abstract: This paper studies the problem of regret minimization in Markovian bandits with \emph{non-observable states} and possibly \emph{constrained} decision epochs. The focus is restricted to a ``pure'' regret benchmark, that compares the… 26 arXiv — Machine Learning research 1d ago Prism Transformer: Progressive Head Schedules for Hierarchical Attention Processing arXiv:2606.27449v1 Announce Type: new Abstract: Multi-head attention conventionally partitions the hidden dimension equally across all heads at every layer, enforcing an identical representational subspace dimension (dh = dmodel/h) throughout the models depth. In this work, we… 19 arXiv — Machine Learning research 1d ago Operator Learning for Cubic Nonlinear Schr\"odinger Equation on Periodic Domains arXiv:2606.27459v1 Announce Type: new Abstract: We consider the cubic nonlinear Schr\"odinger (NLS) equation on two-dimensional flat tori with varying aspect ratios. In this formulation, the choice of aspect ratio governs the Fourier resonance structure, so rational and… 35 arXiv — NLP / Computation & Language research 1d ago The Curse of Multiple Mediators: Hidden Interaction Effects in Activation Patching arXiv:2606.27510v1 Announce Type: cross Abstract: Activation patching is the primary tool in mechanistic interpretability. It attributes causal responsibility for a model behavior to each of its individual components by estimating its natural indirect effect (NIE). Re-deriving… 4 arXiv — Machine Learning research 1d ago Boundary condition fidelity for bottom-hole pressure and CO2 plume prediction in geological carbon storage arXiv:2606.27515v1 Announce Type: new Abstract: Accurate prediction of bottom-hole pressure (BHP) and CO2 plume migration is essential for safe geological carbon storage, yet practical simulations often rely on truncated domains where artificial boundaries distort pressure… 32 arXiv — Machine Learning research 1d ago Productionized Fairness Measurement Under Privacy Constraints arXiv:2606.27558v1 Announce Type: new Abstract: Fairness measurements in the form of disaggregated evaluations often rely on demographic signals that are legally constrained or culturally sensitive. Race and ethnicity signals are among the more difficult signals to curate and… 34 arXiv — Machine Learning research 1d ago Quantum Generative Diffusion Model for Real-World Time Series arXiv:2606.27561v1 Announce Type: new Abstract: Generative models have achieved remarkable success in data synthesis, though recent advances driven by increasing model scale have introduced challenges in computational cost and efficiency. Quantum machine learning offers a… 10 arXiv — Machine Learning research 1d ago hia-gat: A Heterogeneous Interaction-Aware Graph Attention Network For Frame-Level Traffic Conflict Risk Prediction On Freeways arXiv:2606.27577v1 Announce Type: new Abstract: This paper formulates frame-level freeway risk assessment as a multi-agent scene graph-level binary classification problem, where each video or trajectory frame is labeled risky if any TTC- or PET-based conflict violates a… 16 arXiv — Machine Learning research 1d ago PEBS: Per-rater Empirical-Bayes Shrinkage for RLHF Reward-Model Calibration arXiv:2606.27578v1 Announce Type: new Abstract: Reward models for Reinforcement Learning from Human Feedback (RLHF) pool preferences across thousands of annotators and fit one global affine calibrator, collapsing raters with systematically different rating-scale offsets and… 36 arXiv — Machine Learning research 1d ago Retroactive Advantage Correction: Closed-Form V-Trace Bias Correction for Delay-Aware RLHF arXiv:2606.27580v1 Announce Type: new Abstract: Reinforcement learning from human feedback (RLHF) in production does not always have a synchronous reward signal. Code-execution verifiers, slow judge ensembles, and queued human review can return several gradient steps after the… 14 arXiv — Machine Learning research 1d ago Global Explanations for Multivariate Time Series Forecasting Models via $K$-Order Markov Approximations arXiv:2606.27599v1 Announce Type: new Abstract: While many explainable AI (XAI) methods have been proposed, most are not designed for time-series forecasting models and often rely on the implicit assumption that timestamp features are independent. This assumption ignores the… 6 arXiv — Machine Learning research 1d ago Training Observable Control Policies to Expose Agent State Through Actions arXiv:2606.27609v1 Announce Type: new Abstract: Physical or operational constraints often impose communications limitations on autonomous agents. Such limitations complicate monitoring or multiagent coordination. Even when strong communications are absent, some information may… 13 arXiv — Machine Learning research 1d ago COOPA: A Modular LLM Agent Architecture for Operations Research Problems arXiv:2606.27611v1 Announce Type: new Abstract: Operations Research (OR) provides a rigorous framework for high-stakes decision-making, but effective OR modeling requires substantial domain knowledge, mathematical abstraction, and solver expertise. Recent LLM-based systems… 18 arXiv — Machine Learning research 1d ago FoggyTrust: Robust Federated Learning with Hierarchical Trust Networks arXiv:2606.27622v1 Announce Type: new Abstract: Byzantine-robust federated learning seeks to protect distributed model training from malicious or corrupted clients without requiring access to their private data. FLTrust addresses this challenge by introducing a trusted… 33 arXiv — Machine Learning research 1d ago HybridCodec: Modeling Discrete and Continuous Representations for Efficient Speech Language Models arXiv:2606.27627v1 Announce Type: new Abstract: Discrete audio representations have become increasingly popular for building multimodal text-audio systems and integrating audio capabilities into Large Language Models (LLMs). However, numerous studies report performance… 7 arXiv — Machine Learning research 1d ago Continual Learning for Sequential Personalization of Small Language Models: A Stability Monitoring Analysis arXiv:2606.27634v1 Announce Type: new Abstract: Small Language Models (SLMs) are increasingly being considered for deployment on edge devices such as laptops, enabling private, low-latency, and locally personalized applications. However, personalization requires models to adapt… 30 arXiv — Machine Learning research 1d ago TeRoR: Decoupled Temporal Rotation with Relational Circular Region for Temporal Knowledge Graph Embedding arXiv:2606.27651v1 Announce Type: new Abstract: In recent years, with the emergence of Temporal Knowledge Graphs (TKGs), research on learning entity and relation representations in TKGs has attracted increasing attention, giving rise to a large number of TKG embedding methods.… 35 arXiv — Machine Learning research 1d ago Are Time-Series Foundation Models Ready for E-Nose Data? An Empirical Assessment of Their Embeddings arXiv:2606.27672v1 Announce Type: new Abstract: Inspired by advances in natural language processing and computer vision, "time-series foundation models" (TSFMs) have recently been introduced with the promise of strong generalization across diverse time-series tasks, including… 5 arXiv — NLP / Computation & Language research 1d ago Textual Belief States for World Models: Identifiable Representation Learning Under Strict Mediation arXiv:2606.27681v1 Announce Type: cross Abstract: World models in partially observed environments rely on latent representations that summarize interaction history, but in many modern LLM-based architectures predictive performance fails to reflect representation quality due to… 11 arXiv — Machine Learning research 1d ago CBD: API-Only LLM Black-Box Unlearning through Controlled Behavioral Divergence arXiv:2606.27683v1 Announce Type: new Abstract: Edge devices increasingly invoke large language models (LLMs) through API services for context aware edge intelligence, while edge generated data may be collected to improve LLMs and may introduce sensitive, copyrighted, harmful,… 11 arXiv — Machine Learning research 1d ago Deployment-Side Adaptiveness in Multi-Horizon Volatility Forecasting arXiv:2606.27688v1 Announce Type: new Abstract: In financial forecasting, predictive performance depends not only on which model is trained, but also on how the trained model is deployed. We study this issue in multi-horizon volatility forecasting. Our starting point is that a… 27 arXiv — Machine Learning research 1d ago Halt Fast! Early Stopping for Certified Robustness arXiv:2606.27694v1 Announce Type: new Abstract: Randomized Smoothing (RS) provides rigorous robustness guarantees for neural networks without architectural constraints, yet its adoption is limited by extreme computational costs. Standard RS requires tens of thousands of model… 15 arXiv — Machine Learning research 1d ago Class-frequency Guided Noise Schedule for Diffusion Models arXiv:2606.27696v1 Announce Type: new Abstract: In this paper, we are the first to examine the correlations between class frequency and the multi-scale noise schedule within diffusion models. For score-based generative models, low-density regions often lead to inaccurately… 26 arXiv — Machine Learning research 1d ago What Was That Again? Certified Robustness for Automatic Speech Recognition arXiv:2606.27698v1 Announce Type: new Abstract: Automatic Speech Recognition systems are notoriously both sensitive to adversarial and benign perturbations. While this has been repeatedly demonstrated using reference datasets, detecting such behaviors in deployed systems is… 19 arXiv — Machine Learning research 1d ago The Simulacrum: Decision-Theoretic Pretraining for Near-Optimal Time-Series Forecasting and Inference arXiv:2606.27711v1 Announce Type: new Abstract: We introduce a neural network-based framework for learning time series estimators through a process we term decision-theoretic pretraining. Analysts specify a generative world, a distribution over data-generating processes, and a… 36 arXiv — Machine Learning research 1d ago Aurora: A Leverage-Aware Spectral Optimizer arXiv:2606.27715v1 Announce Type: new Abstract: We show that for tall matrix parameters, like projection matrices in the MLP layers, the Muon update can have row norms that are arbitrarily non-uniform. This can lead to a self-reinforcing feedback loop whereby neurons receive… 13 arXiv — Machine Learning research 1d ago Learning to Reason with Curriculum II: Compositional Generalization arXiv:2606.27721v1 Announce Type: new Abstract: Compositional generalization, the ability to solve complex problems by combining solutions to simpler sub-problems, is a fundamental capability of both natural and artificial intelligence, and a key mechanism underlying… 25 arXiv — Machine Learning research 1d ago Reduction of Probabilistic Chemical Reaction Networks arXiv:2606.27737v1 Announce Type: new Abstract: Programming adaptive behaviors at the cellular level is a long-standing goal that raises the question of how probabilistic computation can be implemented in biochemical systems. Chemical reaction networks (CRNs) provide such a… 5 arXiv — Machine Learning research 1d ago The Weakest Link Tells It All: Outcome-Supervised Process Reward Modeling via Learnable Credit Assignment arXiv:2606.27739v1 Announce Type: new Abstract: Process reward models (PRMs) enhance the reasoning capabilities of large language models (LLMs) by providing fine-grained feedback, yet training PRMs typically requires expensive stepwise annotations. Outcome-supervised PRMs offer… 18 arXiv — Machine Learning research 1d ago Flexformer: Flexible Linear Transformer with Learnable Attention Kernel arXiv:2606.27748v1 Announce Type: new Abstract: Transformer models rely on attention mechanism to capture long-range dependencies but suffer from quadratic complexity, limiting their scalability to long sequences. Kernel-based linear attention reduces this complexity but… 8 arXiv — Machine Learning research 1d ago PerturbCellRL: Verifier-Guided Reinforcement Learning for Single-Cell Perturbation Prediction arXiv:2606.27752v1 Announce Type: new Abstract: Single-cell perturbation models can reduce costly wet-lab screening by predicting how cells respond transcriptionally to interventions. While recent generative models improve population-level prediction, individual generated cells… 33 arXiv — Machine Learning research 1d ago Layerwise Progressive Freezing: A Training Scaffold for Depth-Scalable Binary Networks arXiv:2606.27759v1 Announce Type: new Abstract: Training binary neural networks (BNNs) from scratch is dominated by the straight-through estimator (STE), whose forward/backward mismatch produces severe accuracy degradation as networks deepen. We study an orthogonal axis: when… 12 arXiv — Machine Learning research 1d ago RS-Diffuser: Risk-Sensitive Diffusion Planning with Distributional Value Guidance arXiv:2606.27766v1 Announce Type: new Abstract: Offline reinforcement learning enables policy learning from fixed datasets without additional environment interaction, making it appealing for safety-critical applications where online exploration is costly or unsafe.… 32 arXiv — Machine Learning research 1d ago Difference of Convex Programming in the Wasserstein Space with Applications to MMD Optimization arXiv:2606.27767v1 Announce Type: new Abstract: Optimizing functionals over the space of probability measures is now ubiquitous in machine learning. A widely used approach is to perform the optimization directly over the Wasserstein space, but many objective functionals of… 35 arXiv — Machine Learning research 1d ago NormGuard: Reward-Preserving Norm Constraints in Flow-Matching Reinforcement Learning arXiv:2606.27771v1 Announce Type: new Abstract: Reinforcement learning (RL) post-training improves the reward alignment of flow-based generators, but often degrades perceptual quality in ways that are not captured by the reward proxy. We identify a simple structural signature of… 8 arXiv — Machine Learning research 1d ago Accelerating Hierarchical Sparse Predictive Coding with Hybrid Amortized Inference arXiv:2606.27802v1 Announce Type: new Abstract: Hierarchical predictive coding provides an interpretable framework for perception as error-driven inference in multi-layer generative models, while sparse coding imposes parsimonious latent representations through explicit sparsity… 19 arXiv — Machine Learning research 1d ago Pepti-drift: Toxicity-Repulsive Drifting for Antigen-Conditioned Discrete Peptide Generation arXiv:2606.27824v1 Announce Type: new Abstract: Peptides are a promising therapeutic modality that combine the chemical tunability of small molecules with the target specificity of macromolecular therapeutics. However, designing antigen-specific binding peptides while avoiding… 12 arXiv — Machine Learning research 1d ago USAD: Uncertainty-aware Statistical Adversarial Detection arXiv:2606.27832v1 Announce Type: new Abstract: Statistical adversarial detection (SAD) treats detection as a two-sample test. Given a reference set of clean examples (CEs) and a batch of queries, potentially containing an unknown mixture of CEs and adversarial examples (AEs),… 5 arXiv — Machine Learning research 1d ago WattLayer: Get Layers Right to Estimate Inference Energy of Neural Networks arXiv:2606.27841v1 Announce Type: new Abstract: The widespread adoption of Artificial Intelligence (AI) has led to increasing concerns about energy consumption, yet there is a lack of standardized methodologies to accurately estimate AI inference energy consumption, particularly… 15 arXiv — Machine Learning research 1d ago Applicability of memorization indicators for early spotting of overfitting while recalibrating sEMG-decoders on low sample sizes arXiv:2606.27855v1 Announce Type: new Abstract: Deep learning models for surface electromyography (sEMG) can benefit substantially from subject-specific (re-)calibration, since no sufficiently large and diverse datasets are available to train fully generic decoders. However, for… 32 arXiv — Machine Learning research 1d ago GNBAN: Graph Neural Basis Attention Networks for Long-Horizon Forecasting over Large Entity Sets arXiv:2606.27863v1 Announce Type: new Abstract: Demand forecasting at the bottom of a retail hierarchy requires predicting tens of thousands of correlated long-horizon series across products, stores, and regions. Modern systems must scale across massive catalogs, capture shared… 33 arXiv — Machine Learning research 1d ago FlexMoE: One-for-All Nested Intra-Expert Pruning for MoE Language Models arXiv:2606.27866v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) language models scale model ability with sparsely activated experts, making this architecture a standard recipe for modern large models. However, sparse activation does not remove the deployment burden of… 24 arXiv — Machine Learning research 1d ago A Comparison of Fusion Techniques for Multi-Modal Human Activity Recognition on the HARMES Dataset arXiv:2606.27886v1 Announce Type: new Abstract: Recent advances in Human Activity Recognition (HAR) from wearable sensors have shown that multi-modal deep learning models consistently outperform their uni-modal counterparts. Modalities can include IMUs, RGB cameras, audio… 27 arXiv — Machine Learning research 1d ago TA-SparseMG: Trend-Aware Sparse Forecasting via Multi-Scale Gating for Long-Term Time Series arXiv:2606.27908v1 Announce Type: new Abstract: Long-term time series forecasting finds extensive applications in domains such as power demand, traffic flow, meteorological observation, and renewable energy dispatch. Forecasting dynamically varying long-term time series poses… 21 arXiv — Machine Learning research 1d ago Graph Dimensionality Reduction for Contextual Bandits: Structure-Specific Regret Bounds under Approximate Smoothness and Noisy Eigenspaces arXiv:2606.27917v1 Announce Type: new Abstract: Contextual bandits with graph-structured arms arise in recommendation, citation retrieval, and social advertising, where arms connected on a graph tend to share reward signal. Standard dimensionality reduction ignores this… 36 Page 5 of 10 · 500 articles ← Newer Older →