TS-Fault: Benchmarking Time Series Forecasters Against Structural Faults
Mirrored from arXiv — Machine Learning for archival readability. Support the source by reading on the original site.
Computer Science > Machine Learning
Title:TS-Fault: Benchmarking Time Series Forecasters Against Structural Faults
Abstract:Time series forecasting (TSF) underpins consequential decisions in energy, transportation, finance, and healthcare, yet TSF models are almost universally ranked by a single number (e.g., average error) on clean held-out data, under the implicit assumption that it predicts deployed reliability. However, real faults are not i.i.d noise but structured events with temporal shape, broken cross-variable dependencies, regime change coupled with missingness, and causal propagation across a sensing pipeline. Treating TSF robustness as a data-quality problem, we present TS-Fault, a benchmark that evaluates forecasting models under explicit, parameterized fault scenarios with controllable semantic difficulty. TS-Fault organizes recurring failures into four modes along two orthogonal axes (observation- vs mechanism-level; univariate vs multivariate) and injects each fault into the most prediction-critical window via a unified importance score. This design enables robustness to be tested against the structures models actually rely on, rather than reduced to generic noise sensitivity. We evaluate 21 models across 6 datasets, 4 modes, and 5 difficulty levels under a paired clean/corrupt protocol. The results reveal three findings that contradict common leaderboard intuition: (i) clean-data accuracy anti-correlates with robustness; (ii) clean rankings are preserved under observation-level faults but reshuffled under mechanism-level faults; and (iii) all catastrophic failures occur under mechanism-level faults, with foundation models achieving the highest clean-data accuracy yet exhibiting the greatest fragility. The code is publicly available at this https URL.
| Subjects: | Machine Learning (cs.LG); Machine Learning (stat.ML) |
| Cite as: | arXiv:2606.18539 [cs.LG] |
| (or arXiv:2606.18539v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2606.18539
arXiv-issued DOI via DataCite (pending registration)
|
Access Paper:
- View PDF
- HTML (experimental)
- TeX Source
Current browse context:
References & Citations
Bibliographic and Citation Tools
Code, Data and Media Associated with this Article
Demos
Recommenders and Search Tools
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
More from arXiv — Machine Learning
-
Can AI Draw Science? A Benchmark for Evaluating Scientific Figure Generation by Text-to-Image and Multimodal Models
Jun 30
-
On the Necessity of a Liquid Substrate for Mesh Intelligence
Jun 30
-
Position: RL Researchers Need to Distinguish Between Solving Simulators and Using Simulators as a Proxy
Jun 30
-
Learning to Distributedly Estimate under Partially Known Dynamics: A Covariance-Agnostic Neural Kalman Consensus Filter
Jun 30
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.