Hugging Face Daily Papers · · 4 min read

Toto 2.0: Time Series Forecasting Enters the Scaling Era

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Toto 2.0 is designed to answer a simple and open question: Can time series foundation models (TSFMs) improve as they scale? </p>\n<p>Our results show they can. The highlights:</p>\n<ul>\n<li>Scaling that works. Every size improves on the one below it, with no sign of saturation at 2.5B. </li>\n<li>Best in class on every benchmark we tested. Toto 2.0 takes the top spots on BOOM (Datadog's observability forecasting benchmark), GIFT-Eval (the standard general-purpose benchmark), and TIME (a new contamination-resistant zero-shot benchmark).</li>\n<li>A generational jump from Toto 1.0. Toto 2.0 is 7× more parameter-efficient at matching quality and dramatically faster at inference time.</li>\n<li>Trained on observability and synthetic data, generalizes broadly. Toto 2.0 does not see any public forecasting data during pretraining, yet leads the field on general-purpose benchmarks.</li>\n</ul>\n","updatedAt":"2026-05-21T13:13:53.839Z","author":{"_id":"645e9d6d9c8e15af60a7d44f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/645e9d6d9c8e15af60a7d44f/uCuZRH2YcYktidW-Re9Xp.png","fullname":"Emaad Khwaja","name":"Emaad","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":5,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8899980783462524},"editors":["Emaad"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/645e9d6d9c8e15af60a7d44f/uCuZRH2YcYktidW-Re9Xp.png"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.20119","authors":[{"_id":"6a0dc52bd1ef9ecdf71c0db6","user":{"_id":"645e9d6d9c8e15af60a7d44f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/645e9d6d9c8e15af60a7d44f/uCuZRH2YcYktidW-Re9Xp.png","isPro":false,"fullname":"Emaad Khwaja","user":"Emaad","type":"user","name":"Emaad"},"name":"Emaad Khwaja","status":"claimed_verified","statusLastChangedAt":"2026-05-21T19:24:09.999Z","hidden":false},{"_id":"6a0dc52bd1ef9ecdf71c0db7","name":"Chris Lettieri","hidden":false},{"_id":"6a0dc52bd1ef9ecdf71c0db8","name":"Gerald Woo","hidden":false},{"_id":"6a0dc52bd1ef9ecdf71c0db9","name":"Eden Belouadah","hidden":false},{"_id":"6a0dc52bd1ef9ecdf71c0dba","name":"Marc Cenac","hidden":false},{"_id":"6a0dc52bd1ef9ecdf71c0dbb","name":"Guillaume Jarry","hidden":false},{"_id":"6a0dc52bd1ef9ecdf71c0dbc","name":"Enguerrand Paquin","hidden":false},{"_id":"6a0dc52bd1ef9ecdf71c0dbd","name":"Xunyi Zhao","hidden":false},{"_id":"6a0dc52bd1ef9ecdf71c0dbe","name":"Viktoriya Zhukov","hidden":false},{"_id":"6a0dc52bd1ef9ecdf71c0dbf","name":"Othmane Abou-Amal","hidden":false},{"_id":"6a0dc52bd1ef9ecdf71c0dc0","name":"Chenghao Liu","hidden":false},{"_id":"6a0dc52bd1ef9ecdf71c0dc1","name":"Ameet Talwalkar","hidden":false},{"_id":"6a0dc52bd1ef9ecdf71c0dc2","name":"David Asker","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/645e9d6d9c8e15af60a7d44f/y4Letwe272Kj3BZqRUGvA.png","https://cdn-uploads.huggingface.co/production/uploads/645e9d6d9c8e15af60a7d44f/bRxC_dXH7umSpa9HvUIUX.png"],"publishedAt":"2026-05-19T00:00:00.000Z","submittedOnDailyAt":"2026-05-21T00:00:00.000Z","title":"Toto 2.0: Time Series Forecasting Enters the Scaling Era","submittedOnDailyBy":{"_id":"645e9d6d9c8e15af60a7d44f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/645e9d6d9c8e15af60a7d44f/uCuZRH2YcYktidW-Re9Xp.png","isPro":false,"fullname":"Emaad Khwaja","user":"Emaad","type":"user","name":"Emaad"},"summary":"We show that time series foundation models scale: a single training recipe produces reliable forecast-quality improvements from 4M to 2.5B parameters. We release Toto 2.0, a family of five open-weights forecasting models trained under this recipe. The Toto 2.0 family sets a new state of the art on three forecasting benchmarks: BOOM, our observability benchmark; GIFT-Eval, the standard general-purpose benchmark; and the recent contamination-resistant TIME benchmark. This report describes our experimental results and details the design decisions behind Toto 2.0: its architecture and training recipe, training data, and the u-muP hyperparameter transfer pipeline. All five base checkpoints are released under Apache 2.0.","upvotes":26,"discussionId":"6a0dc52bd1ef9ecdf71c0dc3","projectPage":"https://www.datadoghq.com/blog/ai/toto-2/","githubRepo":"https://github.com/DataDog/toto","githubRepoAddedBy":"user","ai_summary":"Time series foundation models demonstrate scalable forecasting performance across parameter sizes, with Toto 2.0 achieving state-of-the-art results on multiple benchmarks through a unified training approach.","ai_keywords":["time series foundation models","forecasting models","parameter scaling","BOOM benchmark","GIFT-Eval benchmark","TIME benchmark","u-muP hyperparameter transfer pipeline"],"githubStars":437,"organization":{"_id":"676d60964b96c8ead04106ea","name":"Datadog","fullname":"Datadog","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/64399c0deb7c5616ef401ae5/tIe52AF51aIyKzDtbvH2U.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"645e9d6d9c8e15af60a7d44f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/645e9d6d9c8e15af60a7d44f/uCuZRH2YcYktidW-Re9Xp.png","isPro":false,"fullname":"Emaad Khwaja","user":"Emaad","type":"user"},{"_id":"6695376dca566116a61c8c27","avatarUrl":"/avatars/445a293b052048ff9abfc078ef5d7ca3.svg","isPro":false,"fullname":"Ben Cohen","user":"bthecohen","type":"user"},{"_id":"6270324ebecab9e2dcf245de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6270324ebecab9e2dcf245de/cMbtWSasyNlYc9hvsEEzt.jpeg","isPro":false,"fullname":"Kye Gomez","user":"kye","type":"user"},{"_id":"67bc8f7b5f3968ee6e4bc46c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67bc8f7b5f3968ee6e4bc46c/SRiW5vB7svVSUkUmcGUZC.jpeg","isPro":false,"fullname":"Chris Lettieri","user":"chris-lettieri-dd","type":"user"},{"_id":"6a0f1393b6daaf9802639236","avatarUrl":"/avatars/60a11ef5eb8f90bfa379a084ef63a3b4.svg","isPro":false,"fullname":"saferstein","user":"jsafo","type":"user"},{"_id":"6a0f1401201a52c2acaeee55","avatarUrl":"/avatars/159c89c73a3bda75c2ce447e6515d25e.svg","isPro":false,"fullname":"Roman","user":"rchevassu","type":"user"},{"_id":"6850832dc32aa399069b1100","avatarUrl":"/avatars/3700a2305c818378947c7d2c32230278.svg","isPro":false,"fullname":"Harmon Herring","user":"harmonherring-pro","type":"user"},{"_id":"64b7246f75b23e68c535320a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64b7246f75b23e68c535320a/uy41js-IL4cd1qFiX4CAI.jpeg","isPro":true,"fullname":"Patrick Lee","user":"patrickleenyc","type":"user"},{"_id":"67aa29b64cb8b1eb4e07598a","avatarUrl":"/avatars/09c64ee0c2dd19897b84b20462794dee.svg","isPro":false,"fullname":"Juliet Moss","user":"julietmoss","type":"user"},{"_id":"682f4326ee1e9ce844a8deb4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/682f4326ee1e9ce844a8deb4/_bzLYS3sZyooO4aapK2vw.jpeg","isPro":false,"fullname":"Varun Reddy","user":"varunreddy5455","type":"user"},{"_id":"6a0f15cb452a0a84889d6a1c","avatarUrl":"/avatars/4c32c62afdd5ff60cbbcd10297f045cb.svg","isPro":false,"fullname":"Eli Schiff","user":"elischiffdd","type":"user"},{"_id":"682f48499806c0814788be70","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/5cbhulmatdl_YPWuaUFv-.png","isPro":false,"fullname":"Ameet Talwalkar","user":"atalwalkar","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"676d60964b96c8ead04106ea","name":"Datadog","fullname":"Datadog","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/64399c0deb7c5616ef401ae5/tIe52AF51aIyKzDtbvH2U.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.20119.md"}">
Papers
arxiv:2605.20119

Toto 2.0: Time Series Forecasting Enters the Scaling Era

Published on May 19
· Submitted by
Emaad Khwaja
on May 21
Authors:
,
,
,
,
,
,
,
,
,
,
,

Abstract

Time series foundation models demonstrate scalable forecasting performance across parameter sizes, with Toto 2.0 achieving state-of-the-art results on multiple benchmarks through a unified training approach.

AI-generated summary

We show that time series foundation models scale: a single training recipe produces reliable forecast-quality improvements from 4M to 2.5B parameters. We release Toto 2.0, a family of five open-weights forecasting models trained under this recipe. The Toto 2.0 family sets a new state of the art on three forecasting benchmarks: BOOM, our observability benchmark; GIFT-Eval, the standard general-purpose benchmark; and the recent contamination-resistant TIME benchmark. This report describes our experimental results and details the design decisions behind Toto 2.0: its architecture and training recipe, training data, and the u-muP hyperparameter transfer pipeline. All five base checkpoints are released under Apache 2.0.

Community

Paper author Paper submitter about 13 hours ago

Toto 2.0 is designed to answer a simple and open question: Can time series foundation models (TSFMs) improve as they scale?

Our results show they can. The highlights:

  • Scaling that works. Every size improves on the one below it, with no sign of saturation at 2.5B.
  • Best in class on every benchmark we tested. Toto 2.0 takes the top spots on BOOM (Datadog's observability forecasting benchmark), GIFT-Eval (the standard general-purpose benchmark), and TIME (a new contamination-resistant zero-shot benchmark).
  • A generational jump from Toto 1.0. Toto 2.0 is 7× more parameter-efficient at matching quality and dramatically faster at inference time.
  • Trained on observability and synthetic data, generalizes broadly. Toto 2.0 does not see any public forecasting data during pretraining, yet leads the field on general-purpose benchmarks.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.20119
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 7

Browse 7 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.20119 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 2

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers