Hugging Face Daily Papers · June 3, 2026 · 4 min read

NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

NVIDIA OmniDreams is an action-conditioned foundation generative world model that provides real-time, photorealistic, and reactive simulation environments for training and evaluating autonomous driving policies.</p>\n","updatedAt":"2026-06-03T02:14:16.780Z","author":{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","fullname":"taesiri","name":"taesiri","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":310,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8769704699516296},"editors":["taesiri"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.03159","authors":[{"_id":"6a1f8debe292c1c78ecb1304","name":"NVIDIA","hidden":false},{"_id":"6a1f8debe292c1c78ecb1306","name":"Aarti Basant","hidden":false},{"_id":"6a1f8debe292c1c78ecb1307","name":"Amlan Kar","hidden":false},{"_id":"6a1f8debe292c1c78ecb1308","name":"Despoina Paschalidou","hidden":false},{"_id":"6a1f8debe292c1c78ecb1309","name":"Fangyin Wei","hidden":false},{"_id":"6a1f8debe292c1c78ecb130a","name":"Francesco Ferroni","hidden":false},{"_id":"6a1f8debe292c1c78ecb130b","name":"Guillermo Garcia Cobo","hidden":false},{"_id":"6a1f8debe292c1c78ecb130c","name":"Haithem Turki","hidden":false},{"_id":"6a1f8debe292c1c78ecb130d","name":"Huan Ling","hidden":false},{"_id":"6a1f8debe292c1c78ecb130e","name":"Jaewoo Seo","hidden":false},{"_id":"6a1f8debe292c1c78ecb130f","name":"James Lucas","hidden":false},{"_id":"6a1f8debe292c1c78ecb1310","name":"Jay Zhangjie Wu","hidden":false},{"_id":"6a1f8debe292c1c78ecb1311","name":"Jialiang Wang","hidden":false},{"_id":"6a1f8debe292c1c78ecb1312","name":"Jonathan Lorraine","hidden":false},{"_id":"6a1f8debe292c1c78ecb1313","name":"Jun Gao","hidden":false},{"_id":"6a1f8debe292c1c78ecb1314","name":"Kai He","hidden":false},{"_id":"6a1f8debe292c1c78ecb1315","name":"Katarina Tothova","hidden":false},{"_id":"6a1f8debe292c1c78ecb1316","name":"Kevin Xie","hidden":false},{"_id":"6a1f8debe292c1c78ecb1317","name":"Michał Tyszkiewicz","hidden":false},{"_id":"6a1f8debe292c1c78ecb1318","name":"Qi Wu","hidden":false},{"_id":"6a1f8debe292c1c78ecb1319","name":"Riccardo de Lutio","hidden":false},{"_id":"6a1f8debe292c1c78ecb131a","name":"Ruilong Li","hidden":false},{"_id":"6a1f8debe292c1c78ecb131b","name":"Sanja Fidler","hidden":false},{"_id":"6a1f8debe292c1c78ecb131c","name":"Seung Wook Kim","hidden":false},{"_id":"6a1f8debe292c1c78ecb131d","name":"Tianchang Shen","hidden":false},{"_id":"6a1f8debe292c1c78ecb131e","name":"Tianshi Cao","hidden":false},{"_id":"6a1f8debe292c1c78ecb131f","name":"Tobias Pfaff","hidden":false},{"_id":"6a1f8debe292c1c78ecb1320","name":"William Lew","hidden":false},{"_id":"6a1f8debe292c1c78ecb1321","name":"Xindi Wu","hidden":false},{"_id":"6a1f8debe292c1c78ecb1322","name":"Xuanchi Ren","hidden":false},{"_id":"6a1f8debe292c1c78ecb1323","name":"Yifan Lu","hidden":false},{"_id":"6a1f8debe292c1c78ecb1324","name":"Yuxuan Zhang","hidden":false},{"_id":"6a1f8debe292c1c78ecb1325","name":"Zan Gojcic","hidden":false},{"_id":"6a1f8debe292c1c78ecb1326","name":"Zian Wang","hidden":false}],"publishedAt":"2026-06-02T00:00:00.000Z","submittedOnDailyAt":"2026-06-03T00:00:00.000Z","title":"NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation","submittedOnDailyBy":{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user","name":"taesiri"},"summary":"As autonomous vehicle capabilities advance, the safe evaluation of driving policies in long-tail scenarios remains a critical bottleneck. In closed-loop simulation, the driving policy model actively interacts with the environment, where its actions dynamically update the simulator state and directly influence the next set of generated sensor observations. While recent reconstruction-based neural simulators offer photorealism, they are fundamentally constrained by their initial captured data and struggle to generalize to highly dynamic or novel scenes. To overcome these limitations, we introduce OmniDreams, a foundation generative world model mid- and post-trained from the Cosmos diffusion model to autoregressively generate action-conditioned videos in real time. By leveraging the rich visual priors of Cosmos and mid- and post-training on 21k hours of driving scenarios, OmniDreams synthesizes complex, unobserved phenomena that are hard for traditional simulators to capture, such as extreme weather and unpredictable dynamic agent behaviors. Crucially, it autoregressively conditions its photorealistic sensor generation on past frames, the current simulator state, and immediate driving actions. Deployed in a closed-loop system with the Alpamayo 1 policy model and AlpaSim orchestrator, OmniDreams acts as a highly responsive, reactive environment, providing a scalable and comprehensive solution for training and evaluating next-generation autonomous driving policies. We additionally show preliminary results indicating that a world-action model (WAM) post-trained from OmniDreams achieves strong performance on the Physical AI Autonomous Vehicles NuRec dataset, surpassing the VLA-based Alpamayo 1.5 research policy model while using only 1/5 the total parameters. These results highlight the potential for a real-time world model like OmniDreams to also serve as a backbone for policy architectures.","upvotes":5,"discussionId":"6a1f8debe292c1c78ecb1327","ai_summary":"OmniDreams, a foundation generative world model trained from the Cosmos diffusion model, enables real-time action-conditioned video generation for autonomous driving policy evaluation in complex, unseen scenarios.","ai_keywords":["generative world model","diffusion model","action-conditioned video","closed-loop simulation","photorealistic sensor generation","autoregressive conditioning","world-action model","policy model","neural simulator","real-time generation"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","organization":{"_id":"60262b67268c201cdc8b7d43","name":"nvidia","fullname":"NVIDIA","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/65df9200dc3292a8983e5017/Vs5FPVCH-VZBipV3qKTuy.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"687363d49a81c7dcbcfa2d84","avatarUrl":"/avatars/5d943a5c811ed931c3fdcfee19253049.svg","isPro":false,"fullname":"jj","user":"realman123","type":"user"},{"_id":"63ca8e060609f1def7e6548a","avatarUrl":"/avatars/1da7947840cb87d5f77c0af9ee11f9c2.svg","isPro":true,"fullname":"Yi Jung","user":"YJ-142150","type":"user"},{"_id":"6a0527b11b095ce1e9a3fc03","avatarUrl":"/avatars/9a00577ea01e72c28ff8ab1df0f767b9.svg","isPro":false,"fullname":"SubiKim","user":"sbkimdbr","type":"user"},{"_id":"672f5f86359d27c87963553d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/-hTmvLgCD22VOVWP7Wq3L.png","isPro":false,"fullname":"Timex Peachtree","user":"TimexPeachtree","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"60262b67268c201cdc8b7d43","name":"nvidia","fullname":"NVIDIA","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/65df9200dc3292a8983e5017/Vs5FPVCH-VZBipV3qKTuy.png"}}">

Papers

arxiv:2606.03159

NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation

Published on Jun 2

· Submitted by

taesiri on Jun 3

NVIDIA

Upvote

Authors:

Abstract

OmniDreams, a foundation generative world model trained from the Cosmos diffusion model, enables real-time action-conditioned video generation for autonomous driving policy evaluation in complex, unseen scenarios.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

As autonomous vehicle capabilities advance, the safe evaluation of driving policies in long-tail scenarios remains a critical bottleneck. In closed-loop simulation, the driving policy model actively interacts with the environment, where its actions dynamically update the simulator state and directly influence the next set of generated sensor observations. While recent reconstruction-based neural simulators offer photorealism, they are fundamentally constrained by their initial captured data and struggle to generalize to highly dynamic or novel scenes. To overcome these limitations, we introduce OmniDreams, a foundation generative world model mid- and post-trained from the Cosmos diffusion model to autoregressively generate action-conditioned videos in real time. By leveraging the rich visual priors of Cosmos and mid- and post-training on 21k hours of driving scenarios, OmniDreams synthesizes complex, unobserved phenomena that are hard for traditional simulators to capture, such as extreme weather and unpredictable dynamic agent behaviors. Crucially, it autoregressively conditions its photorealistic sensor generation on past frames, the current simulator state, and immediate driving actions. Deployed in a closed-loop system with the Alpamayo 1 policy model and AlpaSim orchestrator, OmniDreams acts as a highly responsive, reactive environment, providing a scalable and comprehensive solution for training and evaluating next-generation autonomous driving policies. We additionally show preliminary results indicating that a world-action model (WAM) post-trained from OmniDreams achieves strong performance on the Physical AI Autonomous Vehicles NuRec dataset, surpassing the VLA-based Alpamayo 1.5 research policy model while using only 1/5 the total parameters. These results highlight the potential for a real-time world model like OmniDreams to also serve as a backbone for policy architectures.

View arXiv page View PDF Add to collection

Community

taesiri

Paper submitter about 11 hours ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.03159 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.03159 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.03159 in a Space README.md to link it from this page.

Collections including this paper 1

Discussion (0)

No comments yet. Sign in and be the first to say something.

NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation

Abstract

Community

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 1

Discussion (0)

More from Hugging Face Daily Papers