Hugging Face Daily Papers · · 3 min read

In-Context World Modeling for Robotic Control

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

In-Context World Modeling for Robotic Control</p>\n","updatedAt":"2026-06-26T02:48:36.684Z","author":{"_id":"64c3c631e77ea9f28111172a","avatarUrl":"/avatars/495dbb73b69c399bae780da3118e332f.svg","fullname":"Siyin Wang (SII)","name":"sinwang","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":17,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5499311685562134},"editors":["sinwang"],"editorAvatarUrls":["/avatars/495dbb73b69c399bae780da3118e332f.svg"],"reactions":[],"isReport":false}},{"id":"6a40b5838fbfc742ff9e3590","author":{"_id":"679644efaae193052def45c0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/UZZzkjp4IFcz2eeTSa41e.png","fullname":"Wen JunTing","name":"Wenwen555","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2026-06-28T05:47:47.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"@librarian-bot","html":"<p><span class=\"SVELTE_PARTIAL_HYDRATER contents\" data-target=\"UserMention\" data-props=\"{&quot;user&quot;:&quot;librarian-bot&quot;}\"><span class=\"inline-block\"><span class=\"contents\"><a href=\"/librarian-bot\">@<span class=\"underline\">librarian-bot</span></a></span> </span></span></p>\n","updatedAt":"2026-06-28T05:47:47.215Z","author":{"_id":"679644efaae193052def45c0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/UZZzkjp4IFcz2eeTSa41e.png","fullname":"Wen JunTing","name":"Wenwen555","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7558995485305786},"editors":["Wenwen555"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/UZZzkjp4IFcz2eeTSa41e.png"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.26025","authors":[{"_id":"6a3de83a3b43e283349ec1a1","user":{"_id":"64c3c631e77ea9f28111172a","avatarUrl":"/avatars/495dbb73b69c399bae780da3118e332f.svg","isPro":false,"fullname":"Siyin Wang (SII)","user":"sinwang","type":"user","name":"sinwang"},"name":"Siyin Wang","status":"claimed_verified","statusLastChangedAt":"2026-06-27T15:23:31.616Z","hidden":false},{"_id":"6a3de83a3b43e283349ec1a2","name":"Junhao Shi","hidden":false},{"_id":"6a3de83a3b43e283349ec1a3","name":"Senyu Fei","hidden":false},{"_id":"6a3de83a3b43e283349ec1a4","name":"Zhaoyang Fu","hidden":false},{"_id":"6a3de83a3b43e283349ec1a5","name":"Li Ji","hidden":false},{"_id":"6a3de83a3b43e283349ec1a6","name":"Jingjing Gong","hidden":false},{"_id":"6a3de83a3b43e283349ec1a7","name":"Xipeng Qiu","hidden":false}],"publishedAt":"2026-06-25T00:00:00.000Z","submittedOnDailyAt":"2026-06-26T00:00:00.000Z","title":"In-Context World Modeling for Robotic Control","submittedOnDailyBy":{"_id":"64c3c631e77ea9f28111172a","avatarUrl":"/avatars/495dbb73b69c399bae780da3118e332f.svg","isPro":false,"fullname":"Siyin Wang (SII)","user":"sinwang","type":"user","name":"sinwang"},"summary":"Modern Vision-Language-Action (VLA) models often fail to generalize to novel setups, such as altered camera viewpoints or robot morphologies, because they are typically conditioned only on current observations and language instructions. By ignoring the underlying system configuration as a variable, these models implicitly assume a fixed execution context encountered during training, necessitating data-intensive fine-tuning for any new environment. In this work, we introduce In-Context World Modeling (ICWM), a framework that treats system identification as an in-context adaptation problem. ICWM enables robot policies to autonomously infer essential system variables from a short history of self-generated, task-agnostic interactions. Unlike traditional In-Context Learning that uses demonstrations to specify what task to perform, ICWM leverages the context window to understand how the system operates. By processing these interactions before task execution, the model implicitly captures the world dynamics of the current system, enabling adaptation to novel configurations without parameter updates. Extensive experiments in simulation and on real-world robot platforms demonstrate that ICWM significantly outperforms standard VLA baselines on novel camera viewpoints.","upvotes":45,"discussionId":"6a3de83b3b43e283349ec1a8","ai_summary":"ICWM enables robot policies to infer system variables from self-generated interactions, allowing adaptation to novel configurations without parameter updates by treating system identification as an in-context adaptation problem.","ai_keywords":["Vision-Language-Action models","in-context adaptation","system identification","robot policies","task-agnostic interactions","world modeling","novel configurations","parameter updates","simulation","real-world robot platforms"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","organization":{"_id":"613b0dee83ec35d460684607","name":"OpenMOSS-Team","fullname":"OpenMOSS","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/61457b8deff2c9fdb4de4988/N5b9663zQ4uq5_OTNlnmw.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64c3c631e77ea9f28111172a","avatarUrl":"/avatars/495dbb73b69c399bae780da3118e332f.svg","isPro":false,"fullname":"Siyin Wang (SII)","user":"sinwang","type":"user"},{"_id":"67a8b856aaa85bdabef9f2c3","avatarUrl":"/avatars/fb01d274593cbf320d083e9bcdd62617.svg","isPro":false,"fullname":"yu","user":"shengjiexy","type":"user"},{"_id":"6270324ebecab9e2dcf245de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6270324ebecab9e2dcf245de/cMbtWSasyNlYc9hvsEEzt.jpeg","isPro":false,"fullname":"Kye Gomez","user":"kye","type":"user"},{"_id":"6401a0c14a5c92eccfe333c2","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6401a0c14a5c92eccfe333c2/-fv3jekUxMNrE058phGpA.png","isPro":false,"fullname":"Ha-Yeong Choi","user":"Ha0","type":"user"},{"_id":"677272184d148b904333e874","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/5dUau7gxLk4Wm1TiiJJri.jpeg","isPro":false,"fullname":"Efstathios Karypidis","user":"Sta8is","type":"user"},{"_id":"63c1699e40a26dd2db32400d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63c1699e40a26dd2db32400d/3N0-Zp8igv8-52mXAdiiq.jpeg","isPro":false,"fullname":"Chroma","user":"Chroma111","type":"user"},{"_id":"6a2da6c8ca070ee12c6e396c","avatarUrl":"/avatars/0355287dcabaa67dbc7f0b10b87451f9.svg","isPro":false,"fullname":"Joe Mama","user":"JoeMama123123123","type":"user"},{"_id":"69f0bb9a53592156859aab90","avatarUrl":"/avatars/122aeb140c584b7842c50ae693c2a27e.svg","isPro":false,"fullname":"mini09999","user":"mini09999","type":"user"},{"_id":"64f033ef82c6eea604c4da8b","avatarUrl":"/avatars/51b93fea7fd68b4274ee03701245dcca.svg","isPro":false,"fullname":"Xiaoran Liu (SII)","user":"SII-xrliu","type":"user"},{"_id":"67f5e63688b2c5303ab5be7a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67f5e63688b2c5303ab5be7a/QSH8-QZH6l3KradXqNxJT.png","isPro":false,"fullname":"Chengxuan Qian","user":"Raymond-Qiancx","type":"user"},{"_id":"696da0962b3e2d9587d0b35d","avatarUrl":"/avatars/4f6c177ad51fb687ca1be75d18f6f5d6.svg","isPro":false,"fullname":"mini","user":"mini0999","type":"user"},{"_id":"687f904b5dd92a729b0c6b65","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/687f904b5dd92a729b0c6b65/EUU_RhuKjct46kJkDbNnP.jpeg","isPro":false,"fullname":"Yikang Zhou","user":"Kevin517","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":2,"organization":{"_id":"613b0dee83ec35d460684607","name":"OpenMOSS-Team","fullname":"OpenMOSS","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/61457b8deff2c9fdb4de4988/N5b9663zQ4uq5_OTNlnmw.png"},"query":{}}">
Papers
arxiv:2606.26025

In-Context World Modeling for Robotic Control

Published on Jun 25
· Submitted by
Siyin Wang (SII)
on Jun 26
#2 Paper of the day
Authors:
,
,
,
,
,

Abstract

ICWM enables robot policies to infer system variables from self-generated interactions, allowing adaptation to novel configurations without parameter updates by treating system identification as an in-context adaptation problem.

Modern Vision-Language-Action (VLA) models often fail to generalize to novel setups, such as altered camera viewpoints or robot morphologies, because they are typically conditioned only on current observations and language instructions. By ignoring the underlying system configuration as a variable, these models implicitly assume a fixed execution context encountered during training, necessitating data-intensive fine-tuning for any new environment. In this work, we introduce In-Context World Modeling (ICWM), a framework that treats system identification as an in-context adaptation problem. ICWM enables robot policies to autonomously infer essential system variables from a short history of self-generated, task-agnostic interactions. Unlike traditional In-Context Learning that uses demonstrations to specify what task to perform, ICWM leverages the context window to understand how the system operates. By processing these interactions before task execution, the model implicitly captures the world dynamics of the current system, enabling adaptation to novel configurations without parameter updates. Extensive experiments in simulation and on real-world robot platforms demonstrate that ICWM significantly outperforms standard VLA baselines on novel camera viewpoints.

Community

Paper author Paper submitter 2 days ago

In-Context World Modeling for Robotic Control

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.26025 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.26025 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.26025 in a Space README.md to link it from this page.

Collections including this paper 4

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers