Hugging Face Daily Papers · · 6 min read

MobileEgo Anywhere: Open Infrastructure for long horizon egocentric data on commodity hardware

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

iPhone + open-source stack to create high quality data</p>\n","updatedAt":"2026-05-18T11:35:14.792Z","author":{"_id":"614efbb6ddd8df0d8bfd0a5a","avatarUrl":"/avatars/3188af4402df45c96c2d895a5ac388ec.svg","fullname":"Satpal Singh Rathore","name":"satpalsr","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":17,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6699559688568115},"editors":["satpalsr"],"editorAvatarUrls":["/avatars/3188af4402df45c96c2d895a5ac388ec.svg"],"reactions":[],"isReport":false}},{"id":"6a0afc5bedca8351d95142ac","author":{"_id":"672e478e3e83663aafa442f4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/M3xhhQrtXGw0L6mya1tfG.png","fullname":"Abhishek Anand","name":"abhishekanand94","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2026-05-18T11:47:39.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"the future of embodied ai is open source","html":"<p>the future of embodied ai is open source</p>\n","updatedAt":"2026-05-18T11:47:39.844Z","author":{"_id":"672e478e3e83663aafa442f4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/M3xhhQrtXGw0L6mya1tfG.png","fullname":"Abhishek Anand","name":"abhishekanand94","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9784664511680603},"editors":["abhishekanand94"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/M3xhhQrtXGw0L6mya1tfG.png"],"reactions":[],"isReport":false}},{"id":"6a0bc16b7f332b13403e2237","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":357,"isUserFollowing":false},"createdAt":"2026-05-19T01:48:27.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [EgoLive: A Large-Scale Egocentric Dataset from Real-World Human Tasks](https://huggingface.co/papers/2604.23570) (2026)\n* [SABER: A Scalable Action-Based Embodied Dataset for Real-World VLA Adaptation](https://huggingface.co/papers/2605.09613) (2026)\n* [RoSHI: A Versatile Robot-oriented Suit for Human Data In-the-Wild](https://huggingface.co/papers/2604.07331) (2026)\n* [EgoVerse: An Egocentric Human Dataset for Robot Learning from Around the World](https://huggingface.co/papers/2604.07607) (2026)\n* [Phone2Act: A Low-Cost, Hardware-Agnostic Teleoperation System for Scalable VLA Data Collection](https://huggingface.co/papers/2605.01948) (2026)\n* [UniDex: A Robot Foundation Suite for Universal Dexterous Hand Control from Egocentric Human Videos](https://huggingface.co/papers/2603.22264) (2026)\n* [HRDexDB: A Large-Scale Dataset of Dexterous Human and Robotic Hand Grasps](https://huggingface.co/papers/2604.14944) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"<p>This is an automated message from the <a href=\"https://huggingface.co/librarian-bots\">Librarian Bot</a>. I found the following papers similar to this paper. </p>\n<p>The following papers were recommended by the Semantic Scholar API </p>\n<ul>\n<li><a href=\"https://huggingface.co/papers/2604.23570\">EgoLive: A Large-Scale Egocentric Dataset from Real-World Human Tasks</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.09613\">SABER: A Scalable Action-Based Embodied Dataset for Real-World VLA Adaptation</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.07331\">RoSHI: A Versatile Robot-oriented Suit for Human Data In-the-Wild</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.07607\">EgoVerse: An Egocentric Human Dataset for Robot Learning from Around the World</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.01948\">Phone2Act: A Low-Cost, Hardware-Agnostic Teleoperation System for Scalable VLA Data Collection</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2603.22264\">UniDex: A Robot Foundation Suite for Universal Dexterous Hand Control from Egocentric Human Videos</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.14944\">HRDexDB: A Large-Scale Dataset of Dexterous Human and Robotic Hand Grasps</a> (2026)</li>\n</ul>\n<p> Please give a thumbs up to this comment if you found it helpful!</p>\n<p> If you want recommendations for any Paper on Hugging Face checkout <a href=\"https://huggingface.co/spaces/librarian-bots/recommend_similar_papers\">this</a> Space</p>\n<p> You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: <code><span class=\"SVELTE_PARTIAL_HYDRATER contents\" data-target=\"UserMention\" data-props=\"{&quot;user&quot;:&quot;librarian-bot&quot;}\"><span class=\"inline-block\"><span class=\"contents\"><a href=\"/librarian-bot\">@<span class=\"underline\">librarian-bot</span></a></span> </span></span> recommend</code></p>\n","updatedAt":"2026-05-19T01:48:27.722Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":357,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6989216208457947},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.05945","authors":[{"_id":"6a02bd8cb823258e76123667","name":"Senthil Palanisamy","hidden":false},{"_id":"6a02bd8cb823258e76123668","name":"Abhishek Anand","hidden":false},{"_id":"6a02bd8cb823258e76123669","user":{"_id":"614efbb6ddd8df0d8bfd0a5a","avatarUrl":"/avatars/3188af4402df45c96c2d895a5ac388ec.svg","isPro":false,"fullname":"Satpal Singh Rathore","user":"satpalsr","type":"user","name":"satpalsr"},"name":"Satpal Singh Rathor","status":"claimed_verified","statusLastChangedAt":"2026-05-18T09:50:52.969Z","hidden":false},{"_id":"6a02bd8cb823258e7612366a","name":"Pratyush Patnaik","hidden":false},{"_id":"6a02bd8cb823258e7612366b","name":"Shubhanshu Khatana","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/614efbb6ddd8df0d8bfd0a5a/Pj8BfhaO7LTtQWVlHn2Z6.mp4"],"publishedAt":"2026-05-07T00:00:00.000Z","submittedOnDailyAt":"2026-05-18T00:00:00.000Z","title":"MobileEgo Anywhere: Open Infrastructure for long horizon egocentric data on commodity hardware","submittedOnDailyBy":{"_id":"614efbb6ddd8df0d8bfd0a5a","avatarUrl":"/avatars/3188af4402df45c96c2d895a5ac388ec.svg","isPro":false,"fullname":"Satpal Singh Rathore","user":"satpalsr","type":"user","name":"satpalsr"},"summary":"The recent advancement of Vision Language Action (VLA) models has driven a critical demand for large scale egocentric datasets. However, existing datasets are often limited by short episode durations, typically spanning only a few minutes, which fails to capture the long horizon temporal dependencies necessary for complex robotic task execution. To bridge this gap, we present MobileEgo Anywhere, a framework designed to facilitate the collection of robust, hour plus egocentric trajectories using commodity mobile hardware. We leverage the ubiquitous sensor suites of modern smartphones to provide high fidelity, long term camera pose tracking, effectively removing the high hardware barriers associated with traditional robotics data collection. Our contributions are three fold: (1) we release a novel dataset comprising 200 hours of diverse, long form egocentric data with persistent state tracking; (2) we open source a mobile application that enables any user to record egocentric data, and (3) we provide a comprehensive processing pipeline to convert raw mobile captures into standardized, training ready formats for Vision Language Action model and foundation model research. By democratizing the data collection process, this work enables the massive scale acquisition of long horizon data across varied global environments, accelerating the development of generalizable robotic policies.","upvotes":5,"discussionId":"6a02bd8db823258e7612366c","projectPage":"https://www.fpvlabs.ai/stera","githubRepo":"https://github.com/fpv-labs/stera-sdk","githubRepoAddedBy":"user","ai_summary":"A mobile-based framework for collecting long-duration egocentric robot data using smartphone sensors, enabling large-scale training of vision-language-action models.","ai_keywords":["Vision Language Action models","egocentric datasets","robot data collection","smartphone sensors","long horizon temporal dependencies","mobile hardware","data democratization"],"githubStars":9,"organization":{"_id":"68cd87c5063c34f576496e27","name":"fpvlabs","fullname":"FPV Labs","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/614efbb6ddd8df0d8bfd0a5a/skXOWmq6flwKczKuW4ehD.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"614efbb6ddd8df0d8bfd0a5a","avatarUrl":"/avatars/3188af4402df45c96c2d895a5ac388ec.svg","isPro":false,"fullname":"Satpal Singh Rathore","user":"satpalsr","type":"user"},{"_id":"6a0a8a642dc0b719e778aa66","avatarUrl":"/avatars/896953e554b0aa6c6847f58bd234309f.svg","isPro":false,"fullname":"Vishal","user":"tanwarVishal","type":"user"},{"_id":"6a09c543f391271df2123e41","avatarUrl":"/avatars/1b2a7c0fa197f8d76311d3bcd7b92706.svg","isPro":false,"fullname":"Shubham pandey","user":"Shubhamjjipandey","type":"user"},{"_id":"672e478e3e83663aafa442f4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/M3xhhQrtXGw0L6mya1tfG.png","isPro":false,"fullname":"Abhishek Anand","user":"abhishekanand94","type":"user"},{"_id":"658c4287e15da20cd8907c4d","avatarUrl":"/avatars/19294758d2b2ef6393d67c874c2725ef.svg","isPro":false,"fullname":"Ekaksh Janweja","user":"stormej","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"68cd87c5063c34f576496e27","name":"fpvlabs","fullname":"FPV Labs","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/614efbb6ddd8df0d8bfd0a5a/skXOWmq6flwKczKuW4ehD.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.05945.md"}">
Papers
arxiv:2605.05945

MobileEgo Anywhere: Open Infrastructure for long horizon egocentric data on commodity hardware

Published on May 7
· Submitted by
Satpal Singh Rathore
on May 18
Authors:
,
,
,

Abstract

A mobile-based framework for collecting long-duration egocentric robot data using smartphone sensors, enabling large-scale training of vision-language-action models.

AI-generated summary

The recent advancement of Vision Language Action (VLA) models has driven a critical demand for large scale egocentric datasets. However, existing datasets are often limited by short episode durations, typically spanning only a few minutes, which fails to capture the long horizon temporal dependencies necessary for complex robotic task execution. To bridge this gap, we present MobileEgo Anywhere, a framework designed to facilitate the collection of robust, hour plus egocentric trajectories using commodity mobile hardware. We leverage the ubiquitous sensor suites of modern smartphones to provide high fidelity, long term camera pose tracking, effectively removing the high hardware barriers associated with traditional robotics data collection. Our contributions are three fold: (1) we release a novel dataset comprising 200 hours of diverse, long form egocentric data with persistent state tracking; (2) we open source a mobile application that enables any user to record egocentric data, and (3) we provide a comprehensive processing pipeline to convert raw mobile captures into standardized, training ready formats for Vision Language Action model and foundation model research. By democratizing the data collection process, this work enables the massive scale acquisition of long horizon data across varied global environments, accelerating the development of generalizable robotic policies.

Community

Paper author Paper submitter about 14 hours ago

iPhone + open-source stack to create high quality data

the future of embodied ai is open source

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.05945
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.05945 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.05945 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers