Hugging Face Daily Papers · · 3 min read

Minimalist Visual Inertial Odometry

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Robot Odometry with a 4-Pixel sensor and an IMU.</p>\n","updatedAt":"2026-05-22T08:04:54.105Z","author":{"_id":"66f2e0cd5ddc45b044e955dd","avatarUrl":"/avatars/4df7f0be27282ac817ba747e6b025a1a.svg","fullname":"Francesco Pasti","name":"pastifra","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7384448647499084},"editors":["pastifra"],"editorAvatarUrls":["/avatars/4df7f0be27282ac817ba747e6b025a1a.svg"],"reactions":[],"isReport":false}},{"id":"6a1011ece317cbdad1e8c41d","author":{"_id":"66f2e0cd5ddc45b044e955dd","avatarUrl":"/avatars/4df7f0be27282ac817ba747e6b025a1a.svg","fullname":"Francesco Pasti","name":"pastifra","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2026-05-22T08:21:00.000Z","type":"comment","data":{"edited":true,"hidden":true,"hiddenBy":"","hiddenReason":"Resolved","latest":{"raw":"This comment has been hidden","html":"This comment has been hidden","updatedAt":"2026-05-22T08:21:39.872Z","author":{"_id":"66f2e0cd5ddc45b044e955dd","avatarUrl":"/avatars/4df7f0be27282ac817ba747e6b025a1a.svg","fullname":"Francesco Pasti","name":"pastifra","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":1,"editors":[],"editorAvatarUrls":[],"reactions":[]}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.19990","authors":[{"_id":"6a0eb82b164dbbc68a26c75b","user":{"_id":"66f2e0cd5ddc45b044e955dd","avatarUrl":"/avatars/4df7f0be27282ac817ba747e6b025a1a.svg","isPro":false,"fullname":"Francesco Pasti","user":"pastifra","type":"user","name":"pastifra"},"name":"Francesco Pasti","status":"claimed_verified","statusLastChangedAt":"2026-05-21T19:21:01.228Z","hidden":false},{"_id":"6a0eb82b164dbbc68a26c75c","name":"Jeremy Klotz","hidden":false},{"_id":"6a0eb82b164dbbc68a26c75d","name":"Nicola Bellotto","hidden":false},{"_id":"6a0eb82b164dbbc68a26c75e","name":"Shree K. Nayar","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/66f2e0cd5ddc45b044e955dd/MuXNlgFzI17NK7KyG16-b.mp4"],"publishedAt":"2026-05-19T00:00:00.000Z","submittedOnDailyAt":"2026-05-22T00:00:00.000Z","title":"Minimalist Visual Inertial Odometry","submittedOnDailyBy":{"_id":"66f2e0cd5ddc45b044e955dd","avatarUrl":"/avatars/4df7f0be27282ac817ba747e6b025a1a.svg","isPro":false,"fullname":"Francesco Pasti","user":"pastifra","type":"user","name":"pastifra"},"summary":"Visual-Inertial Odometry(VIO), which is critical to mobile robot navigation, uses cameras with a large number of pixels. Capturing and processing camera images requires significant resources. This work presents a minimalist approach to planar odometry, demonstrating that just four visual measurements and an IMU can provide robust motion estimation for differential-drive robots. Our key insight is that four downward-facing photodiodes that sense the world through optical Gabor masks produce signals that encode speed. Based on this, we jointly optimize the mask parameters alongside a Temporal Convolutional Network (TCN) using a physically-grounded simulator. The resulting model decodes speed from just the four measurements produced by the photodiodes. Pairing these estimates with the angular speed from an IMU yields a continuous planar trajectory. We validate our approach with a prototype sensor mounted on a differential drive robot. Across diverse indoor and outdoor terrains, our system closely tracks the reference ground truth without any real-world fine-tuning. Our work shows that minimalist sensing enables efficient and accurate planar odometry.","upvotes":0,"discussionId":"6a0eb82c164dbbc68a26c75f","projectPage":"https://cave.cs.columbia.edu/projects/categories/project?cid=Computational%20Imaging&pid=Minimalist%20Visual%20Inertial%20Odometry","githubRepo":"https://github.com/pastifra/four-pixel-vio","githubRepoAddedBy":"user","ai_summary":"A minimalist visual-inertial odometry approach uses four photodiodes with optical Gabor masks and a temporal convolutional network to achieve accurate planar motion estimation for differential-drive robots.","ai_keywords":["Temporal Convolutional Network","visual-inertial odometry","photodiodes","optical Gabor masks","differential-drive robots","physically-grounded simulator"],"githubStars":4,"organization":{"_id":"63f68badb607296857bb2441","name":"columbia","fullname":"Columbia University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/68e396f2b5bb631e9b2fac9a/USuDBWwfOGNrQ0SZGFxDF.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[],"acceptLanguages":["en"],"organization":{"_id":"63f68badb607296857bb2441","name":"columbia","fullname":"Columbia University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/68e396f2b5bb631e9b2fac9a/USuDBWwfOGNrQ0SZGFxDF.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.19990.md"}">
Papers
arxiv:2605.19990

Minimalist Visual Inertial Odometry

Published on May 19
· Submitted by
Francesco Pasti
on May 22
Authors:
,
,

Abstract

A minimalist visual-inertial odometry approach uses four photodiodes with optical Gabor masks and a temporal convolutional network to achieve accurate planar motion estimation for differential-drive robots.

AI-generated summary

Visual-Inertial Odometry(VIO), which is critical to mobile robot navigation, uses cameras with a large number of pixels. Capturing and processing camera images requires significant resources. This work presents a minimalist approach to planar odometry, demonstrating that just four visual measurements and an IMU can provide robust motion estimation for differential-drive robots. Our key insight is that four downward-facing photodiodes that sense the world through optical Gabor masks produce signals that encode speed. Based on this, we jointly optimize the mask parameters alongside a Temporal Convolutional Network (TCN) using a physically-grounded simulator. The resulting model decodes speed from just the four measurements produced by the photodiodes. Pairing these estimates with the angular speed from an IMU yields a continuous planar trajectory. We validate our approach with a prototype sensor mounted on a differential drive robot. Across diverse indoor and outdoor terrains, our system closely tracks the reference ground truth without any real-world fine-tuning. Our work shows that minimalist sensing enables efficient and accurate planar odometry.

Community

Paper author Paper submitter about 4 hours ago

Robot Odometry with a 4-Pixel sensor and an IMU.

Paper author Paper submitter about 4 hours ago
This comment has been hidden (marked as Resolved)
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.19990
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.19990 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.19990 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.19990 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers