Hugging Face Daily Papers · · 4 min read

PoLAR: Factorizing Extent and Mode in Latent Actions for Robot Policy Learning

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

PoLAR factorizes latent robot actions into radial extent and directional mode, so latent actions can separately capture how far a transition moves and what mode of behavior it follows. This gives a more structured latent space for downstream policy learning, and improves performance across simulated benchmarks and real-robot manipulation tasks.</p>\n<p>Project page: <a href=\"https://joon-stack.github.io/PoLAR/\" rel=\"nofollow\">https://joon-stack.github.io/PoLAR/</a><br>Code: <a href=\"https://github.com/joon-stack/PoLAR\" rel=\"nofollow\">https://github.com/joon-stack/PoLAR</a></p>\n","updatedAt":"2026-06-23T06:34:38.105Z","author":{"_id":"65404870deee4716f1bb6af6","avatarUrl":"/avatars/1d0de3762867474dd29ae350a0701002.svg","fullname":"Youngjoon Jeong","name":"quiet-storm","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":2,"identifiedLanguage":{"language":"en","probability":0.7664197087287903},"editors":["quiet-storm"],"editorAvatarUrls":["/avatars/1d0de3762867474dd29ae350a0701002.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.21139","authors":[{"_id":"6a3a074ffdcd3514343bb5e0","user":{"_id":"65404870deee4716f1bb6af6","avatarUrl":"/avatars/1d0de3762867474dd29ae350a0701002.svg","isPro":true,"fullname":"Youngjoon Jeong","user":"quiet-storm","type":"user","name":"quiet-storm"},"name":"Youngjoon Jeong","status":"claimed_verified","statusLastChangedAt":"2026-06-23T13:56:22.141Z","hidden":false},{"_id":"6a3a074ffdcd3514343bb5e1","name":"Jihwan Yu","hidden":false},{"_id":"6a3a074ffdcd3514343bb5e2","name":"Minsoo Jo","hidden":false},{"_id":"6a3a074ffdcd3514343bb5e3","name":"Junha Chun","hidden":false},{"_id":"6a3a074ffdcd3514343bb5e4","name":"Taesup Kim","hidden":false}],"publishedAt":"2026-06-19T00:00:00.000Z","submittedOnDailyAt":"2026-06-23T00:00:00.000Z","title":"PoLAR: Factorizing Extent and Mode in Latent Actions for Robot Policy Learning","submittedOnDailyBy":{"_id":"65404870deee4716f1bb6af6","avatarUrl":"/avatars/1d0de3762867474dd29ae350a0701002.svg","isPro":true,"fullname":"Youngjoon Jeong","user":"quiet-storm","type":"user","name":"quiet-storm"},"summary":"Latent action pretraining learns representations of visual change from pairs of observations, but existing methods typically encode each transition as a single unstructured representation that entangles transition extent and transition mode. We introduce Polar Latent Actions with Radial structure (PoLAR), which imposes a radial-direction structure on latent actions, encouraging radius to encode transition extent and direction to retain transition mode. PoLAR uses temporal offset between two observations as a weak proxy for transition extent, encouraging latent action from observation pairs separated by larger temporal gaps to occupy larger radii. We instantiate this structure in hyperbolic space, whose expanding volume with radius offers a natural fit for more diverse transition modes at larger extents. Across in-task and large-scale pretraining settings, PoLAR improves downstream policy performance in simulation and real-world robot experiments, outperforming latent action baselines and strong pretrained VLAs. These results suggest that the geometry of the latent action space is an important design choice for transferring visual pretraining to downstream robot policy learning.","upvotes":7,"discussionId":"6a3a0750fdcd3514343bb5e5","projectPage":"https://joon-stack.github.io/PoLAR","githubRepo":"https://github.com/joon-stack/PoLAR","githubRepoAddedBy":"user","ai_summary":"PoLAR introduces a geometrically structured latent action representation in hyperbolic space that separates transition extent from transition mode, improving robotic policy learning performance.","ai_keywords":["latent action pretraining","polar latent actions","radial structure","hyperbolic space","transition extent","transition mode","temporal offset","downstream policy performance","visual pretraining","robotic policy learning"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":1,"organization":{"_id":"66d54dc8033492801db2bf5a","name":"SeoulNatlUniv","fullname":"Seoul National University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/659ccc9d18897eb6594e897f/_-0BM-1UyM-d-lRiahFnf.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"65404870deee4716f1bb6af6","avatarUrl":"/avatars/1d0de3762867474dd29ae350a0701002.svg","isPro":true,"fullname":"Youngjoon Jeong","user":"quiet-storm","type":"user"},{"_id":"6858c289b8646f42a8be81ec","avatarUrl":"/avatars/d6b7b2a78a10d40114b4468db43be6d8.svg","isPro":false,"fullname":"junhachun","user":"nikriz","type":"user"},{"_id":"655f856e18e09c839bbd51da","avatarUrl":"/avatars/30440b9ce8a796afe7324a1371ccd251.svg","isPro":false,"fullname":"Yujin Jo","user":"zoyzin","type":"user"},{"_id":"6735e23255a98f9a6e71dee2","avatarUrl":"/avatars/01b985c2a87634b83799c0fbae2116a3.svg","isPro":false,"fullname":"Hanseul","user":"k1seul","type":"user"},{"_id":"670e23c7afdfbae6e4da303a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/B8QS0b17nMRGyW0GkOiPY.png","isPro":false,"fullname":"joonki min","user":"joon0822","type":"user"},{"_id":"631c386bc73939ffc0716a37","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1662793811119-noauth.jpeg","isPro":false,"fullname":"SeongWan Kim","user":"idgmatrix","type":"user"},{"_id":"6a2da6c8ca070ee12c6e396c","avatarUrl":"/avatars/0355287dcabaa67dbc7f0b10b87451f9.svg","isPro":false,"fullname":"Joe Mama","user":"JoeMama123123123","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"66d54dc8033492801db2bf5a","name":"SeoulNatlUniv","fullname":"Seoul National University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/659ccc9d18897eb6594e897f/_-0BM-1UyM-d-lRiahFnf.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.21139.md","query":{}}">
Papers
arxiv:2606.21139

PoLAR: Factorizing Extent and Mode in Latent Actions for Robot Policy Learning

Published on Jun 19
· Submitted by
Youngjoon Jeong
on Jun 23
Authors:
,
,
,

Abstract

PoLAR introduces a geometrically structured latent action representation in hyperbolic space that separates transition extent from transition mode, improving robotic policy learning performance.

Latent action pretraining learns representations of visual change from pairs of observations, but existing methods typically encode each transition as a single unstructured representation that entangles transition extent and transition mode. We introduce Polar Latent Actions with Radial structure (PoLAR), which imposes a radial-direction structure on latent actions, encouraging radius to encode transition extent and direction to retain transition mode. PoLAR uses temporal offset between two observations as a weak proxy for transition extent, encouraging latent action from observation pairs separated by larger temporal gaps to occupy larger radii. We instantiate this structure in hyperbolic space, whose expanding volume with radius offers a natural fit for more diverse transition modes at larger extents. Across in-task and large-scale pretraining settings, PoLAR improves downstream policy performance in simulation and real-world robot experiments, outperforming latent action baselines and strong pretrained VLAs. These results suggest that the geometry of the latent action space is an important design choice for transferring visual pretraining to downstream robot policy learning.

Community

Paper author Paper submitter about 21 hours ago
edited about 18 hours ago

PoLAR factorizes latent robot actions into radial extent and directional mode, so latent actions can separately capture how far a transition moves and what mode of behavior it follows. This gives a more structured latent space for downstream policy learning, and improves performance across simulated benchmarks and real-robot manipulation tasks.

Project page: https://joon-stack.github.io/PoLAR/
Code: https://github.com/joon-stack/PoLAR

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.21139
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.21139 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.21139 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.21139 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers