Hugging Face Daily Papers · May 15, 2026 · 3 min read

Overcoming Dynamics-Blindness: Training-Free Pace-and-Path Correction for VLA Models

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

This paper proposes Pace-and-Path Correction, a training-free, closed-form inference-time operator that wraps any chunked-action VLA.</p>\n","updatedAt":"2026-05-15T05:44:01.966Z","author":{"_id":"63999a6fe657365725d0d0a4","avatarUrl":"/avatars/99736de1bc0d5decf4a6eda86e3c7937.svg","fullname":"Derek Zhe Hu","name":"zhehuderek","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7614442706108093},"editors":["zhehuderek"],"editorAvatarUrls":["/avatars/99736de1bc0d5decf4a6eda86e3c7937.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.11459","authors":[{"_id":"6a06b20bb1a8cbabc9f09a04","name":"Yanyan Zhang","hidden":false},{"_id":"6a06b20bb1a8cbabc9f09a05","name":"Chaoda Song","hidden":false},{"_id":"6a06b20bb1a8cbabc9f09a06","name":"Vikash Singh","hidden":false},{"_id":"6a06b20bb1a8cbabc9f09a07","name":"Xinpeng Li","hidden":false},{"_id":"6a06b20bb1a8cbabc9f09a08","name":"Kai Ye","hidden":false},{"_id":"6a06b20bb1a8cbabc9f09a09","name":"Zhe Hu","hidden":false},{"_id":"6a06b20bb1a8cbabc9f09a0a","name":"Zhongzhu Pu","hidden":false},{"_id":"6a06b20bb1a8cbabc9f09a0b","name":"Yu Yin","hidden":false},{"_id":"6a06b20bb1a8cbabc9f09a0c","name":"Vipin Chaudhary","hidden":false}],"publishedAt":"2026-05-14T00:00:00.000Z","submittedOnDailyAt":"2026-05-15T00:00:00.000Z","title":"Overcoming Dynamics-Blindness: Training-Free Pace-and-Path Correction for VLA Models","submittedOnDailyBy":{"_id":"63999a6fe657365725d0d0a4","avatarUrl":"/avatars/99736de1bc0d5decf4a6eda86e3c7937.svg","isPro":false,"fullname":"Derek Zhe Hu","user":"zhehuderek","type":"user","name":"zhehuderek"},"summary":"Vision-Language-Action (VLA) models achieve remarkable flexibility and generalization beyond classical control paradigms. However, most prevailing VLAs are trained under a single-frame observation paradigm, which leaves them structurally blind to temporal dynamics. Consequently, these models degrade severely in non-stationary scenarios, even when trained or finetuned on dynamic datasets. Existing approaches either require expensive retraining or suffer from latency bottlenecks and poor temporal consistency across action chunks. We propose Pace-and-Path Correction, a training-free, closed-form inference-time operator that wraps any chunked-action VLA. From a single quadratic cost, joint minimization yields a unified solution that decomposes orthogonally into two distinct channels. The pace channel compresses execution along the planned direction, while the path channel applies an orthogonal spatial offset, jointly absorbing the perceived dynamics within the chunk window. We evaluate our approach on a comprehensive diagnostic benchmark MoveBench designed to isolate motion as the sole controlled variable. Empirical results demonstrate that our framework consistently outperforms state-of-the-art training-free wrappers and dynamic-adaptive methods and improves success rates by up to 28.8% and 25.9% in absolute terms over foundational VLA models in dynamic-only and static-dynamic mixed environments, respectively.","upvotes":0,"discussionId":"6a06b20bb1a8cbabc9f09a0d","ai_summary":"Vision-Language-Action models suffer from temporal blindness in dynamic environments, but a training-free correction method using quadratic optimization improves performance by addressing pace and path dynamics simultaneously.","ai_keywords":["Vision-Language-Action models","temporal dynamics","chunked-action","quadratic cost","joint minimization","orthogonal decomposition","motion planning","dynamic environments","static-dynamic mixed environments"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[],"acceptLanguages":["en"],"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.11459.md"}">

Papers

arxiv:2605.11459

Overcoming Dynamics-Blindness: Training-Free Pace-and-Path Correction for VLA Models

Published on May 14

· Submitted by

Derek Zhe Hu on May 15

Upvote

Authors:

Abstract

Vision-Language-Action models suffer from temporal blindness in dynamic environments, but a training-free correction method using quadratic optimization improves performance by addressing pace and path dynamics simultaneously.

AI-generated summary

Vision-Language-Action (VLA) models achieve remarkable flexibility and generalization beyond classical control paradigms. However, most prevailing VLAs are trained under a single-frame observation paradigm, which leaves them structurally blind to temporal dynamics. Consequently, these models degrade severely in non-stationary scenarios, even when trained or finetuned on dynamic datasets. Existing approaches either require expensive retraining or suffer from latency bottlenecks and poor temporal consistency across action chunks. We propose Pace-and-Path Correction, a training-free, closed-form inference-time operator that wraps any chunked-action VLA. From a single quadratic cost, joint minimization yields a unified solution that decomposes orthogonally into two distinct channels. The pace channel compresses execution along the planned direction, while the path channel applies an orthogonal spatial offset, jointly absorbing the perceived dynamics within the chunk window. We evaluate our approach on a comprehensive diagnostic benchmark MoveBench designed to isolate motion as the sole controlled variable. Empirical results demonstrate that our framework consistently outperforms state-of-the-art training-free wrappers and dynamic-adaptive methods and improves success rates by up to 28.8% and 25.9% in absolute terms over foundational VLA models in dynamic-only and static-dynamic mixed environments, respectively.

View arXiv page View PDF Add to collection

Community

zhehuderek

Paper submitter about 19 hours ago

This paper proposes Pace-and-Path Correction, a training-free, closed-form inference-time operator that wraps any chunked-action VLA.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2605.11459

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.11459 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.11459 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.11459 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

No comments yet. Sign in and be the first to say something.

Overcoming Dynamics-Blindness: Training-Free Pace-and-Path Correction for VLA Models

Abstract

Community

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 0

Discussion (0)

More from Hugging Face Daily Papers