Hugging Face Daily Papers · June 9, 2026 · 7 min read

Robotic Policy Adaptation via Weight-Space Meta-Learning

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

\n\t<a id=\"paper-title\" class=\"block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full\" href=\"#paper-title\" rel=\"nofollow\">\n\t\t<span class=\"header-link\"><svg class=\"text-gray-500 hover:text-black dark:hover:text-gray-200 w-4\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" aria-hidden=\"true\" role=\"img\" width=\"1em\" height=\"1em\" preserveAspectRatio=\"xMidYMid meet\" viewBox=\"0 0 256 256\"><path d=\"M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z\" fill=\"currentColor\"></path></svg></span>\n\t</a>\n\t<span>\n\t\tPaper Title\n\t</span>\n</h2>\n<p><strong>Robotic Policy Adaptation via Weight-Space Meta-Learning</strong></p>\n<h2 class=\"relative group flex items-baseline\">\n\t<a id=\"short-summary-tldr\" class=\"block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full\" href=\"#short-summary-tldr\" rel=\"nofollow\">\n\t\t<span class=\"header-link\"><svg class=\"text-gray-500 hover:text-black dark:hover:text-gray-200 w-4\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" aria-hidden=\"true\" role=\"img\" width=\"1em\" height=\"1em\" preserveAspectRatio=\"xMidYMid meet\" viewBox=\"0 0 256 256\"><path d=\"M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z\" fill=\"currentColor\"></path></svg></span>\n\t</a>\n\t<span>\n\t\tShort Summary (TL;DR)\n\t</span>\n</h2>\n<p>The authors present <strong>WIZARD</strong>, a framework that enables zero-shot robotic policy adaptation for large Vision-Language-Action (VLA) models without any test-time fine-tuning, online optimization, or action labels. Instead of tuning via gradients at deployment, a meta-network predicts task-specific LoRA parameters in a single forward pass from a language prompt and a short demonstration video. On the LIBERO benchmark, WIZARD improves success rates by up to <strong>2x</strong> on unseen datasets and up to <strong>14x</strong> on unseen tasks.</p>\n<hr>\n<h2 class=\"relative group flex items-baseline\">\n\t<a id=\"suggested-community-comment\" class=\"block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full\" href=\"#suggested-community-comment\" rel=\"nofollow\">\n\t\t<span class=\"header-link\"><svg class=\"text-gray-500 hover:text-black dark:hover:text-gray-200 w-4\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" aria-hidden=\"true\" role=\"img\" width=\"1em\" height=\"1em\" preserveAspectRatio=\"xMidYMid meet\" viewBox=\"0 0 256 256\"><path d=\"M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z\" fill=\"currentColor\"></path></svg></span>\n\t</a>\n\t<span>\n\t\tSuggested Community Comment\n\t</span>\n</h2>\n<p><strong>Title:</strong> 🚀 Zero-Shot LoRA Parameter Generation for Large VLAs<br>This paper introduces a clever workaround to the expensive, action-labeled fine-tuning usually required to adapt Vision-Language-Action (VLA) models to new tasks.</p>\n<h3 class=\"relative group flex items-baseline\">\n\t<a id=\"why-its-interesting\" class=\"block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full\" href=\"#why-its-interesting\" rel=\"nofollow\">\n\t\t<span class=\"header-link\"><svg class=\"text-gray-500 hover:text-black dark:hover:text-gray-200 w-4\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" aria-hidden=\"true\" role=\"img\" width=\"1em\" height=\"1em\" preserveAspectRatio=\"xMidYMid meet\" viewBox=\"0 0 256 256\"><path d=\"M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z\" fill=\"currentColor\"></path></svg></span>\n\t</a>\n\t<span>\n\t\tWhy it's interesting:\n\t</span>\n</h3>\n<ul>\n<li><strong>No Test-Time Gradients:</strong> WIZARD bypasses deployment fine-tuning entirely. It maps multimodal task embeddings directly to specialized LoRA parameters in a single forward pass.</li>\n<li><strong>Scale-Aware Architecture:</strong> To stabilize weight generation across heterogeneous VLA modules, it introduces instance-wise token normalization and explicitly predicts layer-wise statistics.</li>\n<li><strong>Strong Zero-Shot Baselines:</strong> It hits an average success rate of <strong>40%</strong> on <em>LIBERO-Spatial</em> (vs. 19% for standard multi-task VLAs) and successfully transfers to a physical 7-DoF Franka arm, nearly doubling real-world success rates from <strong>0.22 to 0.41</strong>.</li>\n</ul>\n<p>It's a highly scalable approach to parameter generation that avoids full-policy weight synthesis while delivering serious data efficiency.</p>\n<p>Definitely worth a read!</p>\n","updatedAt":"2026-06-09T16:22:39.776Z","author":{"_id":"683bf37314500cac4151fe23","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/683bf37314500cac4151fe23/7A0KR3ND-xlUrav9U2pD5.jpeg","fullname":"Christian Bianchi","name":"Fascetta","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7474290728569031},"editors":["Fascetta"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/683bf37314500cac4151fe23/7A0KR3ND-xlUrav9U2pD5.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.07217","authors":[{"_id":"6a281df4e7d78ea7587e5019","name":"Christian Bianchi","hidden":false},{"_id":"6a281df4e7d78ea7587e501a","name":"Siamak Yousefi","hidden":false},{"_id":"6a281df4e7d78ea7587e501b","name":"Alessio Sampieri","hidden":false},{"_id":"6a281df4e7d78ea7587e501c","name":"Andrea Roberti","hidden":false},{"_id":"6a281df4e7d78ea7587e501d","name":"Luca Rigazio","hidden":false},{"_id":"6a281df4e7d78ea7587e501e","name":"Fabio Galasso","hidden":false},{"_id":"6a281df4e7d78ea7587e501f","name":"Luca Franco","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/683bf37314500cac4151fe23/_mXipHd2wA_udOyWskSsx.png"],"publishedAt":"2026-06-05T00:00:00.000Z","submittedOnDailyAt":"2026-06-09T00:00:00.000Z","title":"Robotic Policy Adaptation via Weight-Space Meta-Learning","submittedOnDailyBy":{"_id":"683bf37314500cac4151fe23","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/683bf37314500cac4151fe23/7A0KR3ND-xlUrav9U2pD5.jpeg","isPro":false,"fullname":"Christian Bianchi","user":"Fascetta","type":"user","name":"Fascetta"},"summary":"Vision-Language-Action (VLA) models are emerging as a promising paradigm for robotic manipulation, enabling general-purpose policies trained from large corpora of demonstrations and action labels. However, adapting these models to new tasks still typically requires task-specific demonstrations, action annotations, and additional fine-tuning, making deployment costly and difficult to scale.\n We propose WIZARD, a weight-space meta-learning framework that sidesteps task-specific fine-tuning by generating task-specific LoRA parameters for a frozen VLA policy. Given only a language instruction and a short demonstration video, WIZARD predicts the corresponding adaptation weights in a single forward pass, without target-task action labels or test-time optimization. During meta-training, WIZARD learns to map task evidence directly to expert LoRA updates, capturing relationships between tasks in weight space.\n Experiments on LIBERO show that WIZARD improves performance by up to ~2x on unseen dataset collections and up to ~14x on unseen tasks. On a Franka Emika Panda, WIZARD consistently improves over a real-domain adapted baseline, showing that generated adapters provide task-level specialization beyond simulation.","upvotes":2,"discussionId":"6a281df4e7d78ea7587e5020","projectPage":"https://fascetta.github.io/WIZARD/","githubRepo":"https://github.com/Fascetta/WIZARD","githubRepoAddedBy":"user","ai_summary":"WIZARD is a weight-space meta-learning framework that generates task-specific LoRA parameters for frozen VLA policies using language instructions and demonstration videos, enabling efficient task adaptation without fine-tuning.","ai_keywords":["Vision-Language-Action models","weight-space meta-learning","LoRA parameters","frozen VLA policy","meta-training","task evidence","expert LoRA updates"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":0,"organization":{"_id":"6a283b8247a3e8e5a11933a9","name":"ItalAI","fullname":"ItalAI","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/683bf37314500cac4151fe23/csGXPSHcNg-3nTFTjHKXY.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"683bf37314500cac4151fe23","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/683bf37314500cac4151fe23/7A0KR3ND-xlUrav9U2pD5.jpeg","isPro":false,"fullname":"Christian Bianchi","user":"Fascetta","type":"user"},{"_id":"67d4881764a8ff8a65e012d0","avatarUrl":"/avatars/8c4b4707ed20aca3d301b18dc9ddcf24.svg","isPro":false,"fullname":"leo sca","user":"leo-sca","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"6a283b8247a3e8e5a11933a9","name":"ItalAI","fullname":"ItalAI","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/683bf37314500cac4151fe23/csGXPSHcNg-3nTFTjHKXY.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.07217.md"}">

Papers

arxiv:2606.07217

Robotic Policy Adaptation via Weight-Space Meta-Learning

Published on Jun 5

· Submitted by

Christian Bianchi on Jun 9

ItalAI

Upvote

Authors:

Abstract

WIZARD is a weight-space meta-learning framework that generates task-specific LoRA parameters for frozen VLA policies using language instructions and demonstration videos, enabling efficient task adaptation without fine-tuning.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

Vision-Language-Action (VLA) models are emerging as a promising paradigm for robotic manipulation, enabling general-purpose policies trained from large corpora of demonstrations and action labels. However, adapting these models to new tasks still typically requires task-specific demonstrations, action annotations, and additional fine-tuning, making deployment costly and difficult to scale. We propose WIZARD, a weight-space meta-learning framework that sidesteps task-specific fine-tuning by generating task-specific LoRA parameters for a frozen VLA policy. Given only a language instruction and a short demonstration video, WIZARD predicts the corresponding adaptation weights in a single forward pass, without target-task action labels or test-time optimization. During meta-training, WIZARD learns to map task evidence directly to expert LoRA updates, capturing relationships between tasks in weight space. Experiments on LIBERO show that WIZARD improves performance by up to ~2x on unseen dataset collections and up to ~14x on unseen tasks. On a Franka Emika Panda, WIZARD consistently improves over a real-domain adapted baseline, showing that generated adapters provide task-level specialization beyond simulation.

View arXiv page View PDF Project page GitHub 0 Add to collection

Community

Fascetta

Paper submitter about 3 hours ago

Paper Title

Robotic Policy Adaptation via Weight-Space Meta-Learning

Short Summary (TL;DR)

The authors present WIZARD, a framework that enables zero-shot robotic policy adaptation for large Vision-Language-Action (VLA) models without any test-time fine-tuning, online optimization, or action labels. Instead of tuning via gradients at deployment, a meta-network predicts task-specific LoRA parameters in a single forward pass from a language prompt and a short demonstration video. On the LIBERO benchmark, WIZARD improves success rates by up to 2x on unseen datasets and up to 14x on unseen tasks.

Suggested Community Comment

Title: 🚀 Zero-Shot LoRA Parameter Generation for Large VLAs
This paper introduces a clever workaround to the expensive, action-labeled fine-tuning usually required to adapt Vision-Language-Action (VLA) models to new tasks.

Why it's interesting:

No Test-Time Gradients: WIZARD bypasses deployment fine-tuning entirely. It maps multimodal task embeddings directly to specialized LoRA parameters in a single forward pass.
Scale-Aware Architecture: To stabilize weight generation across heterogeneous VLA modules, it introduces instance-wise token normalization and explicitly predicts layer-wise statistics.
Strong Zero-Shot Baselines: It hits an average success rate of 40% on LIBERO-Spatial (vs. 19% for standard multi-task VLAs) and successfully transfers to a physical 7-DoF Franka arm, nearly doubling real-world success rates from 0.22 to 0.41.

It's a highly scalable approach to parameter generation that avoids full-policy weight synthesis while delivering serious data efficiency.

Definitely worth a read!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2606.07217

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.07217 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.07217 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.07217 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

No comments yet. Sign in and be the first to say something.

Robotic Policy Adaptation via Weight-Space Meta-Learning

Abstract

Community

Paper Title

Short Summary (TL;DR)

Suggested Community Comment

Why it's interesting:

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 0

Discussion (0)

More from Hugging Face Daily Papers