Hugging Face Daily Papers · · 3 min read

Implicit Preference Alignment for Human Image Animation

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Implicit Preference Alignment that removes the need for bad samples for preference optimization.</p>\n","updatedAt":"2026-05-13T08:06:48.669Z","author":{"_id":"6820a5143ccd1e0216105ff5","avatarUrl":"/avatars/4b65fadc31d72649b8c9c9958b992970.svg","fullname":"Yuanzhi Wang","name":"mdswyz","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8336986899375916},"editors":["mdswyz"],"editorAvatarUrls":["/avatars/4b65fadc31d72649b8c9c9958b992970.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.07545","authors":[{"_id":"6a02e425b823258e761237ee","user":{"_id":"6820a5143ccd1e0216105ff5","avatarUrl":"/avatars/4b65fadc31d72649b8c9c9958b992970.svg","isPro":false,"fullname":"Yuanzhi Wang","user":"mdswyz","type":"user","name":"mdswyz"},"name":"Yuanzhi Wang","status":"claimed_verified","statusLastChangedAt":"2026-05-13T07:53:35.218Z","hidden":false},{"_id":"6a02e425b823258e761237ef","name":"Xuhua Ren","hidden":false},{"_id":"6a02e425b823258e761237f0","name":"Jiaxiang Cheng","hidden":false},{"_id":"6a02e425b823258e761237f1","name":"Bing Ma","hidden":false},{"_id":"6a02e425b823258e761237f2","name":"Kai Yu","hidden":false},{"_id":"6a02e425b823258e761237f3","name":"Tianxiang Zheng","hidden":false},{"_id":"6a02e425b823258e761237f4","name":"Qinglin Lu","hidden":false},{"_id":"6a02e425b823258e761237f5","name":"Zhen Cui","hidden":false}],"publishedAt":"2026-05-08T00:00:00.000Z","submittedOnDailyAt":"2026-05-13T00:00:00.000Z","title":"Implicit Preference Alignment for Human Image Animation","submittedOnDailyBy":{"_id":"6820a5143ccd1e0216105ff5","avatarUrl":"/avatars/4b65fadc31d72649b8c9c9958b992970.svg","isPro":false,"fullname":"Yuanzhi Wang","user":"mdswyz","type":"user","name":"mdswyz"},"summary":"Human image animation has witnessed significant advancements, yet generating high-fidelity hand motions remains a persistent challenge due to their high degrees of freedom and motion complexity. While reinforcement learning from human feedback, particularly direct preference optimization, offers a potential solution, it necessitates the construction of strict preference pairs. However, curating such pairs for dynamic hand regions is prohibitively expensive and often impractical due to frame-wise inconsistencies. In this paper, we propose Implicit Preference Alignment (IPA), a data-efficient post-training framework that eliminates the need for paired preference data. Theoretically grounded in implicit reward maximization, IPA aligns the model by maximizing the likelihood of self-generated high-quality samples while penalizing deviations from the pretrained prior. Furthermore, we introduce a Hand-Aware Local Optimization mechanism to explicitly steer the alignment process toward hand regions. Experiments demonstrate that our method achieves effective preference optimization to enhance hand generation quality, while significantly lowering the barrier for constructing preference data. Codes are released at https://github.com/mdswyz/IPA","upvotes":1,"discussionId":"6a02e425b823258e761237f6","projectPage":"https://github.com/mdswyz/IPA","githubRepo":"https://github.com/mdswyz/IPA","githubRepoAddedBy":"user","ai_summary":"Implicit Preference Alignment (IPA) addresses hand motion generation challenges through data-efficient post-training that eliminates need for paired preference data while using hand-aware local optimization for improved quality.","ai_keywords":["reinforcement learning from human feedback","direct preference optimization","implicit reward maximization","preference pairs","hand motion generation","post-training framework","hand-aware local optimization","implicit preference alignment"],"githubStars":2,"organization":{"_id":"6645f953c39288df638dbdd5","name":"Tencent-Hunyuan","fullname":"Tencent Hunyuan","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d22496c58f969c152bcefd/woKSjt2wXvBNKussyYPsa.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"6645f953c39288df638dbdd5","name":"Tencent-Hunyuan","fullname":"Tencent Hunyuan","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d22496c58f969c152bcefd/woKSjt2wXvBNKussyYPsa.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.07545.md"}">
Papers
arxiv:2605.07545

Implicit Preference Alignment for Human Image Animation

Published on May 8
· Submitted by
Yuanzhi Wang
on May 13
Authors:
,
,
,
,
,
,

Abstract

Implicit Preference Alignment (IPA) addresses hand motion generation challenges through data-efficient post-training that eliminates need for paired preference data while using hand-aware local optimization for improved quality.

AI-generated summary

Human image animation has witnessed significant advancements, yet generating high-fidelity hand motions remains a persistent challenge due to their high degrees of freedom and motion complexity. While reinforcement learning from human feedback, particularly direct preference optimization, offers a potential solution, it necessitates the construction of strict preference pairs. However, curating such pairs for dynamic hand regions is prohibitively expensive and often impractical due to frame-wise inconsistencies. In this paper, we propose Implicit Preference Alignment (IPA), a data-efficient post-training framework that eliminates the need for paired preference data. Theoretically grounded in implicit reward maximization, IPA aligns the model by maximizing the likelihood of self-generated high-quality samples while penalizing deviations from the pretrained prior. Furthermore, we introduce a Hand-Aware Local Optimization mechanism to explicitly steer the alignment process toward hand regions. Experiments demonstrate that our method achieves effective preference optimization to enhance hand generation quality, while significantly lowering the barrier for constructing preference data. Codes are released at https://github.com/mdswyz/IPA

Community

Paper author Paper submitter about 13 hours ago

Implicit Preference Alignment that removes the need for bad samples for preference optimization.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.07545
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.07545 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.07545 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.07545 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers