Hugging Face Daily Papers · May 26, 2026 · 3 min read

On-Policy Adversarial Flow Distillation for Autoregressive Video Generation

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

upload</p>\n","updatedAt":"2026-05-26T03:41:33.571Z","author":{"_id":"64f409e314d3972955dfb8a6","avatarUrl":"/avatars/bedc8ad492b1a0034e3c81b43670834c.svg","fullname":"Yang Luo","name":"yang29","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9909989237785339},"editors":["yang29"],"editorAvatarUrls":["/avatars/bedc8ad492b1a0034e3c81b43670834c.svg"],"reactions":[{"reaction":"❤️","users":["Jungang"],"count":1}],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.26105","authors":[{"_id":"6a15095bb57a1823d5708a81","user":{"_id":"64f409e314d3972955dfb8a6","avatarUrl":"/avatars/bedc8ad492b1a0034e3c81b43670834c.svg","isPro":true,"fullname":"Yang Luo","user":"yang29","type":"user","name":"yang29"},"name":"Yang Luo","status":"claimed_verified","statusLastChangedAt":"2026-05-26T07:47:44.266Z","hidden":false},{"_id":"6a15095bb57a1823d5708a82","name":"Shengju Qian","hidden":false},{"_id":"6a15095bb57a1823d5708a83","name":"Xiaohang Tang","hidden":false},{"_id":"6a15095bb57a1823d5708a84","name":"Zirui Zhu","hidden":false},{"_id":"6a15095bb57a1823d5708a85","name":"Yong Liu","hidden":false},{"_id":"6a15095bb57a1823d5708a86","name":"Xin Wang","hidden":false},{"_id":"6a15095bb57a1823d5708a87","name":"Yang You","hidden":false}],"publishedAt":"2026-05-25T00:00:00.000Z","submittedOnDailyAt":"2026-05-26T00:00:00.000Z","title":"On-Policy Adversarial Flow Distillation for Autoregressive Video Generation","submittedOnDailyBy":{"_id":"64f409e314d3972955dfb8a6","avatarUrl":"/avatars/bedc8ad492b1a0034e3c81b43670834c.svg","isPro":true,"fullname":"Yang Luo","user":"yang29","type":"user","name":"yang29"},"summary":"Autoregressive video generators are attractive for streaming, long-horizon, and interactive applications, but distilling strong black-box teachers into causal students remains difficult. The student must learn under its own rollout distribution, whereas practical teachers may expose only prompt-conditioned completed videos and may differ in architecture, capacity, temporal design, and sampling schedule. This interface makes supervised fine-tuning off-policy, score-based distillation inapplicable, and direct adversarial imitation too sparse for denoising-time credit assignment. We propose Adversarial Flow Distillation (AFD), an on-policy framework for heterogeneous black-box video distillation. AFD queries the teacher and rolls out the current student on the same prompts, trains a prompt-paired Bradley-Terry discriminator to estimate clean-sample teacher-student discrepancy, and converts the resulting on-policy advantage into forward-process flow-matching updates on the student's own noised states. Thus, AFD provides dense velocity-field supervision while requiring no teacher scores, latents, denoising trajectories, step alignment, or reverse-chain reinforcement learning. Experiments across two causal AR student families show that AFD consistently improves motion- and physics-sensitive generation while preserving general video quality, and ablations validate the importance of adaptive on-policy feedback and forward-process credit assignment. The method requires only clean teacher videos and student rollouts, providing a practical route for distilling proprietary or heterogeneous video generators into efficient autoregressive students.","upvotes":12,"discussionId":"6a15095bb57a1823d5708a88","ai_summary":"Adversarial Flow Distillation enables efficient distillation of heterogeneous video generation models by using on-policy feedback and forward-process flow-matching updates without requiring teacher scores or detailed trajectory information.","ai_keywords":["autoregressive video generators","causal students","black-box teachers","on-policy framework","Bradley-Terry discriminator","flow-matching updates","video distillation","heterogeneity","forward-process credit assignment"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64b76528fdb702b3d8641514","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64b76528fdb702b3d8641514/Ho-uWcQCAEIURM1lhWEWJ.jpeg","isPro":false,"fullname":"Jungang Li","user":"Jungang","type":"user"},{"_id":"64f409e314d3972955dfb8a6","avatarUrl":"/avatars/bedc8ad492b1a0034e3c81b43670834c.svg","isPro":true,"fullname":"Yang Luo","user":"yang29","type":"user"},{"_id":"6537e7e55ad6715c4c43297b","avatarUrl":"/avatars/069e4afb7efdbef0c467461e8d390bc9.svg","isPro":false,"fullname":"zhengyuanhong","user":"zyh200727","type":"user"},{"_id":"6548662e08568852409762f6","avatarUrl":"/avatars/d12ff2564375d018669248caaeed1e1a.svg","isPro":false,"fullname":"Mingxian Lin","user":"mxllc","type":"user"},{"_id":"632924029c3f42ca7149f305","avatarUrl":"/avatars/080bc7da4ad2875bdfa359213c88feb7.svg","isPro":false,"fullname":"Liu Ziming","user":"MaruyamaAya","type":"user"},{"_id":"6535759efca2c10e430e2df2","avatarUrl":"/avatars/2f3f33b4827d7caf387caaf9d115f4d5.svg","isPro":false,"fullname":"Yong Liu","user":"sglucas","type":"user"},{"_id":"669cefd6119595d21b55a995","avatarUrl":"/avatars/bafc2387ee70b263bf45c42159381da8.svg","isPro":false,"fullname":"Yuqi Liu","user":"Ricky06662","type":"user"},{"_id":"6380580f42cedbc20c7bef71","avatarUrl":"/avatars/8d710e0de551cd2bf545cc31fcaf099d.svg","isPro":false,"fullname":"Shengju Qian","user":"thesouthfrog","type":"user"},{"_id":"664965beea529a27ec1084ea","avatarUrl":"/avatars/8a8041c282a99bc4dac5289ef911bea8.svg","isPro":false,"fullname":"zhg","user":"zhgeng","type":"user"},{"_id":"6418554a0956be7233a1023e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6418554a0956be7233a1023e/9EKN0GoOpcDbvBDmAQEJf.png","isPro":false,"fullname":"zhang yuechen","user":"julianjuaner","type":"user"},{"_id":"6a0d9a9416d9b05a2a1d31fe","avatarUrl":"/avatars/779d62825b41cfcafec8420b1f4ff58f.svg","isPro":false,"fullname":"xuezhalin","user":"xzzlin","type":"user"},{"_id":"676a0a8f528f8ca2a5d15097","avatarUrl":"/avatars/289e3b2abab8f1b6a92859e9eb3ceae6.svg","isPro":false,"fullname":"lian","user":"lianqing11","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.26105.md"}">

Papers

arxiv:2605.26105

On-Policy Adversarial Flow Distillation for Autoregressive Video Generation

Published on May 25

· Submitted by

Yang Luo on May 26

Upvote

Authors:

Yang Luo ,

Abstract

Adversarial Flow Distillation enables efficient distillation of heterogeneous video generation models by using on-policy feedback and forward-process flow-matching updates without requiring teacher scores or detailed trajectory information.

AI-generated summary

Autoregressive video generators are attractive for streaming, long-horizon, and interactive applications, but distilling strong black-box teachers into causal students remains difficult. The student must learn under its own rollout distribution, whereas practical teachers may expose only prompt-conditioned completed videos and may differ in architecture, capacity, temporal design, and sampling schedule. This interface makes supervised fine-tuning off-policy, score-based distillation inapplicable, and direct adversarial imitation too sparse for denoising-time credit assignment. We propose Adversarial Flow Distillation (AFD), an on-policy framework for heterogeneous black-box video distillation. AFD queries the teacher and rolls out the current student on the same prompts, trains a prompt-paired Bradley-Terry discriminator to estimate clean-sample teacher-student discrepancy, and converts the resulting on-policy advantage into forward-process flow-matching updates on the student's own noised states. Thus, AFD provides dense velocity-field supervision while requiring no teacher scores, latents, denoising trajectories, step alignment, or reverse-chain reinforcement learning. Experiments across two causal AR student families show that AFD consistently improves motion- and physics-sensitive generation while preserving general video quality, and ablations validate the importance of adaptive on-policy feedback and forward-process credit assignment. The method requires only clean teacher videos and student rollouts, providing a practical route for distilling proprietary or heterogeneous video generators into efficient autoregressive students.

View arXiv page View PDF Add to collection

Community

yang29

Paper author Paper submitter about 4 hours ago

upload

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2605.26105

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.26105 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.26105 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.26105 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

No comments yet. Sign in and be the first to say something.

On-Policy Adversarial Flow Distillation for Autoregressive Video Generation

Abstract

Community

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 0

Discussion (0)

More from Hugging Face Daily Papers