upload</p>\n","updatedAt":"2026-05-26T03:41:33.571Z","author":{"_id":"64f409e314d3972955dfb8a6","avatarUrl":"/avatars/bedc8ad492b1a0034e3c81b43670834c.svg","fullname":"Yang Luo","name":"yang29","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9909989237785339},"editors":["yang29"],"editorAvatarUrls":["/avatars/bedc8ad492b1a0034e3c81b43670834c.svg"],"reactions":[{"reaction":"❤️","users":["Jungang"],"count":1}],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.26105","authors":[{"_id":"6a15095bb57a1823d5708a81","user":{"_id":"64f409e314d3972955dfb8a6","avatarUrl":"/avatars/bedc8ad492b1a0034e3c81b43670834c.svg","isPro":true,"fullname":"Yang Luo","user":"yang29","type":"user","name":"yang29"},"name":"Yang Luo","status":"claimed_verified","statusLastChangedAt":"2026-05-26T07:47:44.266Z","hidden":false},{"_id":"6a15095bb57a1823d5708a82","name":"Shengju Qian","hidden":false},{"_id":"6a15095bb57a1823d5708a83","name":"Xiaohang Tang","hidden":false},{"_id":"6a15095bb57a1823d5708a84","name":"Zirui Zhu","hidden":false},{"_id":"6a15095bb57a1823d5708a85","name":"Yong Liu","hidden":false},{"_id":"6a15095bb57a1823d5708a86","name":"Xin Wang","hidden":false},{"_id":"6a15095bb57a1823d5708a87","name":"Yang You","hidden":false}],"publishedAt":"2026-05-25T00:00:00.000Z","submittedOnDailyAt":"2026-05-26T00:00:00.000Z","title":"On-Policy Adversarial Flow Distillation for Autoregressive Video Generation","submittedOnDailyBy":{"_id":"64f409e314d3972955dfb8a6","avatarUrl":"/avatars/bedc8ad492b1a0034e3c81b43670834c.svg","isPro":true,"fullname":"Yang Luo","user":"yang29","type":"user","name":"yang29"},"summary":"Autoregressive video generators are attractive for streaming, long-horizon, and interactive applications, but distilling strong black-box teachers into causal students remains difficult. The student must learn under its own rollout distribution, whereas practical teachers may expose only prompt-conditioned completed videos and may differ in architecture, capacity, temporal design, and sampling schedule. This interface makes supervised fine-tuning off-policy, score-based distillation inapplicable, and direct adversarial imitation too sparse for denoising-time credit assignment. We propose Adversarial Flow Distillation (AFD), an on-policy framework for heterogeneous black-box video distillation. AFD queries the teacher and rolls out the current student on the same prompts, trains a prompt-paired Bradley-Terry discriminator to estimate clean-sample teacher-student discrepancy, and converts the resulting on-policy advantage into forward-process flow-matching updates on the student's own noised states. Thus, AFD provides dense velocity-field supervision while requiring no teacher scores, latents, denoising trajectories, step alignment, or reverse-chain reinforcement learning. Experiments across two causal AR student families show that AFD consistently improves motion- and physics-sensitive generation while preserving general video quality, and ablations validate the importance of adaptive on-policy feedback and forward-process credit assignment. The method requires only clean teacher videos and student rollouts, providing a practical route for distilling proprietary or heterogeneous video generators into efficient autoregressive students.","upvotes":12,"discussionId":"6a15095bb57a1823d5708a88","ai_summary":"Adversarial Flow Distillation enables efficient distillation of heterogeneous video generation models by using on-policy feedback and forward-process flow-matching updates without requiring teacher scores or detailed trajectory information.","ai_keywords":["autoregressive video generators","causal students","black-box teachers","on-policy framework","Bradley-Terry discriminator","flow-matching updates","video distillation","heterogeneity","forward-process credit assignment"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64b76528fdb702b3d8641514","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64b76528fdb702b3d8641514/Ho-uWcQCAEIURM1lhWEWJ.jpeg","isPro":false,"fullname":"Jungang Li","user":"Jungang","type":"user"},{"_id":"64f409e314d3972955dfb8a6","avatarUrl":"/avatars/bedc8ad492b1a0034e3c81b43670834c.svg","isPro":true,"fullname":"Yang Luo","user":"yang29","type":"user"},{"_id":"6537e7e55ad6715c4c43297b","avatarUrl":"/avatars/069e4afb7efdbef0c467461e8d390bc9.svg","isPro":false,"fullname":"zhengyuanhong","user":"zyh200727","type":"user"},{"_id":"6548662e08568852409762f6","avatarUrl":"/avatars/d12ff2564375d018669248caaeed1e1a.svg","isPro":false,"fullname":"Mingxian Lin","user":"mxllc","type":"user"},{"_id":"632924029c3f42ca7149f305","avatarUrl":"/avatars/080bc7da4ad2875bdfa359213c88feb7.svg","isPro":false,"fullname":"Liu Ziming","user":"MaruyamaAya","type":"user"},{"_id":"6535759efca2c10e430e2df2","avatarUrl":"/avatars/2f3f33b4827d7caf387caaf9d115f4d5.svg","isPro":false,"fullname":"Yong Liu","user":"sglucas","type":"user"},{"_id":"669cefd6119595d21b55a995","avatarUrl":"/avatars/bafc2387ee70b263bf45c42159381da8.svg","isPro":false,"fullname":"Yuqi Liu","user":"Ricky06662","type":"user"},{"_id":"6380580f42cedbc20c7bef71","avatarUrl":"/avatars/8d710e0de551cd2bf545cc31fcaf099d.svg","isPro":false,"fullname":"Shengju Qian","user":"thesouthfrog","type":"user"},{"_id":"664965beea529a27ec1084ea","avatarUrl":"/avatars/8a8041c282a99bc4dac5289ef911bea8.svg","isPro":false,"fullname":"zhg","user":"zhgeng","type":"user"},{"_id":"6418554a0956be7233a1023e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6418554a0956be7233a1023e/9EKN0GoOpcDbvBDmAQEJf.png","isPro":false,"fullname":"zhang yuechen","user":"julianjuaner","type":"user"},{"_id":"6a0d9a9416d9b05a2a1d31fe","avatarUrl":"/avatars/779d62825b41cfcafec8420b1f4ff58f.svg","isPro":false,"fullname":"xuezhalin","user":"xzzlin","type":"user"},{"_id":"676a0a8f528f8ca2a5d15097","avatarUrl":"/avatars/289e3b2abab8f1b6a92859e9eb3ceae6.svg","isPro":false,"fullname":"lian","user":"lianqing11","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.26105.md"}">
On-Policy Adversarial Flow Distillation for Autoregressive Video Generation
Abstract
Adversarial Flow Distillation enables efficient distillation of heterogeneous video generation models by using on-policy feedback and forward-process flow-matching updates without requiring teacher scores or detailed trajectory information.
AI-generated summary
Autoregressive video generators are attractive for streaming, long-horizon, and interactive applications, but distilling strong black-box teachers into causal students remains difficult. The student must learn under its own rollout distribution, whereas practical teachers may expose only prompt-conditioned completed videos and may differ in architecture, capacity, temporal design, and sampling schedule. This interface makes supervised fine-tuning off-policy, score-based distillation inapplicable, and direct adversarial imitation too sparse for denoising-time credit assignment. We propose Adversarial Flow Distillation (AFD), an on-policy framework for heterogeneous black-box video distillation. AFD queries the teacher and rolls out the current student on the same prompts, trains a prompt-paired Bradley-Terry discriminator to estimate clean-sample teacher-student discrepancy, and converts the resulting on-policy advantage into forward-process flow-matching updates on the student's own noised states. Thus, AFD provides dense velocity-field supervision while requiring no teacher scores, latents, denoising trajectories, step alignment, or reverse-chain reinforcement learning. Experiments across two causal AR student families show that AFD consistently improves motion- and physics-sensitive generation while preserving general video quality, and ablations validate the importance of adaptive on-policy feedback and forward-process credit assignment. The method requires only clean teacher videos and student rollouts, providing a practical route for distilling proprietary or heterogeneous video generators into efficient autoregressive students.
Community
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.26105 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.26105 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.26105 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.