<video src=\"https://cdn-uploads.huggingface.co/production/uploads/65862671e878be571bf9fc52/5wANoxIDodaoP4eSE0ksM.mp4\" controls=\"\" class=\"max-w-full!\"></video></p>","updatedAt":"2026-06-01T19:08:57.410Z","author":{"_id":"65862671e878be571bf9fc52","avatarUrl":"/avatars/b2a1b939f3112b476e7641e0c5fd2dc7.svg","fullname":"cuijiaxing","name":"cuijiaxing","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.4272986352443695},"editors":["cuijiaxing"],"editorAvatarUrls":["/avatars/b2a1b939f3112b476e7641e0c5fd2dc7.svg"],"reactions":[{"reaction":"🤗","users":["cuijiaxing"],"count":1}],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.23458","authors":[{"_id":"6a1d64c1808ddbc3c7d437bb","name":"Jiaqi Feng","hidden":false},{"_id":"6a1d64c1808ddbc3c7d437bc","name":"Justin Cui","hidden":false},{"_id":"6a1d64c1808ddbc3c7d437bd","name":"Yuanhao Ban","hidden":false},{"_id":"6a1d64c1808ddbc3c7d437be","name":"Cho-Jui Hsieh","hidden":false}],"publishedAt":"2026-05-22T00:00:00.000Z","submittedOnDailyAt":"2026-06-01T00:00:00.000Z","title":"One-Forcing: Towards Stable One-Step Autoregressive Video Generation","submittedOnDailyBy":{"_id":"65862671e878be571bf9fc52","avatarUrl":"/avatars/b2a1b939f3112b476e7641e0c5fd2dc7.svg","isPro":false,"fullname":"cuijiaxing","user":"cuijiaxing","type":"user","name":"cuijiaxing"},"summary":"Recent advances have substantially improved real-time interactive video generation in the autoregressive regime. However, most existing few-step autoregressive video generation methods, often distilled from a corresponding many-step teacher, default to a 4-step sampling configuration, which still incurs considerable latency during deployment and suffers from severe quality degradation when the number of sampling steps is further reduced, particularly in the one-step setting. Trajectory-style consistency distillation methods often produce videos with weak dynamics, while DMD-based approaches, such as Self-Forcing, tend to yield blurry frames. To address this challenge, we propose One-Forcing, a simple yet effective approach which augments the DMD objective with an auxiliary GAN loss for high-quality and efficient one-step video generation. Experiments on VBench show that One-Forcing achieves a total score of 83.76, establishing state-of-the-art performance among one-step causal video generation methods and remaining competitive with strong many-step approaches. We further demonstrate that one-step framewise autoregressive generation can be achieved stably with merely one-third of the training cost of the chunkwise model, a setting that prior methods have failed to achieve successfully.","upvotes":1,"discussionId":"6a1d64c1808ddbc3c7d437bf","projectPage":"https://aurora-edu.github.io/one-forcing/","githubRepo":"https://github.com/Aurora-edu/One-Forcing","githubRepoAddedBy":"user","ai_summary":"One-Forcing improves one-step video generation quality and efficiency by combining DMD objective with GAN loss, achieving state-of-the-art results with reduced training costs.","ai_keywords":["autoregressive video generation","trajectory-style consistency distillation","DMD-based approaches","Self-Forcing","GAN loss","one-step video generation","causal video generation","frame-wise autoregressive generation","chunkwise model"],"githubStars":30},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"65862671e878be571bf9fc52","avatarUrl":"/avatars/b2a1b939f3112b476e7641e0c5fd2dc7.svg","isPro":false,"fullname":"cuijiaxing","user":"cuijiaxing","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.23458.md"}">
One-Forcing: Towards Stable One-Step Autoregressive Video Generation
Abstract
One-Forcing improves one-step video generation quality and efficiency by combining DMD objective with GAN loss, achieving state-of-the-art results with reduced training costs.
AI-generated summary
Recent advances have substantially improved real-time interactive video generation in the autoregressive regime. However, most existing few-step autoregressive video generation methods, often distilled from a corresponding many-step teacher, default to a 4-step sampling configuration, which still incurs considerable latency during deployment and suffers from severe quality degradation when the number of sampling steps is further reduced, particularly in the one-step setting. Trajectory-style consistency distillation methods often produce videos with weak dynamics, while DMD-based approaches, such as Self-Forcing, tend to yield blurry frames. To address this challenge, we propose One-Forcing, a simple yet effective approach which augments the DMD objective with an auxiliary GAN loss for high-quality and efficient one-step video generation. Experiments on VBench show that One-Forcing achieves a total score of 83.76, establishing state-of-the-art performance among one-step causal video generation methods and remaining competitive with strong many-step approaches. We further demonstrate that one-step framewise autoregressive generation can be achieved stably with merely one-third of the training cost of the chunkwise model, a setting that prior methods have failed to achieve successfully.
Community
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.23458 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.23458 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.23458 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.