Hugging Face Daily Papers · May 14, 2026 · 4 min read

Asymmetric Flow Models

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

JiT x0-prediction is not enough for pixel generation. AsymFlow introduces rank-asymmetric flow parameterization for scalable pixel generation.\nCore Method Velocity prediction has a data term and a noise term. AsymFlow makes them rank-asymmetric:\n<ul>\n<li>Data term is full-dimensional</li>\n<li>Noise term is in a low-rank subspace</li>\n</ul>\nThe full-dimensional velocity is recovered analytically for flow matching training and sampling.\nState-of-the-Art Results\n<ul>\n<li>1.57 FID on ImageNet (best pixel flow model)</li>\n<li>Finetunes FLUX.2 klein into pixel space, beats the original latent model on HPSv3/DPG/GenEval (#1 overall on HPSv3)</li>\n</ul>\n<a href=\"https://cdn-uploads.huggingface.co/production/uploads/638067fcb334960c987fbeda/tCs6-krNomJW_oddv7QVw.jpeg\" rel=\"nofollow\"><img src=\"https://cdn-uploads.huggingface.co/production/uploads/638067fcb334960c987fbeda/tCs6-krNomJW_oddv7QVw.jpeg\" alt=\"asymflow_teaser\"></a>\n","updatedAt":"2026-05-14T02:18:27.567Z","author":{"_id":"638067fcb334960c987fbeda","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/638067fcb334960c987fbeda/63ZgNyCXQjQLhRd6poRK-.png","fullname":"Hansheng Chen","name":"Lakonik","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":41,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7590687870979309},"editors":["Lakonik"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/638067fcb334960c987fbeda/63ZgNyCXQjQLhRd6poRK-.png"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.12964","authors":[{"_id":"6a052cbbb1a8cbabc9f08694","user":{"_id":"638067fcb334960c987fbeda","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/638067fcb334960c987fbeda/63ZgNyCXQjQLhRd6poRK-.png","isPro":true,"fullname":"Hansheng Chen","user":"Lakonik","type":"user","name":"Lakonik"},"name":"Hansheng Chen","status":"claimed_verified","statusLastChangedAt":"2026-05-14T10:56:00.635Z","hidden":false},{"_id":"6a052cbbb1a8cbabc9f08695","user":{"_id":"66fb1165cb9763996575d9d2","avatarUrl":"/avatars/0ec7648321de9b3c073a766ed1130f7b.svg","isPro":false,"fullname":"Jan Ackermann","user":"ackermannj","type":"user","name":"ackermannj"},"name":"Jan Ackermann","status":"claimed_verified","statusLastChangedAt":"2026-05-14T10:55:58.786Z","hidden":false},{"_id":"6a052cbbb1a8cbabc9f08696","name":"Minseo Kim","hidden":false},{"_id":"6a052cbbb1a8cbabc9f08697","name":"Gordon Wetzstein","hidden":false},{"_id":"6a052cbbb1a8cbabc9f08698","name":"Leonidas Guibas","hidden":false}],"publishedAt":"2026-05-13T00:00:00.000Z","submittedOnDailyAt":"2026-05-14T00:00:00.000Z","title":"Asymmetric Flow Models","submittedOnDailyBy":{"_id":"638067fcb334960c987fbeda","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/638067fcb334960c987fbeda/63ZgNyCXQjQLhRd6poRK-.png","isPro":true,"fullname":"Hansheng Chen","user":"Lakonik","type":"user","name":"Lakonik"},"summary":"Flow-based generation in high-dimensional spaces is difficult because velocity prediction requires modeling high-dimensional noise, even when data has strong low-rank structure. We present Asymmetric Flow Modeling (AsymFlow), a rank-asymmetric velocity parameterization that restricts noise prediction to a low-rank subspace while keeping data prediction full-dimensional. From this asymmetric prediction, AsymFlow analytically recovers the full-dimensional velocity without changing the network architecture or training/sampling procedures. On ImageNet 256times256, AsymFlow achieves a leading 1.57 FID, outperforming prior DiT/JiT-like pixel diffusion models by a large margin. AsymFlow also provides the first-ever route for finetuning pretrained latent flow models into pixel-space models: aligning the low-rank pixel subspace to the latent space gives a seamless initialization that preserves the latent model's high-level semantics and structure, so finetuning mainly improves low-level mismatches rather than relearning pixel generation. We show that the pixel AsymFlow model finetuned from FLUX.2 klein 9B establishes a new state of the art for pixel-space text-to-image generation, beating its latent base on HPSv3, DPG-Bench, and GenEval while qualitatively showing substantially improved visual realism.","upvotes":12,"discussionId":"6a052cbbb1a8cbabc9f08699","projectPage":"https://hanshengchen.com/asymflow/","githubRepo":"https://github.com/Lakonik/LakonLab","githubRepoAddedBy":"user","ai_summary":"Asymmetric Flow Modeling enables efficient high-dimensional flow-based generation by restricting noise prediction to low-rank subspaces while maintaining full-dimensional data prediction, achieving superior performance in pixel-space text-to-image generation through effective fine-tuning from latent models.","ai_keywords":["flow-based generation","velocity prediction","high-dimensional noise","low-rank structure","rank-asymmetric velocity parameterization","diffusion models","pixel space","latent space","fine-tuning","text-to-image generation","FID score","latent flow models","pixel diffusion models"],"githubStars":323,"organization":{"_id":"672c672dcf09d152f4da04c4","name":"StanfordUniversity","fullname":"Stanford University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/68e396f2b5bb631e9b2fac9a/vJI0POlzGMXL2878t1vz2.jpeg"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"638067fcb334960c987fbeda","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/638067fcb334960c987fbeda/63ZgNyCXQjQLhRd6poRK-.png","isPro":true,"fullname":"Hansheng Chen","user":"Lakonik","type":"user"},{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"6900ecce911da714e743f38d","avatarUrl":"/avatars/066ff3195a2793535f2af2127f159692.svg","isPro":false,"fullname":"Howard Xiao","user":"howardhx","type":"user"},{"_id":"62f9558c77b722f1866448b8","avatarUrl":"/avatars/2a37d2caacab433fdd17d232aa29371c.svg","isPro":false,"fullname":"eepos","user":"eepos","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"63511e3aa8822aadf5720ff9","avatarUrl":"/avatars/5798938f1c90d642b2d492b67c77aa4c.svg","isPro":false,"fullname":"Matan Kleiner","user":"matankleiner","type":"user"},{"_id":"66615c855fd9d736e670e0a9","avatarUrl":"/avatars/0ff3127b513552432a7c651e21d7f283.svg","isPro":false,"fullname":"wangshuai","user":"wangsssssss","type":"user"},{"_id":"66fb1165cb9763996575d9d2","avatarUrl":"/avatars/0ec7648321de9b3c073a766ed1130f7b.svg","isPro":false,"fullname":"Jan Ackermann","user":"ackermannj","type":"user"},{"_id":"6351e5bb3734c6e8a5c1bec1","avatarUrl":"/avatars/a784a51b369b197398575c3afbd5ceab.svg","isPro":false,"fullname":"Han-Bit Kang","user":"hbkang","type":"user"},{"_id":"63c5d43ae2804cb2407e4d43","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1673909278097-noauth.png","isPro":false,"fullname":"xziayro","user":"xziayro","type":"user"},{"_id":"687b1491392477cd3f670a78","avatarUrl":"/avatars/7189730a0e210040536a007c07887292.svg","isPro":false,"fullname":"Hongje Seong","user":"hongjeseong","type":"user"},{"_id":"69ccaba55334e3f776f1f11c","avatarUrl":"/avatars/9a278175809d34bfa0a84d8719188fb3.svg","isPro":false,"fullname":"Li Jiahui","user":"HUSHml","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"672c672dcf09d152f4da04c4","name":"StanfordUniversity","fullname":"Stanford University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/68e396f2b5bb631e9b2fac9a/vJI0POlzGMXL2878t1vz2.jpeg"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.12964.md"}">

Papers

arxiv:2605.12964

Asymmetric Flow Models

Published on May 13

· Submitted by

Hansheng Chen on May 14

Stanford University

Upvote

Authors:

Hansheng Chen ,

Jan Ackermann ,

Abstract

Asymmetric Flow Modeling enables efficient high-dimensional flow-based generation by restricting noise prediction to low-rank subspaces while maintaining full-dimensional data prediction, achieving superior performance in pixel-space text-to-image generation through effective fine-tuning from latent models.

AI-generated summary

Flow-based generation in high-dimensional spaces is difficult because velocity prediction requires modeling high-dimensional noise, even when data has strong low-rank structure. We present Asymmetric Flow Modeling (AsymFlow), a rank-asymmetric velocity parameterization that restricts noise prediction to a low-rank subspace while keeping data prediction full-dimensional. From this asymmetric prediction, AsymFlow analytically recovers the full-dimensional velocity without changing the network architecture or training/sampling procedures. On ImageNet 256times256, AsymFlow achieves a leading 1.57 FID, outperforming prior DiT/JiT-like pixel diffusion models by a large margin. AsymFlow also provides the first-ever route for finetuning pretrained latent flow models into pixel-space models: aligning the low-rank pixel subspace to the latent space gives a seamless initialization that preserves the latent model's high-level semantics and structure, so finetuning mainly improves low-level mismatches rather than relearning pixel generation. We show that the pixel AsymFlow model finetuned from FLUX.2 klein 9B establishes a new state of the art for pixel-space text-to-image generation, beating its latent base on HPSv3, DPG-Bench, and GenEval while qualitatively showing substantially improved visual realism.

View arXiv page View PDF Project page GitHub 323 Add to collection

Community

Lakonik

Paper author Paper submitter about 24 hours ago

JiT x0-prediction is not enough for pixel generation. AsymFlow introduces rank-asymmetric flow parameterization for scalable pixel generation.

Core Method
Velocity prediction has a data term and a noise term. AsymFlow makes them rank-asymmetric:

Data term is full-dimensional
Noise term is in a low-rank subspace

The full-dimensional velocity is recovered analytically for flow matching training and sampling.

State-of-the-Art Results

1.57 FID on ImageNet (best pixel flow model)
Finetunes FLUX.2 klein into pixel space, beats the original latent model on HPSv3/DPG/GenEval (#1 overall on HPSv3)