Amazing! thanks team. </p>\n","updatedAt":"2025-11-25T16:35:43.703Z","author":{"_id":"633c3172ec4e4abf307b7dc6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/633c3172ec4e4abf307b7dc6/PQxG5qMMqlQ-BtjMNUl32.png","fullname":"Charchit Sharma","name":"charchits7","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9754365086555481},"editors":["charchits7"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/633c3172ec4e4abf307b7dc6/PQxG5qMMqlQ-BtjMNUl32.png"],"reactions":[{"reaction":"❤️","users":["sayakpaul","Graham-USMC","fogside","SekkSea","rkfg","ariG23498","chivier","ovuruska","Chelowek054","maximilianofir"],"count":10}],"isReport":false}},{"id":"69260c39763efbaa0e7259f7","author":{"_id":"5fd5e18a90b6dc4633f6d292","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5fd5e18a90b6dc4633f6d292/gZXHW5dd9R86AV9LMZ--y.png","fullname":"Maziyar Panahi","name":"MaziyarPanahi","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4981,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5fd5e18a90b6dc4633f6d292/QPiv8pt4JNxr0FdGnpFef.png","fullname":"OpenMed","name":"OpenMed","type":"org","isHf":false,"details":"Health x AI","plan":"team"}},"createdAt":"2025-11-25T20:06:17.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"amazing! great work! 👏\nis there a support for multi-gpus? (device_map=auto)","html":"<p>amazing! great work! 👏<br>is there a support for multi-gpus? (device_map=auto)</p>\n","updatedAt":"2025-11-25T20:06:17.858Z","author":{"_id":"5fd5e18a90b6dc4633f6d292","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5fd5e18a90b6dc4633f6d292/gZXHW5dd9R86AV9LMZ--y.png","fullname":"Maziyar Panahi","name":"MaziyarPanahi","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4981,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5fd5e18a90b6dc4633f6d292/QPiv8pt4JNxr0FdGnpFef.png","fullname":"OpenMed","name":"OpenMed","type":"org","isHf":false,"details":"Health x AI","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8909691572189331},"editors":["MaziyarPanahi"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/5fd5e18a90b6dc4633f6d292/gZXHW5dd9R86AV9LMZ--y.png"],"reactions":[{"reaction":"👍","users":["noah-vandal"],"count":1}],"isReport":false},"replies":[{"id":"6926b8f3bf5c52ab85acc28a","author":{"_id":"5f7fbd813e94f16a85448745","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1649681653581-5f7fbd813e94f16a85448745.jpeg","fullname":"Sayak Paul","name":"sayakpaul","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":913,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}},"createdAt":"2025-11-26T08:23:15.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"You should be able to incorporate that in different forms. Check this out:\nhttps://huggingface.co/docs/diffusers/main/en/training/distributed_inference","html":"<p>You should be able to incorporate that in different forms. Check this out:<br><a href=\"https://huggingface.co/docs/diffusers/main/en/training/distributed_inference\">https://huggingface.co/docs/diffusers/main/en/training/distributed_inference</a></p>\n","updatedAt":"2025-11-26T08:23:15.787Z","author":{"_id":"5f7fbd813e94f16a85448745","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1649681653581-5f7fbd813e94f16a85448745.jpeg","fullname":"Sayak Paul","name":"sayakpaul","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":913,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8594894409179688},"editors":["sayakpaul"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1649681653581-5f7fbd813e94f16a85448745.jpeg"],"reactions":[{"reaction":"🤗","users":["ariG23498","MaziyarPanahi"],"count":2}],"isReport":false,"parentCommentId":"69260c39763efbaa0e7259f7"}},{"id":"6929a61f44399d33a72bab63","author":{"_id":"5fd5e18a90b6dc4633f6d292","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5fd5e18a90b6dc4633f6d292/gZXHW5dd9R86AV9LMZ--y.png","fullname":"Maziyar Panahi","name":"MaziyarPanahi","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4981,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5fd5e18a90b6dc4633f6d292/QPiv8pt4JNxr0FdGnpFef.png","fullname":"OpenMed","name":"OpenMed","type":"org","isHf":false,"details":"Health x AI","plan":"team"}},"createdAt":"2025-11-28T13:39:43.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"> You should be able to incorporate that in different forms. Check this out:\n> https://huggingface.co/docs/diffusers/main/en/training/distributed_inference\nbeautiful! thank you, will try it today! \n","html":"<blockquote>\n<p>You should be able to incorporate that in different forms. Check this out:<br><a href=\"https://huggingface.co/docs/diffusers/main/en/training/distributed_inference\">https://huggingface.co/docs/diffusers/main/en/training/distributed_inference</a><br>beautiful! thank you, will try it today! </p>\n</blockquote>\n","updatedAt":"2025-11-28T13:39:43.420Z","author":{"_id":"5fd5e18a90b6dc4633f6d292","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5fd5e18a90b6dc4633f6d292/gZXHW5dd9R86AV9LMZ--y.png","fullname":"Maziyar Panahi","name":"MaziyarPanahi","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4981,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5fd5e18a90b6dc4633f6d292/QPiv8pt4JNxr0FdGnpFef.png","fullname":"OpenMed","name":"OpenMed","type":"org","isHf":false,"details":"Health x AI","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8715914487838745},"editors":["MaziyarPanahi"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/5fd5e18a90b6dc4633f6d292/gZXHW5dd9R86AV9LMZ--y.png"],"reactions":[],"isReport":false,"parentCommentId":"69260c39763efbaa0e7259f7"}}]},{"id":"6926ae9ce74a6113cd906800","author":{"_id":"6316d72a29411a6864ba18d6","avatarUrl":"/avatars/c167d71e34cb8c01f0db76a6d9dd7e38.svg","fullname":"muntedslunt","name":"muntedslunt","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false},"createdAt":"2025-11-26T07:39:08.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"lol... nope","html":"<p>lol... nope</p>\n","updatedAt":"2025-11-26T07:39:08.832Z","author":{"_id":"6316d72a29411a6864ba18d6","avatarUrl":"/avatars/c167d71e34cb8c01f0db76a6d9dd7e38.svg","fullname":"muntedslunt","name":"muntedslunt","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"fi","probability":0.4861249625682831},"editors":["muntedslunt"],"editorAvatarUrls":["/avatars/c167d71e34cb8c01f0db76a6d9dd7e38.svg"],"reactions":[{"reaction":"😔","users":["ariG23498"],"count":1}],"isReport":false},"replies":[{"id":"6926b9026fb402142b1885ea","author":{"_id":"5f7fbd813e94f16a85448745","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1649681653581-5f7fbd813e94f16a85448745.jpeg","fullname":"Sayak Paul","name":"sayakpaul","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":913,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}},"createdAt":"2025-11-26T08:23:30.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"What's that supposed to mean?","html":"<p>What's that supposed to mean?</p>\n","updatedAt":"2025-11-26T08:23:30.260Z","author":{"_id":"5f7fbd813e94f16a85448745","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1649681653581-5f7fbd813e94f16a85448745.jpeg","fullname":"Sayak Paul","name":"sayakpaul","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":913,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9989904761314392},"editors":["sayakpaul"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1649681653581-5f7fbd813e94f16a85448745.jpeg"],"reactions":[],"isReport":false,"parentCommentId":"6926ae9ce74a6113cd906800"}}]},{"id":"6926df57f265b7e9efc9e8e4","author":{"_id":"663ba7b476e6d5b98fd22645","avatarUrl":"/avatars/204817bb7e23e5e5eaf51bb80ae8c630.svg","fullname":"Nguyen Nhu Giap","name":"NhuGiap","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-11-26T11:07:03.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Hi, can you tell me a bit about your motivations behind omitting all bias parameters in network architecture? Thanks!","html":"<p>Hi, can you tell me a bit about your motivations behind omitting all bias parameters in network architecture? Thanks!</p>\n","updatedAt":"2025-11-26T11:07:03.829Z","author":{"_id":"663ba7b476e6d5b98fd22645","avatarUrl":"/avatars/204817bb7e23e5e5eaf51bb80ae8c630.svg","fullname":"Nguyen Nhu Giap","name":"NhuGiap","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8452562093734741},"editors":["NhuGiap"],"editorAvatarUrls":["/avatars/204817bb7e23e5e5eaf51bb80ae8c630.svg"],"reactions":[],"isReport":false},"replies":[{"id":"6926e3c2f4cfb3d76eecff92","author":{"_id":"5f7fbd813e94f16a85448745","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1649681653581-5f7fbd813e94f16a85448745.jpeg","fullname":"Sayak Paul","name":"sayakpaul","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":913,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}},"createdAt":"2025-11-26T11:25:54.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"That's a question for the Black Forest Labs team, not us.","html":"<p>That's a question for the Black Forest Labs team, not us.</p>\n","updatedAt":"2025-11-26T11:25:54.747Z","author":{"_id":"5f7fbd813e94f16a85448745","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1649681653581-5f7fbd813e94f16a85448745.jpeg","fullname":"Sayak Paul","name":"sayakpaul","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":913,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9801974892616272},"editors":["sayakpaul"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1649681653581-5f7fbd813e94f16a85448745.jpeg"],"reactions":[{"reaction":"➕","users":["ariG23498","hroch"],"count":2}],"isReport":false,"parentCommentId":"6926df57f265b7e9efc9e8e4"}}]},{"id":"69270f150706eeebd5b81399","createdAt":"2025-11-26T14:30:45.000Z","type":"comment","data":{"edited":true,"hidden":true,"hiddenBy":"","latest":{"raw":"This comment has been hidden","html":"This comment has been hidden","updatedAt":"2025-12-26T23:00:07.845Z"},"numEdits":0,"editors":[],"editorAvatarUrls":[],"reactions":[]}},{"id":"692727cacfcedf38b072c769","author":{"_id":"68c00ad35db4872ee259c823","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/rwnTddduiJMd0c0QUXCb1.png","fullname":"Guilherme Sabino Vaz","name":"guilhermevaz","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-11-26T16:16:10.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Amazing work! Can you tell me when the depth-maps model will be released?\nHas anyone already tried giving a depth map as a normal image? How does the model behave?","html":"<p>Amazing work! Can you tell me when the depth-maps model will be released?<br>Has anyone already tried giving a depth map as a normal image? How does the model behave?</p>\n","updatedAt":"2025-11-26T16:16:10.603Z","author":{"_id":"68c00ad35db4872ee259c823","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/rwnTddduiJMd0c0QUXCb1.png","fullname":"Guilherme Sabino Vaz","name":"guilhermevaz","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9621202945709229},"editors":["guilhermevaz"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/rwnTddduiJMd0c0QUXCb1.png"],"reactions":[],"isReport":false}},{"id":"692746e2cfcedf38b072c77f","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-11-26T18:28:50.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"https://github.com/huggingface/blog/blob/main/flux-2.md?plain=1#L283 causes\n```\n transformer_id, subfolder=\"transformer\", torch_dtype=torch_dtype, device_map=\"cpu\"\n ^^^^^^^^^^^^^^\nNameError: name 'transformer_id' is not defined\n```","html":"<p><a href=\"https://github.com/huggingface/blog/blob/main/flux-2.md?plain=1#L283\" rel=\"nofollow\">https://github.com/huggingface/blog/blob/main/flux-2.md?plain=1#L283</a> causes</p>\n<pre><code> transformer_id, subfolder=\"transformer\", torch_dtype=torch_dtype, device_map=\"cpu\"\n ^^^^^^^^^^^^^^\nNameError: name 'transformer_id' is not defined\n</code></pre>\n","updatedAt":"2025-11-27T03:45:29.131Z","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.47359123826026917},"editors":["sonic74"],"editorAvatarUrls":["/avatars/3c1daed6469b74f67acf9606172bf974.svg"],"reactions":[],"isReport":false},"replies":[{"id":"6927c3288b9bc560603d668d","author":{"_id":"608aabf24955d2bfc3cd99c6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg","fullname":"Aritra Roy Gosthipaty","name":"ariG23498","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":645,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}},"createdAt":"2025-11-27T03:19:04.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Probably an installation error?\n\n`pip install git+https://github.com/huggingface/diffusers -U` should help you with this.","html":"<p>Probably an installation error?</p>\n<p><code>pip install git+https://github.com/huggingface/diffusers -U</code> should help you with this.</p>\n","updatedAt":"2025-11-27T03:19:04.126Z","author":{"_id":"608aabf24955d2bfc3cd99c6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg","fullname":"Aritra Roy Gosthipaty","name":"ariG23498","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":645,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8660337924957275},"editors":["ariG23498"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg"],"reactions":[],"isReport":false,"parentCommentId":"692746e2cfcedf38b072c77f"}},{"id":"6927c39309f9a13f42877020","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-11-27T03:20:51.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"No, it's an undefined variable in line 30 of the `enable_group_offload` example.","html":"<p>No, it's an undefined variable in line 30 of the <code>enable_group_offload</code> example.</p>\n","updatedAt":"2025-11-27T03:20:51.421Z","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6889880299568176},"editors":["sonic74"],"editorAvatarUrls":["/avatars/3c1daed6469b74f67acf9606172bf974.svg"],"reactions":[],"isReport":false,"parentCommentId":"692746e2cfcedf38b072c77f"}},{"id":"6927c47bf265b7e9efc9e90d","author":{"_id":"608aabf24955d2bfc3cd99c6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg","fullname":"Aritra Roy Gosthipaty","name":"ariG23498","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":645,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}},"createdAt":"2025-11-27T03:24:43.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Could you share a colab notebook or a github gist with the code? I could look into it that way.","html":"<p>Could you share a colab notebook or a github gist with the code? I could look into it that way.</p>\n","updatedAt":"2025-11-27T03:24:43.358Z","author":{"_id":"608aabf24955d2bfc3cd99c6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg","fullname":"Aritra Roy Gosthipaty","name":"ariG23498","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":645,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9770315885543823},"editors":["ariG23498"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg"],"reactions":[],"isReport":false,"parentCommentId":"692746e2cfcedf38b072c77f"}},{"id":"6927c58dbf5c52ab85acc2a8","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-11-27T03:29:17.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"It's just a copy of https://github.com/huggingface/blog/blob/main/flux-2.md?plain=1#L283 but here you go: https://gist.github.com/sonic74/423c03483fbc13e7fd99ac97bcec8ff8","html":"<p>It's just a copy of <a href=\"https://github.com/huggingface/blog/blob/main/flux-2.md?plain=1#L283\" rel=\"nofollow\">https://github.com/huggingface/blog/blob/main/flux-2.md?plain=1#L283</a> but here you go: <a href=\"https://gist.github.com/sonic74/423c03483fbc13e7fd99ac97bcec8ff8\" rel=\"nofollow\">https://gist.github.com/sonic74/423c03483fbc13e7fd99ac97bcec8ff8</a></p>\n","updatedAt":"2025-11-27T03:43:12.764Z","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":3,"identifiedLanguage":{"language":"en","probability":0.6202142238616943},"editors":["sonic74"],"editorAvatarUrls":["/avatars/3c1daed6469b74f67acf9606172bf974.svg"],"reactions":[],"isReport":false,"parentCommentId":"692746e2cfcedf38b072c77f"}},{"id":"6927c7cd09ea436c473a25df","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-11-27T03:38:53.000Z","type":"comment","data":{"edited":true,"hidden":true,"hiddenBy":"","latest":{"raw":"This comment has been hidden","html":"This comment has been hidden","updatedAt":"2025-11-27T03:42:09.048Z","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"editors":[],"editorAvatarUrls":[],"reactions":[],"parentCommentId":"692746e2cfcedf38b072c77f"}},{"id":"6927cf079200762340f9130a","author":{"_id":"608aabf24955d2bfc3cd99c6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg","fullname":"Aritra Roy Gosthipaty","name":"ariG23498","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":645,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}},"createdAt":"2025-11-27T04:09:43.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"I think this would work: https://github.com/ariG23498/custom-inference-endpoint/blob/main/flux.2-with-remote-text-encoder.ipynb","html":"<p>I think this would work: <a href=\"https://github.com/ariG23498/custom-inference-endpoint/blob/main/flux.2-with-remote-text-encoder.ipynb\" rel=\"nofollow\">https://github.com/ariG23498/custom-inference-endpoint/blob/main/flux.2-with-remote-text-encoder.ipynb</a></p>\n","updatedAt":"2025-11-27T04:09:43.090Z","author":{"_id":"608aabf24955d2bfc3cd99c6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg","fullname":"Aritra Roy Gosthipaty","name":"ariG23498","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":645,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8133991956710815},"editors":["ariG23498"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg"],"reactions":[],"isReport":false,"parentCommentId":"692746e2cfcedf38b072c77f"}},{"id":"6927d1cdf265b7e9efc9e912","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-11-27T04:21:33.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"I hoped `enable_group_offload` would speed it up a bit more than\n```\n>python3 flux.2-with-remote-text-encoder.py\ntorch.__version__='2.9.1+cu130'\ndiffusers.__version__='0.36.0.dev0'\nUsing GPU: NVIDIA GeForce RTX 5070 Ti\nTotal VRAM: 15 GBs\nRunning remote text encoder ☁️\nDone ✅\n 2%|███▏ | 1/50 [02:29<2:02:25, 149.92s/it]\n```","html":"<p>I hoped <code>enable_group_offload</code> would speed it up a bit more than</p>\n<pre><code>>python3 flux.2-with-remote-text-encoder.py\ntorch.__version__='2.9.1+cu130'\ndiffusers.__version__='0.36.0.dev0'\nUsing GPU: NVIDIA GeForce RTX 5070 Ti\nTotal VRAM: 15 GBs\nRunning remote text encoder ☁️\nDone ✅\n 2%|███▏ | 1/50 [02:29<2:02:25, 149.92s/it]\n</code></pre>\n","updatedAt":"2025-11-27T04:21:33.019Z","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5549377799034119},"editors":["sonic74"],"editorAvatarUrls":["/avatars/3c1daed6469b74f67acf9606172bf974.svg"],"reactions":[],"isReport":false,"parentCommentId":"692746e2cfcedf38b072c77f"}},{"id":"6927d2b6d00cb5a9bf99ed5c","author":{"_id":"608aabf24955d2bfc3cd99c6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg","fullname":"Aritra Roy Gosthipaty","name":"ariG23498","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":645,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}},"createdAt":"2025-11-27T04:25:26.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"I don't think group offload speeds things up. It is meant to be a trick to reduce memory usage. The trade offs here should be noted.","html":"<p>I don't think group offload speeds things up. It is meant to be a trick to reduce memory usage. The trade offs here should be noted.</p>\n","updatedAt":"2025-11-27T04:25:26.510Z","author":{"_id":"608aabf24955d2bfc3cd99c6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg","fullname":"Aritra Roy Gosthipaty","name":"ariG23498","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":645,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9854326844215393},"editors":["ariG23498"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg"],"reactions":[],"isReport":false,"parentCommentId":"692746e2cfcedf38b072c77f"}},{"id":"6927d3928b9bc560603d6692","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-11-27T04:29:06.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"I wonder what ComfyUI uses. It inferences in about a minute - even with fp8 and a local text encoder.","html":"<p>I wonder what ComfyUI uses. It inferences in about a minute - even with fp8 and a local text encoder.</p>\n","updatedAt":"2025-11-27T04:32:11.650Z","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":4,"identifiedLanguage":{"language":"en","probability":0.8609533309936523},"editors":["sonic74"],"editorAvatarUrls":["/avatars/3c1daed6469b74f67acf9606172bf974.svg"],"reactions":[],"isReport":false,"parentCommentId":"692746e2cfcedf38b072c77f"}},{"id":"6927f49af265b7e9efc9e916","author":{"_id":"63df091910678851bb0cd0e0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63df091910678851bb0cd0e0/FUXFt0C-rUFSppIAu5ZDN.png","fullname":"Alvaro Somoza","name":"OzzyGT","type":"user","isPro":false,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":241,"isUserFollowing":false},"createdAt":"2025-11-27T06:50:02.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"this is just a typo, change `transformer_id` for `repo_id`","html":"<p>this is just a typo, change <code>transformer_id</code> for <code>repo_id</code></p>\n","updatedAt":"2025-11-27T06:50:02.273Z","author":{"_id":"63df091910678851bb0cd0e0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63df091910678851bb0cd0e0/FUXFt0C-rUFSppIAu5ZDN.png","fullname":"Alvaro Somoza","name":"OzzyGT","type":"user","isPro":false,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":241,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.74115389585495},"editors":["OzzyGT"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/63df091910678851bb0cd0e0/FUXFt0C-rUFSppIAu5ZDN.png"],"reactions":[],"isReport":false,"parentCommentId":"692746e2cfcedf38b072c77f"}},{"id":"69285f771601303bb179ffd7","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-11-27T14:25:59.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"I already did, but wasn't sure, because then I got\n```\ntorch.AcceleratorError: CUDA error: an illegal memory access was encountered\nSearch for `cudaErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.\n```","html":"<p>I already did, but wasn't sure, because then I got</p>\n<pre><code>torch.AcceleratorError: CUDA error: an illegal memory access was encountered\nSearch for `cudaErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.\n</code></pre>\n","updatedAt":"2025-11-27T15:20:28.356Z","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":2,"identifiedLanguage":{"language":"en","probability":0.8433528542518616},"editors":["sonic74"],"editorAvatarUrls":["/avatars/3c1daed6469b74f67acf9606172bf974.svg"],"reactions":[],"isReport":false,"parentCommentId":"692746e2cfcedf38b072c77f"}},{"id":"6928717ca4a9a86eb355d157","author":{"_id":"63df091910678851bb0cd0e0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63df091910678851bb0cd0e0/FUXFt0C-rUFSppIAu5ZDN.png","fullname":"Alvaro Somoza","name":"OzzyGT","type":"user","isPro":false,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":241,"isUserFollowing":false},"createdAt":"2025-11-27T15:42:52.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"do you have 32GB of free RAM? group offloading with cuda streams needs a lot of RAM but it's fast, anyway, here is really hard to give help, if you still have problems, please open an issue in the diffusers repo with the code and the error you're getting. ","html":"<p>do you have 32GB of free RAM? group offloading with cuda streams needs a lot of RAM but it's fast, anyway, here is really hard to give help, if you still have problems, please open an issue in the diffusers repo with the code and the error you're getting. </p>\n","updatedAt":"2025-11-27T15:42:52.158Z","author":{"_id":"63df091910678851bb0cd0e0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63df091910678851bb0cd0e0/FUXFt0C-rUFSppIAu5ZDN.png","fullname":"Alvaro Somoza","name":"OzzyGT","type":"user","isPro":false,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":241,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9722614288330078},"editors":["OzzyGT"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/63df091910678851bb0cd0e0/FUXFt0C-rUFSppIAu5ZDN.png"],"reactions":[],"isReport":false,"parentCommentId":"692746e2cfcedf38b072c77f"}},{"id":"6928737d38380707431b32f3","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-11-27T15:51:25.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"I have 96 GB RAM, and indeed ComfyUI fills it up when inferencing flux2.\nb.t.w., [there's obvious spam](https://github.com/huggingface/blog/issues/3112) in the issue tracker since 2 months.","html":"<p>I have 96 GB RAM, and indeed ComfyUI fills it up when inferencing flux2.<br>b.t.w., <a href=\"https://github.com/huggingface/blog/issues/3112\" rel=\"nofollow\">there's obvious spam</a> in the issue tracker since 2 months.</p>\n","updatedAt":"2025-11-27T15:51:25.988Z","author":{"_id":"683d1665e41c42facedbcaf8","avatarUrl":"/avatars/3c1daed6469b74f67acf9606172bf974.svg","fullname":"Sven Killig","name":"sonic74","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8643487691879272},"editors":["sonic74"],"editorAvatarUrls":["/avatars/3c1daed6469b74f67acf9606172bf974.svg"],"reactions":[],"isReport":false,"parentCommentId":"692746e2cfcedf38b072c77f"}}]},{"id":"6927f2cacfcedf38b072c796","author":{"_id":"6304c907bad6ce7fc02764d4","avatarUrl":"/avatars/d93fae5d31c8f76e97d8bdfb3e2a0d5e.svg","fullname":"Junjie","name":"Adenialzz","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false},"createdAt":"2025-11-27T06:42:18.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Hi, How can I deploy a text encoder privately?","html":"<p>Hi, How can I deploy a text encoder privately?</p>\n","updatedAt":"2025-11-27T06:42:18.203Z","author":{"_id":"6304c907bad6ce7fc02764d4","avatarUrl":"/avatars/d93fae5d31c8f76e97d8bdfb3e2a0d5e.svg","fullname":"Junjie","name":"Adenialzz","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9077998399734497},"editors":["Adenialzz"],"editorAvatarUrls":["/avatars/d93fae5d31c8f76e97d8bdfb3e2a0d5e.svg"],"reactions":[{"reaction":"😎","users":["ariG23498"],"count":1},{"reaction":"🔥","users":["ariG23498"],"count":1},{"reaction":"🚀","users":["ariG23498"],"count":1}],"isReport":false},"replies":[{"id":"6927f929bf5c52ab85acc2ac","author":{"_id":"63df091910678851bb0cd0e0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63df091910678851bb0cd0e0/FUXFt0C-rUFSppIAu5ZDN.png","fullname":"Alvaro Somoza","name":"OzzyGT","type":"user","isPro":false,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":241,"isUserFollowing":false},"createdAt":"2025-11-27T07:09:29.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Hi, you can read this [nice repo](https://github.com/ariG23498/custom-inference-endpoint) with the process that @ariG23498 made","html":"<p>Hi, you can read this <a href=\"https://github.com/ariG23498/custom-inference-endpoint\" rel=\"nofollow\">nice repo</a> with the process that <span class=\"SVELTE_PARTIAL_HYDRATER contents\" data-target=\"UserMention\" data-props=\"{"user":"ariG23498"}\"><span class=\"inline-block\"><span class=\"contents\"><a href=\"/ariG23498\">@<span class=\"underline\">ariG23498</span></a></span> </span></span> made</p>\n","updatedAt":"2025-11-27T07:09:29.107Z","author":{"_id":"63df091910678851bb0cd0e0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63df091910678851bb0cd0e0/FUXFt0C-rUFSppIAu5ZDN.png","fullname":"Alvaro Somoza","name":"OzzyGT","type":"user","isPro":false,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":241,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8557339906692505},"editors":["OzzyGT"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/63df091910678851bb0cd0e0/FUXFt0C-rUFSppIAu5ZDN.png"],"reactions":[],"isReport":false,"parentCommentId":"6927f2cacfcedf38b072c796"}},{"id":"6927fa0e8dde7713575455a9","author":{"_id":"6304c907bad6ce7fc02764d4","avatarUrl":"/avatars/d93fae5d31c8f76e97d8bdfb3e2a0d5e.svg","fullname":"Junjie","name":"Adenialzz","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false},"createdAt":"2025-11-27T07:13:18.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Thanks.","html":"<p>Thanks.</p>\n","updatedAt":"2025-11-27T07:13:18.404Z","author":{"_id":"6304c907bad6ce7fc02764d4","avatarUrl":"/avatars/d93fae5d31c8f76e97d8bdfb3e2a0d5e.svg","fullname":"Junjie","name":"Adenialzz","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8000723719596863},"editors":["Adenialzz"],"editorAvatarUrls":["/avatars/d93fae5d31c8f76e97d8bdfb3e2a0d5e.svg"],"reactions":[],"isReport":false,"parentCommentId":"6927f2cacfcedf38b072c796"}},{"id":"6927fdc12769abebd2f50121","author":{"_id":"608aabf24955d2bfc3cd99c6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg","fullname":"Aritra Roy Gosthipaty","name":"ariG23498","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":645,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}},"createdAt":"2025-11-27T07:29:05.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Also, let me know if you face any issues! You can write an issue directly in the GitHub repo itself. Would love to help 🤗","html":"<p>Also, let me know if you face any issues! You can write an issue directly in the GitHub repo itself. Would love to help 🤗</p>\n","updatedAt":"2025-11-27T07:29:05.732Z","author":{"_id":"608aabf24955d2bfc3cd99c6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg","fullname":"Aritra Roy Gosthipaty","name":"ariG23498","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":645,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.848737359046936},"editors":["ariG23498"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/608aabf24955d2bfc3cd99c6/-YxmtpzEmf3NKOTktODRP.jpeg"],"reactions":[],"isReport":false,"parentCommentId":"6927f2cacfcedf38b072c796"}}]},{"id":"69dae29a51984581afba5e82","author":{"_id":"66c0e1af466dc6770ef31414","avatarUrl":"/avatars/8672f41ed26b0e7140c0117203b0ded5.svg","fullname":"dsa","name":"goodasdgood","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2026-04-12T00:08:58.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"how run flux on two gpu\n\ncode????","html":"<p>how run flux on two gpu</p>\n<p>code????</p>\n","updatedAt":"2026-04-12T00:08:58.216Z","author":{"_id":"66c0e1af466dc6770ef31414","avatarUrl":"/avatars/8672f41ed26b0e7140c0117203b0ded5.svg","fullname":"dsa","name":"goodasdgood","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7250254154205322},"editors":["goodasdgood"],"editorAvatarUrls":["/avatars/8672f41ed26b0e7140c0117203b0ded5.svg"],"reactions":[],"isReport":false}},{"id":"69dae72f51984581afba5e95","author":{"_id":"66c0e1af466dc6770ef31414","avatarUrl":"/avatars/8672f41ed26b0e7140c0117203b0ded5.svg","fullname":"dsa","name":"goodasdgood","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2026-04-12T00:28:31.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"https://github.com/ayttop/xflux2gpu/blob/main/xxxxxx%20(1).ipynb\n\n\nhow run it on 2 gpu\n2*16gb\n\n\nhttps://huggingface.co/docs/diffusers/main/en/api/parallel\n\nhttps://huggingface.co/docs/diffusers/main/en/training/distributed_inference\n","html":"<p><a href=\"https://github.com/ayttop/xflux2gpu/blob/main/xxxxxx%20(1).ipynb\" rel=\"nofollow\">https://github.com/ayttop/xflux2gpu/blob/main/xxxxxx%20(1).ipynb</a></p>\n<p>how run it on 2 gpu<br>2*16gb</p>\n<p><a href=\"https://huggingface.co/docs/diffusers/main/en/api/parallel\">https://huggingface.co/docs/diffusers/main/en/api/parallel</a></p>\n<p><a href=\"https://huggingface.co/docs/diffusers/main/en/training/distributed_inference\">https://huggingface.co/docs/diffusers/main/en/training/distributed_inference</a></p>\n","updatedAt":"2026-04-12T00:35:14.734Z","author":{"_id":"66c0e1af466dc6770ef31414","avatarUrl":"/avatars/8672f41ed26b0e7140c0117203b0ded5.svg","fullname":"dsa","name":"goodasdgood","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.3185204863548279},"editors":["goodasdgood"],"editorAvatarUrls":["/avatars/8672f41ed26b0e7140c0117203b0ded5.svg"],"reactions":[],"isReport":false}},{"id":"69daf8dd356d7881a860bd04","author":{"_id":"66c0e1af466dc6770ef31414","avatarUrl":"/avatars/8672f41ed26b0e7140c0117203b0ded5.svg","fullname":"dsa","name":"goodasdgood","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2026-04-12T01:43:57.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"import torch\nfrom transformers import Mistral3ForConditionalGeneration\n\nfrom diffusers import Flux2Pipeline, Flux2Transformer2DModel\n\nrepo_id = \"diffusers/FLUX.2-dev-bnb-4bit\"\ndevice = \"cuda:0\"\ntorch_dtype = torch.bfloat16\n\ntransformer = Flux2Transformer2DModel.from_pretrained(\n repo_id, subfolder=\"transformer\", torch_dtype=torch_dtype, device_map=\"cpu\"\n)\ntext_encoder = Mistral3ForConditionalGeneration.from_pretrained(\n repo_id, subfolder=\"text_encoder\", dtype=torch_dtype, device_map=\"cpu\"\n)\n\npipe = Flux2Pipeline.from_pretrained(\n repo_id, transformer=transformer, text_encoder=text_encoder, torch_dtype=torch_dtype\n)\npipe.enable_model_cpu_offload()\n\nprompt = \"Realistic macro photograph of a hermit crab using a soda can as its shell, partially emerging from the can, captured with sharp detail and natural colors, on a sunlit beach with soft shadows and a shallow depth of field, with blurred ocean waves in the background. The can has the text `BFL Diffusers` on it and it has a color gradient that start with #FF5733 at the top and transitions to #33FF57 at the bottom.\"\n\nimage = pipe(\n prompt=prompt,\n generator=torch.Generator(device=device).manual_seed(42),\n num_inference_steps=50, # 28 is a good trade-off\n guidance_scale=4,\n).images[0]\n\nimage.save(\"flux2_t2i_nf4.png\")\n\n\n\n\n\n Flax classes are deprecated and will be removed in Diffusers v1.0.0. We recommend migrating to PyTorch classes or pinning your version of Diffusers.\nFlax classes are deprecated and will be removed in Diffusers v1.0.0. We recommend migrating to PyTorch classes or pinning your version of Diffusers.\n/usr/local/lib/python3.12/dist-packages/huggingface_hub/utils/_validators.py:206: UserWarning: The `local_dir_use_symlinks` argument is deprecated and ignored in `hf_hub_download`. Downloading to a local directory does not use symlinks anymore.\n warnings.warn(\nDownload complete: 0.00/0.00 [00:00<?, ?B/s]Fetching 2 files: 100% 2/2 [00:00<00:00, 98.80it/s]Loading checkpoint shards: 100% 2/2 [00:01<00:00, 2.03it/s]Download complete: 0.00/0.00 [00:00<?, ?B/s]Fetching 4 files: 100% 4/4 [00:00<00:00, 167.57it/s]Loading weights: 100% 585/585 [00:02<00:00, 310.91it/s, Materializing param=model.vision_tower.transformer.layers.23.ffn_norm.weight]The tied weights mapping and config for this model specifies to tie model.language_model.embed_tokens.weight to lm_head.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning\nLoading pipeline components...: 100% 5/5 [00:03<00:00, 1.14it/s]---------------------------------------------------------------------------\nOutOfMemoryError Traceback (most recent call last)\n/tmp/ipykernel_18947/863729753.py in <cell line: 0>()\n 22 prompt = \"Realistic macro photograph of a hermit crab using a soda can as its shell, partially emerging from the can, captured with sharp detail and natural colors, on a sunlit beach with soft shadows and a shallow depth of field, with blurred ocean waves in the background. The can has the text `BFL Diffusers` on it and it has a color gradient that start with #FF5733 at the top and transitions to #33FF57 at the bottom.\"\n 23 \n---> 24 image = pipe(\n 25 prompt=prompt,\n 26 generator=torch.Generator(device=device).manual_seed(42),\n\n36 frames/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py in decorate_context(*args, **kwargs)\n 122 # pyrefly: ignore [bad-context-manager]\n 123 with ctx_factory():\n--> 124 return func(*args, **kwargs)\n 125 \n 126 return decorate_context\n\n/usr/local/lib/python3.12/dist-packages/diffusers/pipelines/flux2/pipeline_flux2.py in __call__(self, image, prompt, height, width, num_inference_steps, sigmas, guidance_scale, num_images_per_prompt, generator, latents, prompt_embeds, output_type, return_dict, attention_kwargs, callback_on_step_end, callback_on_step_end_tensor_inputs, max_sequence_length, text_encoder_out_layers, caption_upsample_temperature)\n 869 prompt, images=image, temperature=caption_upsample_temperature, device=device\n 870 )\n--> 871 prompt_embeds, text_ids = self.encode_prompt(\n 872 prompt=prompt,\n 873 prompt_embeds=prompt_embeds,\n\n/usr/local/lib/python3.12/dist-packages/diffusers/pipelines/flux2/pipeline_flux2.py in encode_prompt(self, prompt, device, num_images_per_prompt, prompt_embeds, max_sequence_length, text_encoder_out_layers)\n 586 \n 587 if prompt_embeds is None:\n--> 588 prompt_embeds = self._get_mistral_3_small_prompt_embeds(\n 589 text_encoder=self.text_encoder,\n 590 tokenizer=self.tokenizer,\n\n/usr/local/lib/python3.12/dist-packages/diffusers/pipelines/flux2/pipeline_flux2.py in _get_mistral_3_small_prompt_embeds(text_encoder, tokenizer, prompt, dtype, device, max_sequence_length, system_message, hidden_states_layers)\n 337 \n 338 # Forward pass through the model\n--> 339 output = text_encoder(\n 340 input_ids=input_ids,\n 341 attention_mask=attention_mask,\n\n/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)\n 1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]\n 1775 else:\n-> 1776 return self._call_impl(*args, **kwargs)\n 1777 \n 1778 # torchrec tests the code consistency with the following code\n\n/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)\n 1785 or _global_backward_pre_hooks or _global_backward_hooks\n 1786 or _global_forward_hooks or _global_forward_pre_hooks):\n-> 1787 return forward_call(*args, **kwargs)\n 1788 \n 1789 result = None\n\n/usr/local/lib/python3.12/dist-packages/accelerate/hooks.py in new_forward(module, *args, **kwargs)\n 190 output = module._old_forward(*args, **kwargs)\n 191 else:\n--> 192 output = module._old_forward(*args, **kwargs)\n 193 return module._hf_hook.post_forward(module, output)\n 194 \n\n/usr/local/lib/python3.12/dist-packages/transformers/utils/generic.py in wrapper(self, *args, **kwargs)\n 1000 outputs = func(self, *args, **kwargs)\n 1001 else:\n-> 1002 outputs = func(self, *args, **kwargs)\n 1003 except TypeError as original_exception:\n 1004 # If we get a TypeError, it's possible that the model is not receiving the recordable kwargs correctly.\n\n/usr/local/lib/python3.12/dist-packages/transformers/models/mistral3/modeling_mistral3.py in forward(self, input_ids, pixel_values, attention_mask, position_ids, past_key_values, inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict, cache_position, logits_to_keep, image_sizes, **kwargs)\n 444 return_dict = return_dict if return_dict is not None else self.config.use_return_dict\n 445 \n--> 446 outputs = self.model(\n 447 input_ids=input_ids,\n 448 pixel_values=pixel_values,\n\n/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)\n 1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]\n 1775 else:\n-> 1776 return self._call_impl(*args, **kwargs)\n 1777 \n 1778 # torchrec tests the code consistency with the following code\n\n/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)\n 1785 or _global_backward_pre_hooks or _global_backward_hooks\n 1786 or _global_forward_hooks or _global_forward_pre_hooks):\n-> 1787 return forward_call(*args, **kwargs)\n 1788 \n 1789 result = None\n\n/usr/local/lib/python3.12/dist-packages/transformers/utils/generic.py in wrapper(self, *args, **kwargs)\n 1000 outputs = func(self, *args, **kwargs)\n 1001 else:\n-> 1002 outputs = func(self, *args, **kwargs)\n 1003 except TypeError as original_exception:\n 1004 # If we get a TypeError, it's possible that the model is not receiving the recordable kwargs correctly.\n\n/usr/local/lib/python3.12/dist-packages/transformers/models/mistral3/modeling_mistral3.py in forward(self, input_ids, pixel_values, attention_mask, position_ids, past_key_values, inputs_embeds, vision_feature_layer, use_cache, output_attentions, output_hidden_states, return_dict, cache_position, image_sizes, **kwargs)\n 323 inputs_embeds = inputs_embeds.masked_scatter(special_image_mask, image_features)\n 324 \n--> 325 outputs = self.language_model(\n 326 attention_mask=attention_mask,\n 327 position_ids=position_ids,\n\n/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)\n 1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]\n 1775 else:\n-> 1776 return self._call_impl(*args, **kwargs)\n 1777 \n 1778 # torchrec tests the code consistency with the following code\n\n/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)\n 1785 or _global_backward_pre_hooks or _global_backward_hooks\n 1786 or _global_forward_hooks or _global_forward_pre_hooks):\n-> 1787 return forward_call(*args, **kwargs)\n 1788 \n 1789 result = None\n\n/usr/local/lib/python3.12/dist-packages/transformers/utils/generic.py in wrapper(self, *args, **kwargs)\n 1000 outputs = func(self, *args, **kwargs)\n 1001 else:\n-> 1002 outputs = func(self, *args, **kwargs)\n 1003 except TypeError as original_exception:\n 1004 # If we get a TypeError, it's possible that the model is not receiving the recordable kwargs correctly.\n\n/usr/local/lib/python3.12/dist-packages/transformers/models/mistral/modeling_mistral.py in forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, use_cache, cache_position, **kwargs)\n 395 \n 396 for decoder_layer in self.layers[: self.config.num_hidden_layers]:\n--> 397 hidden_states = decoder_layer(\n 398 hidden_states,\n 399 attention_mask=causal_mask,\n\n/usr/local/lib/python3.12/dist-packages/transformers/modeling_layers.py in __call__(self, *args, **kwargs)\n 91 \n 92 return self._gradient_checkpointing_func(partial(super().__call__, **kwargs), *args)\n---> 93 return super().__call__(*args, **kwargs)\n 94 \n 95 \n\n/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)\n 1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]\n 1775 else:\n-> 1776 return self._call_impl(*args, **kwargs)\n 1777 \n 1778 # torchrec tests the code consistency with the following code\n\n/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)\n 1785 or _global_backward_pre_hooks or _global_backward_hooks\n 1786 or _global_forward_hooks or _global_forward_pre_hooks):\n-> 1787 return forward_call(*args, **kwargs)\n 1788 \n 1789 result = None\n\n/usr/local/lib/python3.12/dist-packages/transformers/utils/generic.py in wrapped_forward(*args, **kwargs)\n 953 if key == \"hidden_states\" and len(collected_outputs[key]) == 0:\n 954 collected_outputs[key] += (args[0],)\n--> 955 output = orig_forward(*args, **kwargs)\n 956 if not isinstance(output, tuple):\n 957 collected_outputs[key] += (output,)\n\n/usr/local/lib/python3.12/dist-packages/transformers/models/mistral/modeling_mistral.py in forward(self, hidden_states, attention_mask, position_ids, past_key_values, use_cache, cache_position, position_embeddings, **kwargs)\n 228 hidden_states = self.input_layernorm(hidden_states)\n 229 # Self Attention\n--> 230 hidden_states, _ = self.self_attn(\n 231 hidden_states=hidden_states,\n 232 attention_mask=attention_mask,\n\n/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)\n 1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]\n 1775 else:\n-> 1776 return self._call_impl(*args, **kwargs)\n 1777 \n 1778 # torchrec tests the code consistency with the following code\n\n/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)\n 1785 or _global_backward_pre_hooks or _global_backward_hooks\n 1786 or _global_forward_hooks or _global_forward_pre_hooks):\n-> 1787 return forward_call(*args, **kwargs)\n 1788 \n 1789 result = None\n\n/usr/local/lib/python3.12/dist-packages/transformers/models/mistral/modeling_mistral.py in forward(self, hidden_states, position_embeddings, attention_mask, past_key_values, cache_position, **kwargs)\n 151 hidden_shape = (*input_shape, -1, self.head_dim)\n 152 \n--> 153 query_states = self.q_proj(hidden_states).view(hidden_shape).transpose(1, 2)\n 154 key_states = self.k_proj(hidden_states).view(hidden_shape).transpose(1, 2)\n 155 value_states = self.v_proj(hidden_states).view(hidden_shape).transpose(1, 2)\n\n/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)\n 1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]\n 1775 else:\n-> 1776 return self._call_impl(*args, **kwargs)\n 1777 \n 1778 # torchrec tests the code consistency with the following code\n\n/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)\n 1785 or _global_backward_pre_hooks or _global_backward_hooks\n 1786 or _global_forward_hooks or _global_forward_pre_hooks):\n-> 1787 return forward_call(*args, **kwargs)\n 1788 \n 1789 result = None\n\n/usr/local/lib/python3.12/dist-packages/bitsandbytes/nn/modules.py in forward(self, x)\n 554 weight = self.weight if getattr(quant_state, \"packing_format_for_cpu\", False) else self.weight.t()\n 555 \n--> 556 return bnb.matmul_4bit(x, weight, bias=bias, quant_state=quant_state).to(inp_dtype)\n 557 \n 558 \n\n/usr/local/lib/python3.12/dist-packages/bitsandbytes/autograd/_functions.py in matmul_4bit(A, B, quant_state, out, bias)\n 399 return out\n 400 else:\n--> 401 return MatMul4Bit.apply(A, B, out, bias, quant_state)\n\n/usr/local/lib/python3.12/dist-packages/torch/autograd/function.py in apply(cls, *args, **kwargs)\n 581 # See NOTE: [functorch vjp and autograd interaction]\n 582 args = _functorch.utils.unwrap_dead_wrappers(args)\n--> 583 return super().apply(*args, **kwargs) # type: ignore[misc]\n 584 \n 585 if not is_setup_ctx_defined:\n\n/usr/local/lib/python3.12/dist-packages/bitsandbytes/autograd/_functions.py in forward(ctx, A, B, out, bias, quant_state)\n 313 # 1. Dequantize\n 314 # 2. MatmulnN\n--> 315 output = torch.nn.functional.linear(A, F.dequantize_4bit(B, quant_state).to(A.dtype).t(), bias)\n 316 \n 317 # 3. Save state\n\n/usr/local/lib/python3.12/dist-packages/bitsandbytes/functional.py in dequantize_4bit(A, quant_state, absmax, out, blocksize, quant_type)\n 1048 )\n 1049 else:\n-> 1050 out = torch.ops.bitsandbytes.dequantize_4bit.default(\n 1051 A,\n 1052 absmax,\n\n/usr/local/lib/python3.12/dist-packages/torch/_ops.py in __call__(self, *args, **kwargs)\n 817 # that are named \"self\". This way, all the aten ops can be called by kwargs.\n 818 def __call__(self, /, *args: _P.args, **kwargs: _P.kwargs) -> _T:\n--> 819 return self._op(*args, **kwargs)\n 820 \n 821 # Use positional-only argument to avoid naming collision with aten ops arguments\n\n/usr/local/lib/python3.12/dist-packages/torch/_compile.py in inner(*args, **kwargs)\n 52 fn.__dynamo_disable = disable_fn # type: ignore[attr-defined]\n 53 \n---> 54 return disable_fn(*args, **kwargs)\n 55 \n 56 return inner\n\n/usr/local/lib/python3.12/dist-packages/torch/_dynamo/eval_frame.py in _fn(*args, **kwargs)\n 1179 ):\n 1180 return fn(*args, **kwargs)\n-> 1181 return fn(*args, **kwargs)\n 1182 finally:\n 1183 set_eval_frame(None)\n\n/usr/local/lib/python3.12/dist-packages/torch/library.py in func_no_dynamo(*args, **kwargs)\n 740 @torch._disable_dynamo\n 741 def func_no_dynamo(*args, **kwargs):\n--> 742 return func(*args, **kwargs)\n 743 \n 744 for key in keys:\n\n/usr/local/lib/python3.12/dist-packages/bitsandbytes/backends/cuda/ops.py in _(A, absmax, blocksize, quant_type, shape, dtype)\n 361 dtype: torch.dtype,\n 362 ) -> torch.Tensor:\n--> 363 out = torch.empty(shape, dtype=dtype, device=A.device)\n 364 _dequantize_4bit_impl(A, absmax, blocksize, quant_type, dtype, out=out)\n 365 return out\n\nOutOfMemoryError: CUDA out of memory. Tried to allocate 40.00 MiB. GPU 0 has a total capacity of 14.56 GiB of which 17.81 MiB is free. Including non-PyTorch memory, this process has 14.54 GiB memory in use. Of the allocated memory 14.40 GiB is allocated by PyTorch, and 15.19 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)","html":"<p>import torch<br>from transformers import Mistral3ForConditionalGeneration</p>\n<p>from diffusers import Flux2Pipeline, Flux2Transformer2DModel</p>\n<p>repo_id = \"diffusers/FLUX.2-dev-bnb-4bit\"<br>device = \"cuda:0\"<br>torch_dtype = torch.bfloat16</p>\n<p>transformer = Flux2Transformer2DModel.from_pretrained(<br> repo_id, subfolder=\"transformer\", torch_dtype=torch_dtype, device_map=\"cpu\"<br>)<br>text_encoder = Mistral3ForConditionalGeneration.from_pretrained(<br> repo_id, subfolder=\"text_encoder\", dtype=torch_dtype, device_map=\"cpu\"<br>)</p>\n<p>pipe = Flux2Pipeline.from_pretrained(<br> repo_id, transformer=transformer, text_encoder=text_encoder, torch_dtype=torch_dtype<br>)<br>pipe.enable_model_cpu_offload()</p>\n<p>prompt = \"Realistic macro photograph of a hermit crab using a soda can as its shell, partially emerging from the can, captured with sharp detail and natural colors, on a sunlit beach with soft shadows and a shallow depth of field, with blurred ocean waves in the background. The can has the text <code>BFL Diffusers</code> on it and it has a color gradient that start with #FF5733 at the top and transitions to #33FF57 at the bottom.\"</p>\n<p>image = pipe(<br> prompt=prompt,<br> generator=torch.Generator(device=device).manual_seed(42),<br> num_inference_steps=50, # 28 is a good trade-off<br> guidance_scale=4,<br>).images[0]</p>\n<p>image.save(\"flux2_t2i_nf4.png\")</p>\n<p> Flax classes are deprecated and will be removed in Diffusers v1.0.0. We recommend migrating to PyTorch classes or pinning your version of Diffusers.<br>Flax classes are deprecated and will be removed in Diffusers v1.0.0. We recommend migrating to PyTorch classes or pinning your version of Diffusers.<br>/usr/local/lib/python3.12/dist-packages/huggingface_hub/utils/_validators.py:206: UserWarning: The <code>local_dir_use_symlinks</code> argument is deprecated and ignored in <code>hf_hub_download</code>. Downloading to a local directory does not use symlinks anymore.<br> warnings.warn(<br>Download complete: 0.00/0.00 [00:00<?, ?B/s]Fetching 2 files: 100% 2/2 [00:00<00:00, 98.80it/s]Loading checkpoint shards: 100% 2/2 [00:01<00:00, 2.03it/s]Download complete: 0.00/0.00 [00:00<?, ?B/s]Fetching 4 files: 100% 4/4 [00:00<00:00, 167.57it/s]Loading weights: 100% 585/585 [00:02<00:00, 310.91it/s, Materializing param=model.vision_tower.transformer.layers.23.ffn_norm.weight]The tied weights mapping and config for this model specifies to tie model.language_model.embed_tokens.weight to lm_head.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with <code>tie_word_embeddings=False</code> to silence this warning<br>Loading pipeline components...: 100% 5/5 [00:03<00:00, 1.14it/s]---------------------------------------------------------------------------<br>OutOfMemoryError Traceback (most recent call last)<br>/tmp/ipykernel_18947/863729753.py in <cell line: 0>()<br> 22 prompt = \"Realistic macro photograph of a hermit crab using a soda can as its shell, partially emerging from the can, captured with sharp detail and natural colors, on a sunlit beach with soft shadows and a shallow depth of field, with blurred ocean waves in the background. The can has the text <code>BFL Diffusers</code> on it and it has a color gradient that start with #FF5733 at the top and transitions to #33FF57 at the bottom.\"<br> 23<br>---> 24 image = pipe(<br> 25 prompt=prompt,<br> 26 generator=torch.Generator(device=device).manual_seed(42),</p>\n<p>36 frames/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py in decorate_context(*args, **kwargs)<br> 122 # pyrefly: ignore [bad-context-manager]<br> 123 with ctx_factory():<br>--> 124 return func(*args, **kwargs)<br> 125<br> 126 return decorate_context</p>\n<p>/usr/local/lib/python3.12/dist-packages/diffusers/pipelines/flux2/pipeline_flux2.py in <strong>call</strong>(self, image, prompt, height, width, num_inference_steps, sigmas, guidance_scale, num_images_per_prompt, generator, latents, prompt_embeds, output_type, return_dict, attention_kwargs, callback_on_step_end, callback_on_step_end_tensor_inputs, max_sequence_length, text_encoder_out_layers, caption_upsample_temperature)<br> 869 prompt, images=image, temperature=caption_upsample_temperature, device=device<br> 870 )<br>--> 871 prompt_embeds, text_ids = self.encode_prompt(<br> 872 prompt=prompt,<br> 873 prompt_embeds=prompt_embeds,</p>\n<p>/usr/local/lib/python3.12/dist-packages/diffusers/pipelines/flux2/pipeline_flux2.py in encode_prompt(self, prompt, device, num_images_per_prompt, prompt_embeds, max_sequence_length, text_encoder_out_layers)<br> 586<br> 587 if prompt_embeds is None:<br>--> 588 prompt_embeds = self._get_mistral_3_small_prompt_embeds(<br> 589 text_encoder=self.text_encoder,<br> 590 tokenizer=self.tokenizer,</p>\n<p>/usr/local/lib/python3.12/dist-packages/diffusers/pipelines/flux2/pipeline_flux2.py in _get_mistral_3_small_prompt_embeds(text_encoder, tokenizer, prompt, dtype, device, max_sequence_length, system_message, hidden_states_layers)<br> 337<br> 338 # Forward pass through the model<br>--> 339 output = text_encoder(<br> 340 input_ids=input_ids,<br> 341 attention_mask=attention_mask,</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)<br> 1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]<br> 1775 else:<br>-> 1776 return self._call_impl(*args, **kwargs)<br> 1777<br> 1778 # torchrec tests the code consistency with the following code</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)<br> 1785 or _global_backward_pre_hooks or _global_backward_hooks<br> 1786 or _global_forward_hooks or _global_forward_pre_hooks):<br>-> 1787 return forward_call(*args, **kwargs)<br> 1788<br> 1789 result = None</p>\n<p>/usr/local/lib/python3.12/dist-packages/accelerate/hooks.py in new_forward(module, *args, **kwargs)<br> 190 output = module._old_forward(*args, **kwargs)<br> 191 else:<br>--> 192 output = module._old_forward(*args, **kwargs)<br> 193 return module._hf_hook.post_forward(module, output)<br> 194 </p>\n<p>/usr/local/lib/python3.12/dist-packages/transformers/utils/generic.py in wrapper(self, *args, **kwargs)<br> 1000 outputs = func(self, *args, **kwargs)<br> 1001 else:<br>-> 1002 outputs = func(self, *args, **kwargs)<br> 1003 except TypeError as original_exception:<br> 1004 # If we get a TypeError, it's possible that the model is not receiving the recordable kwargs correctly.</p>\n<p>/usr/local/lib/python3.12/dist-packages/transformers/models/mistral3/modeling_mistral3.py in forward(self, input_ids, pixel_values, attention_mask, position_ids, past_key_values, inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict, cache_position, logits_to_keep, image_sizes, **kwargs)<br> 444 return_dict = return_dict if return_dict is not None else self.config.use_return_dict<br> 445<br>--> 446 outputs = self.model(<br> 447 input_ids=input_ids,<br> 448 pixel_values=pixel_values,</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)<br> 1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]<br> 1775 else:<br>-> 1776 return self._call_impl(*args, **kwargs)<br> 1777<br> 1778 # torchrec tests the code consistency with the following code</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)<br> 1785 or _global_backward_pre_hooks or _global_backward_hooks<br> 1786 or _global_forward_hooks or _global_forward_pre_hooks):<br>-> 1787 return forward_call(*args, **kwargs)<br> 1788<br> 1789 result = None</p>\n<p>/usr/local/lib/python3.12/dist-packages/transformers/utils/generic.py in wrapper(self, *args, **kwargs)<br> 1000 outputs = func(self, *args, **kwargs)<br> 1001 else:<br>-> 1002 outputs = func(self, *args, **kwargs)<br> 1003 except TypeError as original_exception:<br> 1004 # If we get a TypeError, it's possible that the model is not receiving the recordable kwargs correctly.</p>\n<p>/usr/local/lib/python3.12/dist-packages/transformers/models/mistral3/modeling_mistral3.py in forward(self, input_ids, pixel_values, attention_mask, position_ids, past_key_values, inputs_embeds, vision_feature_layer, use_cache, output_attentions, output_hidden_states, return_dict, cache_position, image_sizes, **kwargs)<br> 323 inputs_embeds = inputs_embeds.masked_scatter(special_image_mask, image_features)<br> 324<br>--> 325 outputs = self.language_model(<br> 326 attention_mask=attention_mask,<br> 327 position_ids=position_ids,</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)<br> 1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]<br> 1775 else:<br>-> 1776 return self._call_impl(*args, **kwargs)<br> 1777<br> 1778 # torchrec tests the code consistency with the following code</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)<br> 1785 or _global_backward_pre_hooks or _global_backward_hooks<br> 1786 or _global_forward_hooks or _global_forward_pre_hooks):<br>-> 1787 return forward_call(*args, **kwargs)<br> 1788<br> 1789 result = None</p>\n<p>/usr/local/lib/python3.12/dist-packages/transformers/utils/generic.py in wrapper(self, *args, **kwargs)<br> 1000 outputs = func(self, *args, **kwargs)<br> 1001 else:<br>-> 1002 outputs = func(self, *args, **kwargs)<br> 1003 except TypeError as original_exception:<br> 1004 # If we get a TypeError, it's possible that the model is not receiving the recordable kwargs correctly.</p>\n<p>/usr/local/lib/python3.12/dist-packages/transformers/models/mistral/modeling_mistral.py in forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, use_cache, cache_position, **kwargs)<br> 395<br> 396 for decoder_layer in self.layers[: self.config.num_hidden_layers]:<br>--> 397 hidden_states = decoder_layer(<br> 398 hidden_states,<br> 399 attention_mask=causal_mask,</p>\n<p>/usr/local/lib/python3.12/dist-packages/transformers/modeling_layers.py in <strong>call</strong>(self, *args, **kwargs)<br> 91<br> 92 return self._gradient_checkpointing_func(partial(super().<strong>call</strong>, **kwargs), *args)<br>---> 93 return super().<strong>call</strong>(*args, **kwargs)<br> 94<br> 95 </p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)<br> 1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]<br> 1775 else:<br>-> 1776 return self._call_impl(*args, **kwargs)<br> 1777<br> 1778 # torchrec tests the code consistency with the following code</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)<br> 1785 or _global_backward_pre_hooks or _global_backward_hooks<br> 1786 or _global_forward_hooks or _global_forward_pre_hooks):<br>-> 1787 return forward_call(*args, **kwargs)<br> 1788<br> 1789 result = None</p>\n<p>/usr/local/lib/python3.12/dist-packages/transformers/utils/generic.py in wrapped_forward(*args, **kwargs)<br> 953 if key == \"hidden_states\" and len(collected_outputs[key]) == 0:<br> 954 collected_outputs[key] += (args[0],)<br>--> 955 output = orig_forward(*args, **kwargs)<br> 956 if not isinstance(output, tuple):<br> 957 collected_outputs[key] += (output,)</p>\n<p>/usr/local/lib/python3.12/dist-packages/transformers/models/mistral/modeling_mistral.py in forward(self, hidden_states, attention_mask, position_ids, past_key_values, use_cache, cache_position, position_embeddings, **kwargs)<br> 228 hidden_states = self.input_layernorm(hidden_states)<br> 229 # Self Attention<br>--> 230 hidden_states, _ = self.self_attn(<br> 231 hidden_states=hidden_states,<br> 232 attention_mask=attention_mask,</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)<br> 1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]<br> 1775 else:<br>-> 1776 return self._call_impl(*args, **kwargs)<br> 1777<br> 1778 # torchrec tests the code consistency with the following code</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)<br> 1785 or _global_backward_pre_hooks or _global_backward_hooks<br> 1786 or _global_forward_hooks or _global_forward_pre_hooks):<br>-> 1787 return forward_call(*args, **kwargs)<br> 1788<br> 1789 result = None</p>\n<p>/usr/local/lib/python3.12/dist-packages/transformers/models/mistral/modeling_mistral.py in forward(self, hidden_states, position_embeddings, attention_mask, past_key_values, cache_position, **kwargs)<br> 151 hidden_shape = (*input_shape, -1, self.head_dim)<br> 152<br>--> 153 query_states = self.q_proj(hidden_states).view(hidden_shape).transpose(1, 2)<br> 154 key_states = self.k_proj(hidden_states).view(hidden_shape).transpose(1, 2)<br> 155 value_states = self.v_proj(hidden_states).view(hidden_shape).transpose(1, 2)</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)<br> 1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]<br> 1775 else:<br>-> 1776 return self._call_impl(*args, **kwargs)<br> 1777<br> 1778 # torchrec tests the code consistency with the following code</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)<br> 1785 or _global_backward_pre_hooks or _global_backward_hooks<br> 1786 or _global_forward_hooks or _global_forward_pre_hooks):<br>-> 1787 return forward_call(*args, **kwargs)<br> 1788<br> 1789 result = None</p>\n<p>/usr/local/lib/python3.12/dist-packages/bitsandbytes/nn/modules.py in forward(self, x)<br> 554 weight = self.weight if getattr(quant_state, \"packing_format_for_cpu\", False) else self.weight.t()<br> 555<br>--> 556 return bnb.matmul_4bit(x, weight, bias=bias, quant_state=quant_state).to(inp_dtype)<br> 557<br> 558 </p>\n<p>/usr/local/lib/python3.12/dist-packages/bitsandbytes/autograd/_functions.py in matmul_4bit(A, B, quant_state, out, bias)<br> 399 return out<br> 400 else:<br>--> 401 return MatMul4Bit.apply(A, B, out, bias, quant_state)</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/autograd/function.py in apply(cls, *args, **kwargs)<br> 581 # See NOTE: [functorch vjp and autograd interaction]<br> 582 args = _functorch.utils.unwrap_dead_wrappers(args)<br>--> 583 return super().apply(*args, **kwargs) # type: ignore[misc]<br> 584<br> 585 if not is_setup_ctx_defined:</p>\n<p>/usr/local/lib/python3.12/dist-packages/bitsandbytes/autograd/_functions.py in forward(ctx, A, B, out, bias, quant_state)<br> 313 # 1. Dequantize<br> 314 # 2. MatmulnN<br>--> 315 output = torch.nn.functional.linear(A, F.dequantize_4bit(B, quant_state).to(A.dtype).t(), bias)<br> 316<br> 317 # 3. Save state</p>\n<p>/usr/local/lib/python3.12/dist-packages/bitsandbytes/functional.py in dequantize_4bit(A, quant_state, absmax, out, blocksize, quant_type)<br> 1048 )<br> 1049 else:<br>-> 1050 out = torch.ops.bitsandbytes.dequantize_4bit.default(<br> 1051 A,<br> 1052 absmax,</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/_ops.py in <strong>call</strong>(self, *args, **kwargs)<br> 817 # that are named \"self\". This way, all the aten ops can be called by kwargs.<br> 818 def <strong>call</strong>(self, /, *args: _P.args, **kwargs: _P.kwargs) -> _T:<br>--> 819 return self._op(*args, **kwargs)<br> 820<br> 821 # Use positional-only argument to avoid naming collision with aten ops arguments</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/_compile.py in inner(*args, **kwargs)<br> 52 fn.__dynamo_disable = disable_fn # type: ignore[attr-defined]<br> 53<br>---> 54 return disable_fn(*args, **kwargs)<br> 55<br> 56 return inner</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/_dynamo/eval_frame.py in _fn(*args, **kwargs)<br> 1179 ):<br> 1180 return fn(*args, **kwargs)<br>-> 1181 return fn(*args, **kwargs)<br> 1182 finally:<br> 1183 set_eval_frame(None)</p>\n<p>/usr/local/lib/python3.12/dist-packages/torch/library.py in func_no_dynamo(*args, **kwargs)<br> 740 <span class=\"SVELTE_PARTIAL_HYDRATER contents\" data-target=\"UserMention\" data-props=\"{"user":"torch"}\"><span class=\"inline-block\"><span class=\"contents\"><a href=\"/torch\">@<span class=\"underline\">torch</span></a></span> </span></span>._disable_dynamo<br> 741 def func_no_dynamo(*args, **kwargs):<br>--> 742 return func(*args, **kwargs)<br> 743<br> 744 for key in keys:</p>\n<p>/usr/local/lib/python3.12/dist-packages/bitsandbytes/backends/cuda/ops.py in _(A, absmax, blocksize, quant_type, shape, dtype)<br> 361 dtype: torch.dtype,<br> 362 ) -> torch.Tensor:<br>--> 363 out = torch.empty(shape, dtype=dtype, device=A.device)<br> 364 _dequantize_4bit_impl(A, absmax, blocksize, quant_type, dtype, out=out)<br> 365 return out</p>\n<p>OutOfMemoryError: CUDA out of memory. Tried to allocate 40.00 MiB. GPU 0 has a total capacity of 14.56 GiB of which 17.81 MiB is free. Including non-PyTorch memory, this process has 14.54 GiB memory in use. Of the allocated memory 14.40 GiB is allocated by PyTorch, and 15.19 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (<a href=\"https://pytorch.org/docs/stable/notes/cuda.html#environment-variables\" rel=\"nofollow\">https://pytorch.org/docs/stable/notes/cuda.html#environment-variables</a>)</p>\n","updatedAt":"2026-04-12T01:43:57.338Z","author":{"_id":"66c0e1af466dc6770ef31414","avatarUrl":"/avatars/8672f41ed26b0e7140c0117203b0ded5.svg","fullname":"dsa","name":"goodasdgood","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.4948948323726654},"editors":["goodasdgood"],"editorAvatarUrls":["/avatars/8672f41ed26b0e7140c0117203b0ded5.svg"],"reactions":[],"isReport":false}},{"id":"69db010ea455b785a0514fef","author":{"_id":"66c0e1af466dc6770ef31414","avatarUrl":"/avatars/8672f41ed26b0e7140c0117203b0ded5.svg","fullname":"dsa","name":"goodasdgood","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2026-04-12T02:18:54.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"https://github.com/huggingface/diffusers/blob/main/docs/source/en/using-diffusers/loading.md","html":"<p><a href=\"https://github.com/huggingface/diffusers/blob/main/docs/source/en/using-diffusers/loading.md\" rel=\"nofollow\">https://github.com/huggingface/diffusers/blob/main/docs/source/en/using-diffusers/loading.md</a></p>\n","updatedAt":"2026-04-12T02:18:54.570Z","author":{"_id":"66c0e1af466dc6770ef31414","avatarUrl":"/avatars/8672f41ed26b0e7140c0117203b0ded5.svg","fullname":"dsa","name":"goodasdgood","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5550499558448792},"editors":["goodasdgood"],"editorAvatarUrls":["/avatars/8672f41ed26b0e7140c0117203b0ded5.svg"],"reactions":[],"isReport":false}}],"status":"open","isReport":false,"pinned":false,"locked":false,"collection":"community_blogs"},"contextAuthors":["YiYiXu","dg845","sayakpaul","OzzyGT","dn6","ariG23498","linoyts","multimodalart"],"primaryEmailConfirmed":false,"discussionRole":0,"acceptLanguages":["en"],"withThread":true,"cardDisplay":false,"repoDiscussionsLocked":false}">
amazing! great work! 👏
is there a support for multi-gpus? (device_map=auto)
What's that supposed to mean?
Hi, can you tell me a bit about your motivations behind omitting all bias parameters in network architecture? Thanks!
That's a question for the Black Forest Labs team, not us.
This comment has been hidden Amazing work! Can you tell me when the depth-maps model will be released?
Has anyone already tried giving a depth map as a normal image? How does the model behave?
Probably an installation error?
pip install git+https://github.com/huggingface/diffusers -U should help you with this.
Hi, How can I deploy a text encoder privately?
how run flux on two gpu
code????
import torch
from transformers import Mistral3ForConditionalGeneration
from diffusers import Flux2Pipeline, Flux2Transformer2DModel
repo_id = "diffusers/FLUX.2-dev-bnb-4bit"
device = "cuda:0"
torch_dtype = torch.bfloat16
transformer = Flux2Transformer2DModel.from_pretrained(
repo_id, subfolder="transformer", torch_dtype=torch_dtype, device_map="cpu"
)
text_encoder = Mistral3ForConditionalGeneration.from_pretrained(
repo_id, subfolder="text_encoder", dtype=torch_dtype, device_map="cpu"
)
pipe = Flux2Pipeline.from_pretrained(
repo_id, transformer=transformer, text_encoder=text_encoder, torch_dtype=torch_dtype
)
pipe.enable_model_cpu_offload()
prompt = "Realistic macro photograph of a hermit crab using a soda can as its shell, partially emerging from the can, captured with sharp detail and natural colors, on a sunlit beach with soft shadows and a shallow depth of field, with blurred ocean waves in the background. The can has the text BFL Diffusers on it and it has a color gradient that start with #FF5733 at the top and transitions to #33FF57 at the bottom."
image = pipe(
prompt=prompt,
generator=torch.Generator(device=device).manual_seed(42),
num_inference_steps=50, # 28 is a good trade-off
guidance_scale=4,
).images[0]
image.save("flux2_t2i_nf4.png")
Flax classes are deprecated and will be removed in Diffusers v1.0.0. We recommend migrating to PyTorch classes or pinning your version of Diffusers.
Flax classes are deprecated and will be removed in Diffusers v1.0.0. We recommend migrating to PyTorch classes or pinning your version of Diffusers.
/usr/local/lib/python3.12/dist-packages/huggingface_hub/utils/_validators.py:206: UserWarning: The local_dir_use_symlinks argument is deprecated and ignored in hf_hub_download. Downloading to a local directory does not use symlinks anymore.
warnings.warn(
Download complete: 0.00/0.00 [00:00<?, ?B/s]Fetching 2 files: 100% 2/2 [00:00<00:00, 98.80it/s]Loading checkpoint shards: 100% 2/2 [00:01<00:00, 2.03it/s]Download complete: 0.00/0.00 [00:00<?, ?B/s]Fetching 4 files: 100% 4/4 [00:00<00:00, 167.57it/s]Loading weights: 100% 585/585 [00:02<00:00, 310.91it/s, Materializing param=model.vision_tower.transformer.layers.23.ffn_norm.weight]The tied weights mapping and config for this model specifies to tie model.language_model.embed_tokens.weight to lm_head.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with tie_word_embeddings=False to silence this warning
Loading pipeline components...: 100% 5/5 [00:03<00:00, 1.14it/s]---------------------------------------------------------------------------
OutOfMemoryError Traceback (most recent call last)
/tmp/ipykernel_18947/863729753.py in <cell line: 0>()
22 prompt = "Realistic macro photograph of a hermit crab using a soda can as its shell, partially emerging from the can, captured with sharp detail and natural colors, on a sunlit beach with soft shadows and a shallow depth of field, with blurred ocean waves in the background. The can has the text BFL Diffusers on it and it has a color gradient that start with #FF5733 at the top and transitions to #33FF57 at the bottom."
23
---> 24 image = pipe(
25 prompt=prompt,
26 generator=torch.Generator(device=device).manual_seed(42),
36 frames/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py in decorate_context(*args, **kwargs)
122 # pyrefly: ignore [bad-context-manager]
123 with ctx_factory():
--> 124 return func(*args, **kwargs)
125
126 return decorate_context
/usr/local/lib/python3.12/dist-packages/diffusers/pipelines/flux2/pipeline_flux2.py in call(self, image, prompt, height, width, num_inference_steps, sigmas, guidance_scale, num_images_per_prompt, generator, latents, prompt_embeds, output_type, return_dict, attention_kwargs, callback_on_step_end, callback_on_step_end_tensor_inputs, max_sequence_length, text_encoder_out_layers, caption_upsample_temperature)
869 prompt, images=image, temperature=caption_upsample_temperature, device=device
870 )
--> 871 prompt_embeds, text_ids = self.encode_prompt(
872 prompt=prompt,
873 prompt_embeds=prompt_embeds,
/usr/local/lib/python3.12/dist-packages/diffusers/pipelines/flux2/pipeline_flux2.py in encode_prompt(self, prompt, device, num_images_per_prompt, prompt_embeds, max_sequence_length, text_encoder_out_layers)
586
587 if prompt_embeds is None:
--> 588 prompt_embeds = self._get_mistral_3_small_prompt_embeds(
589 text_encoder=self.text_encoder,
590 tokenizer=self.tokenizer,
/usr/local/lib/python3.12/dist-packages/diffusers/pipelines/flux2/pipeline_flux2.py in _get_mistral_3_small_prompt_embeds(text_encoder, tokenizer, prompt, dtype, device, max_sequence_length, system_message, hidden_states_layers)
337
338 # Forward pass through the model
--> 339 output = text_encoder(
340 input_ids=input_ids,
341 attention_mask=attention_mask,
/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)
1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1775 else:
-> 1776 return self._call_impl(*args, **kwargs)
1777
1778 # torchrec tests the code consistency with the following code
/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
1785 or _global_backward_pre_hooks or _global_backward_hooks
1786 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1787 return forward_call(*args, **kwargs)
1788
1789 result = None
/usr/local/lib/python3.12/dist-packages/accelerate/hooks.py in new_forward(module, *args, **kwargs)
190 output = module._old_forward(*args, **kwargs)
191 else:
--> 192 output = module._old_forward(*args, **kwargs)
193 return module._hf_hook.post_forward(module, output)
194
/usr/local/lib/python3.12/dist-packages/transformers/utils/generic.py in wrapper(self, *args, **kwargs)
1000 outputs = func(self, *args, **kwargs)
1001 else:
-> 1002 outputs = func(self, *args, **kwargs)
1003 except TypeError as original_exception:
1004 # If we get a TypeError, it's possible that the model is not receiving the recordable kwargs correctly.
/usr/local/lib/python3.12/dist-packages/transformers/models/mistral3/modeling_mistral3.py in forward(self, input_ids, pixel_values, attention_mask, position_ids, past_key_values, inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict, cache_position, logits_to_keep, image_sizes, **kwargs)
444 return_dict = return_dict if return_dict is not None else self.config.use_return_dict
445
--> 446 outputs = self.model(
447 input_ids=input_ids,
448 pixel_values=pixel_values,
/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)
1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1775 else:
-> 1776 return self._call_impl(*args, **kwargs)
1777
1778 # torchrec tests the code consistency with the following code
/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
1785 or _global_backward_pre_hooks or _global_backward_hooks
1786 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1787 return forward_call(*args, **kwargs)
1788
1789 result = None
/usr/local/lib/python3.12/dist-packages/transformers/utils/generic.py in wrapper(self, *args, **kwargs)
1000 outputs = func(self, *args, **kwargs)
1001 else:
-> 1002 outputs = func(self, *args, **kwargs)
1003 except TypeError as original_exception:
1004 # If we get a TypeError, it's possible that the model is not receiving the recordable kwargs correctly.
/usr/local/lib/python3.12/dist-packages/transformers/models/mistral3/modeling_mistral3.py in forward(self, input_ids, pixel_values, attention_mask, position_ids, past_key_values, inputs_embeds, vision_feature_layer, use_cache, output_attentions, output_hidden_states, return_dict, cache_position, image_sizes, **kwargs)
323 inputs_embeds = inputs_embeds.masked_scatter(special_image_mask, image_features)
324
--> 325 outputs = self.language_model(
326 attention_mask=attention_mask,
327 position_ids=position_ids,
/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)
1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1775 else:
-> 1776 return self._call_impl(*args, **kwargs)
1777
1778 # torchrec tests the code consistency with the following code
/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
1785 or _global_backward_pre_hooks or _global_backward_hooks
1786 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1787 return forward_call(*args, **kwargs)
1788
1789 result = None
/usr/local/lib/python3.12/dist-packages/transformers/utils/generic.py in wrapper(self, *args, **kwargs)
1000 outputs = func(self, *args, **kwargs)
1001 else:
-> 1002 outputs = func(self, *args, **kwargs)
1003 except TypeError as original_exception:
1004 # If we get a TypeError, it's possible that the model is not receiving the recordable kwargs correctly.
/usr/local/lib/python3.12/dist-packages/transformers/models/mistral/modeling_mistral.py in forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, use_cache, cache_position, **kwargs)
395
396 for decoder_layer in self.layers[: self.config.num_hidden_layers]:
--> 397 hidden_states = decoder_layer(
398 hidden_states,
399 attention_mask=causal_mask,
/usr/local/lib/python3.12/dist-packages/transformers/modeling_layers.py in call(self, *args, **kwargs)
91
92 return self._gradient_checkpointing_func(partial(super().call, **kwargs), *args)
---> 93 return super().call(*args, **kwargs)
94
95
/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)
1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1775 else:
-> 1776 return self._call_impl(*args, **kwargs)
1777
1778 # torchrec tests the code consistency with the following code
/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
1785 or _global_backward_pre_hooks or _global_backward_hooks
1786 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1787 return forward_call(*args, **kwargs)
1788
1789 result = None
/usr/local/lib/python3.12/dist-packages/transformers/utils/generic.py in wrapped_forward(*args, **kwargs)
953 if key == "hidden_states" and len(collected_outputs[key]) == 0:
954 collected_outputs[key] += (args[0],)
--> 955 output = orig_forward(*args, **kwargs)
956 if not isinstance(output, tuple):
957 collected_outputs[key] += (output,)
/usr/local/lib/python3.12/dist-packages/transformers/models/mistral/modeling_mistral.py in forward(self, hidden_states, attention_mask, position_ids, past_key_values, use_cache, cache_position, position_embeddings, **kwargs)
228 hidden_states = self.input_layernorm(hidden_states)
229 # Self Attention
--> 230 hidden_states, _ = self.self_attn(
231 hidden_states=hidden_states,
232 attention_mask=attention_mask,
/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)
1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1775 else:
-> 1776 return self._call_impl(*args, **kwargs)
1777
1778 # torchrec tests the code consistency with the following code
/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
1785 or _global_backward_pre_hooks or _global_backward_hooks
1786 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1787 return forward_call(*args, **kwargs)
1788
1789 result = None
/usr/local/lib/python3.12/dist-packages/transformers/models/mistral/modeling_mistral.py in forward(self, hidden_states, position_embeddings, attention_mask, past_key_values, cache_position, **kwargs)
151 hidden_shape = (*input_shape, -1, self.head_dim)
152
--> 153 query_states = self.q_proj(hidden_states).view(hidden_shape).transpose(1, 2)
154 key_states = self.k_proj(hidden_states).view(hidden_shape).transpose(1, 2)
155 value_states = self.v_proj(hidden_states).view(hidden_shape).transpose(1, 2)
/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, **kwargs)
1774 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1775 else:
-> 1776 return self._call_impl(*args, **kwargs)
1777
1778 # torchrec tests the code consistency with the following code
/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
1785 or _global_backward_pre_hooks or _global_backward_hooks
1786 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1787 return forward_call(*args, **kwargs)
1788
1789 result = None
/usr/local/lib/python3.12/dist-packages/bitsandbytes/nn/modules.py in forward(self, x)
554 weight = self.weight if getattr(quant_state, "packing_format_for_cpu", False) else self.weight.t()
555
--> 556 return bnb.matmul_4bit(x, weight, bias=bias, quant_state=quant_state).to(inp_dtype)
557
558
/usr/local/lib/python3.12/dist-packages/bitsandbytes/autograd/_functions.py in matmul_4bit(A, B, quant_state, out, bias)
399 return out
400 else:
--> 401 return MatMul4Bit.apply(A, B, out, bias, quant_state)
/usr/local/lib/python3.12/dist-packages/torch/autograd/function.py in apply(cls, *args, **kwargs)
581 # See NOTE: [functorch vjp and autograd interaction]
582 args = _functorch.utils.unwrap_dead_wrappers(args)
--> 583 return super().apply(*args, **kwargs) # type: ignore[misc]
584
585 if not is_setup_ctx_defined:
/usr/local/lib/python3.12/dist-packages/bitsandbytes/autograd/_functions.py in forward(ctx, A, B, out, bias, quant_state)
313 # 1. Dequantize
314 # 2. MatmulnN
--> 315 output = torch.nn.functional.linear(A, F.dequantize_4bit(B, quant_state).to(A.dtype).t(), bias)
316
317 # 3. Save state
/usr/local/lib/python3.12/dist-packages/bitsandbytes/functional.py in dequantize_4bit(A, quant_state, absmax, out, blocksize, quant_type)
1048 )
1049 else:
-> 1050 out = torch.ops.bitsandbytes.dequantize_4bit.default(
1051 A,
1052 absmax,
/usr/local/lib/python3.12/dist-packages/torch/_ops.py in call(self, *args, **kwargs)
817 # that are named "self". This way, all the aten ops can be called by kwargs.
818 def call(self, /, *args: _P.args, **kwargs: _P.kwargs) -> _T:
--> 819 return self._op(*args, **kwargs)
820
821 # Use positional-only argument to avoid naming collision with aten ops arguments
/usr/local/lib/python3.12/dist-packages/torch/_compile.py in inner(*args, **kwargs)
52 fn.__dynamo_disable = disable_fn # type: ignore[attr-defined]
53
---> 54 return disable_fn(*args, **kwargs)
55
56 return inner
/usr/local/lib/python3.12/dist-packages/torch/_dynamo/eval_frame.py in _fn(*args, **kwargs)
1179 ):
1180 return fn(*args, **kwargs)
-> 1181 return fn(*args, **kwargs)
1182 finally:
1183 set_eval_frame(None)
/usr/local/lib/python3.12/dist-packages/torch/library.py in func_no_dynamo(*args, **kwargs)
740 @torch ._disable_dynamo
741 def func_no_dynamo(*args, **kwargs):
--> 742 return func(*args, **kwargs)
743
744 for key in keys:
/usr/local/lib/python3.12/dist-packages/bitsandbytes/backends/cuda/ops.py in _(A, absmax, blocksize, quant_type, shape, dtype)
361 dtype: torch.dtype,
362 ) -> torch.Tensor:
--> 363 out = torch.empty(shape, dtype=dtype, device=A.device)
364 _dequantize_4bit_impl(A, absmax, blocksize, quant_type, dtype, out=out)
365 return out
OutOfMemoryError: CUDA out of memory. Tried to allocate 40.00 MiB. GPU 0 has a total capacity of 14.56 GiB of which 17.81 MiB is free. Including non-PyTorch memory, this process has 14.54 GiB memory in use. Of the allocated memory 14.40 GiB is allocated by PyTorch, and 15.19 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.