\n\t<a id=\"overview\" class=\"block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full\" href=\"#overview\" rel=\"nofollow\">\n\t\t<span class=\"header-link\"><svg class=\"text-gray-500 hover:text-black dark:hover:text-gray-200 w-4\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" aria-hidden=\"true\" role=\"img\" width=\"1em\" height=\"1em\" preserveAspectRatio=\"xMidYMid meet\" viewBox=\"0 0 256 256\"><path d=\"M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z\" fill=\"currentColor\"></path></svg></span>\n\t</a>\n\t<span>\n\t\tOverview\n\t</span>\n</h2>\n<p><strong>MDMF</strong> (<em>Micro-Defects expose Macro-Fakes</em>) reframes AI-generated image detection as a <em>local distributional</em> problem rather than an <em>image-level classification</em> one. Instead of compressing each image into a single representation that tends to over-rely on global semantics, MDMF treats every image as a collection of patches, projects each patch into a learnable forensic latent space — the <em>Patch Forensic Signature</em> (PFS) — and measures the distributional discrepancy between the test image and a small reference bank of clean real images via a Maximum Mean Discrepancy (MMD) score.</p>\n<h2 class=\"relative group flex items-baseline\">\n\t<a id=\"key-highlights\" class=\"block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full\" href=\"#key-highlights\" rel=\"nofollow\">\n\t\t<span class=\"header-link\"><svg class=\"text-gray-500 hover:text-black dark:hover:text-gray-200 w-4\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" aria-hidden=\"true\" role=\"img\" width=\"1em\" height=\"1em\" preserveAspectRatio=\"xMidYMid meet\" viewBox=\"0 0 256 256\"><path d=\"M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z\" fill=\"currentColor\"></path></svg></span>\n\t</a>\n\t<span>\n\t\tKey Highlights\n\t</span>\n</h2>\n<ul>\n<li><strong>Patch Forensic Signature (PFS)</strong> — A learnable forensic reparameterization of frozen DINOv2 patch tokens that suppresses semantic invariances and amplifies generation-induced statistical irregularities.</li>\n<li><strong>MDMF detector</strong> — A distribution-aware detection framework that aggregates patch-level evidence into a stable image-level score via MMD between PFS distributions, avoiding the per-patch decision boundary that destabilizes hard-voting baselines.</li>\n<li><strong>Theory-grounded</strong> — We prove that patch-wise PFS modeling yields a provably larger MMD signal than global pooling whenever localized forensic cues are present, and we derive a finite-sample separation guarantee with a finite optimal patch count $K^\\star$.</li>\n<li><strong>Strong cross-generator generalization</strong> — Trained on a single 4-class ProGAN split, MDMF reaches <strong>95.65</strong> average AUROC on the ImageNet benchmark (9 generators spanning diffusion, GAN, and AR families) and remains state-of-the-art on LSUN-Bedroom, GenImage, WildRF, LDMFakeDetect, and an OpenSora video-frame stress test, with markedly gentler degradation under JPEG, blur, and noise post-processing than the strongest training-based baseline.</li>\n</ul>\n","updatedAt":"2026-05-13T18:47:40.278Z","author":{"_id":"68c4e7907daa73025f2b15ae","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/oabWp6ggSVwn5tLP4IMdz.jpeg","fullname":"Boxuan Zhang","name":"ZBox008003","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.8142094016075134},"editors":["ZBox008003"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/oabWp6ggSVwn5tLP4IMdz.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.09296","authors":[{"_id":"6a04c678b1a8cbabc9f08588","name":"Boxuan Zhang","hidden":false},{"_id":"6a04c678b1a8cbabc9f08589","name":"Jianing Zhu","hidden":false},{"_id":"6a04c678b1a8cbabc9f0858a","name":"Qifan Wang","hidden":false},{"_id":"6a04c678b1a8cbabc9f0858b","name":"Jiang Liu","hidden":false},{"_id":"6a04c678b1a8cbabc9f0858c","name":"Ruixiang Tang","hidden":false}],"publishedAt":"2026-05-10T00:00:00.000Z","submittedOnDailyAt":"2026-05-13T00:00:00.000Z","title":"Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts","submittedOnDailyBy":{"_id":"68c4e7907daa73025f2b15ae","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/oabWp6ggSVwn5tLP4IMdz.jpeg","isPro":false,"fullname":"Boxuan Zhang","user":"ZBox008003","type":"user","name":"ZBox008003"},"summary":"Recent generative models can produce images that appear highly realistic, raising challenges in distinguishing real and AI-generated images. Yet existing detectors based on pre-trained feature extractors tend to over-rely on global semantics, limiting sensitivity to the critical micro-defects. In this work, we propose Micro-Defects expose Macro-Fakes (MDMF), a local distribution-aware detection framework that amplifies micro-scale statistical irregularities into macro-level distributional discrepancies. To avoid localized forensic cues being diluted by plain aggregation, we introduce a learnable Patch Forensic Signature that projects semantic patch embeddings into a compact forensic latent space. We then use Maximum Mean Discrepancy (MMD) to quantify distributional discrepancies between generated and real images. Our theory-grounded analysis shows that patch-wise modeling yields provably larger discrepancies when localized forensic signals are present in generated images, enabling more reliable separation from real images. Extensive experiments demonstrate that MDMF consistently outperforms baseline detectors across multiple benchmarks, validating its general effectiveness. Project page: https://zbox1005.github.io/MDMF-project/","upvotes":2,"discussionId":"6a04c678b1a8cbabc9f0858d","projectPage":"https://zbox1005.github.io/MDMF-project/","githubRepo":"https://github.com/ZBox1005/MDMF","githubRepoAddedBy":"user","ai_summary":"A local distribution-aware detection framework that amplifies micro-scale statistical irregularities to identify AI-generated images with improved accuracy.","ai_keywords":["generative models","pre-trained feature extractors","global semantics","micro-defects","local distribution-aware detection","Patch Forensic Signature","forensic latent space","Maximum Mean Discrepancy","MMD","patch-wise modeling","distributional discrepancies"],"githubStars":2,"organization":{"_id":"69cb3e878aafeb00b1e64143","name":"RutgersU","fullname":"Rutgers University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/69cb3e2fa6964a91fe8c2c97/1SJLGjmcRZ5xHSjK_K6dW.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"68c4e7907daa73025f2b15ae","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/oabWp6ggSVwn5tLP4IMdz.jpeg","isPro":false,"fullname":"Boxuan Zhang","user":"ZBox008003","type":"user"},{"_id":"66d413232b6001518c8ba6fb","avatarUrl":"/avatars/7ff86107f7156de6bda538798000676d.svg","isPro":false,"fullname":"Boxuan Zhang","user":"ZBox0803","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"69cb3e878aafeb00b1e64143","name":"RutgersU","fullname":"Rutgers University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/69cb3e2fa6964a91fe8c2c97/1SJLGjmcRZ5xHSjK_K6dW.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.09296.md"}">
Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts
Abstract
A local distribution-aware detection framework that amplifies micro-scale statistical irregularities to identify AI-generated images with improved accuracy.
AI-generated summary
Recent generative models can produce images that appear highly realistic, raising challenges in distinguishing real and AI-generated images. Yet existing detectors based on pre-trained feature extractors tend to over-rely on global semantics, limiting sensitivity to the critical micro-defects. In this work, we propose Micro-Defects expose Macro-Fakes (MDMF), a local distribution-aware detection framework that amplifies micro-scale statistical irregularities into macro-level distributional discrepancies. To avoid localized forensic cues being diluted by plain aggregation, we introduce a learnable Patch Forensic Signature that projects semantic patch embeddings into a compact forensic latent space. We then use Maximum Mean Discrepancy (MMD) to quantify distributional discrepancies between generated and real images. Our theory-grounded analysis shows that patch-wise modeling yields provably larger discrepancies when localized forensic signals are present in generated images, enabling more reliable separation from real images. Extensive experiments demonstrate that MDMF consistently outperforms baseline detectors across multiple benchmarks, validating its general effectiveness. Project page: https://zbox1005.github.io/MDMF-project/
Community
Overview
MDMF (Micro-Defects expose Macro-Fakes) reframes AI-generated image detection as a local distributional problem rather than an image-level classification one. Instead of compressing each image into a single representation that tends to over-rely on global semantics, MDMF treats every image as a collection of patches, projects each patch into a learnable forensic latent space — the Patch Forensic Signature (PFS) — and measures the distributional discrepancy between the test image and a small reference bank of clean real images via a Maximum Mean Discrepancy (MMD) score.
Key Highlights
- Patch Forensic Signature (PFS) — A learnable forensic reparameterization of frozen DINOv2 patch tokens that suppresses semantic invariances and amplifies generation-induced statistical irregularities.
- MDMF detector — A distribution-aware detection framework that aggregates patch-level evidence into a stable image-level score via MMD between PFS distributions, avoiding the per-patch decision boundary that destabilizes hard-voting baselines.
- Theory-grounded — We prove that patch-wise PFS modeling yields a provably larger MMD signal than global pooling whenever localized forensic cues are present, and we derive a finite-sample separation guarantee with a finite optimal patch count $K^\star$.
- Strong cross-generator generalization — Trained on a single 4-class ProGAN split, MDMF reaches 95.65 average AUROC on the ImageNet benchmark (9 generators spanning diffusion, GAN, and AR families) and remains state-of-the-art on LSUN-Bedroom, GenImage, WildRF, LDMFakeDetect, and an OpenSora video-frame stress test, with markedly gentler degradation under JPEG, blur, and noise post-processing than the strongest training-based baseline.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.09296 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.09296 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.09296 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.