Hugging Face Daily Papers · · 5 min read

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

<strong>BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation</strong></p>\n<p>BrainG3N is a controllable generator for 3D brain MRI built on top of a strong self-supervised foundation encoder. A frozen 3D MAE encoder (pretrained on 35,309 volumes across 18 cohorts, 4 modalities, 200+ sites) produces clinically informative embeddings; a conditional flow-matching DiT then generates new scans directly in that space, and a fine-tuned CNN decoder maps them back to voxels.</p>\n<p>Because generation happens in a clinically grounded latent space, the synthetic scans actually carry the conditioned attributes:</p>\n<ul>\n<li>Controllable along disease, age, sex, modality, site, and IDH1 status, plus patient-specific longitudinal forecasting.</li>\n<li>Real-data clinical probes recover the requested attribute from generated scans (age Pearson r=0.93).</li>\n<li>The same frozen encoder is a strong foundation model in its own right — beats/matches BrainIAC, BrainSegFounder, and MedicalNet on 21/23 linear-probing tasks (IDH1 AUC 0.937, brain-age MAE 4.43 y), with no fine-tuning.</li>\n</ul>\n<p>Useful for augmenting under-represented cohorts, counterfactual disease trajectories (\"what would this patient look like under disease X\"), and privacy-preserving synthetic data sharing.</p>\n<p>📄 <a href=\"https://arxiv.org/abs/2606.19651\" rel=\"nofollow\">https://arxiv.org/abs/2606.19651</a> — model, code, and synthetic dataset coming soon.</p>\n<p>Feel free to reach out! 🤗</p>\n","updatedAt":"2026-06-22T03:27:31.676Z","author":{"_id":"68210ad4f29d70e1cccc86be","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/peHjJRYBM1-tjTORiyASd.png","fullname":"Max Van Puyvelde","name":"mxvp","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8025298118591309},"editors":["mxvp"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/peHjJRYBM1-tjTORiyASd.png"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.19651","authors":[{"_id":"6a38a67adb23715e9da13951","name":"Max Van Puyvelde","hidden":false},{"_id":"6a38a67adb23715e9da13952","name":"Ibrahim Gulluk","hidden":false},{"_id":"6a38a67adb23715e9da13953","name":"Wim Van Criekinge","hidden":false},{"_id":"6a38a67adb23715e9da13954","name":"Olivier Gevaert","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/68210ad4f29d70e1cccc86be/tXkV91iqAf9AaRH--cp2U.png","https://cdn-uploads.huggingface.co/production/uploads/68210ad4f29d70e1cccc86be/MzFvRK-3yreQRdR_VA1Nh.png"],"publishedAt":"2026-06-17T00:00:00.000Z","submittedOnDailyAt":"2026-06-22T00:00:00.000Z","title":"BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation","submittedOnDailyBy":{"_id":"68210ad4f29d70e1cccc86be","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/peHjJRYBM1-tjTORiyASd.png","isPro":false,"fullname":"Max Van Puyvelde","user":"mxvp","type":"user","name":"mxvp"},"summary":"Three-dimensional (3D) brain MRI is central to clinical neurology and neuro-oncology, where generative models could augment under-represented cohorts, simulate disease trajectories, and support privacy-preserving data sharing. Latent diffusion has been the go-to solution for modeling imaging data, but it places two competing demands on the tokenizer: encoder embeddings must retain the clinical information that downstream tasks act on, and the decoder must reconstruct anatomically faithful volumes. Existing reconstruction-driven tokenizers achieve the second at the expense of the first. To address this, we introduce a fully volumetric masked-autoencoder (MAE) based tokenizer for 3D brain MRI latent diffusion, decoupling encoder and decoder: a frozen 3D MAE encoder produces clinically informative embeddings, while a dedicated CNN decoder reconstructs voxels from a linear projection of those embeddings. We pretrain the encoder on 35,309 volumes from 18 public cohorts spanning four modalities, ten disease categories, and 200+ acquisition sites, and demonstrate its dual utility in two settings. First, on a 23-task linear-probing benchmark, the encoder outperforms or matches SOTA models (i.e., BrainIAC, BrainSegFounder, and MedicalNet) on 21 of 23 tasks. Second, a conditional diffusion transformer (DiT) trained on these clinically informative embeddings supports both conditional generation across six variables and patient-specific longitudinal forecasting. Together these results establish a single 3D brain-MRI embedding space capable of both downstream clinical tasks and controllable generation.","upvotes":4,"discussionId":"6a38a67adb23715e9da13955","ai_summary":"A 3D brain MRI generative model uses a masked-autoencoder tokenizer to create clinically informative embeddings that support both medical task performance and controlled image generation.","ai_keywords":["masked-autoencoder","latent diffusion","volumetric MRI","clinical information","encoder-decoder architecture","frozen encoder","CNN decoder","linear projection","diffusion transformer","conditional generation","longitudinal forecasting"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","organization":{"_id":"675ffd236bd9cde5e41f5f29","name":"gevaertlab","fullname":"Gevaert Lab","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/68210ad4f29d70e1cccc86be/Hw9Y8NiSXvpbXbR8m5Kmm.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"68210ad4f29d70e1cccc86be","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/peHjJRYBM1-tjTORiyASd.png","isPro":false,"fullname":"Max Van Puyvelde","user":"mxvp","type":"user"},{"_id":"666ddefe83571a7a05af7870","avatarUrl":"/avatars/74b88573973b5508e47f5af4044b14a6.svg","isPro":false,"fullname":"Halil Ibrahim Gulluk","user":"gulluk","type":"user"},{"_id":"6a38c52263c2e2e9905c4c62","avatarUrl":"/avatars/18a6e156378a28cd13eca235079e86ef.svg","isPro":false,"fullname":"Stéphanie Jolie","user":"stjolie","type":"user"},{"_id":"61af81009f77f7b669578f95","avatarUrl":"/avatars/fb50773ac49948940eb231834ee6f2fd.svg","isPro":false,"fullname":"rotem israeli","user":"irotem98","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":2,"organization":{"_id":"675ffd236bd9cde5e41f5f29","name":"gevaertlab","fullname":"Gevaert Lab","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/68210ad4f29d70e1cccc86be/Hw9Y8NiSXvpbXbR8m5Kmm.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.19651.md","query":{}}">
Papers
arxiv:2606.19651

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

Published on Jun 17
· Submitted by
Max Van Puyvelde
on Jun 22
#2 Paper of the day
Authors:
,
,
,

Abstract

A 3D brain MRI generative model uses a masked-autoencoder tokenizer to create clinically informative embeddings that support both medical task performance and controlled image generation.

Three-dimensional (3D) brain MRI is central to clinical neurology and neuro-oncology, where generative models could augment under-represented cohorts, simulate disease trajectories, and support privacy-preserving data sharing. Latent diffusion has been the go-to solution for modeling imaging data, but it places two competing demands on the tokenizer: encoder embeddings must retain the clinical information that downstream tasks act on, and the decoder must reconstruct anatomically faithful volumes. Existing reconstruction-driven tokenizers achieve the second at the expense of the first. To address this, we introduce a fully volumetric masked-autoencoder (MAE) based tokenizer for 3D brain MRI latent diffusion, decoupling encoder and decoder: a frozen 3D MAE encoder produces clinically informative embeddings, while a dedicated CNN decoder reconstructs voxels from a linear projection of those embeddings. We pretrain the encoder on 35,309 volumes from 18 public cohorts spanning four modalities, ten disease categories, and 200+ acquisition sites, and demonstrate its dual utility in two settings. First, on a 23-task linear-probing benchmark, the encoder outperforms or matches SOTA models (i.e., BrainIAC, BrainSegFounder, and MedicalNet) on 21 of 23 tasks. Second, a conditional diffusion transformer (DiT) trained on these clinically informative embeddings supports both conditional generation across six variables and patient-specific longitudinal forecasting. Together these results establish a single 3D brain-MRI embedding space capable of both downstream clinical tasks and controllable generation.

Community

Paper submitter about 5 hours ago

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

BrainG3N is a controllable generator for 3D brain MRI built on top of a strong self-supervised foundation encoder. A frozen 3D MAE encoder (pretrained on 35,309 volumes across 18 cohorts, 4 modalities, 200+ sites) produces clinically informative embeddings; a conditional flow-matching DiT then generates new scans directly in that space, and a fine-tuned CNN decoder maps them back to voxels.

Because generation happens in a clinically grounded latent space, the synthetic scans actually carry the conditioned attributes:

  • Controllable along disease, age, sex, modality, site, and IDH1 status, plus patient-specific longitudinal forecasting.
  • Real-data clinical probes recover the requested attribute from generated scans (age Pearson r=0.93).
  • The same frozen encoder is a strong foundation model in its own right — beats/matches BrainIAC, BrainSegFounder, and MedicalNet on 21/23 linear-probing tasks (IDH1 AUC 0.937, brain-age MAE 4.43 y), with no fine-tuning.

Useful for augmenting under-represented cohorts, counterfactual disease trajectories ("what would this patient look like under disease X"), and privacy-preserving synthetic data sharing.

📄 https://arxiv.org/abs/2606.19651 — model, code, and synthetic dataset coming soon.

Feel free to reach out! 🤗

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.19651
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.19651 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.19651 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.19651 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers