Hugging Face Daily Papers · June 24, 2026 · 4 min read

QG-MIL: A Gated Transformer Aggregator for Domain-Agnostic Multiple Instance Learning in Medical Imaging

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

Hi everyone, thanks for checking out our paper!\nIn QG-MIL, we study the attention concentration problem in medical Multiple Instance Learning, where attention-based aggregators can collapse onto a small subset of instances and produce unstable or overconfident predictions.\nOur goal was to design a simple drop-in MIL aggregator that mitigates this behavior architecturally, without auxiliary losses, masking strategies, or multi-stage training. QG-MIL combines RMSNorm pre-normalization, per-head QK normalization, attention-output gating, and SwiGLU-style feed-forward layers, and we evaluate it across pathology and hematology benchmarks with different bag sizes and feature extractors.\nHappy to discuss the method, limitations, ablations, or possible extensions!\n","updatedAt":"2026-06-24T09:16:43.901Z","author":{"_id":"67bed13027ea9754039d63c3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/JHlZXuwcY0y_estgKPkPT.jpeg","fullname":"Luca Zedda","name":"Snarcy","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":4,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8614324927330017},"editors":["Snarcy"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/JHlZXuwcY0y_estgKPkPT.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.20027","authors":[{"_id":"6a3b9f735ac9fb074498492f","name":"Luca Zedda","hidden":false},{"_id":"6a3b9f735ac9fb0744984930","name":"Davide Antonio Mura","hidden":false},{"_id":"6a3b9f735ac9fb0744984931","name":"Cecilia Di Ruberto","hidden":false},{"_id":"6a3b9f735ac9fb0744984932","name":"Maurizio Atzori","hidden":false},{"_id":"6a3b9f735ac9fb0744984933","name":"Muhammed Furkan Dasdelen","hidden":false},{"_id":"6a3b9f735ac9fb0744984934","name":"Carsten Marr","hidden":false},{"_id":"6a3b9f735ac9fb0744984935","name":"Andrea Loddo","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/67bed13027ea9754039d63c3/T7vDCscoBN3IkPA3ZaP0p.png"],"publishedAt":"2026-06-18T00:00:00.000Z","submittedOnDailyAt":"2026-06-24T00:00:00.000Z","title":"QG-MIL: A Gated Transformer Aggregator for Domain-Agnostic Multiple Instance Learning in Medical Imaging","submittedOnDailyBy":{"_id":"67bed13027ea9754039d63c3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/JHlZXuwcY0y_estgKPkPT.jpeg","isPro":false,"fullname":"Luca Zedda","user":"Snarcy","type":"user","name":"Snarcy"},"summary":"Attention-based Multiple Instance Learning aggregators in medical imaging are prone to attention concentration, producing overconfident and unstable predictions. We introduce QG-MIL, a gated transformer aggregator that addresses this through four synergistic architectural components: RMSNorm-based pre-normalization, per-head QK normalization, fine-grained attention output gating, and SwiGLU-style feed-forward modules. Together, these design choices stabilize training and distribute attention more uniformly across instances without auxiliary losses, masking, or multi-stage regularization. We evaluate QG-MIL across six benchmarks spanning whole-slide pathology and cell-level hematology, covering two fundamentally different MIL scales. The best-performing QG-MIL variants outperform leading baselines on all six benchmarks, with an average improvement of +6.1 mean macro F1 points. Attention overlays and attention mass analysis confirm more distributed instance weighting. Ablation studies show that while individual components can match the full model on specific datasets, the QG-MIL design provides the most consistent cross-domain performance and tightest variance when compared to selected baselines. We release a configurable implementation to support reproducibility at: https://github.com/unica-visual-intelligence-lab/QG-MIL","upvotes":2,"discussionId":"6a3b9f745ac9fb0744984936","githubRepo":"https://github.com/unica-visual-intelligence-lab/QG-MIL","githubRepoAddedBy":"user","ai_summary":"QG-MIL introduces a gated transformer aggregator for multiple instance learning in medical imaging that stabilizes attention distribution and improves prediction consistency across different medical domains.","ai_keywords":["Attention-based Multiple Instance Learning","gated transformer aggregator","RMSNorm-based pre-normalization","per-head QK normalization","fine-grained attention output gating","SwiGLU-style feed-forward modules","attention concentration","overconfident predictions","unstable predictions","medical imaging","whole-slide pathology","cell-level hematology","MIL scales","mean macro F1","attention overlays","attention mass analysis","ablation studies","cross-domain performance"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":1},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"67bed13027ea9754039d63c3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/JHlZXuwcY0y_estgKPkPT.jpeg","isPro":false,"fullname":"Luca Zedda","user":"Snarcy","type":"user"},{"_id":"6270324ebecab9e2dcf245de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6270324ebecab9e2dcf245de/cMbtWSasyNlYc9hvsEEzt.jpeg","isPro":false,"fullname":"Kye Gomez","user":"kye","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.20027.md","query":{}}">

Papers

arxiv:2606.20027

QG-MIL: A Gated Transformer Aggregator for Domain-Agnostic Multiple Instance Learning in Medical Imaging

Published on Jun 18

· Submitted by

Luca Zedda on Jun 24

Upvote

Authors:

Abstract

QG-MIL introduces a gated transformer aggregator for multiple instance learning in medical imaging that stabilizes attention distribution and improves prediction consistency across different medical domains.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

Attention-based Multiple Instance Learning aggregators in medical imaging are prone to attention concentration, producing overconfident and unstable predictions. We introduce QG-MIL, a gated transformer aggregator that addresses this through four synergistic architectural components: RMSNorm-based pre-normalization, per-head QK normalization, fine-grained attention output gating, and SwiGLU-style feed-forward modules. Together, these design choices stabilize training and distribute attention more uniformly across instances without auxiliary losses, masking, or multi-stage regularization. We evaluate QG-MIL across six benchmarks spanning whole-slide pathology and cell-level hematology, covering two fundamentally different MIL scales. The best-performing QG-MIL variants outperform leading baselines on all six benchmarks, with an average improvement of +6.1 mean macro F1 points. Attention overlays and attention mass analysis confirm more distributed instance weighting. Ablation studies show that while individual components can match the full model on specific datasets, the QG-MIL design provides the most consistent cross-domain performance and tightest variance when compared to selected baselines. We release a configurable implementation to support reproducibility at: https://github.com/unica-visual-intelligence-lab/QG-MIL

View arXiv page View PDF GitHub 1 Add to collection

Community

Snarcy

Paper submitter about 16 hours ago

Hi everyone, thanks for checking out our paper!

In QG-MIL, we study the attention concentration problem in medical Multiple Instance Learning, where attention-based aggregators can collapse onto a small subset of instances and produce unstable or overconfident predictions.

Our goal was to design a simple drop-in MIL aggregator that mitigates this behavior architecturally, without auxiliary losses, masking strategies, or multi-stage training. QG-MIL combines RMSNorm pre-normalization, per-head QK normalization, attention-output gating, and SwiGLU-style feed-forward layers, and we evaluate it across pathology and hematology benchmarks with different bag sizes and feature extractors.

Happy to discuss the method, limitations, ablations, or possible extensions!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2606.20027

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.20027 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.20027 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.20027 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

No comments yet. Sign in and be the first to say something.

QG-MIL: A Gated Transformer Aggregator for Domain-Agnostic Multiple Instance Learning in Medical Imaging

Abstract

Community

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 0

Discussion (0)

More from Hugging Face Daily Papers