Hugging Face Daily Papers · · 4 min read

When, Where, and How: Adaptive Binning for Tabular Self-Supervised Learning

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

<a href=\"https://cdn-uploads.huggingface.co/production/uploads/67039e7443929668458d3618/G3-DkG1OiDnT0qHQkNWrE.png\" rel=\"nofollow\"><img src=\"https://cdn-uploads.huggingface.co/production/uploads/67039e7443929668458d3618/G3-DkG1OiDnT0qHQkNWrE.png\" alt=\"main_figure\"></a></p>\n<p>This paper proposes Adaptive Binning for medical tabular self-supervised learning. The core idea is to replace fixed global quantile binning with a learning-coupled, feature-wise coarse-to-fine curriculum that determines when to refine each feature, where to split its bins, and how to supervise mixed categorical–numerical schemas through type-aware ordinal reconstruction.</p>\n<p>We show that adaptive discretization yields stronger representations across diverse public medical tabular datasets in both linear probing and fine-tuning evaluations. We also establish a unified benchmark for reproducible medical tabular self-supervised learning.</p>\n","updatedAt":"2026-06-22T21:45:03.225Z","author":{"_id":"67039e7443929668458d3618","avatarUrl":"/avatars/081486010c5d27153410f68cc3325e6a.svg","fullname":"DaehwanKim","name":"HYUDHKIM","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8518046140670776},"editors":["HYUDHKIM"],"editorAvatarUrls":["/avatars/081486010c5d27153410f68cc3325e6a.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.19827","authors":[{"_id":"6a3891efdb23715e9da138df","user":{"_id":"67039e7443929668458d3618","avatarUrl":"/avatars/081486010c5d27153410f68cc3325e6a.svg","isPro":false,"fullname":"DaehwanKim","user":"HYUDHKIM","type":"user","name":"HYUDHKIM"},"name":"Daehwan Kim","status":"claimed_verified","statusLastChangedAt":"2026-06-22T16:12:34.063Z","hidden":false},{"_id":"6a3891efdb23715e9da138e0","name":"Haejun Chung","hidden":false},{"_id":"6a3891efdb23715e9da138e1","name":"Ikbeom Jang","hidden":false}],"publishedAt":"2026-06-18T00:00:00.000Z","submittedOnDailyAt":"2026-06-22T00:00:00.000Z","title":"When, Where, and How: Adaptive Binning for Tabular Self-Supervised Learning","submittedOnDailyBy":{"_id":"67039e7443929668458d3618","avatarUrl":"/avatars/081486010c5d27153410f68cc3325e6a.svg","isPro":false,"fullname":"DaehwanKim","user":"HYUDHKIM","type":"user","name":"HYUDHKIM"},"summary":"Medical tabular data are ubiquitous in clinical research, but deep learning for tables remains underexplored because reliable labels often require costly expert adjudication, even though structured clinical variables are routinely available in tabular form. Self-supervised learning can leverage these unlabeled tables, and recent binning-based pretexts offer a promising inductive bias, but existing objectives fix a single global quantile discretization and apply feature-agnostic supervision. We propose Adaptive Binning, a training-adaptive discretization pretext for tabular SSL that couples discretization to learning through a feature-wise coarse-to-fine curriculum. Motivated by the spectral bias of neural networks and the principles of curriculum learning, our method progressively refines discretization per feature upon plateau detection and selects representation-aware splits to jointly improve value-space concentration and representation-space coherence. A heterogeneity-aware objective unifies categorical reconstruction with ordinal supervision for numerical features, and experiments on public medical tabular datasets under unified evaluation protocols show consistent gains for linear probing and fine-tuning without dataset-specific discretization tuning. We further introduce a medical tabular SSL benchmark with standardized protocols to support reproducible progress in this underexplored domain. Our code is available at https://github.com/labhai/Adaptive-Binning.","upvotes":2,"discussionId":"6a3891efdb23715e9da138e2","projectPage":"https://github.com/labhai/Adaptive-Binning","githubRepo":"https://github.com/labhai/Adaptive-Binning","githubRepoAddedBy":"user","ai_summary":"Adaptive Binning introduces a training-adaptive discretization method for self-supervised learning on medical tabular data, improving representation learning through feature-wise refinement and heterogeneous feature handling.","ai_keywords":["self-supervised learning","tabular data","discretization","pretexts","spectral bias","curriculum learning","fine-tuning","linear probing","medical tabular SSL","heterogeneous-aware objective","feature-wise coarse-to-fine curriculum"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":0},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"67039e7443929668458d3618","avatarUrl":"/avatars/081486010c5d27153410f68cc3325e6a.svg","isPro":false,"fullname":"DaehwanKim","user":"HYUDHKIM","type":"user"},{"_id":"6a2da6c8ca070ee12c6e396c","avatarUrl":"/avatars/0355287dcabaa67dbc7f0b10b87451f9.svg","isPro":false,"fullname":"Joe Mama","user":"JoeMama123123123","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.19827.md","query":{}}">
Papers
arxiv:2606.19827

When, Where, and How: Adaptive Binning for Tabular Self-Supervised Learning

Published on Jun 18
· Submitted by
DaehwanKim
on Jun 22
Authors:
,

Abstract

Adaptive Binning introduces a training-adaptive discretization method for self-supervised learning on medical tabular data, improving representation learning through feature-wise refinement and heterogeneous feature handling.

Medical tabular data are ubiquitous in clinical research, but deep learning for tables remains underexplored because reliable labels often require costly expert adjudication, even though structured clinical variables are routinely available in tabular form. Self-supervised learning can leverage these unlabeled tables, and recent binning-based pretexts offer a promising inductive bias, but existing objectives fix a single global quantile discretization and apply feature-agnostic supervision. We propose Adaptive Binning, a training-adaptive discretization pretext for tabular SSL that couples discretization to learning through a feature-wise coarse-to-fine curriculum. Motivated by the spectral bias of neural networks and the principles of curriculum learning, our method progressively refines discretization per feature upon plateau detection and selects representation-aware splits to jointly improve value-space concentration and representation-space coherence. A heterogeneity-aware objective unifies categorical reconstruction with ordinal supervision for numerical features, and experiments on public medical tabular datasets under unified evaluation protocols show consistent gains for linear probing and fine-tuning without dataset-specific discretization tuning. We further introduce a medical tabular SSL benchmark with standardized protocols to support reproducible progress in this underexplored domain. Our code is available at https://github.com/labhai/Adaptive-Binning.

Community

Paper author Paper submitter about 4 hours ago

main_figure

This paper proposes Adaptive Binning for medical tabular self-supervised learning. The core idea is to replace fixed global quantile binning with a learning-coupled, feature-wise coarse-to-fine curriculum that determines when to refine each feature, where to split its bins, and how to supervise mixed categorical–numerical schemas through type-aware ordinal reconstruction.

We show that adaptive discretization yields stronger representations across diverse public medical tabular datasets in both linear probing and fine-tuning evaluations. We also establish a unified benchmark for reproducible medical tabular self-supervised learning.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.19827
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.19827 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.19827 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.19827 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers