Hugging Face Daily Papers · June 9, 2026 · 6 min read

Trust Functions: Near-Lossless Weak-to-Strong Generalization by Learning When to Trust the Weak Teacher

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

Weak-to-strong generalization studies how to improve a strong student using supervision from a weaker teacher when reliable labels are scarce. We view this primarily as a data selection problem, where the key challenge is to identify which weak labels are reliable enough to serve as a training signal. To address this, we introduce trust functions that assign each weak label a scalar trust score and use these scores to filter weak supervision. Across several domains, including world knowledge, quantitative reasoning, and strategy games, trust filtering yields students that match and sometimes surpass ground-truth supervision, achieving near-lossless weak-to-strong generalization. Moreover, trust functions enable an iterative weak-to-strong chain that compounds gains by training a student and reusing it as the next teacher, amplifying the gains. There are several mechanisms to which advantage of trust functions can be attributed.\n","updatedAt":"2026-06-09T23:01:47.660Z","author":{"_id":"61f7fa24b5e6e866f9abdaed","avatarUrl":"/avatars/8d43531365e2e78e568db9e0a421196a.svg","fullname":"Arda Uzunoğlu","name":"ardauzunoglu","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9510704278945923},"editors":["ardauzunoglu"],"editorAvatarUrls":["/avatars/8d43531365e2e78e568db9e0a421196a.svg"],"reactions":[],"isReport":false}},{"id":"6a28c28cb6ab1056bc22baff","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":363,"isUserFollowing":false},"createdAt":"2026-06-10T01:49:00.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [When In-Distribution Gains Fail: Evaluating Weak-to-Strong Reward Models under Preference Shift](https://huggingface.co/papers/2605.25629) (2026)\n* [Weak Critics Make Strong Learners: On-Policy Critique Distillation for Scalable Oversight](https://huggingface.co/papers/2606.00424) (2026)\n* [Evaluating Risks in Weak-to-Strong Alignment: A Bias-Variance Perspective](https://huggingface.co/papers/2604.25077) (2026)\n* [Strong Teacher Not Needed? On Distillation in LLM Pretraining](https://huggingface.co/papers/2605.23857) (2026)\n* [Single-Rollout Hidden-State Dynamics for Training-Free RLVR Data Selection](https://huggingface.co/papers/2605.28631) (2026)\n* [On the Generalization Gap in Self-Evolving Language Model Reasoning](https://huggingface.co/papers/2606.01075) (2026)\n* [LoRi: Low-Rank Distillation for Implicit Reasoning](https://huggingface.co/papers/2606.05315) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"This is an automated message from the <a href=\"https://huggingface.co/librarian-bots\">Librarian Bot</a>. I found the following papers similar to this paper. \nThe following papers were recommended by the Semantic Scholar API \n<ul>\n<li><a href=\"https://huggingface.co/papers/2605.25629\">When In-Distribution Gains Fail: Evaluating Weak-to-Strong Reward Models under Preference Shift</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2606.00424\">Weak Critics Make Strong Learners: On-Policy Critique Distillation for Scalable Oversight</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.25077\">Evaluating Risks in Weak-to-Strong Alignment: A Bias-Variance Perspective</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.23857\">Strong Teacher Not Needed? On Distillation in LLM Pretraining</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.28631\">Single-Rollout Hidden-State Dynamics for Training-Free RLVR Data Selection</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2606.01075\">On the Generalization Gap in Self-Evolving Language Model Reasoning</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2606.05315\">LoRi: Low-Rank Distillation for Implicit Reasoning</a> (2026)</li>\n</ul>\n Please give a thumbs up to this comment if you found it helpful!\n If you want recommendations for any Paper on Hugging Face checkout <a href=\"https://huggingface.co/spaces/librarian-bots/recommend_similar_papers\">this</a> Space\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: <code><a href=\"/librarian-bot\">@librarian-bot</a> recommend</code>\n","updatedAt":"2026-06-10T01:49:00.156Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":363,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.710087776184082},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.01000","authors":[{"_id":"6a289b0ee7d78ea7587e5202","name":"Arda Uzunoglu","hidden":false},{"_id":"6a289b0ee7d78ea7587e5203","name":"Alvin Zhang","hidden":false},{"_id":"6a289b0ee7d78ea7587e5204","name":"Daniel Khashabi","hidden":false}],"publishedAt":"2026-05-31T00:00:00.000Z","submittedOnDailyAt":"2026-06-09T00:00:00.000Z","title":"Trust Functions: Near-Lossless Weak-to-Strong Generalization by Learning When to Trust the Weak Teacher","submittedOnDailyBy":{"_id":"61f7fa24b5e6e866f9abdaed","avatarUrl":"/avatars/8d43531365e2e78e568db9e0a421196a.svg","isPro":false,"fullname":"Arda Uzunoğlu","user":"ardauzunoglu","type":"user","name":"ardauzunoglu"},"summary":"Weak-to-strong generalization studies how to improve a strong student using supervision from a weaker teacher when reliable labels are scarce. We view this primarily as a data selection problem, where the key challenge is to identify which weak labels are reliable enough to serve as a training signal. To address this, we introduce trust functions that assign each weak label a scalar trust score and use these scores to filter weak supervision. Across several domains, including world knowledge, quantitative reasoning, and strategy games, trust filtering yields students that match and sometimes surpass ground-truth supervision, achieving near-lossless weak-to-strong generalization. Moreover, trust functions enable an iterative weak-to-strong chain that compounds gains by training a student and reusing it as the next teacher, amplifying the gains. There are several mechanisms to which advantage of trust functions can be attributed.","upvotes":2,"discussionId":"6a289b0fe7d78ea7587e5205","projectPage":"https://ardauzunoglu.github.io/trust-functions/","githubRepo":"https://github.com/ardauzunoglu/trust-functions","githubRepoAddedBy":"user","ai_summary":"Trust functions enable effective weak-to-strong generalization by identifying reliable weak labels for training, achieving performance comparable to ground-truth supervision across multiple domains.","ai_keywords":["weak-to-strong generalization","data selection","trust functions","weak supervision","reliable labels","training signal","iterative weak-to-strong chain","compounding gains"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":0,"organization":{"_id":"6137aeeaf8e9dca6e152bccf","name":"jhu-clsp","fullname":"Center for Language and Speech Processing @ JHU","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1631039662102-6137ad94501f80a6f6e1eac9.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"5f6540c65e78cc6b0ed3199d","avatarUrl":"/avatars/0280d4df417855965a0964d22766c012.svg","isPro":false,"fullname":"Daniel Khashabi","user":"danyaljj","type":"user"},{"_id":"61f7fa24b5e6e866f9abdaed","avatarUrl":"/avatars/8d43531365e2e78e568db9e0a421196a.svg","isPro":false,"fullname":"Arda Uzunoğlu","user":"ardauzunoglu","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"6137aeeaf8e9dca6e152bccf","name":"jhu-clsp","fullname":"Center for Language and Speech Processing @ JHU","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1631039662102-6137ad94501f80a6f6e1eac9.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.01000.md"}">

Papers

arxiv:2606.01000

Trust Functions: Near-Lossless Weak-to-Strong Generalization by Learning When to Trust the Weak Teacher

Published on May 31

· Submitted by

Arda Uzunoğlu on Jun 9

Center for Language and Speech Processing @ JHU

Upvote

Authors:

Abstract

Trust functions enable effective weak-to-strong generalization by identifying reliable weak labels for training, achieving performance comparable to ground-truth supervision across multiple domains.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

View arXiv page View PDF Project page GitHub 0 Add to collection

Community

ardauzunoglu

Paper submitter about 3 hours ago

librarian-bot

12 minutes ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2606.01000

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.01000 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.01000 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.01000 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

No comments yet. Sign in and be the first to say something.

Trust Functions: Near-Lossless Weak-to-Strong Generalization by Learning When to Trust the Weak Teacher

Abstract

Community

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 0

Discussion (0)

More from Hugging Face Daily Papers