Hugging Face Daily Papers · · 3 min read

TuneJury: An Open Metric for Improving Music Generation Preference Alignment

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

<strong>TuneJury</strong> is an open reward model for music generation preference alignment. A lightweight head sits on top of frozen music encoders and maps an audio clip and an optional text prompt to a single preference score. We train it on human pairwise judgments from open music-preference datasets. We demonstrate three applications: inference-time best-of-N selection, DITTO-style latent optimization, and expert-iteration post-training. We release the checkpoints, evaluation code, live demo, and score files over seven open music collections.</p>\n","updatedAt":"2026-06-16T03:06:43.450Z","author":{"_id":"67dc6c6e8fc6577e1851b36e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67dc6c6e8fc6577e1851b36e/Sb0-erpwIC4Qutty5KEln.jpeg","fullname":"Yonghyun Kim","name":"yonghyunk1m","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.797537624835968},"editors":["yonghyunk1m"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/67dc6c6e8fc6577e1851b36e/Sb0-erpwIC4Qutty5KEln.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.17006","authors":[{"_id":"6a30b1b7a0d4daae4285fd43","user":{"_id":"67dc6c6e8fc6577e1851b36e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67dc6c6e8fc6577e1851b36e/Sb0-erpwIC4Qutty5KEln.jpeg","isPro":true,"fullname":"Yonghyun Kim","user":"yonghyunk1m","type":"user","name":"yonghyunk1m"},"name":"Yonghyun Kim","status":"claimed_verified","statusLastChangedAt":"2026-06-16T12:07:23.065Z","hidden":false},{"_id":"6a30b1b7a0d4daae4285fd44","name":"Junwon Lee","hidden":false},{"_id":"6a30b1b7a0d4daae4285fd45","name":"Haiwen Xia","hidden":false},{"_id":"6a30b1b7a0d4daae4285fd46","name":"Yinghao Ma","hidden":false},{"_id":"6a30b1b7a0d4daae4285fd47","name":"Junghyun Koo","hidden":false},{"_id":"6a30b1b7a0d4daae4285fd48","name":"Koichi Saito","hidden":false},{"_id":"6a30b1b7a0d4daae4285fd49","name":"Yuki Mitsufuji","hidden":false},{"_id":"6a30b1b7a0d4daae4285fd4a","name":"Chris Donahue","hidden":false}],"publishedAt":"2026-06-15T00:00:00.000Z","submittedOnDailyAt":"2026-06-16T00:00:00.000Z","title":"TuneJury: An Open Metric for Improving Music Generation Preference Alignment","submittedOnDailyBy":{"_id":"67dc6c6e8fc6577e1851b36e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67dc6c6e8fc6577e1851b36e/Sb0-erpwIC4Qutty5KEln.jpeg","isPro":true,"fullname":"Yonghyun Kim","user":"yonghyunk1m","type":"user","name":"yonghyunk1m"},"summary":"We introduce TuneJury, an open, instance-level pairwise reward model for text-to-music that predicts a music preference score from a text prompt and an audio clip. The released checkpoint is trained on publicly available human-preference labels covering arena-style (A vs. B) votes, metric-alignment preference pairs, crowdsourced pairwise comparisons, and expert aesthetic ratings. The predicted score margin between two clips is well calibrated on our held-out test split, supporting data filtering via a simple score threshold. TuneJury generalizes to both held-out test pairs and out-of-distribution benchmarks, remaining competitive with prior baselines on the latter. For generators released after training, we introduce anchor calibration, a post-hoc, per-system Bradley-Terry calibration that recovers agreement at substantially better data efficiency than from-scratch retraining. The same frozen reward drives consistent reward-axis gains across three downstream applications: inference-time best-of-N selection, DITTO-style latent optimization, and expert-iteration post-training. TuneJury is available at https://github.com/yonghyunk1m/TuneJury.","upvotes":0,"discussionId":"6a30b1b7a0d4daae4285fd4b","projectPage":"https://huggingface.co/spaces/yonghyunk1m/TuneJury","githubRepo":"https://github.com/yonghyunk1m/TuneJury","githubRepoAddedBy":"user","ai_summary":"A novel open-source pairwise reward model for text-to-music generation that provides calibrated preference scoring and generalizes across multiple downstream applications through a frozen reward mechanism.","ai_keywords":["pairwise reward model","text-to-music","music preference score","human-preference labels","Bradley-Terry calibration","anchor calibration","best-of-N selection","latent optimization","expert-iteration post-training"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":0,"organization":{"_id":"6a168f5894b3a8d00be4a67b","name":"TuneJury","fullname":"TuneJury","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/67dc6c6e8fc6577e1851b36e/dEOYSNWAz-ONqGZHoLpA7.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[],"acceptLanguages":["en"],"organization":{"_id":"6a168f5894b3a8d00be4a67b","name":"TuneJury","fullname":"TuneJury","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/67dc6c6e8fc6577e1851b36e/dEOYSNWAz-ONqGZHoLpA7.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.17006.md","query":{}}">
Papers
arxiv:2606.17006

TuneJury: An Open Metric for Improving Music Generation Preference Alignment

Published on Jun 15
· Submitted by
Yonghyun Kim
on Jun 16
Authors:
,
,
,
,
,
,

Abstract

A novel open-source pairwise reward model for text-to-music generation that provides calibrated preference scoring and generalizes across multiple downstream applications through a frozen reward mechanism.

We introduce TuneJury, an open, instance-level pairwise reward model for text-to-music that predicts a music preference score from a text prompt and an audio clip. The released checkpoint is trained on publicly available human-preference labels covering arena-style (A vs. B) votes, metric-alignment preference pairs, crowdsourced pairwise comparisons, and expert aesthetic ratings. The predicted score margin between two clips is well calibrated on our held-out test split, supporting data filtering via a simple score threshold. TuneJury generalizes to both held-out test pairs and out-of-distribution benchmarks, remaining competitive with prior baselines on the latter. For generators released after training, we introduce anchor calibration, a post-hoc, per-system Bradley-Terry calibration that recovers agreement at substantially better data efficiency than from-scratch retraining. The same frozen reward drives consistent reward-axis gains across three downstream applications: inference-time best-of-N selection, DITTO-style latent optimization, and expert-iteration post-training. TuneJury is available at https://github.com/yonghyunk1m/TuneJury.

Community

Paper author Paper submitter about 10 hours ago

TuneJury is an open reward model for music generation preference alignment. A lightweight head sits on top of frozen music encoders and maps an audio clip and an optional text prompt to a single preference score. We train it on human pairwise judgments from open music-preference datasets. We demonstrate three applications: inference-time best-of-N selection, DITTO-style latent optimization, and expert-iteration post-training. We release the checkpoints, evaluation code, live demo, and score files over seven open music collections.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.17006
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.17006 in a dataset README.md to link it from this page.

Spaces citing this paper 4

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers