This is a project trying to enable LLM to generate 100 bars multi-instrument music validated by the statistical cloud</p>\n","updatedAt":"2026-06-23T16:27:10.509Z","author":{"_id":"6573ad7706fdcd4ca914564a","avatarUrl":"/avatars/64b65a0f545833463c8b3e258496e7a5.svg","fullname":"Yichen Xu","name":"xuyichen","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8660554885864258},"editors":["xuyichen"],"editorAvatarUrls":["/avatars/64b65a0f545833463c8b3e258496e7a5.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.22708","authors":[{"_id":"6a3ab3770a86ac3098d5d488","name":"Yichen Xu","hidden":false}],"publishedAt":"2026-06-21T00:00:00.000Z","submittedOnDailyAt":"2026-06-23T00:00:00.000Z","title":"Libretto: Giving LLM Agents a Sense of Musical Structure","submittedOnDailyBy":{"_id":"6573ad7706fdcd4ca914564a","avatarUrl":"/avatars/64b65a0f545833463c8b3e258496e7a5.svg","isPro":false,"fullname":"Yichen Xu","user":"xuyichen","type":"user","name":"xuyichen"},"summary":"Generative music systems can now produce impressive audio from text prompts, but audio outputs are difficult to inspect, edit, and diagnose as musical structure. We introduce Libretto, an agent-facing framework for symbolic music generation and revision. Libretto uses an LLM-native grammar with explicit onset slots, voices, and bar-level organization, then evaluates each piece in a corpus-calibrated statistical space over rhythm, harmony, melody, texture, form, and variation. The same structural axes support retrieval, diagnosis, copy-risk control, and iterative self-revision. Across gap filling, reference-guided full-piece generation, gradual morphing, and educational music generation, Libretto turns symbolic music from a raw token sequence into a measurable and editable object for language-model agents.","upvotes":1,"discussionId":"6a3ab3770a86ac3098d5d489","projectPage":"https://libretto.site/","githubRepo":"https://github.com/Xyc-arch/Libretto","githubRepoAddedBy":"user","ai_summary":"Libretto provides a structured framework for symbolic music generation and revision using LLM-native grammar and statistical evaluation across musical dimensions.","ai_keywords":["symbolic music generation","LLM-native grammar","onset slots","voices","bar-level organization","corpus-calibrated statistical space","rhythm","harmony","melody","texture","form","variation","retrieval","copy-risk control","iterative self-revision"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":1,"organization":{"_id":"66b1baeff10262fc4fa61961","name":"UCBerkeley","fullname":"University of California, Berkeley","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/63f425c3a096536aeab42dea/bxNKEkprdm5JI1wkjmNAL.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6573ad7706fdcd4ca914564a","avatarUrl":"/avatars/64b65a0f545833463c8b3e258496e7a5.svg","isPro":false,"fullname":"Yichen Xu","user":"xuyichen","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"66b1baeff10262fc4fa61961","name":"UCBerkeley","fullname":"University of California, Berkeley","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/63f425c3a096536aeab42dea/bxNKEkprdm5JI1wkjmNAL.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.22708.md","query":{}}">
Libretto: Giving LLM Agents a Sense of Musical Structure
Abstract
Libretto provides a structured framework for symbolic music generation and revision using LLM-native grammar and statistical evaluation across musical dimensions.
Generative music systems can now produce impressive audio from text prompts, but audio outputs are difficult to inspect, edit, and diagnose as musical structure. We introduce Libretto, an agent-facing framework for symbolic music generation and revision. Libretto uses an LLM-native grammar with explicit onset slots, voices, and bar-level organization, then evaluates each piece in a corpus-calibrated statistical space over rhythm, harmony, melody, texture, form, and variation. The same structural axes support retrieval, diagnosis, copy-risk control, and iterative self-revision. Across gap filling, reference-guided full-piece generation, gradual morphing, and educational music generation, Libretto turns symbolic music from a raw token sequence into a measurable and editable object for language-model agents.
Community
This is a project trying to enable LLM to generate 100 bars multi-instrument music validated by the statistical cloud
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2606.22708 in a model README.md to link it from this page.
Cite arxiv.org/abs/2606.22708 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2606.22708 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.