Skill optimization is inherently multi-objective: a skill must maximize task correctness and satisfy hard platform limits (truncated descriptions, compacted instruction bodies, finite shared context). Prior prompt optimizers either ignore these trade-offs or collapse them into a single scalar, missing Pareto-optimal variants in non-convex regions. MOCHA replaces single-objective selection with Chebyshev scalarization — provably covering the full Pareto front — combined with exponential annealing that transitions from exploration to exploitation as the rollout budget is consumed. Across six diverse skills, MOCHA beats the strongest baseline by 7.5% on average (up to +14.9%) and finds 2× more Pareto-optimal variants, while existing optimizers plateau at the seed on 4 of 6 tasks.<br><a href=\"https://cdn-uploads.huggingface.co/production/uploads/6366e2d9575c93ceda0791d8/-te4DkcopPXYSantgGtLz.png\" rel=\"nofollow\"><img src=\"https://cdn-uploads.huggingface.co/production/uploads/6366e2d9575c93ceda0791d8/-te4DkcopPXYSantgGtLz.png\" alt=\"teaser_non_convex\"></a></p>\n<p><a href=\"https://cdn-uploads.huggingface.co/production/uploads/6366e2d9575c93ceda0791d8/bCOJrOMkF59Sgd9eHxaz1.png\" rel=\"nofollow\"><img src=\"https://cdn-uploads.huggingface.co/production/uploads/6366e2d9575c93ceda0791d8/bCOJrOMkF59Sgd9eHxaz1.png\" alt=\"fig_evolution\"></a></p>\n","updatedAt":"2026-05-21T06:04:52.170Z","author":{"_id":"6366e2d9575c93ceda0791d8","avatarUrl":"/avatars/a53cb1bb7cd9c63a2520587108ffe962.svg","fullname":"Mehrab Tanjim","name":"Mehrab-Tanjim","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7766125202178955},"editors":["Mehrab-Tanjim"],"editorAvatarUrls":["/avatars/a53cb1bb7cd9c63a2520587108ffe962.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.19330","authors":[{"_id":"6a0ea070164dbbc68a26c67c","name":"Md Mehrab Tanjim","hidden":false},{"_id":"6a0ea070164dbbc68a26c67d","name":"Jayakumar Subramanian","hidden":false},{"_id":"6a0ea070164dbbc68a26c67e","name":"Xiang Chen","hidden":false},{"_id":"6a0ea070164dbbc68a26c67f","name":"Branislav Kveton","hidden":false},{"_id":"6a0ea070164dbbc68a26c680","name":"Subhojyoti Mukherjee","hidden":false},{"_id":"6a0ea070164dbbc68a26c681","name":"Anlan Zhang","hidden":false},{"_id":"6a0ea070164dbbc68a26c682","name":"Sungchul Kim","hidden":false},{"_id":"6a0ea070164dbbc68a26c683","name":"Somdeb Sarkhel","hidden":false},{"_id":"6a0ea070164dbbc68a26c684","name":"Sunav Choudhury","hidden":false}],"publishedAt":"2026-05-19T00:00:00.000Z","submittedOnDailyAt":"2026-05-21T00:00:00.000Z","title":"MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization","submittedOnDailyBy":{"_id":"6366e2d9575c93ceda0791d8","avatarUrl":"/avatars/a53cb1bb7cd9c63a2520587108ffe962.svg","isPro":false,"fullname":"Mehrab Tanjim","user":"Mehrab-Tanjim","type":"user","name":"Mehrab-Tanjim"},"summary":"LLM agents organize behavior through skills - structured natural-language specifications governing how an agent reasons, retrieves, and responds. Unlike monolithic prompts, skills are multi-field artifacts subject to hard platform constraints: description fields are truncated for routing, instruction bodies are compacted via progressive disclosure, and co-resident skills compete for limited context windows. These constraints make skill optimization inherently multi-objective: a skill must simultaneously maximize task performance and satisfy platform limits. Yet existing prompt optimizers either ignore these trade-offs or collapse them into a weighted sum, missing Pareto-optimal variants in non-convex objective regions. We introduce MOCHA (Multi-Objective Chebyshev Annealing), which replaces single-objective selection with Chebyshev scalarization - covering the full Pareto front, including non-convex regions - combined with exponential annealing that transitions from exploration to exploitation. In our experiments across six diverse agent skills - where all methods share the same multi-objective mutation operator and baselines receive identical per-objective textual feedback - existing optimizers fail to improve the seed skill on 4 of 6 tasks: 1000 rollouts yield zero progress. MOCHA breaks through on every task, achieving 7.5% relative improvement in mean correctness over the strongest baseline (up to 14.9% on FEVER and 10.4% on TheoremQA) while discovering twice as many more Pareto-optimal skill variants.","upvotes":0,"discussionId":"6a0ea070164dbbc68a26c685","organization":{"_id":"637b318856db0404b7c5a0c2","name":"adobe-research","fullname":"Adobe Research","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1669033410364-624bebf604abc7ebb01789af.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[],"acceptLanguages":["en"],"organization":{"_id":"637b318856db0404b7c5a0c2","name":"adobe-research","fullname":"Adobe Research","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1669033410364-624bebf604abc7ebb01789af.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.19330.md"}">
MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization
Abstract
LLM agents organize behavior through skills - structured natural-language specifications governing how an agent reasons, retrieves, and responds. Unlike monolithic prompts, skills are multi-field artifacts subject to hard platform constraints: description fields are truncated for routing, instruction bodies are compacted via progressive disclosure, and co-resident skills compete for limited context windows. These constraints make skill optimization inherently multi-objective: a skill must simultaneously maximize task performance and satisfy platform limits. Yet existing prompt optimizers either ignore these trade-offs or collapse them into a weighted sum, missing Pareto-optimal variants in non-convex objective regions. We introduce MOCHA (Multi-Objective Chebyshev Annealing), which replaces single-objective selection with Chebyshev scalarization - covering the full Pareto front, including non-convex regions - combined with exponential annealing that transitions from exploration to exploitation. In our experiments across six diverse agent skills - where all methods share the same multi-objective mutation operator and baselines receive identical per-objective textual feedback - existing optimizers fail to improve the seed skill on 4 of 6 tasks: 1000 rollouts yield zero progress. MOCHA breaks through on every task, achieving 7.5% relative improvement in mean correctness over the strongest baseline (up to 14.9% on FEVER and 10.4% on TheoremQA) while discovering twice as many more Pareto-optimal skill variants.
Community
Skill optimization is inherently multi-objective: a skill must maximize task correctness and satisfy hard platform limits (truncated descriptions, compacted instruction bodies, finite shared context). Prior prompt optimizers either ignore these trade-offs or collapse them into a single scalar, missing Pareto-optimal variants in non-convex regions. MOCHA replaces single-objective selection with Chebyshev scalarization — provably covering the full Pareto front — combined with exponential annealing that transitions from exploration to exploitation as the rollout budget is consumed. Across six diverse skills, MOCHA beats the strongest baseline by 7.5% on average (up to +14.9%) and finds 2× more Pareto-optimal variants, while existing optimizers plateau at the seed on 4 of 6 tasks.


Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.19330 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.19330 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.19330 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.