Automatic multi-agent systems aim to instantiate agent workflows without relying on manually designed or fixed orchestration. However, existing automatic MAS approaches remain only partially adaptive: they either perform training-free test-time search or optimize the meta-level designer while keeping downstream execution agents frozen, which creating a frozen-executor ceiling and leaving the end-to-end training of self-designing and self-executing agentic models unexplored. To address this, we introduce MetaAgent-X, an end-to-end reinforcement learning framework that jointly optimizes automatic MAS design and execution. MetaAgent-X enables script-based MAS generation, execution rollout collection, and credit assignment for both designer and executor trajectories. To support stable and scalable optimization, we propose Executor Designer Hierarchical Rollout and Stagewise Co-evolution to improve training stability and expose the dynamics of designer-executor co-evolution. MetaAgent-X consistently outperforms existing automatic MAS baselines, achieving up to 21.7% gains. Comprehensive ablations show that both designer and executor improve throughout training, and that effective automatic MAS learning follows a stagewise co-evolution process. These results establish end-to-end trainable automatic MAS as a practical paradigm for building self-designing and self-executing agentic models.</p>\n","updatedAt":"2026-05-18T18:33:06.334Z","author":{"_id":"648d2e2e514bf0ce32ba729f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/648d2e2e514bf0ce32ba729f/VPL1rehLxkvixz5oRD6u_.jpeg","fullname":"Yaolun Zhang","name":"Mercury7353","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9000614285469055},"editors":["Mercury7353"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/648d2e2e514bf0ce32ba729f/VPL1rehLxkvixz5oRD6u_.jpeg"],"reactions":[],"isReport":false}},{"id":"6a0bc0cdf314c87588e8a278","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":357,"isUserFollowing":false},"createdAt":"2026-05-19T01:45:49.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Small Model as Master Orchestrator: Learning Unified Agent-Tool Orchestration with Parallel Subtask Decomposition](https://huggingface.co/papers/2604.17009) (2026)\n* [LEMON: Learning Executable Multi-Agent Orchestration via Counterfactual Reinforcement Learning](https://huggingface.co/papers/2605.14483) (2026)\n* [EvoMAS: Learning Execution-Time Workflows for Multi-Agent Systems](https://huggingface.co/papers/2605.08769) (2026)\n* [Learning to Communicate: Toward End-to-End Optimization of Multi-Agent Language Systems](https://huggingface.co/papers/2604.21794) (2026)\n* [MARS$^2$: Scaling Multi-Agent Tree Search via Reinforcement Learning for Code Generation](https://huggingface.co/papers/2604.14564) (2026)\n* [Reinforced Collaboration in Multi-Agent Flow Networks](https://huggingface.co/papers/2605.12943) (2026)\n* [Unified-MAS: Universally Generating Domain-Specific Nodes for Empowering Automatic Multi-Agent Systems](https://huggingface.co/papers/2603.21475) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"<p>This is an automated message from the <a href=\"https://huggingface.co/librarian-bots\">Librarian Bot</a>. I found the following papers similar to this paper. </p>\n<p>The following papers were recommended by the Semantic Scholar API </p>\n<ul>\n<li><a href=\"https://huggingface.co/papers/2604.17009\">Small Model as Master Orchestrator: Learning Unified Agent-Tool Orchestration with Parallel Subtask Decomposition</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.14483\">LEMON: Learning Executable Multi-Agent Orchestration via Counterfactual Reinforcement Learning</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.08769\">EvoMAS: Learning Execution-Time Workflows for Multi-Agent Systems</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.21794\">Learning to Communicate: Toward End-to-End Optimization of Multi-Agent Language Systems</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.14564\">MARS$^2$: Scaling Multi-Agent Tree Search via Reinforcement Learning for Code Generation</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.12943\">Reinforced Collaboration in Multi-Agent Flow Networks</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2603.21475\">Unified-MAS: Universally Generating Domain-Specific Nodes for Empowering Automatic Multi-Agent Systems</a> (2026)</li>\n</ul>\n<p> Please give a thumbs up to this comment if you found it helpful!</p>\n<p> If you want recommendations for any Paper on Hugging Face checkout <a href=\"https://huggingface.co/spaces/librarian-bots/recommend_similar_papers\">this</a> Space</p>\n<p> You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: <code><span class=\"SVELTE_PARTIAL_HYDRATER contents\" data-target=\"UserMention\" data-props=\"{"user":"librarian-bot"}\"><span class=\"inline-block\"><span class=\"contents\"><a href=\"/librarian-bot\">@<span class=\"underline\">librarian-bot</span></a></span> </span></span> recommend</code></p>\n","updatedAt":"2026-05-19T01:45:49.710Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":357,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7208890914916992},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.14212","authors":[{"_id":"6a06783db1a8cbabc9f097f1","name":"Yaolun Zhang","hidden":false},{"_id":"6a06783db1a8cbabc9f097f2","name":"Yujie Zhao","hidden":false},{"_id":"6a06783db1a8cbabc9f097f3","name":"Nan Wang","hidden":false},{"_id":"6a06783db1a8cbabc9f097f4","name":"Yiran Wu","hidden":false},{"_id":"6a06783db1a8cbabc9f097f5","name":"Jiayu Chang","hidden":false},{"_id":"6a06783db1a8cbabc9f097f6","name":"Yizhao Chen","hidden":false},{"_id":"6a06783db1a8cbabc9f097f7","name":"Qingyun Wu","hidden":false},{"_id":"6a06783db1a8cbabc9f097f8","name":"Jishen Zhao","hidden":false},{"_id":"6a06783db1a8cbabc9f097f9","name":"Huazheng Wang","hidden":false}],"publishedAt":"2026-05-14T00:00:00.000Z","submittedOnDailyAt":"2026-05-18T00:00:00.000Z","title":"MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning","submittedOnDailyBy":{"_id":"648d2e2e514bf0ce32ba729f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/648d2e2e514bf0ce32ba729f/VPL1rehLxkvixz5oRD6u_.jpeg","isPro":false,"fullname":"Yaolun Zhang","user":"Mercury7353","type":"user","name":"Mercury7353"},"summary":"Automatic multi-agent systems aim to instantiate agent workflows without relying on manually designed or fixed orchestration. However, existing automatic MAS approaches remain only partially adaptive: they either perform training-free test-time search or optimize the meta-level designer while keeping downstream execution agents frozen, which creating a frozen-executor ceiling and leaving the end-to-end training of self-designing and self-executing agentic models unexplored. To address this, we introduce MetaAgent-X, an end-to-end reinforcement learning framework that jointly optimizes automatic MAS design and execution. MetaAgent-X enables script-based MAS generation, execution rollout collection, and credit assignment for both designer and executor trajectories. To support stable and scalable optimization, we propose Executor Designer Hierarchical Rollout and Stagewise Co-evolution to improve training stability and expose the dynamics of designer-executor co-evolution. MetaAgent-X consistently outperforms existing automatic MAS baselines, achieving up to 21.7% gains. Comprehensive ablations show that both designer and executor improve throughout training, and that effective automatic MAS learning follows a stagewise co-evolution process. These results establish end-to-end trainable automatic MAS as a practical paradigm for building self-designing and self-executing agentic models.","upvotes":9,"discussionId":"6a06783eb1a8cbabc9f097fa","projectPage":"https://mercury7353.github.io/MetaAgent-X-Page/","ai_summary":"MetaAgent-X presents an end-to-end reinforcement learning framework that jointly optimizes automatic multi-agent system design and execution through hierarchical rollout and stagewise co-evolution techniques.","ai_keywords":["automatic multi-agent systems","reinforcement learning","joint optimization","script-based generation","execution rollout collection","credit assignment","hierarchical rollout","stagewise co-evolution","end-to-end training","self-designing agents","self-executing agents"],"organization":{"_id":"6897df91ad3033f4085e432c","name":"OregonStateUniversity","fullname":"Oregon State University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/6897df118dbb78d2e8837335/ssdqjm2xjvu285uuDBZbd.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"648d2e2e514bf0ce32ba729f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/648d2e2e514bf0ce32ba729f/VPL1rehLxkvixz5oRD6u_.jpeg","isPro":false,"fullname":"Yaolun Zhang","user":"Mercury7353","type":"user"},{"_id":"667b04670b19955ad0f61c7e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/Q0hMYkaX8WzquYCUctUKX.png","isPro":false,"fullname":"Yi Ge (Ellen)","user":"ellenyige","type":"user"},{"_id":"6245285af59b8d262df3321b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6245285af59b8d262df3321b/dvy__dTf-miJ60IbveDg4.jpeg","isPro":false,"fullname":"Yifan Zeng","user":"yokey","type":"user"},{"_id":"65837fd2d6cc28d6bd6563c8","avatarUrl":"/avatars/19e1b19dd123388896d76015f38464d9.svg","isPro":false,"fullname":"Nan","user":"nanw","type":"user"},{"_id":"68a4f750d59068b17bf037f7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/68a4f750d59068b17bf037f7/qrE5mRw7uW3FSV5zdWJHE.jpeg","isPro":false,"fullname":"Jishen Zhao","user":"jzhao31","type":"user"},{"_id":"68f18db6dcb9e41cd8df67e1","avatarUrl":"/avatars/d22323d9c655d88c0492568f931e7de3.svg","isPro":false,"fullname":"Yujie Zhao","user":"YujieZhao","type":"user"},{"_id":"6301d6455e305a35cb0846a7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6301d6455e305a35cb0846a7/aT2AtzRMSY_T3y02MIUap.jpeg","isPro":true,"fullname":"Lanxiang Hu","user":"Snyhlxde","type":"user"},{"_id":"6270324ebecab9e2dcf245de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6270324ebecab9e2dcf245de/cMbtWSasyNlYc9hvsEEzt.jpeg","isPro":false,"fullname":"Kye Gomez","user":"kye","type":"user"},{"_id":"668d4893891640bf32158d25","avatarUrl":"/avatars/4d7b283739d28b81e3d60238b595260d.svg","isPro":false,"fullname":"Hao Lu","user":"LH2002","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"6897df91ad3033f4085e432c","name":"OregonStateUniversity","fullname":"Oregon State University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/6897df118dbb78d2e8837335/ssdqjm2xjvu285uuDBZbd.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.14212.md"}">
MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning
Abstract
MetaAgent-X presents an end-to-end reinforcement learning framework that jointly optimizes automatic multi-agent system design and execution through hierarchical rollout and stagewise co-evolution techniques.
AI-generated summary
Automatic multi-agent systems aim to instantiate agent workflows without relying on manually designed or fixed orchestration. However, existing automatic MAS approaches remain only partially adaptive: they either perform training-free test-time search or optimize the meta-level designer while keeping downstream execution agents frozen, which creating a frozen-executor ceiling and leaving the end-to-end training of self-designing and self-executing agentic models unexplored. To address this, we introduce MetaAgent-X, an end-to-end reinforcement learning framework that jointly optimizes automatic MAS design and execution. MetaAgent-X enables script-based MAS generation, execution rollout collection, and credit assignment for both designer and executor trajectories. To support stable and scalable optimization, we propose Executor Designer Hierarchical Rollout and Stagewise Co-evolution to improve training stability and expose the dynamics of designer-executor co-evolution. MetaAgent-X consistently outperforms existing automatic MAS baselines, achieving up to 21.7% gains. Comprehensive ablations show that both designer and executor improve throughout training, and that effective automatic MAS learning follows a stagewise co-evolution process. These results establish end-to-end trainable automatic MAS as a practical paradigm for building self-designing and self-executing agentic models.
Community
Automatic multi-agent systems aim to instantiate agent workflows without relying on manually designed or fixed orchestration. However, existing automatic MAS approaches remain only partially adaptive: they either perform training-free test-time search or optimize the meta-level designer while keeping downstream execution agents frozen, which creating a frozen-executor ceiling and leaving the end-to-end training of self-designing and self-executing agentic models unexplored. To address this, we introduce MetaAgent-X, an end-to-end reinforcement learning framework that jointly optimizes automatic MAS design and execution. MetaAgent-X enables script-based MAS generation, execution rollout collection, and credit assignment for both designer and executor trajectories. To support stable and scalable optimization, we propose Executor Designer Hierarchical Rollout and Stagewise Co-evolution to improve training stability and expose the dynamics of designer-executor co-evolution. MetaAgent-X consistently outperforms existing automatic MAS baselines, achieving up to 21.7% gains. Comprehensive ablations show that both designer and executor improve throughout training, and that effective automatic MAS learning follows a stagewise co-evolution process. These results establish end-to-end trainable automatic MAS as a practical paradigm for building self-designing and self-executing agentic models.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.14212 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.14212 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.