Communication enables coordination in multi-agent reinforcement learning (MARL), but many real-world applications, e.g., search-and-rescue with drone swarms, operate under severe bandwidth constraints. Many communication architectures still expose a coupled bottleneck in which a shared latent representation is used for both policy execution and inter-agent communication. Consequently, reducing message size directly limits the policy’s latent space, often leading to significant performance degradation. We address this with two contributions. First, we introduce β, a normalised per-agent bandwidth budget that unifies sparsity, rounds, and message dimension into a single comparable constraint. Second, we provide SLIM, a minimal architecture that decouples the communication pathway from the policy’s latent representation, allowing us to isolate the effect of bandwidth from the effect of policy capacity while benefiting from in-step communication. We evaluate our method on several partially-observable MARL benchmarks, where communication is essential. Our approach achieves state-of-the-art performance and exhibits scalability and robustness under limited communication, with only marginal degradation as bandwidth is reduced.</p>\n","updatedAt":"2026-05-26T10:56:02.611Z","author":{"_id":"651d2733399125a79357cdc3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/ASYSZL7g5zYo8mns6by_I.png","fullname":"Alexi Canesse","name":"alexicanesse","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.896904706954956},"editors":["alexicanesse"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/ASYSZL7g5zYo8mns6by_I.png"],"reactions":[],"isReport":false}},{"id":"6a1600ca91a9e2d4068834e8","author":{"_id":"65243980050781c16f234f1f","avatarUrl":"/avatars/743a009681d5d554c27e04300db9f267.svg","fullname":"Avi","name":"avahal","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":6,"isUserFollowing":false},"createdAt":"2026-05-26T20:21:30.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Interesting breakdown of this paper on arXivLens: https://arxivlens.com/PaperView/Details/decoupling-communication-from-policy-robust-marl-under-bandwidth-constraints-7284-23ffc80f\nCovers the executive summary, detailed methodology, and practical applications.","html":"<p>Interesting breakdown of this paper on arXivLens: <a href=\"https://arxivlens.com/PaperView/Details/decoupling-communication-from-policy-robust-marl-under-bandwidth-constraints-7284-23ffc80f\" rel=\"nofollow\">https://arxivlens.com/PaperView/Details/decoupling-communication-from-policy-robust-marl-under-bandwidth-constraints-7284-23ffc80f</a><br>Covers the executive summary, detailed methodology, and practical applications.</p>\n","updatedAt":"2026-05-26T20:21:30.020Z","author":{"_id":"65243980050781c16f234f1f","avatarUrl":"/avatars/743a009681d5d554c27e04300db9f267.svg","fullname":"Avi","name":"avahal","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":6,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6777297854423523},"editors":["avahal"],"editorAvatarUrls":["/avatars/743a009681d5d554c27e04300db9f267.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.21085","authors":[{"_id":"6a0f7dbda53a61ce2e422b73","user":{"_id":"651d2733399125a79357cdc3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/ASYSZL7g5zYo8mns6by_I.png","isPro":false,"fullname":"Alexi Canesse","user":"alexicanesse","type":"user","name":"alexicanesse"},"name":"Alexi Canesse","status":"claimed_verified","statusLastChangedAt":"2026-05-22T16:09:19.376Z","hidden":false},{"_id":"6a0f7dbda53a61ce2e422b74","name":"Benoît Goupil","hidden":false},{"_id":"6a0f7dbda53a61ce2e422b75","name":"Jesse Read","hidden":false},{"_id":"6a0f7dbda53a61ce2e422b76","name":"Sonia Vanier","hidden":false}],"publishedAt":"2026-05-20T00:00:00.000Z","submittedOnDailyAt":"2026-05-26T00:00:00.000Z","title":"Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints","submittedOnDailyBy":{"_id":"651d2733399125a79357cdc3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/ASYSZL7g5zYo8mns6by_I.png","isPro":false,"fullname":"Alexi Canesse","user":"alexicanesse","type":"user","name":"alexicanesse"},"summary":"Communication enables coordination in multi-agent reinforcement learning (MARL), but many real-world applications, e.g., search-and-rescue with drone swarms, operate under severe bandwidth constraints. Many communication architectures still expose a coupled bottleneck in which a shared latent representation is used for both policy execution and inter-agent communication. Consequently, reducing message size directly limits the policy's latent space, often leading to significant performance degradation. We address this with two contributions. First, we introduce β, a normalised per-agent bandwidth budget that unifies sparsity, rounds, and message dimension into a single comparable constraint. Second, we provide SLIM, a minimal architecture that decouples the communication pathway from the policy's latent representation, allowing us to isolate the effect of bandwidth from the effect of policy capacity while benefiting from in-step communication. We evaluate our method on several partially-observable MARL benchmarks, where communication is essential. Our approach achieves state-of-the-art performance and exhibits scalability and robustness under limited communication, with only marginal degradation as bandwidth is reduced.","upvotes":1,"discussionId":"6a0f7dbda53a61ce2e422b77","githubRepo":"https://github.com/alexicanesse/Decoupling-Communication-from-Policy-Robust-MARL-under-Bandwidth-Constraints","githubRepoAddedBy":"user","ai_summary":"Researchers propose a novel communication architecture for multi-agent reinforcement learning that decouples policy representation from communication pathways, enabling better performance under bandwidth constraints.","ai_keywords":["MARL","communication architectures","bandwidth constraints","policy execution","inter-agent communication","latent representation","sparsity","rounds","message dimension","SLIM","partially-observable MARL"],"githubStars":1,"organization":{"_id":"691f39b236fe6b60136afe00","name":"orailix","fullname":"Orailix","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/654e1e137cac9d7b51418f83/EwhwHf5mYSV65VRf-5bXb.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"651d2733399125a79357cdc3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/ASYSZL7g5zYo8mns6by_I.png","isPro":false,"fullname":"Alexi Canesse","user":"alexicanesse","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"691f39b236fe6b60136afe00","name":"orailix","fullname":"Orailix","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/654e1e137cac9d7b51418f83/EwhwHf5mYSV65VRf-5bXb.png"}}">
Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints
Abstract
Researchers propose a novel communication architecture for multi-agent reinforcement learning that decouples policy representation from communication pathways, enabling better performance under bandwidth constraints.
AI-generated summary
Communication enables coordination in multi-agent reinforcement learning (MARL), but many real-world applications, e.g., search-and-rescue with drone swarms, operate under severe bandwidth constraints. Many communication architectures still expose a coupled bottleneck in which a shared latent representation is used for both policy execution and inter-agent communication. Consequently, reducing message size directly limits the policy's latent space, often leading to significant performance degradation. We address this with two contributions. First, we introduce β, a normalised per-agent bandwidth budget that unifies sparsity, rounds, and message dimension into a single comparable constraint. Second, we provide SLIM, a minimal architecture that decouples the communication pathway from the policy's latent representation, allowing us to isolate the effect of bandwidth from the effect of policy capacity while benefiting from in-step communication. We evaluate our method on several partially-observable MARL benchmarks, where communication is essential. Our approach achieves state-of-the-art performance and exhibits scalability and robustness under limited communication, with only marginal degradation as bandwidth is reduced.
Community
Communication enables coordination in multi-agent reinforcement learning (MARL), but many real-world applications, e.g., search-and-rescue with drone swarms, operate under severe bandwidth constraints. Many communication architectures still expose a coupled bottleneck in which a shared latent representation is used for both policy execution and inter-agent communication. Consequently, reducing message size directly limits the policy’s latent space, often leading to significant performance degradation. We address this with two contributions. First, we introduce β, a normalised per-agent bandwidth budget that unifies sparsity, rounds, and message dimension into a single comparable constraint. Second, we provide SLIM, a minimal architecture that decouples the communication pathway from the policy’s latent representation, allowing us to isolate the effect of bandwidth from the effect of policy capacity while benefiting from in-step communication. We evaluate our method on several partially-observable MARL benchmarks, where communication is essential. Our approach achieves state-of-the-art performance and exhibits scalability and robustness under limited communication, with only marginal degradation as bandwidth is reduced.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.21085 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.21085 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.21085 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.