Hugging Face Daily Papers · May 26, 2026 · 5 min read

Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

Communication enables coordination in multi-agent reinforcement learning (MARL), but many real-world applications, e.g., search-and-rescue with drone swarms, operate under severe bandwidth constraints. Many communication architectures still expose a coupled bottleneck in which a shared latent representation is used for both policy execution and inter-agent communication. Consequently, reducing message size directly limits the policy’s latent space, often leading to significant performance degradation. We address this with two contributions. First, we introduce β, a normalised per-agent bandwidth budget that unifies sparsity, rounds, and message dimension into a single comparable constraint. Second, we provide SLIM, a minimal architecture that decouples the communication pathway from the policy’s latent representation, allowing us to isolate the effect of bandwidth from the effect of policy capacity while benefiting from in-step communication. We evaluate our method on several partially-observable MARL benchmarks, where communication is essential. Our approach achieves state-of-the-art performance and exhibits scalability and robustness under limited communication, with only marginal degradation as bandwidth is reduced.</p>\n","updatedAt":"2026-05-26T10:56:02.611Z","author":{"_id":"651d2733399125a79357cdc3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/ASYSZL7g5zYo8mns6by_I.png","fullname":"Alexi Canesse","name":"alexicanesse","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.896904706954956},"editors":["alexicanesse"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/ASYSZL7g5zYo8mns6by_I.png"],"reactions":[],"isReport":false}},{"id":"6a1600ca91a9e2d4068834e8","author":{"_id":"65243980050781c16f234f1f","avatarUrl":"/avatars/743a009681d5d554c27e04300db9f267.svg","fullname":"Avi","name":"avahal","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":6,"isUserFollowing":false},"createdAt":"2026-05-26T20:21:30.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Interesting breakdown of this paper on arXivLens: https://arxivlens.com/PaperView/Details/decoupling-communication-from-policy-robust-marl-under-bandwidth-constraints-7284-23ffc80f\nCovers the executive summary, detailed methodology, and practical applications.","html":"<p>Interesting breakdown of this paper on arXivLens: <a href=\"https://arxivlens.com/PaperView/Details/decoupling-communication-from-policy-robust-marl-under-bandwidth-constraints-7284-23ffc80f\" rel=\"nofollow\">https://arxivlens.com/PaperView/Details/decoupling-communication-from-policy-robust-marl-under-bandwidth-constraints-7284-23ffc80f</a><br>Covers the executive summary, detailed methodology, and practical applications.</p>\n","updatedAt":"2026-05-26T20:21:30.020Z","author":{"_id":"65243980050781c16f234f1f","avatarUrl":"/avatars/743a009681d5d554c27e04300db9f267.svg","fullname":"Avi","name":"avahal","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":6,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6777297854423523},"editors":["avahal"],"editorAvatarUrls":["/avatars/743a009681d5d554c27e04300db9f267.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.21085","authors":[{"_id":"6a0f7dbda53a61ce2e422b73","user":{"_id":"651d2733399125a79357cdc3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/ASYSZL7g5zYo8mns6by_I.png","isPro":false,"fullname":"Alexi Canesse","user":"alexicanesse","type":"user","name":"alexicanesse"},"name":"Alexi Canesse","status":"claimed_verified","statusLastChangedAt":"2026-05-22T16:09:19.376Z","hidden":false},{"_id":"6a0f7dbda53a61ce2e422b74","name":"Benoît Goupil","hidden":false},{"_id":"6a0f7dbda53a61ce2e422b75","name":"Jesse Read","hidden":false},{"_id":"6a0f7dbda53a61ce2e422b76","name":"Sonia Vanier","hidden":false}],"publishedAt":"2026-05-20T00:00:00.000Z","submittedOnDailyAt":"2026-05-26T00:00:00.000Z","title":"Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints","submittedOnDailyBy":{"_id":"651d2733399125a79357cdc3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/ASYSZL7g5zYo8mns6by_I.png","isPro":false,"fullname":"Alexi Canesse","user":"alexicanesse","type":"user","name":"alexicanesse"},"summary":"Communication enables coordination in multi-agent reinforcement learning (MARL), but many real-world applications, e.g., search-and-rescue with drone swarms, operate under severe bandwidth constraints. Many communication architectures still expose a coupled bottleneck in which a shared latent representation is used for both policy execution and inter-agent communication. Consequently, reducing message size directly limits the policy's latent space, often leading to significant performance degradation. We address this with two contributions. First, we introduce β, a normalised per-agent bandwidth budget that unifies sparsity, rounds, and message dimension into a single comparable constraint. Second, we provide SLIM, a minimal architecture that decouples the communication pathway from the policy's latent representation, allowing us to isolate the effect of bandwidth from the effect of policy capacity while benefiting from in-step communication. We evaluate our method on several partially-observable MARL benchmarks, where communication is essential. Our approach achieves state-of-the-art performance and exhibits scalability and robustness under limited communication, with only marginal degradation as bandwidth is reduced.","upvotes":1,"discussionId":"6a0f7dbda53a61ce2e422b77","githubRepo":"https://github.com/alexicanesse/Decoupling-Communication-from-Policy-Robust-MARL-under-Bandwidth-Constraints","githubRepoAddedBy":"user","ai_summary":"Researchers propose a novel communication architecture for multi-agent reinforcement learning that decouples policy representation from communication pathways, enabling better performance under bandwidth constraints.","ai_keywords":["MARL","communication architectures","bandwidth constraints","policy execution","inter-agent communication","latent representation","sparsity","rounds","message dimension","SLIM","partially-observable MARL"],"githubStars":1,"organization":{"_id":"691f39b236fe6b60136afe00","name":"orailix","fullname":"Orailix","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/654e1e137cac9d7b51418f83/EwhwHf5mYSV65VRf-5bXb.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"651d2733399125a79357cdc3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/ASYSZL7g5zYo8mns6by_I.png","isPro":false,"fullname":"Alexi Canesse","user":"alexicanesse","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"691f39b236fe6b60136afe00","name":"orailix","fullname":"Orailix","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/654e1e137cac9d7b51418f83/EwhwHf5mYSV65VRf-5bXb.png"}}">

Papers

arxiv:2605.21085

Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints

Published on May 20

· Submitted by

Alexi Canesse on May 26

Orailix

Upvote

Authors:

Alexi Canesse ,

Abstract

Researchers propose a novel communication architecture for multi-agent reinforcement learning that decouples policy representation from communication pathways, enabling better performance under bandwidth constraints.

AI-generated summary