Hugging Face Daily Papers · May 13, 2026 · 7 min read

Large Language Models over Networks: Collaborative Intelligence under Resource Constraints

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

\n\t<a id=\"🌐-whats-this-about\" class=\"block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full\" href=\"#🌐-whats-this-about\" rel=\"nofollow\">\n\t\t<span class=\"header-link\"><svg class=\"text-gray-500 hover:text-black dark:hover:text-gray-200 w-4\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" aria-hidden=\"true\" role=\"img\" width=\"1em\" height=\"1em\" preserveAspectRatio=\"xMidYMid meet\" viewBox=\"0 0 256 256\"><path d=\"M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z\" fill=\"currentColor\"></path></svg></span>\n\t</a>\n\t<span>\n\t\t🌐 What's this about\n\t</span>\n</h2>\n<p>Cloud APIs alone can't serve every LLM workload. UAVs hit connectivity gaps, closed-loop control can't tolerate round-trips, and per-token pricing caps sustained agentic deployments. On-device LLMs hit the opposite wall: compute, memory, capability. This survey argues that the answer isn't picking a side. It's <strong>collaborative intelligence</strong>, where multiple independent LLMs across device and cloud exchange natural language or structured messages at the task level.</p>\n<h2 class=\"relative group flex items-baseline\">\n\t<a id=\"🧭-two-axes-one-taxonomy\" class=\"block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full\" href=\"#🧭-two-axes-one-taxonomy\" rel=\"nofollow\">\n\t\t<span class=\"header-link\"><svg class=\"text-gray-500 hover:text-black dark:hover:text-gray-200 w-4\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" aria-hidden=\"true\" role=\"img\" width=\"1em\" height=\"1em\" preserveAspectRatio=\"xMidYMid meet\" viewBox=\"0 0 256 256\"><path d=\"M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z\" fill=\"currentColor\"></path></svg></span>\n\t</a>\n\t<span>\n\t\t🧭 Two axes, one taxonomy\n\t</span>\n</h2>\n<p>We organize the design space along two composable dimensions:</p>\n<ul>\n<li>📡 <strong>Vertical device-cloud collaboration</strong>: heuristic routers, classifier-based routers, RL routers for multi-turn, and self-routing where the on-device LLM decides when to escalate based on its own chain-of-thought signals</li>\n<li>🤝 <strong>Horizontal multi-agent collaboration</strong>: prompt-driven coordination, cooperative policy optimization (co-training agents in authentic scenarios), and inter-agent network optimization</li>\n</ul>\n<p>These compose into hybrid topologies in practice.</p>\n<h2 class=\"relative group flex items-baseline\">\n\t<a id=\"🎓-learning-to-collaborate\" class=\"block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full\" href=\"#🎓-learning-to-collaborate\" rel=\"nofollow\">\n\t\t<span class=\"header-link\"><svg class=\"text-gray-500 hover:text-black dark:hover:text-gray-200 w-4\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" aria-hidden=\"true\" role=\"img\" width=\"1em\" height=\"1em\" preserveAspectRatio=\"xMidYMid meet\" viewBox=\"0 0 256 256\"><path d=\"M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z\" fill=\"currentColor\"></path></svg></span>\n\t</a>\n\t<span>\n\t\t🎓 Learning to collaborate\n\t</span>\n</h2>\n<p>Two training threads we trace through the literature:</p>\n<ul>\n<li>🧩 <strong>Routing policy learning</strong>: from heuristic to LLM-selection, classifier-based, RL-based, to self-routing</li>\n<li>🛠️ <strong>Cooperative capability learning</strong>: agents trained in isolation against fixed partners fail to generalize; genuine cooperation emerges only under simultaneous co-training</li>\n</ul>\n<h2 class=\"relative group flex items-baseline\">\n\t<a id=\"🔬-open-challenges\" class=\"block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full\" href=\"#🔬-open-challenges\" rel=\"nofollow\">\n\t\t<span class=\"header-link\"><svg class=\"text-gray-500 hover:text-black dark:hover:text-gray-200 w-4\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" aria-hidden=\"true\" role=\"img\" width=\"1em\" height=\"1em\" preserveAspectRatio=\"xMidYMid meet\" viewBox=\"0 0 256 256\"><path d=\"M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z\" fill=\"currentColor\"></path></svg></span>\n\t</a>\n\t<span>\n\t\t🔬 Open challenges\n\t</span>\n</h2>\n<p>Scaling under resource heterogeneity, trustworthy collaborative intelligence (privacy, robustness, verifiability across endpoints), and the gap between black-box API endpoints and white-box on-device models that any real deployment has to bridge.</p>\n","updatedAt":"2026-05-13T08:49:21.790Z","author":{"_id":"66f601baaa320f437309030b","avatarUrl":"/avatars/cb90b5c1b56878afe08d0f2b1cc611a1.svg","fullname":"Liangqi Yuan","name":"liangqiy","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8719468712806702},"editors":["liangqiy"],"editorAvatarUrls":["/avatars/cb90b5c1b56878afe08d0f2b1cc611a1.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.08626","authors":[{"_id":"6a02edf0b823258e76123849","user":{"_id":"66f601baaa320f437309030b","avatarUrl":"/avatars/cb90b5c1b56878afe08d0f2b1cc611a1.svg","isPro":false,"fullname":"Liangqi Yuan","user":"liangqiy","type":"user","name":"liangqiy"},"name":"Liangqi Yuan","status":"claimed_verified","statusLastChangedAt":"2026-05-13T07:53:12.192Z","hidden":false},{"_id":"6a02edf0b823258e7612384a","name":"Wenzhi Fang","hidden":false},{"_id":"6a02edf0b823258e7612384b","name":"Shiqiang Wang","hidden":false},{"_id":"6a02edf0b823258e7612384c","name":"H. Vincent Poor","hidden":false},{"_id":"6a02edf0b823258e7612384d","name":"Christopher G. Brinton","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/66f601baaa320f437309030b/tM2oCmQrjWL-KQWijhLNd.png"],"publishedAt":"2026-05-09T00:00:00.000Z","submittedOnDailyAt":"2026-05-13T00:00:00.000Z","title":"Large Language Models over Networks: Collaborative Intelligence under Resource Constraints","submittedOnDailyBy":{"_id":"66f601baaa320f437309030b","avatarUrl":"/avatars/cb90b5c1b56878afe08d0f2b1cc611a1.svg","isPro":false,"fullname":"Liangqi Yuan","user":"liangqiy","type":"user","name":"liangqiy"},"summary":"Large language models (LLMs) are transforming society, powering applications from smartphone assistants to autonomous driving. Yet cloud-based LLM services alone cannot serve a growing class of applications, including those operating under intermittent connectivity, sub-second latency budgets, data-residency constraints, or sustained high-volume inference. On-device deployment is in turn constrained by limited computation and memory. No single endpoint can deliver high-quality service across this spectrum. This article focuses on collaborative intelligence, a paradigm in which multiple independent LLMs distributed across device and cloud endpoints collaborate at the task level through natural language or structured messages. Such collaboration strives for superior response quality under heterogeneous resource constraints spanning computation, memory, communication, and cost across network tiers. We present collaborative inference along two complementary and composable dimensions: vertical device-cloud collaboration and horizontal multi-agent collaboration, which can be combined into hybrid topologies in practice. We then examine learning to collaborate, addressing the training of routing policies and the development of cooperative capabilities among LLMs. Finally, we identify open research challenges including scaling under resource heterogeneity and trustworthy collaborative intelligence.","upvotes":1,"discussionId":"6a02edf1b823258e7612384e","ai_summary":"Collaborative intelligence enables multiple distributed LLMs to work together across devices and clouds to provide high-quality responses under diverse resource constraints.","ai_keywords":["large language models","collaborative intelligence","device-cloud collaboration","multi-agent collaboration","collaborative inference","routing policies","resource heterogeneity"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"66f601baaa320f437309030b","avatarUrl":"/avatars/cb90b5c1b56878afe08d0f2b1cc611a1.svg","isPro":false,"fullname":"Liangqi Yuan","user":"liangqiy","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.08626.md"}">

Papers

arxiv:2605.08626

Large Language Models over Networks: Collaborative Intelligence under Resource Constraints

Published on May 9

· Submitted by

Liangqi Yuan on May 13

Upvote

Authors:

Liangqi Yuan ,

Abstract

Collaborative intelligence enables multiple distributed LLMs to work together across devices and clouds to provide high-quality responses under diverse resource constraints.

AI-generated summary

Large language models (LLMs) are transforming society, powering applications from smartphone assistants to autonomous driving. Yet cloud-based LLM services alone cannot serve a growing class of applications, including those operating under intermittent connectivity, sub-second latency budgets, data-residency constraints, or sustained high-volume inference. On-device deployment is in turn constrained by limited computation and memory. No single endpoint can deliver high-quality service across this spectrum. This article focuses on collaborative intelligence, a paradigm in which multiple independent LLMs distributed across device and cloud endpoints collaborate at the task level through natural language or structured messages. Such collaboration strives for superior response quality under heterogeneous resource constraints spanning computation, memory, communication, and cost across network tiers. We present collaborative inference along two complementary and composable dimensions: vertical device-cloud collaboration and horizontal multi-agent collaboration, which can be combined into hybrid topologies in practice. We then examine learning to collaborate, addressing the training of routing policies and the development of cooperative capabilities among LLMs. Finally, we identify open research challenges including scaling under resource heterogeneity and trustworthy collaborative intelligence.

View arXiv page View PDF Add to collection

Community

liangqiy

Paper author Paper submitter about 12 hours ago

🌐 What's this about

Cloud APIs alone can't serve every LLM workload. UAVs hit connectivity gaps, closed-loop control can't tolerate round-trips, and per-token pricing caps sustained agentic deployments. On-device LLMs hit the opposite wall: compute, memory, capability. This survey argues that the answer isn't picking a side. It's collaborative intelligence, where multiple independent LLMs across device and cloud exchange natural language or structured messages at the task level.

🧭 Two axes, one taxonomy

We organize the design space along two composable dimensions:

📡 Vertical device-cloud collaboration: heuristic routers, classifier-based routers, RL routers for multi-turn, and self-routing where the on-device LLM decides when to escalate based on its own chain-of-thought signals
🤝 Horizontal multi-agent collaboration: prompt-driven coordination, cooperative policy optimization (co-training agents in authentic scenarios), and inter-agent network optimization

These compose into hybrid topologies in practice.

🎓 Learning to collaborate

Two training threads we trace through the literature:

🧩 Routing policy learning: from heuristic to LLM-selection, classifier-based, RL-based, to self-routing
🛠️ Cooperative capability learning: agents trained in isolation against fixed partners fail to generalize; genuine cooperation emerges only under simultaneous co-training

🔬 Open challenges

Scaling under resource heterogeneity, trustworthy collaborative intelligence (privacy, robustness, verifiability across endpoints), and the gap between black-box API endpoints and white-box on-device models that any real deployment has to bridge.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2605.08626

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.08626 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.08626 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.08626 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

No comments yet. Sign in and be the first to say something.

Large Language Models over Networks: Collaborative Intelligence under Resource Constraints

Abstract

Community

🌐 What's this about

🧭 Two axes, one taxonomy

🎓 Learning to collaborate

🔬 Open challenges

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 0

Discussion (0)

More from Hugging Face Daily Papers