We identify and validate an information-isolation bottleneck in parallel test-time scaling, where independent reasoning branches repeatedly rediscover information already found elsewhere, causing redundant exploration and inefficient latency scaling.<br>To address this, we propose <strong>Collaborative Parallel Thinking (CPT)</strong>, a training-free inference framework that turns branch-private discoveries into shared decision information during search and establishes a stronger accuracy–latency Pareto frontier on challenging mathematical reasoning benchmarks.</p>\n","updatedAt":"2026-05-27T07:30:30.074Z","author":{"_id":"6537346f434f0b412aae9838","avatarUrl":"/avatars/1c6172ca3e5136ca109cc140f261257c.svg","fullname":"bitwxl","name":"bitwxl2022","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.8664682507514954},"editors":["bitwxl2022"],"editorAvatarUrls":["/avatars/1c6172ca3e5136ca109cc140f261257c.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.27030","authors":[{"_id":"6a16953ae9aa3c8e322db5c1","user":{"_id":"6537346f434f0b412aae9838","avatarUrl":"/avatars/1c6172ca3e5136ca109cc140f261257c.svg","isPro":false,"fullname":"bitwxl","user":"bitwxl2022","type":"user","name":"bitwxl2022"},"name":"Xinglin Wang","status":"claimed_verified","statusLastChangedAt":"2026-05-27T07:52:53.717Z","hidden":false},{"_id":"6a16953ae9aa3c8e322db5c2","name":"Hao Lin","hidden":false},{"_id":"6a16953ae9aa3c8e322db5c3","name":"Shaoxiong Feng","hidden":false},{"_id":"6a16953ae9aa3c8e322db5c4","name":"Peiwen Yuan","hidden":false},{"_id":"6a16953ae9aa3c8e322db5c5","name":"Yiwei Li","hidden":false},{"_id":"6a16953ae9aa3c8e322db5c6","name":"Jiayi Shi","hidden":false},{"_id":"6a16953ae9aa3c8e322db5c7","name":"Yueqi Zhang","hidden":false},{"_id":"6a16953ae9aa3c8e322db5c8","name":"Chuyi Tan","hidden":false},{"_id":"6a16953ae9aa3c8e322db5c9","name":"Ji Zhang","hidden":false},{"_id":"6a16953ae9aa3c8e322db5ca","name":"Boyuan Pan","hidden":false},{"_id":"6a16953ae9aa3c8e322db5cb","name":"Yao Hu","hidden":false},{"_id":"6a16953ae9aa3c8e322db5cc","name":"Kan Li","hidden":false}],"publishedAt":"2026-05-26T00:00:00.000Z","submittedOnDailyAt":"2026-05-27T00:00:00.000Z","title":"Share More, Search Less: Collaborative Parallel Thinking for Efficient Test-Time Scaling","submittedOnDailyBy":{"_id":"6537346f434f0b412aae9838","avatarUrl":"/avatars/1c6172ca3e5136ca109cc140f261257c.svg","isPro":false,"fullname":"bitwxl","user":"bitwxl2022","type":"user","name":"bitwxl2022"},"summary":"Test-Time Scaling (TTS) enhances the reasoning capabilities of large language models by allocating additional inference compute to explore the solution space. However, existing parallel TTS methods typically keep branches isolated during search: intermediate discoveries remain branch-private and cannot guide other branches in time. This information isolation causes substantial redundant exploration, as branches repeatedly rediscover information already found elsewhere and require more search steps to collect complete decision information needed to reach correct answers. To bridge this gap, we propose Collaborative Parallel Thinking (CPT), a training-free inference framework that enables search-time information sharing across parallel branches. CPT extracts compact intermediate information from ongoing branches, maintains a deduplicated query-level information pool, and broadcasts pool entries through the input context, allowing each branch in subsequent search steps to reuse discoveries made by other branches rather than rediscover the same information. Empirically, experiments on HMMT and AIME benchmarks show that CPT establishes a stronger accuracy--latency Pareto frontier than strong baselines across rollout budgets and model scales, highlighting search-time collaboration as an effective direction for efficient parallel TTS.","upvotes":18,"discussionId":"6a16953ae9aa3c8e322db5cd","githubRepo":"https://github.com/WangXinglin/CPT","githubRepoAddedBy":"user","ai_summary":"Collaborative Parallel Thinking (CPT) enables information sharing across parallel search branches during inference to reduce redundant exploration and improve efficiency in test-time scaling for language models.","ai_keywords":["Test-Time Scaling","large language models","inference compute","parallel TTS","search-time information sharing","collaborative parallel thinking","deduplicated query-level information pool","rollout budgets","model scales"],"githubStars":1,"organization":{"_id":"623df197158b6354baef12e2","name":"BIT","fullname":"Beijing Institute of Technology","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1665508603955-623df1201733752a68c76d38.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"63f724a226bb222ab6080282","avatarUrl":"/avatars/5e56890956544c7208318bddc8362237.svg","isPro":false,"fullname":"Wang xinglin","user":"bitwxl","type":"user"},{"_id":"64ff13da9152345879b5857f","avatarUrl":"/avatars/d242bf933cb75f7a6e881f43ef6e020d.svg","isPro":false,"fullname":"Joyce","user":"Joyce-Shi","type":"user"},{"_id":"69fda9eb7fd750408c9ca15c","avatarUrl":"/avatars/67a59e88050d9e809d05851b4cc6bf55.svg","isPro":false,"fullname":"LIU ZISHEN","user":"Zishenliu729","type":"user"},{"_id":"686fc8f781846c7f7ec3599d","avatarUrl":"/avatars/34596640d268a4e4a1aa4ec512b6f29a.svg","isPro":false,"fullname":"hujin","user":"TinyStar0929","type":"user"},{"_id":"64f587505e2a868aa39356bc","avatarUrl":"/avatars/a6c61adcfc1cc30c8c8c0b24f3b77313.svg","isPro":false,"fullname":"ZJ","user":"HCOOOH","type":"user"},{"_id":"6532777f25a58aec655c9708","avatarUrl":"/avatars/795cd329881b65208cbfba1cfa524d5d.svg","isPro":false,"fullname":"Youchao Zhou","user":"zuochao912","type":"user"},{"_id":"69b8ea209d2809a87121a02a","avatarUrl":"/avatars/72afd059f665b94848ce82bb4c8a617f.svg","isPro":false,"fullname":"LIU ZISHEN","user":"sow0713","type":"user"},{"_id":"666181233902e37d2db39c2c","avatarUrl":"/avatars/400b59c506d20a313c618e7f462b37f5.svg","isPro":false,"fullname":"amigo2024","user":"Amigozyq","type":"user"},{"_id":"640196f91083960bd3896f80","avatarUrl":"/avatars/4d7da355eb6fed974c4a09744b1b6e75.svg","isPro":false,"fullname":"ypw","user":"cp0102","type":"user"},{"_id":"69fda8af3ab00ba99acb6191","avatarUrl":"/avatars/e3792315551271de7ba2e9d9bf3c9c34.svg","isPro":false,"fullname":"c","user":"chen66265","type":"user"},{"_id":"670f8a0fdcc32b5a25ee0b8e","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/2X2RVuk6ZwuQxF623eutq.png","isPro":false,"fullname":"Qiyang Chen","user":"c7y","type":"user"},{"_id":"69fda85d5ca97222ff7c7495","avatarUrl":"/avatars/423ee7030ec7ee392bb93e267e016fbd.svg","isPro":false,"fullname":"hao lin","user":"haolin02","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"623df197158b6354baef12e2","name":"BIT","fullname":"Beijing Institute of Technology","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1665508603955-623df1201733752a68c76d38.png"}}">
Share More, Search Less: Collaborative Parallel Thinking for Efficient Test-Time Scaling
Published on May 26
· Submitted by bitwxl on May 27 Authors: ,
,
,
,
,
,
,
,
,
,
Abstract
Collaborative Parallel Thinking (CPT) enables information sharing across parallel search branches during inference to reduce redundant exploration and improve efficiency in test-time scaling for language models.
AI-generated summary
Test-Time Scaling (TTS) enhances the reasoning capabilities of large language models by allocating additional inference compute to explore the solution space. However, existing parallel TTS methods typically keep branches isolated during search: intermediate discoveries remain branch-private and cannot guide other branches in time. This information isolation causes substantial redundant exploration, as branches repeatedly rediscover information already found elsewhere and require more search steps to collect complete decision information needed to reach correct answers. To bridge this gap, we propose Collaborative Parallel Thinking (CPT), a training-free inference framework that enables search-time information sharing across parallel branches. CPT extracts compact intermediate information from ongoing branches, maintains a deduplicated query-level information pool, and broadcasts pool entries through the input context, allowing each branch in subsequent search steps to reuse discoveries made by other branches rather than rediscover the same information. Empirically, experiments on HMMT and AIME benchmarks show that CPT establishes a stronger accuracy--latency Pareto frontier than strong baselines across rollout budgets and model scales, highlighting search-time collaboration as an effective direction for efficient parallel TTS.
Community
We identify and validate an information-isolation bottleneck in parallel test-time scaling, where independent reasoning branches repeatedly rediscover information already found elsewhere, causing redundant exploration and inefficient latency scaling.
To address this, we propose Collaborative Parallel Thinking (CPT), a training-free inference framework that turns branch-private discoveries into shared decision information during search and establishes a stronger accuracy–latency Pareto frontier on challenging mathematical reasoning benchmarks.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.27030 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.27030 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.27030 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.