Hugging Face Daily Papers · May 13, 2026 · 3 min read

Reliable Chain-of-Thought via Prefix Consistency

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

TL;DR: Correct Chain-of-Thought traces reproduce their answer under prefix regeneration more often than wrong ones, and weighting majority voting by this prefix consistency reaches plateau accuracy at up to 21x fewer tokens (median 4.6x).</p>\n","updatedAt":"2026-05-13T12:05:40.174Z","author":{"_id":"66548d20e7865ffd5ecb5e7e","avatarUrl":"/avatars/24b31a44cb7b51d4b161f27230a95ed9.svg","fullname":"Naoto Iwase","name":"niwase","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8814356923103333},"editors":["niwase"],"editorAvatarUrls":["/avatars/24b31a44cb7b51d4b161f27230a95ed9.svg"],"reactions":[],"isReport":false}},{"id":"6a048aed06d319b43c26ad62","author":{"_id":"661ab1f1fa3b144a381fa454","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/661ab1f1fa3b144a381fa454/IlpZBb9NCjo7ntFwMIH53.png","fullname":"Urro","name":"urroxyz","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":9,"isUserFollowing":false},"createdAt":"2026-05-13T14:30:05.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"High-ROI. I like it.","html":"<p>High-ROI. I like it.</p>\n","updatedAt":"2026-05-13T14:30:05.692Z","author":{"_id":"661ab1f1fa3b144a381fa454","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/661ab1f1fa3b144a381fa454/IlpZBb9NCjo7ntFwMIH53.png","fullname":"Urro","name":"urroxyz","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":9,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9770863056182861},"editors":["urroxyz"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/661ab1f1fa3b144a381fa454/IlpZBb9NCjo7ntFwMIH53.png"],"reactions":[{"reaction":"❤️","users":["niwase"],"count":1}],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.07654","authors":[{"_id":"6a04676586b054ce2fa4119e","name":"Naoto Iwase","hidden":false},{"_id":"6a04676586b054ce2fa4119f","name":"Yuki Ichihara","hidden":false},{"_id":"6a04676586b054ce2fa411a0","name":"Mohammad Atif Quamar","hidden":false},{"_id":"6a04676586b054ce2fa411a1","name":"Junpei Komiyama","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/66548d20e7865ffd5ecb5e7e/Z7OxKviusvsxq7ArXudxD.gif"],"publishedAt":"2026-05-08T00:00:00.000Z","submittedOnDailyAt":"2026-05-13T00:00:00.000Z","title":"Reliable Chain-of-Thought via Prefix Consistency","submittedOnDailyBy":{"_id":"66548d20e7865ffd5ecb5e7e","avatarUrl":"/avatars/24b31a44cb7b51d4b161f27230a95ed9.svg","isPro":false,"fullname":"Naoto Iwase","user":"niwase","type":"user","name":"niwase"},"summary":"Large Language Models often improve accuracy on reasoning tasks by sampling multiple Chain-of-Thought (CoT) traces and aggregating them with majority voting (MV), a test-time technique called self-consistency. When we truncate a CoT partway through and regenerate the remainder, we observe that traces with correct answers reproduce their original answer more often than traces with wrong answers. We use this difference as a reliability signal, prefix consistency, that weights each candidate answer by how often it reappears under regeneration. It requires no access to token log-probabilities or self-rating prompts. Across five reasoning models and four math and science benchmarks, prefix consistency is the best correctness predictor in most settings, and reweighting votes by it reaches Standard MV plateau accuracy at up to 21x fewer tokens (median 4.6x). Our code is available at https://github.com/naoto-iwase/prefix-consistency.","upvotes":1,"discussionId":"6a04676586b054ce2fa411a2","projectPage":"https://naoto-iwase.github.io/prefix-consistency-page","githubRepo":"https://github.com/naoto-iwase/prefix-consistency","githubRepoAddedBy":"user","ai_summary":"Prefix consistency uses answer reproduction rates under trace regeneration to weight candidate responses, achieving high accuracy with significantly fewer tokens than standard majority voting.","ai_keywords":["Chain-of-Thought","self-consistency","majority voting","prefix consistency","answer reproduction","token regeneration","correctness prediction","test-time technique"],"githubStars":0,"organization":{"_id":"61fb9e24dc607a42af5f193f","name":"MBZUAI","fullname":"Mohamed Bin Zayed University of Artificial Intelligence","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1643879908583-603ab5664a944b99e81476e8.jpeg"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"661ab1f1fa3b144a381fa454","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/661ab1f1fa3b144a381fa454/IlpZBb9NCjo7ntFwMIH53.png","isPro":true,"fullname":"Urro","user":"urroxyz","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"61fb9e24dc607a42af5f193f","name":"MBZUAI","fullname":"Mohamed Bin Zayed University of Artificial Intelligence","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1643879908583-603ab5664a944b99e81476e8.jpeg"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.07654.md"}">

Papers

arxiv:2605.07654

Reliable Chain-of-Thought via Prefix Consistency

Published on May 8

· Submitted by

Naoto Iwase on May 13

Mohamed Bin Zayed University of Artificial Intelligence

Upvote

Authors:

Abstract

Prefix consistency uses answer reproduction rates under trace regeneration to weight candidate responses, achieving high accuracy with significantly fewer tokens than standard majority voting.

AI-generated summary

Large Language Models often improve accuracy on reasoning tasks by sampling multiple Chain-of-Thought (CoT) traces and aggregating them with majority voting (MV), a test-time technique called self-consistency. When we truncate a CoT partway through and regenerate the remainder, we observe that traces with correct answers reproduce their original answer more often than traces with wrong answers. We use this difference as a reliability signal, prefix consistency, that weights each candidate answer by how often it reappears under regeneration. It requires no access to token log-probabilities or self-rating prompts. Across five reasoning models and four math and science benchmarks, prefix consistency is the best correctness predictor in most settings, and reweighting votes by it reaches Standard MV plateau accuracy at up to 21x fewer tokens (median 4.6x). Our code is available at https://github.com/naoto-iwase/prefix-consistency.

View arXiv page View PDF Project page GitHub 0 Add to collection

Community

niwase

Paper submitter about 9 hours ago

urroxyz

about 7 hours ago

High-ROI. I like it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2605.07654

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.07654 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.07654 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.07654 in a Space README.md to link it from this page.

Collections including this paper 1

Discussion (0)

No comments yet. Sign in and be the first to say something.

Reliable Chain-of-Thought via Prefix Consistency

Abstract

Community

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 1

Discussion (0)

More from Hugging Face Daily Papers