Hugging Face Daily Papers · · 5 min read

When Should Models Change Their Minds? Contextual Belief Management in Large Language Models

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

We show that long-horizon reasoning in LLMs fundamentally depends on contextual belief management — knowing when to update, preserve, or ignore information — and that explicit belief-state optimization dramatically improves this ability.</p>\n","updatedAt":"2026-05-29T02:30:07.244Z","author":{"_id":"620b3bbb0668e435407c8d0a","avatarUrl":"/avatars/e0fccbb2577d76088e09f054c35cffbc.svg","fullname":"Ningyu Zhang","name":"Ningyu","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":48,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8127264976501465},"editors":["Ningyu"],"editorAvatarUrls":["/avatars/e0fccbb2577d76088e09f054c35cffbc.svg"],"reactions":[],"isReport":false}},{"id":"6a1a4090447ed909ef240a9b","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":359,"isUserFollowing":false},"createdAt":"2026-05-30T01:42:40.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Rewarding Beliefs, Not Actions: Consistency-Guided Credit Assignment for Long-Horizon Agents](https://huggingface.co/papers/2605.20061) (2026)\n* [Seeing Isn't Believing: Mitigating Belief Inertia via Active Intervention in Embodied Agents](https://huggingface.co/papers/2604.17252) (2026)\n* [Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents](https://huggingface.co/papers/2605.30159) (2026)\n* [Same Evidence, Different Answers: Canonical-Context On-Policy Distillation for Multi-Turn Language Models](https://huggingface.co/papers/2605.30251) (2026)\n* [PAIR: Prefix-Aware Internal Reward Model for Multi-Turn Agent Optimization](https://huggingface.co/papers/2605.17877) (2026)\n* [Think Through Uncertainty: Improving Long-Form Generation Factuality via Reasoning Calibration](https://huggingface.co/papers/2604.12046) (2026)\n* [Evaluating Temporal Consistency in Multi-Turn Language Models](https://huggingface.co/papers/2604.23051) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"<p>This is an automated message from the <a href=\"https://huggingface.co/librarian-bots\">Librarian Bot</a>. I found the following papers similar to this paper. </p>\n<p>The following papers were recommended by the Semantic Scholar API </p>\n<ul>\n<li><a href=\"https://huggingface.co/papers/2605.20061\">Rewarding Beliefs, Not Actions: Consistency-Guided Credit Assignment for Long-Horizon Agents</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.17252\">Seeing Isn't Believing: Mitigating Belief Inertia via Active Intervention in Embodied Agents</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.30159\">Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.30251\">Same Evidence, Different Answers: Canonical-Context On-Policy Distillation for Multi-Turn Language Models</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.17877\">PAIR: Prefix-Aware Internal Reward Model for Multi-Turn Agent Optimization</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.12046\">Think Through Uncertainty: Improving Long-Form Generation Factuality via Reasoning Calibration</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.23051\">Evaluating Temporal Consistency in Multi-Turn Language Models</a> (2026)</li>\n</ul>\n<p> Please give a thumbs up to this comment if you found it helpful!</p>\n<p> If you want recommendations for any Paper on Hugging Face checkout <a href=\"https://huggingface.co/spaces/librarian-bots/recommend_similar_papers\">this</a> Space</p>\n<p> You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: <code><span class=\"SVELTE_PARTIAL_HYDRATER contents\" data-target=\"UserMention\" data-props=\"{&quot;user&quot;:&quot;librarian-bot&quot;}\"><span class=\"inline-block\"><span class=\"contents\"><a href=\"/librarian-bot\">@<span class=\"underline\">librarian-bot</span></a></span> </span></span> recommend</code></p>\n","updatedAt":"2026-05-30T01:42:40.654Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":359,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7190617322921753},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.30219","authors":[{"_id":"6a18f75356b4bb14ec65ce5d","name":"Haoming Xu","hidden":false},{"_id":"6a18f75356b4bb14ec65ce5e","name":"Weihong Xu","hidden":false},{"_id":"6a18f75356b4bb14ec65ce5f","name":"Zongrui Li","hidden":false},{"_id":"6a18f75356b4bb14ec65ce60","name":"Mengru Wang","hidden":false},{"_id":"6a18f75356b4bb14ec65ce61","name":"Yunzhi Yao","hidden":false},{"_id":"6a18f75356b4bb14ec65ce62","name":"Chiyu Wu","hidden":false},{"_id":"6a18f75356b4bb14ec65ce63","name":"Jin Shang","hidden":false},{"_id":"6a18f75356b4bb14ec65ce64","name":"Yu Gong","hidden":false},{"_id":"6a18f75356b4bb14ec65ce65","name":"Shumin Deng","hidden":false}],"publishedAt":"2026-05-28T00:00:00.000Z","submittedOnDailyAt":"2026-05-29T00:00:00.000Z","title":"When Should Models Change Their Minds? Contextual Belief Management in Large Language Models","submittedOnDailyBy":{"_id":"620b3bbb0668e435407c8d0a","avatarUrl":"/avatars/e0fccbb2577d76088e09f054c35cffbc.svg","isPro":false,"fullname":"Ningyu Zhang","user":"Ningyu","type":"user","name":"Ningyu"},"summary":"Long-horizon interactions require language models to manage accumulating information: when to update their state, when to preserve their state, and what to ignore. We study this challenge as Contextual Belief Management (CBM): maintaining a predicted belief state aligned with formal evidence while isolating task-irrelevant noise. To make CBM measurable, we introduce BeliefTrack, a closed-world benchmark spanning Rule Discovery and Circuit Diagnosis, where a finite belief space and symbolic verifiers enable exact turn-level evaluation. BeliefTrack diagnoses three failures: Failed Stay, Failed Update, and Failed Isolation. Across multiple LLMs, vanilla models exhibit severe CBM failures, while explicit belief-tracking prompts provide limited gains. In contrast, reinforcement learning with belief-state rewards reduces failure rates by 70.9\\% on average. Further probing reveals latent belief-state dynamics behind these failures, and representation-level steering reduces failure rates by 46.1\\% across two tasks\\footnote{Code is coming soon at https://github.com/zjunlp/CBM.","upvotes":15,"discussionId":"6a18f75356b4bb14ec65ce66","ai_summary":"Language models struggle with managing long-term information through contextual belief management, which involves updating, preserving, and filtering relevant information, and can be improved using reinforcement learning and representation-level steering techniques.","ai_keywords":["contextual belief management","belief state","reinforcement learning","representation-level steering","belief tracking","symbolic verifiers","closed-world benchmark","rule discovery","circuit diagnosis"],"organization":{"_id":"6345aadf5efccdc07f1365a5","name":"ZhejiangUniversity","fullname":"Zhejiang University","avatar":"https://www.gravatar.com/avatar/d1d414628877bec2958f95ad283c15e7?d=retro&size=100"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"620b3bbb0668e435407c8d0a","avatarUrl":"/avatars/e0fccbb2577d76088e09f054c35cffbc.svg","isPro":false,"fullname":"Ningyu Zhang","user":"Ningyu","type":"user"},{"_id":"6a0c1e5139d601217d9b3e8e","avatarUrl":"/avatars/bc27ca94a598dd902d591cbdee597f0c.svg","isPro":false,"fullname":"Leonardo Garate","user":"Opaquing","type":"user"},{"_id":"6549caee44e75a7de4fee2fa","avatarUrl":"/avatars/5aea69671eb1299aaaa948d888b4b64f.svg","isPro":false,"fullname":"Xu Ziwen","user":"xzwnlp","type":"user"},{"_id":"684bc1be17ae31ba66171292","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/684bc1be17ae31ba66171292/LFlkU4kArMjSzIbwjXd44.jpeg","isPro":false,"fullname":"Jingsheng Zheng","user":"JohnsonZheng03","type":"user"},{"_id":"6190ab805ca89a28e9f66873","avatarUrl":"/avatars/3c7ecc398fbf851acd2a132e947a92be.svg","isPro":false,"fullname":"Xin Xu","user":"XinXuNLPer","type":"user"},{"_id":"66abc6da92b9eb71fe476118","avatarUrl":"/avatars/6d1618f45cc76da80335ad926ad24552.svg","isPro":false,"fullname":"xy.r","user":"ShawnRu","type":"user"},{"_id":"652bdbb77c5365f2d1228dfb","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/652bdbb77c5365f2d1228dfb/ImPwcK1dMr23MtJVI9C9I.jpeg","isPro":false,"fullname":"ZhongYi","user":"Blurblur02","type":"user"},{"_id":"63a942dd2e05ca32e35335df","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63a942dd2e05ca32e35335df/kuKfBLEXfWnvnoUUmoXW6.jpeg","isPro":false,"fullname":"haoming xu","user":"haomingx","type":"user"},{"_id":"65535b54140fc44a74d43635","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/MIrD8OzDKF2aI38i7ZPjR.jpeg","isPro":false,"fullname":"Zhisong Qiu","user":"consultantQ","type":"user"},{"_id":"67026ef05ce58dd0c3fc0d1c","avatarUrl":"/avatars/94d907941a00ddc9a8030b5c6772bc59.svg","isPro":false,"fullname":"xukewei","user":"xukewei","type":"user"},{"_id":"6776ae0c91b4c75dac91249c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6776ae0c91b4c75dac91249c/uJk3ZnRrzjPCcBNjmrWLI.png","isPro":false,"fullname":"Oran Feng","user":"xiachongfeng","type":"user"},{"_id":"6698c1c3157ceb76c48ff996","avatarUrl":"/avatars/2f1d732c4d9df4f5b554268ee1949dda.svg","isPro":false,"fullname":"徐步强","user":"Xubqpanda","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"6345aadf5efccdc07f1365a5","name":"ZhejiangUniversity","fullname":"Zhejiang University","avatar":"https://www.gravatar.com/avatar/d1d414628877bec2958f95ad283c15e7?d=retro&size=100"}}">
Papers
arxiv:2605.30219

When Should Models Change Their Minds? Contextual Belief Management in Large Language Models

Published on May 28
· Submitted by
Ningyu Zhang
on May 29
Authors:
,
,
,
,
,
,
,
,

Abstract

Language models struggle with managing long-term information through contextual belief management, which involves updating, preserving, and filtering relevant information, and can be improved using reinforcement learning and representation-level steering techniques.

AI-generated summary

Long-horizon interactions require language models to manage accumulating information: when to update their state, when to preserve their state, and what to ignore. We study this challenge as Contextual Belief Management (CBM): maintaining a predicted belief state aligned with formal evidence while isolating task-irrelevant noise. To make CBM measurable, we introduce BeliefTrack, a closed-world benchmark spanning Rule Discovery and Circuit Diagnosis, where a finite belief space and symbolic verifiers enable exact turn-level evaluation. BeliefTrack diagnoses three failures: Failed Stay, Failed Update, and Failed Isolation. Across multiple LLMs, vanilla models exhibit severe CBM failures, while explicit belief-tracking prompts provide limited gains. In contrast, reinforcement learning with belief-state rewards reduces failure rates by 70.9\% on average. Further probing reveals latent belief-state dynamics behind these failures, and representation-level steering reduces failure rates by 46.1\% across two tasks\footnote{Code is coming soon at https://github.com/zjunlp/CBM.

Community

Paper submitter 1 day ago

We show that long-horizon reasoning in LLMs fundamentally depends on contextual belief management — knowing when to update, preserve, or ignore information — and that explicit belief-state optimization dramatically improves this ability.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.30219 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.30219 in a Space README.md to link it from this page.

Collections including this paper 1

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers