Policy-adherent tool-calling agents in customer-service domains must maintain task states across turns while calling tools and obeying domain policies. Task states consist of relevant facts, identifiers, constraints, and conditions observed through user interaction and tool calls. In standard agents, task states are not represented separately. Observations, tool returns, and policy instructions are placed in the prompt, leaving agents to reconstruct the relevant states from the prompt each time they decide what to do next. This design makes state management implicit, creating two common failure modes. An agent may retrieve the right facts but later ground its decision in stale, missing, or incorrect information; and a syntactically valid tool call may still violate a domain policy that depends on the current task state. We introduce LedgerAgent, an inference-time method for tool-calling agents that maintains observed task states in a separate ledger and renders the states into the prompt. The ledger is also used to check state-dependent policy constraints before environment-changing tool calls are executed, blocking policy violations. Across four customer-service domains and a mixed panel of open- and closed-weight models, LedgerAgent improves average pass^k over a standard prompt-based tool-calling approach, with the largest gains under stricter multi-trial consistency metrics.</p>\n","updatedAt":"2026-06-19T18:17:10.015Z","author":{"_id":"640f6299ef5c6dcac8b1df52","avatarUrl":"/avatars/022f21183abc8a8b5ce1b198d3ba96dc.svg","fullname":"Amir","name":"sahsaeedi","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8883206248283386},"editors":["sahsaeedi"],"editorAvatarUrls":["/avatars/022f21183abc8a8b5ce1b198d3ba96dc.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.20529","authors":[{"_id":"6a358764db23715e9da12caa","name":"Md Nayem Uddin","hidden":false},{"_id":"6a358764db23715e9da12cab","name":"Amir Saeidi","hidden":false},{"_id":"6a358764db23715e9da12cac","name":"Eduardo Blanco","hidden":false},{"_id":"6a358764db23715e9da12cad","name":"Chitta Baral","hidden":false}],"publishedAt":"2026-06-18T00:00:00.000Z","submittedOnDailyAt":"2026-06-19T00:00:00.000Z","title":"LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents","submittedOnDailyBy":{"_id":"640f6299ef5c6dcac8b1df52","avatarUrl":"/avatars/022f21183abc8a8b5ce1b198d3ba96dc.svg","isPro":false,"fullname":"Amir","user":"sahsaeedi","type":"user","name":"sahsaeedi"},"summary":"Policy-adherent tool-calling agents in customer-service domains must maintain task states across turns while calling tools and obeying domain policies. Task states consist of relevant facts, identifiers, constraints, and conditions observed through user interaction and tool calls. In standard agents, task states are not represented separately. Observations, tool returns, and policy instructions are placed in the prompt, leaving agents to reconstruct the relevant states from the prompt each time they decide what to do next. This design makes state management implicit, creating two common failure modes. An agent may retrieve the right facts but later ground its decision in stale, missing, or incorrect information; and a syntactically valid tool call may still violate a domain policy that depends on the current task state. We introduce LedgerAgent, an inference-time method for tool-calling agents that maintains observed task states in a separate ledger and renders the states into the prompt. The ledger is also used to check state-dependent policy constraints before environment-changing tool calls are executed, blocking policy violations. Across four customer-service domains and a mixed panel of open- and closed-weight models, LedgerAgent improves average passk over a standard prompt-based tool-calling approach, with the largest gains under stricter multi-trial consistency metrics.","upvotes":3,"discussionId":"6a358764db23715e9da12cae","ai_summary":"LEDGERAGENT is a method for customer service agents that maintains task states in a separate ledger to improve policy adherence and state management during tool calling.","ai_keywords":["tool-calling agents","task states","domain policies","ledger","prompt-based approach","policy constraints","state-dependent constraints","multi-trial consistency"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","organization":{"_id":"66ccec5b99d6b87c5585532b","name":"Arizona-State-University","fullname":"Arizona State University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/65d032d02d592af8eae8d969/TpoBla207XZPOGVo4NVWI.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"640f6299ef5c6dcac8b1df52","avatarUrl":"/avatars/022f21183abc8a8b5ce1b198d3ba96dc.svg","isPro":false,"fullname":"Amir","user":"sahsaeedi","type":"user"},{"_id":"636f4ad38305bc19758bfb70","avatarUrl":"/avatars/7becbb5d0280114bcbf05a9604f5de1f.svg","isPro":false,"fullname":"Divij Handa","user":"Divij","type":"user"},{"_id":"61591e3142907cd8183a06c6","avatarUrl":"/avatars/c9517d283f9e0c4561c89786b4f82a05.svg","isPro":false,"fullname":"MD NAYEM UDDIN","user":"nurakib","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"66ccec5b99d6b87c5585532b","name":"Arizona-State-University","fullname":"Arizona State University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/65d032d02d592af8eae8d969/TpoBla207XZPOGVo4NVWI.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.20529.md","query":{}}">
LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents
Published on Jun 18
· Submitted by Amir on Jun 19 Abstract
LEDGERAGENT is a method for customer service agents that maintains task states in a separate ledger to improve policy adherence and state management during tool calling.
Policy-adherent tool-calling agents in customer-service domains must maintain task states across turns while calling tools and obeying domain policies. Task states consist of relevant facts, identifiers, constraints, and conditions observed through user interaction and tool calls. In standard agents, task states are not represented separately. Observations, tool returns, and policy instructions are placed in the prompt, leaving agents to reconstruct the relevant states from the prompt each time they decide what to do next. This design makes state management implicit, creating two common failure modes. An agent may retrieve the right facts but later ground its decision in stale, missing, or incorrect information; and a syntactically valid tool call may still violate a domain policy that depends on the current task state. We introduce LedgerAgent, an inference-time method for tool-calling agents that maintains observed task states in a separate ledger and renders the states into the prompt. The ledger is also used to check state-dependent policy constraints before environment-changing tool calls are executed, blocking policy violations. Across four customer-service domains and a mixed panel of open- and closed-weight models, LedgerAgent improves average passk over a standard prompt-based tool-calling approach, with the largest gains under stricter multi-trial consistency metrics.
Community
Policy-adherent tool-calling agents in customer-service domains must maintain task states across turns while calling tools and obeying domain policies. Task states consist of relevant facts, identifiers, constraints, and conditions observed through user interaction and tool calls. In standard agents, task states are not represented separately. Observations, tool returns, and policy instructions are placed in the prompt, leaving agents to reconstruct the relevant states from the prompt each time they decide what to do next. This design makes state management implicit, creating two common failure modes. An agent may retrieve the right facts but later ground its decision in stale, missing, or incorrect information; and a syntactically valid tool call may still violate a domain policy that depends on the current task state. We introduce LedgerAgent, an inference-time method for tool-calling agents that maintains observed task states in a separate ledger and renders the states into the prompt. The ledger is also used to check state-dependent policy constraints before environment-changing tool calls are executed, blocking policy violations. Across four customer-service domains and a mixed panel of open- and closed-weight models, LedgerAgent improves average pass^k over a standard prompt-based tool-calling approach, with the largest gains under stricter multi-trial consistency metrics.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2606.20529 in a model README.md to link it from this page.
Cite arxiv.org/abs/2606.20529 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2606.20529 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.