As LLM agents increasingly select tools autonomously, their choices among tools with different privileges become safety-relevant. However, prior tool-selection studies focus on safety-agnostic metadata preferences, leaving privilege-sensitive choices underexplored. To address this gap, we study over-privileged tool selection, in which an agent selects or escalates to a higher-privilege tool despite a sufficient lower-privilege alternative. We introduce ToolPrivBench to evaluate whether agents choose higher-privilege tools despite sufficient lower-privilege alternatives, measuring both initial selection and escalation after transient tool failures. Across eight domains and five recurring risk patterns, we find that over-privileged tool selection is common among mainstream LLM agents and is further amplified by transient failures. We further find that general safety alignment does not reliably transfer to least-privilege tool choice, while prompt-level controls provide only limited mitigation under transient failures. We therefore introduce a privilege-aware post-training defense that teaches agents to prefer sufficient lower-privilege tools and escalate only when necessary. Our mitigation experiments show that this defense substantially reduces unnecessary high-privilege tool use while preserving general capabilities.</p>\n<p><a href=\"https://cdn-uploads.huggingface.co/production/uploads/622842e296588dd1a2594746/6dixR1V6JRKVYpz6GNOnE.png\" rel=\"nofollow\"><img src=\"https://cdn-uploads.huggingface.co/production/uploads/622842e296588dd1a2594746/6dixR1V6JRKVYpz6GNOnE.png\" alt=\"image\"></a></p>\n","updatedAt":"2026-06-25T05:21:19.743Z","author":{"_id":"622842e296588dd1a2594746","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/622842e296588dd1a2594746/bGQI84-VLEFamixi7Q5-I.jpeg","fullname":"Yuchi Wang","name":"YuchiWang","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8506971001625061},"editors":["YuchiWang"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/622842e296588dd1a2594746/bGQI84-VLEFamixi7Q5-I.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.20023","authors":[{"_id":"6a3ba1715ac9fb0744984939","name":"Kaiyue Yang","hidden":false},{"_id":"6a3ba1715ac9fb074498493a","name":"Yuyan Bu","hidden":false},{"_id":"6a3ba1715ac9fb074498493b","name":"Jingwei Yi","hidden":false},{"_id":"6a3ba1715ac9fb074498493c","name":"Yuchi Wang","hidden":false},{"_id":"6a3ba1715ac9fb074498493d","name":"Biyu Zhou","hidden":false},{"_id":"6a3ba1715ac9fb074498493e","name":"Juntao Dai","hidden":false},{"_id":"6a3ba1715ac9fb074498493f","name":"Songlin Hu","hidden":false},{"_id":"6a3ba1715ac9fb0744984940","name":"Yaodong Yang","hidden":false}],"publishedAt":"2026-06-18T00:00:00.000Z","submittedOnDailyAt":"2026-06-25T00:00:00.000Z","title":"When Lower Privileges Suffice: Investigating Over-Privileged Tool Selection in LLM Agents","submittedOnDailyBy":{"_id":"622842e296588dd1a2594746","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/622842e296588dd1a2594746/bGQI84-VLEFamixi7Q5-I.jpeg","isPro":false,"fullname":"Yuchi Wang","user":"YuchiWang","type":"user","name":"YuchiWang"},"summary":"As LLM agents increasingly select tools autonomously, their choices among tools with different privileges become safety-relevant. However, prior tool-selection studies focus on safety-agnostic metadata preferences, leaving privilege-sensitive choices underexplored. To address this gap, we study over-privileged tool selection, in which an agent selects or escalates to a higher-privilege tool despite a sufficient lower-privilege alternative. We introduce ToolPrivBench to evaluate whether agents choose higher-privilege tools despite sufficient lower-privilege alternatives, measuring both initial selection and escalation after transient tool failures. Across eight domains and five recurring risk patterns, we find that over-privileged tool selection is common among mainstream LLM agents and is further amplified by transient failures. We further find that general safety alignment does not reliably transfer to least-privilege tool choice, while prompt-level controls provide only limited mitigation under transient failures. We therefore introduce a privilege-aware post-training defense that teaches agents to prefer sufficient lower-privilege tools and escalate only when necessary. Our mitigation experiments show that this defense substantially reduces unnecessary high-privilege tool use while preserving general capabilities.","upvotes":3,"discussionId":"6a3ba1715ac9fb0744984941","githubRepo":"https://github.com/AISafetyHub/agent-tool-selection-bias","githubRepoAddedBy":"user","ai_summary":"LLM agents frequently select higher-privilege tools unnecessarily, and while safety alignment doesn't ensure least-privilege choices, a post-training defense can reduce excessive privilege use without sacrificing performance.","ai_keywords":["tool selection","privilege-sensitive choices","over-privileged tool selection","ToolPrivBench","least-privilege tool choice","post-training defense"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":3,"organization":{"_id":"61be9739d2f9358e24ca0a4f","name":"BAAI","fullname":"Beijing Academy of Artificial Intelligence","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1664511063789-632c234f42c386ebd2710434.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"622842e296588dd1a2594746","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/622842e296588dd1a2594746/bGQI84-VLEFamixi7Q5-I.jpeg","isPro":false,"fullname":"Yuchi Wang","user":"YuchiWang","type":"user"},{"_id":"6837daeef09a146728a08458","avatarUrl":"/avatars/d74da168ed34ddbe412ec694cd498ae2.svg","isPro":false,"fullname":"kaiyue","user":"hhh99h","type":"user"},{"_id":"697846757ab756d3d8519d28","avatarUrl":"/avatars/210984ec4247dfc1f43c95a91b431b67.svg","isPro":false,"fullname":"JuliaStephens","user":"JuliaStephens","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"61be9739d2f9358e24ca0a4f","name":"BAAI","fullname":"Beijing Academy of Artificial Intelligence","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1664511063789-632c234f42c386ebd2710434.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.20023.md","query":{}}">
When Lower Privileges Suffice: Investigating Over-Privileged Tool Selection in LLM Agents
Abstract
LLM agents frequently select higher-privilege tools unnecessarily, and while safety alignment doesn't ensure least-privilege choices, a post-training defense can reduce excessive privilege use without sacrificing performance.
As LLM agents increasingly select tools autonomously, their choices among tools with different privileges become safety-relevant. However, prior tool-selection studies focus on safety-agnostic metadata preferences, leaving privilege-sensitive choices underexplored. To address this gap, we study over-privileged tool selection, in which an agent selects or escalates to a higher-privilege tool despite a sufficient lower-privilege alternative. We introduce ToolPrivBench to evaluate whether agents choose higher-privilege tools despite sufficient lower-privilege alternatives, measuring both initial selection and escalation after transient tool failures. Across eight domains and five recurring risk patterns, we find that over-privileged tool selection is common among mainstream LLM agents and is further amplified by transient failures. We further find that general safety alignment does not reliably transfer to least-privilege tool choice, while prompt-level controls provide only limited mitigation under transient failures. We therefore introduce a privilege-aware post-training defense that teaches agents to prefer sufficient lower-privilege tools and escalate only when necessary. Our mitigation experiments show that this defense substantially reduces unnecessary high-privilege tool use while preserving general capabilities.
Community
As LLM agents increasingly select tools autonomously, their choices among tools with different privileges become safety-relevant. However, prior tool-selection studies focus on safety-agnostic metadata preferences, leaving privilege-sensitive choices underexplored. To address this gap, we study over-privileged tool selection, in which an agent selects or escalates to a higher-privilege tool despite a sufficient lower-privilege alternative. We introduce ToolPrivBench to evaluate whether agents choose higher-privilege tools despite sufficient lower-privilege alternatives, measuring both initial selection and escalation after transient tool failures. Across eight domains and five recurring risk patterns, we find that over-privileged tool selection is common among mainstream LLM agents and is further amplified by transient failures. We further find that general safety alignment does not reliably transfer to least-privilege tool choice, while prompt-level controls provide only limited mitigation under transient failures. We therefore introduce a privilege-aware post-training defense that teaches agents to prefer sufficient lower-privilege tools and escalate only when necessary. Our mitigation experiments show that this defense substantially reduces unnecessary high-privilege tool use while preserving general capabilities.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2606.20023 in a model README.md to link it from this page.
Cite arxiv.org/abs/2606.20023 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2606.20023 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.