POISE is a position-aware skill-poisoning attack on LLM agents. We evaluate under a strict Attack Success Rate metric that requires the injected payload to execute and the user's task to still pass its verifier in the same trial. On Skill-Inject with codex+gpt-5.2, POISE achieves 89.3% ASR (+28.0 pts over a random-placement body baseline, +2.6 over a YAML-only baseline), while staying stealthy enough that only 5.6% of poisoned variants raise a new high-risk alert — exposing the limits of current static scanners.<br>We demonstrate this supply-chain threat in agent skill ecosystems to help the community develop stronger detection and safer skill distribution.</p>\n","updatedAt":"2026-06-11T09:03:33.402Z","author":{"_id":"69418eff68f48f199f47bc2d","avatarUrl":"/avatars/5e728a98a15314d76bafb38e8d9e1edd.svg","fullname":"Haochang Hao","name":"liofoil","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8442595601081848},"editors":["liofoil"],"editorAvatarUrls":["/avatars/5e728a98a15314d76bafb38e8d9e1edd.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.07943","authors":[{"_id":"6a28ac4fe7d78ea7587e523c","name":"Haochang Hao","hidden":false},{"_id":"6a28ac4fe7d78ea7587e523d","name":"Dehai Min","hidden":false},{"_id":"6a28ac4fe7d78ea7587e523e","name":"Zhifang Zhang","hidden":false},{"_id":"6a28ac4fe7d78ea7587e523f","name":"Yunbei Zhang","hidden":false},{"_id":"6a28ac4fe7d78ea7587e5240","name":"Miao Xu","hidden":false},{"_id":"6a28ac4fe7d78ea7587e5241","name":"Yingqiang Ge","hidden":false},{"_id":"6a28ac4fe7d78ea7587e5242","name":"Lu Cheng","hidden":false}],"publishedAt":"2026-06-06T00:00:00.000Z","submittedOnDailyAt":"2026-06-11T00:00:00.000Z","title":"POISE: Position-Aware Undetectable Skill Injection on LLM Agents","submittedOnDailyBy":{"_id":"69418eff68f48f199f47bc2d","avatarUrl":"/avatars/5e728a98a15314d76bafb38e8d9e1edd.svg","isPro":false,"fullname":"Haochang Hao","user":"liofoil","type":"user","name":"liofoil"},"summary":"Agent skills provide a lightweight mechanism for extending general-purpose agents, but their open format exposes them to skill-poisoning attacks. A practically dangerous injection must stay invisible: if executing the payload derails the user's legitimate task, the resulting failure signal invites inspection of the skill. We therefore evaluate attacks by Attack Success Rate, which requires the injected payload to execute and the user's task to still pass its verifier in the same trial. Prior skill-poisoning attacks face a reliability-stealth trade-off under this lens: YAML-header injections are reliably loaded but easily inspected, whereas stealthier body injections that place explicit malicious commands in the skill prose are less reliable because out-of-context commands invite the agent's own suspicion. We introduce POISE, a position-aware attack that compresses the trigger into a single, benign-looking body instruction, placing it at a feasible position and using a context-aware generator to blend it with nearby setup or prerequisite steps. On Skill-Inject with codex+gpt-5.2, POISE achieves an 89.3% ASR, 28.0 points above a random-placement body baseline and 2.6 points above a YAML-only baseline, while retaining the stealth advantage of body placement. That stealth is the decisive margin: because legitimate skill bodies naturally require privileged tool operations, LLM scanners are hyper-sensitive, falsely flagging 74.6% of clean skills on average across four judges and both benchmarks. Blending into these false alarms, POISE causes only 5.6% of poisoned variants to gain a new high-risk alert over their clean baselines, rendering current static defenses ineffective.","upvotes":4,"discussionId":"6a28ac4fe7d78ea7587e5243","githubRepo":"https://github.com/liofoil/SkillSafety","githubRepoAddedBy":"user","ai_summary":"POISE is a stealthy skill-poisoning attack that embeds malicious triggers within benign-looking instructions, achieving high attack success rates while avoiding detection by LLM scanners that are overly sensitive to privileged tool operations.","ai_keywords":["skill-poisoning attacks","attack success rate","YAML-header injections","body injections","context-aware generator","position-aware attack","LLM scanners","static defenses","codex","gpt-5.2"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":0,"organization":{"_id":"696e94fe3c4c7e3aa42e8692","name":"UIC-AI-lab","fullname":"University of Illinois Chicago","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/629c6ee73a3221bb210afc2d/wMJTjUyrizV5R9TeznZ9P.jpeg"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"69418eff68f48f199f47bc2d","avatarUrl":"/avatars/5e728a98a15314d76bafb38e8d9e1edd.svg","isPro":false,"fullname":"Haochang Hao","user":"liofoil","type":"user"},{"_id":"6820a87b77669579f4297072","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/POA5q8EbJfnGL2hADSTqV.png","isPro":false,"fullname":"Cola Chen (SII)","user":"141forever","type":"user"},{"_id":"658e8954a6567cb93c186888","avatarUrl":"/avatars/0d4aee3af5909f931f33e841f42b4e28.svg","isPro":false,"fullname":"zhang zhifang","user":"zhangzhifang","type":"user"},{"_id":"6a2ae6c2e36bc84d91b6e7cc","avatarUrl":"/avatars/abf4b4c0020f9332b6827952cc53163e.svg","isPro":false,"fullname":"mmgood","user":"mmgood","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"696e94fe3c4c7e3aa42e8692","name":"UIC-AI-lab","fullname":"University of Illinois Chicago","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/629c6ee73a3221bb210afc2d/wMJTjUyrizV5R9TeznZ9P.jpeg"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.07943.md"}">
POISE: Position-Aware Undetectable Skill Injection on LLM Agents
Abstract
POISE is a stealthy skill-poisoning attack that embeds malicious triggers within benign-looking instructions, achieving high attack success rates while avoiding detection by LLM scanners that are overly sensitive to privileged tool operations.
Agent skills provide a lightweight mechanism for extending general-purpose agents, but their open format exposes them to skill-poisoning attacks. A practically dangerous injection must stay invisible: if executing the payload derails the user's legitimate task, the resulting failure signal invites inspection of the skill. We therefore evaluate attacks by Attack Success Rate, which requires the injected payload to execute and the user's task to still pass its verifier in the same trial. Prior skill-poisoning attacks face a reliability-stealth trade-off under this lens: YAML-header injections are reliably loaded but easily inspected, whereas stealthier body injections that place explicit malicious commands in the skill prose are less reliable because out-of-context commands invite the agent's own suspicion. We introduce POISE, a position-aware attack that compresses the trigger into a single, benign-looking body instruction, placing it at a feasible position and using a context-aware generator to blend it with nearby setup or prerequisite steps. On Skill-Inject with codex+gpt-5.2, POISE achieves an 89.3% ASR, 28.0 points above a random-placement body baseline and 2.6 points above a YAML-only baseline, while retaining the stealth advantage of body placement. That stealth is the decisive margin: because legitimate skill bodies naturally require privileged tool operations, LLM scanners are hyper-sensitive, falsely flagging 74.6% of clean skills on average across four judges and both benchmarks. Blending into these false alarms, POISE causes only 5.6% of poisoned variants to gain a new high-risk alert over their clean baselines, rendering current static defenses ineffective.
Community
POISE is a position-aware skill-poisoning attack on LLM agents. We evaluate under a strict Attack Success Rate metric that requires the injected payload to execute and the user's task to still pass its verifier in the same trial. On Skill-Inject with codex+gpt-5.2, POISE achieves 89.3% ASR (+28.0 pts over a random-placement body baseline, +2.6 over a YAML-only baseline), while staying stealthy enough that only 5.6% of poisoned variants raise a new high-risk alert — exposing the limits of current static scanners.
We demonstrate this supply-chain threat in agent skill ecosystems to help the community develop stronger detection and safer skill distribution.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2606.07943 in a model README.md to link it from this page.
Cite arxiv.org/abs/2606.07943 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2606.07943 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.