\n\t<a id=\"why-agent-skills-are-a-different-security-problem\" class=\"block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full\" href=\"#why-agent-skills-are-a-different-security-problem\" rel=\"nofollow\">\n\t\t<span class=\"header-link\"><svg class=\"text-gray-500 hover:text-black dark:hover:text-gray-200 w-4\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" aria-hidden=\"true\" role=\"img\" width=\"1em\" height=\"1em\" preserveAspectRatio=\"xMidYMid meet\" viewBox=\"0 0 256 256\"><path d=\"M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z\" fill=\"currentColor\"></path></svg></span>\n\t</a>\n\t<span>\n\t\tWhy Agent Skills Are a Different Security Problem\n\t</span>\n</h2>\n<p>Most security tooling starts with a familiar question: does this artifact contain malware? That question matters for agent skills too, but it underdetermines the risk.</p>\n<p>An agent skill can be a Markdown instruction file, a Python script, a workflow definition, references, or a bundle that combines all of these. When an agent loads a skill, it may gain new ways to invoke tools, access context, issue subtasks, install dependencies, or interact with external services.</p>\n<p>That surface introduces failure modes that classic malware scanners are not designed to catch:</p>\n<ul>\n<li>skills that request authority far beyond what their stated purpose requires,</li>\n<li>instructions designed to steer or hijack an agent's behavior when processed,</li>\n<li>code paths that can leak data passed through context,</li>\n<li>workflows with dangerous side effects despite a benign description,</li>\n<li>hardcoded credentials, insecure TLS settings, dynamic execution, or destructive shell patterns.</li>\n</ul>\n<p>Some of these look like normal software-security findings. Others are specific to agentic systems, where a document can become operational instruction and a workflow can change what an autonomous assistant is allowed to do. ClawHub Security Signals is designed to expose that boundary.</p>\n<h2 class=\"relative group flex items-baseline\">\n\t<a id=\"whats-in-the-dataset\" class=\"block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full\" href=\"#whats-in-the-dataset\" rel=\"nofollow\">\n\t\t<span class=\"header-link\"><svg class=\"text-gray-500 hover:text-black dark:hover:text-gray-200 w-4\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" aria-hidden=\"true\" role=\"img\" width=\"1em\" height=\"1em\" preserveAspectRatio=\"xMidYMid meet\" viewBox=\"0 0 256 256\"><path d=\"M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z\" fill=\"currentColor\"></path></svg></span>\n\t</a>\n\t<span>\n\t\tWhat's in the Dataset\n\t</span>\n</h2>\n<p>The dataset covers 67,453 latest public ClawHub skill versions across four deterministic splits: train (47,262), validation (10,076), test (6,747), and eval_holdout (3,368). The eval_holdout split is reserved for model evaluation and should not be used for training.</p>\n<p>Each row includes redacted SKILL.md content, sanitized bundled files where present, the final ClawScan verdict, and summarized scanner evidence. During preparation, 387 secret-like values were redacted from exported bundle content. A TruffleHog verified-secret pass found 0 verified secrets after validation.</p>\n<p>ClawScan assigns each skill version a registry verdict:</p>\n<ul>\n<li>clean: 41,743 rows (61.9%)</li>\n<li>suspicious: 25,504 rows (37.8%)</li>\n<li>malicious: 206 rows (0.3%)</li>\n</ul>\n<p>A suspicious verdict means the skill warrants review before trust is extended. It is not a confirmed-harmful label. A malicious verdict is still a silver-standard registry verdict, not human-verified ground truth at this stage.</p>\n","updatedAt":"2026-06-03T05:26:45.158Z","author":{"_id":"643bc6ea5ec6af9c331ad3f9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/643bc6ea5ec6af9c331ad3f9/MqYQoeo56Idr_u3s2YRVQ.jpeg","fullname":"Vincent Koc","name":"vincentkoc","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":12,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d648291fa3e4e7ae3fa6e8/0EjtTNzaEdCs8orbs57ju.png","fullname":"OpenClaw","name":"OpenClaw","type":"org","isHf":false,"plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8621916174888611},"editors":["vincentkoc"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/643bc6ea5ec6af9c331ad3f9/MqYQoeo56Idr_u3s2YRVQ.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.01494","authors":[{"_id":"6a1fbab0e292c1c78ecb154b","name":"Vincent Koc","hidden":false},{"_id":"6a1fbab0e292c1c78ecb154c","name":"Patrick Erichsen","hidden":false},{"_id":"6a1fbab0e292c1c78ecb154d","name":"Jacob Tomlinson","hidden":false},{"_id":"6a1fbab0e292c1c78ecb154e","name":"Agustin Rivera","hidden":false},{"_id":"6a1fbab0e292c1c78ecb154f","name":"Michael Appel","hidden":false},{"_id":"6a1fbab0e292c1c78ecb1550","name":"Nir Paz","hidden":false}],"publishedAt":"2026-05-31T23:20:25.000Z","submittedOnDailyAt":"2026-06-03T00:00:00.000Z","title":"ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree","submittedOnDailyBy":{"_id":"643bc6ea5ec6af9c331ad3f9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/643bc6ea5ec6af9c331ad3f9/MqYQoeo56Idr_u3s2YRVQ.jpeg","isPro":false,"fullname":"Vincent Koc","user":"vincentkoc","type":"user","name":"vincentkoc"},"summary":"Agent skills extend AI agents with reusable instructions, tools, scripts, references, and workflows, establishing a security boundary distinct from both model safety and traditional package-malware detection. ClawHub Security Signals is a sanitized dataset of 67,453 latest public OpenClaw skill versions. Each row pairs redacted SKILL.md content and sanitized bundled files where present with a final ClawScan registry verdict and evidence from three scanner families: VirusTotal, static heuristic analysis, and NVIDIA SkillSpector.\n Rather than estimating malicious-skill prevalence, we study scanner disagreement. The three scanners rarely flag the same skills: any pair overlaps on at most 10.4% of their combined positives, only 0.69% of skills are flagged by all three, and 81.9% of flagged skills are identified by a single scanner. The disagreement is structured by attack surface. SkillSpector, which raises semantic agentic-risk advisories rather than malware-reputation signals, is positive for 19,209 of 25,504 suspicious rows (75.3%) but only 14 of 206 malicious rows (6.8%). The malicious-verdict region shows the inverse profile: 150 of 206 malicious rows (72.8%) are VirusTotal-positive, consistent with bundled-code malware evidence.\n These results show that agent-skill security requires layered governance, not single-scanner allow/block decisions. The corpus is released as a sanitized silver-standard dataset: labels are the registry's automated verdicts, not human-annotated ground truth, and the release represents an early, versioned snapshot intended to support the community while a human-annotated subset is developed. Further research is encouraged, including models tailored for skill-security triage.","upvotes":2,"discussionId":"6a1fbab0e292c1c78ecb1551","projectPage":"https://huggingface.co/datasets/OpenClaw/clawhub-security-signals","ai_summary":"Agent skills require layered security governance due to scanner disagreement, with findings showing varying detection rates across different scanner types and attack surfaces.","ai_keywords":["ClawHub Security Signals","ClawScan registry verdict","VirusTotal","static heuristic analysis","NVIDIA SkillSpector","semantic agentic-risk advisories","malware-reputation signals"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","organization":{"_id":"69f062f6f2053a6fa234fdea","name":"OpenClaw","fullname":"OpenClaw","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d648291fa3e4e7ae3fa6e8/0EjtTNzaEdCs8orbs57ju.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"643bc6ea5ec6af9c331ad3f9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/643bc6ea5ec6af9c331ad3f9/MqYQoeo56Idr_u3s2YRVQ.jpeg","isPro":false,"fullname":"Vincent Koc","user":"vincentkoc","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"69f062f6f2053a6fa234fdea","name":"OpenClaw","fullname":"OpenClaw","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/62d648291fa3e4e7ae3fa6e8/0EjtTNzaEdCs8orbs57ju.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.01494.md"}">
ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree
Abstract
Agent skills require layered security governance due to scanner disagreement, with findings showing varying detection rates across different scanner types and attack surfaces.
Agent skills extend AI agents with reusable instructions, tools, scripts, references, and workflows, establishing a security boundary distinct from both model safety and traditional package-malware detection. ClawHub Security Signals is a sanitized dataset of 67,453 latest public OpenClaw skill versions. Each row pairs redacted SKILL.md content and sanitized bundled files where present with a final ClawScan registry verdict and evidence from three scanner families: VirusTotal, static heuristic analysis, and NVIDIA SkillSpector.
Rather than estimating malicious-skill prevalence, we study scanner disagreement. The three scanners rarely flag the same skills: any pair overlaps on at most 10.4% of their combined positives, only 0.69% of skills are flagged by all three, and 81.9% of flagged skills are identified by a single scanner. The disagreement is structured by attack surface. SkillSpector, which raises semantic agentic-risk advisories rather than malware-reputation signals, is positive for 19,209 of 25,504 suspicious rows (75.3%) but only 14 of 206 malicious rows (6.8%). The malicious-verdict region shows the inverse profile: 150 of 206 malicious rows (72.8%) are VirusTotal-positive, consistent with bundled-code malware evidence.
These results show that agent-skill security requires layered governance, not single-scanner allow/block decisions. The corpus is released as a sanitized silver-standard dataset: labels are the registry's automated verdicts, not human-annotated ground truth, and the release represents an early, versioned snapshot intended to support the community while a human-annotated subset is developed. Further research is encouraged, including models tailored for skill-security triage.
Community
Why Agent Skills Are a Different Security Problem
Most security tooling starts with a familiar question: does this artifact contain malware? That question matters for agent skills too, but it underdetermines the risk.
An agent skill can be a Markdown instruction file, a Python script, a workflow definition, references, or a bundle that combines all of these. When an agent loads a skill, it may gain new ways to invoke tools, access context, issue subtasks, install dependencies, or interact with external services.
That surface introduces failure modes that classic malware scanners are not designed to catch:
- skills that request authority far beyond what their stated purpose requires,
- instructions designed to steer or hijack an agent's behavior when processed,
- code paths that can leak data passed through context,
- workflows with dangerous side effects despite a benign description,
- hardcoded credentials, insecure TLS settings, dynamic execution, or destructive shell patterns.
Some of these look like normal software-security findings. Others are specific to agentic systems, where a document can become operational instruction and a workflow can change what an autonomous assistant is allowed to do. ClawHub Security Signals is designed to expose that boundary.
What's in the Dataset
The dataset covers 67,453 latest public ClawHub skill versions across four deterministic splits: train (47,262), validation (10,076), test (6,747), and eval_holdout (3,368). The eval_holdout split is reserved for model evaluation and should not be used for training.
Each row includes redacted SKILL.md content, sanitized bundled files where present, the final ClawScan verdict, and summarized scanner evidence. During preparation, 387 secret-like values were redacted from exported bundle content. A TruffleHog verified-secret pass found 0 verified secrets after validation.
ClawScan assigns each skill version a registry verdict:
- clean: 41,743 rows (61.9%)
- suspicious: 25,504 rows (37.8%)
- malicious: 206 rows (0.3%)
A suspicious verdict means the skill warrants review before trust is extended. It is not a confirmed-harmful label. A malicious verdict is still a silver-standard registry verdict, not human-verified ground truth at this stage.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2606.01494 in a model README.md to link it from this page.
Cite arxiv.org/abs/2606.01494 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.