Metadata-Version: 2.4
Name: openclaw-skill-vetter-mcp
Version: 1.1.0
Summary: MCP server for security-vetting third-party AI agent extensions before installation — Claude skills, ClawHub plugins, agent tool packs. v1.1: agent-config trust-boundary scanner (AGENTS.md / .cursor/rules.md / .claude/CLAUDE.md / .gemini/config / git hooks) addressing CVE-2026-26268. Outputs 0-100 risk score + BLOCK/REVIEW/CAUTION/CLEAN buckets.
Project-URL: Homepage, https://github.com/temurkhan13/openclaw-skill-vetter-mcp
Project-URL: Documentation, https://github.com/temurkhan13/openclaw-skill-vetter-mcp/blob/main/SPEC.md
Project-URL: Bug Tracker, https://github.com/temurkhan13/openclaw-skill-vetter-mcp/issues
Project-URL: Custom MCP Build, https://github.com/temurkhan13/openclaw-skill-vetter-mcp#need-this-adapted-to-your-stack
Project-URL: Changelog, https://github.com/temurkhan13/openclaw-skill-vetter-mcp/blob/main/CHANGELOG.md
Author-email: Temur Khan <temur@pixelette.tech>
License: MIT License
        
        Copyright (c) 2026 Temur Khan
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: ai-agent,claude,clawhub,exfiltration-detection,mcp,model-context-protocol,openclaw,plugin-vetting,production-ai,prompt-injection,security,skill-security,static-analysis,supply-chain
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: System :: Monitoring
Requires-Python: >=3.11
Requires-Dist: mcp>=1.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pyyaml>=6.0
Provides-Extra: dev
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.5; extra == 'dev'
Requires-Dist: types-pyyaml>=6.0; extra == 'dev'
Description-Content-Type: text/markdown

# openclaw-skill-vetter-mcp

<!-- mcp-name: io.github.temurkhan13/openclaw-skill-vetter-mcp -->

> **MCP server for security-vetting third-party AI agent extensions before installation** — Claude skills, ClawHub plugins, agent tool packs, any code-shaped artifact that runs in your agent environment with your API keys. **41 detection rules** across prompt-injection patterns, hardcoded exfiltration channels (Discord/Slack/Telegram webhooks, SSH-key reads, AWS-creds reads), dangerous dynamic execution (`eval`, `exec`, `subprocess shell=True`, pickle.loads), manifest/permission drift, and known typosquat dependencies. Outputs a 0-100 risk score + BLOCK/REVIEW/CAUTION/CLEAN bucket + per-finding evidence. **Native ClawHub manifest support; the rule engine generalizes to any code-shaped extension via Custom MCP Build adapters.** Keywords: AI agent security, plugin vetting, supply-chain security, prompt injection detection, MCP static analysis.

[![Status: v1.0.0](https://img.shields.io/badge/status-v1.0.0-brightgreen)](https://github.com/temurkhan13/openclaw-skill-vetter-mcp) [![License: MIT](https://img.shields.io/badge/license-MIT-blue)](./LICENSE) [![MCP](https://img.shields.io/badge/protocol-MCP-purple)](https://modelcontextprotocol.io/) [![PyPI](https://img.shields.io/pypi/v/openclaw-skill-vetter-mcp)](https://pypi.org/project/openclaw-skill-vetter-mcp/)

---

## What it does

Third-party AI agent extensions — Claude skills, ClawHub plugins, MCP servers themselves, agent tool packs, npm-distributed agent code — are code that runs inside your environment with your API keys, your filesystem access, your network egress. The supply-chain attack surface is now broadly recognized + actively exploited:

- The [OWASP MCP Top 10](https://owasp.org/www-project-mcp-top-10/) catalogues prompt injection, command injection, and "rug pull" attacks where compromised MCP servers update with malicious tool definitions *after* user approval. Microsoft, Datadog, Atlassian, Palo Alto Unit 42, and Prompt Security have all published detailed threat analyses in 2026: see [Datadog's MCP risks blog](https://www.datadoghq.com/blog/monitor-mcp-servers/), [Atlassian's MCP Clients risk awareness](https://www.atlassian.com/blog/artificial-intelligence/mcp-risk-awareness), and [Unit 42's MCP sampling attack vectors](https://unit42.paloaltonetworks.com/model-context-protocol-attack-vectors/).
- The 2026 **ClawHavoc** campaign poisoned the OpenClaw skill registry (ClawHub) at scale. **Latest public counts: 824 confirmed malicious skills (~7.7% of a 10,700+ registry as of mid-Feb 2026)** per [Koi Security](https://www.koi.ai/blog/clawhavoc-341-malicious-clawedbot-skills-found-by-the-bot-they-were-targeting) + [The Hacker News](https://thehackernews.com/2026/02/researchers-find-341-malicious-clawhub.html). [Snyk's ToxicSkills study](https://snyk.io/blog/toxicskills-malicious-ai-agent-skills-clawhub/) flagged **prompt injection in 36% of skills + 1,467 malicious payloads**. [Bitdefender's independent analysis](https://www.bitdefender.com) places it at ~900 / ~20% of the ecosystem. [Antiy Labs catalogued 1,184 historically published](https://cyberpress.org/clawhavoc-poisons-openclaws-clawhub-with-1184-malicious-skills/). [Zscaler ThreatLabz documented the **DeepSeek-Claw skill** distributing Remcos RAT + GhostLoader](https://www.zscaler.com/blogs/product-insights/taming-agentic-threats-zscaler-visibility-and-guardrails-mitigate-openclaw) — exfils macOS keychain, SSH keys, crypto wallets, cloud API tokens.
- The OpenClaw runtime itself accumulated **138+ CVEs in 2026** ([tracker](https://www.betterclaw.io/blog/openclaw-security-2026)) — one-click RCE [(CVE-2026-25253, CVSS 8.8)](https://www.armosec.io/blog/cve-2026-32922-openclaw-privilege-escalation-cloud-security/), browser-snapshot RCE ([CVE-2026-42436](https://www.redpacketsecurity.com/cve-alert-cve-2026-42436-openclaw-openclaw/)), privilege escalation, cross-site WebSocket hijacking, and more.

The same shape of attack works against any third-party extension a user installs into their AI agent runtime — Claude skills, MCP servers, browser-extension agents, npm-distributed agent code. The defensive question every operator faces before clicking install: *"is this safe to run with my API keys?"*

This MCP server runs a battery of static-analysis scanners against any skill's directory and produces a single VetReport that an operator can act on:

```
> claude: vet the data-extractor skill before I install it.
[MCP tool: vet_skill]

Skill 'data-extractor': BLOCK — do not install.
Risk score: 100/100. Findings: 1 critical, 4 high, 1 info.

Critical:
  EXFIL.WEBHOOK_DISCORD (extract.py:5) —
    Hardcoded Discord webhook URL: 'https://discord.com/api/webhooks/...'
    Recommendation: Refuse install unless explicitly justified.

High:
  AST.OS_SYSTEM (extract.py:14) — os.system('curl ... | bash')
  EXFIL.ENV_DUMP (extract.py:9) — dumps full os.environ
  MANIFEST.WILDCARD_PERMISSION — `network.http: *`
  ...

Vet result for data-extractor: REFUSE INSTALL.
```

```
> claude: any flagged skills currently installed?
[MCP tool: flagged_skills_report]

2 skills flagged at REVIEW or BLOCK:
  - data-extractor       BLOCK   risk_score=100   1 CRITICAL EXFIL.WEBHOOK_DISCORD
  - markdown-formatter   REVIEW  risk_score=35    1 HIGH AST.EVAL_CALL on user input
```

---

## Why `openclaw-skill-vetter-mcp`

Three things existing tools (manual code review, generic SAST, ClawHub trust scores) don't do:

1. **Skill-aware scanning.** Generic SAST tools don't know what an OpenClaw skill manifest looks like. They miss the most common malware shape: a "calculator" skill that requests `network.http: *`. The vetter cross-checks declared purpose against requested permissions.

2. **Risk score the operator can paste into a ticket.** Not "high cyclomatic complexity" — `BLOCK — Discord webhook at extract.py:5`. Each finding has `rule_id`, `file:line`, `evidence`, and a specific recommendation.

3. **Built for review-before-install, not after-the-fact audit.** Run it from inside Claude on a skill you're about to add. Get a verdict in seconds. Refuse the install if it's BLOCK; sandbox-test if REVIEW; install if CLEAN.

Built for the **production-AI operator** who has been bitten (or doesn't want to be) by ClawHavoc-style supply-chain attacks.

### How this fits in the OpenClaw security ecosystem

The OpenClaw security crisis has spawned a multi-vendor tooling landscape. This server's place in it:

| Layer | Vendor / project | Posture |
|-------|------------------|---------|
| Enterprise SaaS / SOC | [Cisco DefenseClaw](https://blogs.cisco.com/ai/cisco-announces-defenseclaw), [ClawSecure Watchtower](https://www.clawsecure.ai/), [Zscaler ThreatLabz](https://www.zscaler.com/blogs/product-insights/taming-agentic-threats-zscaler-visibility-and-guardrails-mitigate-openclaw), NemoClaw | Server-side, paid, integration-heavy, SIEM-aimed. Best fit for organizations with existing security teams + SOC infrastructure. |
| Best-practices guidance | [Microsoft Security Blog](https://www.microsoft.com/en-us/security/blog/2026/02/19/running-openclaw-safely-identity-isolation-runtime-risk/), [CrowdStrike](https://www.crowdstrike.com/en-us/blog/what-security-teams-need-to-know-about-openclaw-ai-super-agent/), [Conscia](https://conscia.com/blog/the-openclaw-security-crisis/) | Educational. No tooling. |
| Open-source / community | [SecureClaw](https://www.securityweek.com/openclaw-security-issues-continue-as-secureclaw-open-source-tool-debuts/), [openclaw-security-monitor](https://github.com/adibirzu/openclaw-security-monitor), [openclaw-dashboard](https://github.com/tugcantopaloglu/openclaw-dashboard), [slowmist's hardening guide](https://github.com/slowmist/openclaw-security-practice-guide) | Self-hosted runtime + dashboard tooling. Generally separate process / web UI. |
| **MCP-native (this layer)** | **`openclaw-skill-vetter-mcp` (this server)** + [`openclaw-output-vetter-mcp`](https://github.com/temurkhan13/openclaw-output-vetter-mcp) (claim verification) + [`openclaw-upgrade-orchestrator-mcp`](https://github.com/temurkhan13/openclaw-upgrade-orchestrator-mcp) (regression catalog + provider-fingerprint) | **Inline in the agent's own conversation** — Claude Desktop / Cursor / Cline calls these tools directly during a turn. Sub-second, free, MIT, local, read-only. The operator-tooling layer one step closer to the agent than enterprise SIEM covers. |

**This server isn't a replacement for the SaaS layer** — large organizations should pair both. It's a replacement for *manual code review of every ClawHub skill before install*, with a verdict an operator can paste into a ticket in seconds.

---

## Tool surface

| Tool | What it returns |
|------|-----------------|
| `vet_skill` | Full VetReport for one skill: risk_score, risk_level, sorted findings, summary |
| `vet_skill_directory` | Aggregate report across every skill in the directory + per-bucket counts |
| `installed_skills_overview` | Lightweight: just bucket counts + flagged skill IDs |
| `flagged_skills_report` | Just REVIEW + BLOCK skills with their findings |
| `scan_for_prompt_injection` | Focused: only prompt-injection findings on one skill |
| `scan_for_exfiltration` | Focused: only exfiltration findings on one skill |
| `list_detection_rules` | Catalog of every rule the server applies (transparency) |

Resources:
- `skill-vetter://overview` — installed-skills risk overview
- `skill-vetter://flagged` — currently-flagged skills
- `skill-vetter://rules` — detection rules catalog

Prompts:
- `pre-install-skill-check` — vet a specific skill before installation
- `weekly-skill-audit` — compose a 200-word weekly audit of all installed skills

---

## Quickstart

### Install

```bash
pip install openclaw-skill-vetter-mcp
```

### Configure for Claude Desktop

Add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows):

```json
{
  "mcpServers": {
    "openclaw-skill-vetter": {
      "command": "python",
      "args": ["-m", "openclaw_skill_vetter_mcp"],
      "env": {
        "OPENCLAW_SKILL_VETTER_BACKEND": "mock"
      }
    }
  }
}
```

### Backends

| Backend | Status | Description |
|---------|--------|-------------|
| `mock` | ✅ v1.0 | 6 demo skills with deliberate findings spanning all severities — for protocol verification and README/CLI demos |
| `openclaw-skills-dir` | ✅ v1.0 | Reads `~/.openclaw/skills/` (override via `OPENCLAW_SKILLS_DIR`); each subdirectory is parsed as one skill |
| `clawhub-fetch` | ⏳ v1.1 | Fetches a candidate skill from the ClawHub registry directly for vet-before-install workflows |

### Skill manifest format

Each skill directory contains a `skill.yaml` (or `skill.json`):

```yaml
id: weather-fetch
name: Weather Fetch
version: 1.0.0
author: verified-publisher@openclaw.example
description: Fetches current weather for a city using OpenWeatherMap.
purpose: Live weather data lookup
runtime: python3.11
entry_point: main.py
permissions:
  - network.http: api.openweathermap.org
dependencies:
  - requests>=2.31
  - pydantic>=2.0
signature: ed25519:abcd1234efgh5678
```

Plus the actual code files (`*.py`, `*.js`, `*.ts`, `*.sh`, `*.rb`, `*.go`, `*.rs`) and any prompt files (`*.prompt`, `*.md`, `*.txt`).

If your OpenClaw deployment uses a different on-disk shape, see the **Custom MCP Build** section below.

---

## Detection rules (v1.0)

Four scanner modules cover the v1.0 ruleset:

**Manifest** — `MANIFEST.MISSING`, `MANIFEST.PURPOSE_NETWORK_DRIFT`, `MANIFEST.WILDCARD_PERMISSION`, `MANIFEST.BROAD_FILESYSTEM_WRITE`, `MANIFEST.EMPTY_DESCRIPTION`, `MANIFEST.NO_AUTHOR`, `MANIFEST.UNSIGNED`

**Static patterns** (text regex over code + prompts) —
- *Prompt-injection*: `PROMPT_INJ.IGNORE_PRIOR`, `PROMPT_INJ.ROLE_OVERRIDE`, `PROMPT_INJ.EXTRACT_SYSTEM`, `PROMPT_INJ.JAILBREAK_DAN`, `PROMPT_INJ.NEW_USER_MARKER`
- *Exfiltration*: `EXFIL.WEBHOOK_DISCORD`, `EXFIL.WEBHOOK_SLACK`, `EXFIL.WEBHOOK_TELEGRAM`, `EXFIL.PASTEBIN_LITERAL`, `EXFIL.SSH_KEY_READ`, `EXFIL.AWS_CREDS_READ`, `EXFIL.ENV_DUMP`, `EXFIL.SUBPROCESS_CURL`
- *Dynamic execution*: `DYN_EXEC.SHELL_TRUE`, `DYN_EXEC.OS_SYSTEM`, `DYN_EXEC.EVAL_LITERAL`, `DYN_EXEC.EXEC_LITERAL`, `DYN_EXEC.PICKLE_LOADS`, `DYN_EXEC.DYNAMIC_IMPORT`
- *Obfuscation*: `OBFUSCATION.LARGE_BASE64`, `OBFUSCATION.LARGE_HEX`

**Python AST** (catches what regex misses) — `AST.EVAL_CALL`, `AST.EXEC_CALL`, `AST.COMPILE_CALL`, `AST.OS_SYSTEM`, `AST.OS_POPEN`, `AST.OS_EXECV`, `AST.SUBPROCESS_RUN_SHELL_TRUE`, `AST.SUBPROCESS_POPEN_SHELL_TRUE`, `AST.DYNAMIC_IMPORT`

**Dependencies** — `DEP.TYPOSQUAT`, `DEP.HOMOGLYPH`, `DEP.UNTRUSTED_GIT_SOURCE`, `DEP.LOCAL_PATH`

Use `list_detection_rules` to query the live catalog.

---

## Risk scoring

Each finding contributes by severity:

| Severity | Weight |
|----------|-------:|
| CRITICAL | 40 |
| HIGH | 15 |
| MEDIUM | 5 |
| LOW | 1 |
| INFO | 0 |

Final `risk_score = min(sum, 100)`. Bucketing (first match wins):

| Bucket | Trigger |
|--------|---------|
| BLOCK | ≥1 CRITICAL or score ≥ 80 |
| REVIEW | ≥1 HIGH or score ≥ 50 |
| CAUTION | ≥1 MEDIUM or score ≥ 20 |
| CLEAN | no findings or only INFO |

Conservative-by-design: false positives are OK, missed criticals are not. If your operator workflow disagrees with a specific rule, you can filter by `category` on the client side, or fork + customize.

---

## Roadmap

| Version | Scope | Status |
|---------|-------|--------|
| v1.0 | mock + openclaw-skills-dir backends, 7 tools / 3 resources / 2 prompts, 4 scanner modules with 41 detection rules, GitHub Actions CI matrix, PyPI Trusted Publishing | ✅ |
| v1.1 | `clawhub-fetch` backend (vet a skill from ClawHub before install); CVE-DB lookup for dependencies; signature verification against ClawHub publisher keys | ⏳ |
| v1.2 | Sandbox-execution scanner (run skill in isolated process, observe network attempts); whitelist/allowlist per-operator | ⏳ |
| v1.x | Custom rule packs; integration with existing SAST tools; per-rule severity overrides | ⏳ |

---

## Need this adapted to your stack?

If your AI deployment doesn't use the OpenClaw skill format — different agent harness, custom skill schema, monolithic skill files, internal-registry distribution — and you want the same vet-before-install discipline, that's a **Custom MCP Build** engagement.

| Tier | Scope | Investment | Timeline |
|------|-------|------------|----------|
| Simple | Single backend adapter for your existing skill format | **$8,000–$12,000** | 1–2 weeks |
| Standard | Custom backend + custom rule pack tuned to your ecosystem + CI integration | **$15,000–$25,000** | 2–4 weeks |
| Complex | Multi-format ingestion + sandbox-execution + signed-publisher allowlist + rule-tuning workshop | **$30,000–$45,000** | 4–8 weeks |

**To engage:**
1. Email **temur@pixelette.tech** with subject `Custom MCP Build inquiry — skill vetting`
2. Include: 1-paragraph description of your skill ecosystem + which tier you're considering
3. Reply within 2 business days with a 30-min discovery call slot

This server is part of a **production-AI infrastructure MCP suite** — companion to [silentwatch-mcp](https://github.com/temurkhan13/silentwatch-mcp), [openclaw-health-mcp](https://github.com/temurkhan13/openclaw-health-mcp), and [openclaw-cost-tracker-mcp](https://github.com/temurkhan13/openclaw-cost-tracker-mcp). Install all four for full operational visibility.

---

## Production AI audits

If you're running production AI and want an outside practitioner to score readiness, find the failure patterns already present (ClawHavoc-style skill malware being one of the most damaging), and write the corrective-action plan:

| Tier | Scope | Investment | Timeline |
|------|-------|------------|----------|
| Audit Lite | One system, top-5 findings, written report | **$1,500** | 1 week |
| Audit Standard | Full audit, all 14 patterns, 5 Cs findings, 90-day follow-up | **$3,000** | 2–3 weeks |
| Audit + Workshop | Standard audit + 2-day team workshop + first monthly audit included | **$7,500** | 3–4 weeks |

Same email channel: **temur@pixelette.tech** with subject `AI audit inquiry`.

---

## Contributing

PRs welcome. Scanners are pluggable — see `src/openclaw_skill_vetter_mcp/scanners/` for the contract.

To add a new scanner:

1. Create `scanners/<your_scanner>.py` exporting `SCANNER_NAME: str` and `def scan(skill: Skill) -> list[Finding]`
2. Optionally export `def all_rules() -> list[tuple[...]]` for the rules catalog
3. Register in `analysis.vet_skill` (the orchestrator iterates over a fixed tuple of scanner modules)
4. Add tests in `tests/test_scanners.py`

To add a new backend:

1. Subclass `SkillBackend` in `backends/<your_backend>.py`
2. Implement `get_skills`, `get_skill_by_id`, `get_directory`
3. Register in `backends/__init__.py`
4. Add tests in `tests/test_backend_<your_backend>.py`

Bug reports + feature requests: open a GitHub issue. False-positive reports: include the skill snippet that fired the wrong rule and we'll tune.

---

## License

MIT — see [LICENSE](./LICENSE).

---

## Related

- [Production-AI MCP Suite (Gumroad bundle)](https://temurah.gumroad.com/l/production-ai-mcp-suite) — this server plus 5 others in one curated 6-pack bundle with a decision tree, day-one drill, and Custom MCP Build CTA. $99, or $49 with `LAUNCH50` for the first 30 days.
- [silentwatch-mcp](https://github.com/temurkhan13/silentwatch-mcp) — cron silent-failure detection
- [openclaw-health-mcp](https://github.com/temurkhan13/openclaw-health-mcp) — deployment health
- [openclaw-cost-tracker-mcp](https://github.com/temurkhan13/openclaw-cost-tracker-mcp) — token-cost telemetry + 429 prediction (v1.1+)
- [openclaw-upgrade-orchestrator-mcp](https://github.com/temurkhan13/openclaw-upgrade-orchestrator-mcp) — read-only upgrade advisor + provider-side regression detection (v1.2+)
- [openclaw-output-vetter-mcp](https://github.com/temurkhan13/openclaw-output-vetter-mcp) — agent claim verification (inline grounding-check + swallowed-exception scanner + multi-turn transcript review)
- [AI Production Discipline Framework](https://temurah.gumroad.com/l/ai-production-discipline-framework) — Notion template, $29 — methodology these MCPs implement
- [SPEC.md](./SPEC.md) — full server design

---

Built by [Temur Khan](https://www.notion.so/@temurkhan) — independent practitioner on production AI systems.
Contact: **temur@pixelette.tech**
