Metadata-Version: 2.4
Name: skillscan-fuzzer
Version: 1.0.0
Summary: LLM-powered adversarial skill file fuzzer for skillscan-security
Project-URL: Homepage, https://github.com/kurtpayne/skillscan-fuzzer
Project-URL: Repository, https://github.com/kurtpayne/skillscan-fuzzer
Project-URL: Issues, https://github.com/kurtpayne/skillscan-fuzzer/issues
License: Apache-2.0
License-File: LICENSE
Keywords: ai,fuzzing,llm,mcp,red-team,security,skill
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.11
Requires-Dist: click>=8.1.0
Requires-Dist: openai>=1.30.0
Requires-Dist: requests>=2.28.0
Provides-Extra: dev
Requires-Dist: mypy>=1.10.0; extra == 'dev'
Requires-Dist: pre-commit>=3.0.0; extra == 'dev'
Requires-Dist: pytest-cov>=5.0.0; extra == 'dev'
Requires-Dist: pytest>=8.2.0; extra == 'dev'
Requires-Dist: ruff>=0.4.4; extra == 'dev'
Description-Content-Type: text/markdown

# skillscan-fuzzer

[![CI](https://github.com/kurtpayne/skillscan-fuzzer/actions/workflows/ci.yml/badge.svg)](https://github.com/kurtpayne/skillscan-fuzzer/actions/workflows/ci.yml)
[![PyPI version](https://img.shields.io/pypi/v/skillscan-fuzzer.svg)](https://pypi.org/project/skillscan-fuzzer/)
[![Python](https://img.shields.io/pypi/pyversions/skillscan-fuzzer.svg)](https://pypi.org/project/skillscan-fuzzer/)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)

**M19 — LLM-powered adversarial SKILL.md variant generator**

`skillscan-fuzzer` generates adversarial variants of AI skill files to probe the detection boundaries of `skillscan`. It uses an OpenAI-compatible LLM API (GPT-4.1-mini by default, or any local Ollama model) to apply semantic mutations to seed skill files, producing variants with unified diffs and optional scan results.

The fuzzer is the controlled-input complement to the public scan feed (M14): instead of scanning skills found in the wild, it generates skills designed to stress-test the scanner's static rules and ML model. The output feeds directly into `skillscan-trace` (M18) for behavioral verification.

---

## Installation

```bash
# From PyPI
pip install skillscan-fuzzer

# Or as an extra from the main package
pip install skillscan-security[fuzzer]

# Development install from source
git clone https://github.com/kurtpayne/skillscan-fuzzer
cd skillscan-fuzzer
pip install -e ".[dev]"
```

---

## Quick Start

```bash
# Fuzz 5 variants of each malicious seed using evasion strategy
skillscan-fuzzer --strategy evasion --variants 5 --seed-dir path/to/corpus/adversarial

# Fuzz a specific seed with injection strategy and run skillscan on each variant
skillscan-fuzzer --strategy injection \
             --seed-file path/to/corpus/benign/gh_example.md \
             --scan

# Use a local Ollama endpoint (no API key needed)
skillscan-fuzzer --strategy obfuscation \
             --model llama3.1:8b \
             --base-url http://localhost:11434/v1 \
             --api-key ollama \
             --seed-dir path/to/seeds

# Dry run to verify output structure without making LLM calls
skillscan-fuzzer --strategy evasion --dry-run --variants 2 --seed-dir path/to/seeds
```

Seeds are not bundled — use [skillscan-corpus](https://github.com/kurtpayne/skillscan-corpus) or provide your own.

---

## Mutation Strategies

| Strategy | Seed Type | Goal |
|---|---|---|
| `evasion` | malicious | Preserve malicious intent while changing surface patterns to evade rule matching |
| `injection` | benign | Add one subtle attack vector to an otherwise clean skill |
| `benign_drift` | benign | Add security-adjacent vocabulary without actual malice (false-positive stress test) |
| `obfuscation` | malicious | Apply encoding/homoglyph/zero-width obfuscation to hide attack patterns |
| `authority` | any | Inject fake system headers, vendor impersonation, or role-escalation claims |

---

## Output Structure

```
fuzz-output/
  evasion/
    ah01_goal_substitution_calendar/
      variant_001.md          ← complete mutated SKILL.md
      variant_001.diff        ← unified diff against original
      variant_001.scan.json   ← skillscan SARIF result (if --scan)
      variant_002.md
      variant_002.diff
      ...
    summary.json              ← evasion rate, detection rate, per-seed results
```

The `summary.json` contains:

```json
{
  "strategy": "evasion",
  "total_seeds": 10,
  "total_variants": 50,
  "errors": 0,
  "scanned": 50,
  "evasion_rate": 0.34,
  "false_positive_rate": null,
  "per_seed": [...]
}
```

---

## CLI Reference

```
Usage: skillscan-fuzzer [OPTIONS]

Options:
  -s, --strategy [evasion|injection|benign_drift|obfuscation|authority]
                                  Mutation strategy  [default: evasion]
  -n, --variants INTEGER          Variants per seed  [default: 5]
  --seed-dir DIRECTORY            Directory of seed SKILL.md files
  --seed-file PATH                Specific seed file (repeatable)
  -o, --output-dir DIRECTORY      Output root  [default: fuzz-output]
  -m, --model TEXT                LLM model  [default: gpt-4.1-mini]
  --base-url TEXT                 OpenAI-compatible API base URL
  --api-key TEXT                  API key (reads OPENAI_API_KEY env var)
  --temperature FLOAT             Sampling temperature  [default: 0.9]
  --max-tokens INTEGER            Max response tokens  [default: 4096]
  --scan / --no-scan              Run skillscan on each variant  [default: no-scan]
  --dry-run                       Skip LLM calls; write placeholder variants
  --max-seeds INTEGER             Limit number of seeds processed
  -v, --verbose                   Enable debug logging
  -h, --help                      Show this message and exit
```

---

## API Key Configuration

The fuzzer reads `OPENAI_API_KEY` from (in priority order):

1. `--api-key` CLI flag
2. `OPENAI_API_KEY` environment variable
3. `~/.skillscan-secrets` (the shared credential store used by all skillscan tools)

For Ollama, pass `--base-url http://localhost:11434/v1 --api-key ollama`.

---

## Corpus Integration

Generated variants are intended to be reviewed and, if they represent genuine evasion gaps or new attack patterns, committed to `skillscan-corpus` as labeled training examples. The recommended workflow:

1. Run the fuzzer with `--scan` to identify variants that evade detection.
2. Manually review evading variants for quality (does the attack intent survive?).
3. Commit confirmed evasion variants to `skillscan-corpus/adversarial/` with label `malicious`.
4. Update the static rules or retrain the ML model to close the gap.
5. Repeat.

For behavioral verification (rather than static rule verification), pipe variants into `skillscan-trace` (M18) using the bundled pipeline script:

```bash
python scripts/fuzzer_tracer_pipeline.py \
  --seeds path/to/skillscan-corpus/adversarial/agent_hijacker \
  --dry-run
```

---

## Running Tests

```bash
pip install -e ".[dev]"
pytest tests/ -v
```

All tests use `--dry-run` mode and do not require an API key or network access.
