Metadata-Version: 2.4
Name: mcp-airlock-crunchtools
Version: 0.2.1
Summary: Secure MCP server for quarantined web content extraction — two-layer defense against prompt injection
Author: crunchtools.com
License-Expression: AGPL-3.0-or-later
License-File: LICENSE
Keywords: mcp,prompt-injection,quarantine,sanitization,security
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Requires-Dist: beautifulsoup4>=4.12
Requires-Dist: dbus-fast>=2.0
Requires-Dist: fastmcp>=2.0
Requires-Dist: httpx>=0.28
Requires-Dist: markdownify>=0.14
Requires-Dist: numpy>=1.24
Requires-Dist: onnxruntime>=1.17
Requires-Dist: pydantic>=2.0
Requires-Dist: transformers>=4.40
Provides-Extra: dev
Requires-Dist: mypy>=1.13; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.24; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.8; extra == 'dev'
Description-Content-Type: text/markdown

# mcp-airlock-crunchtools

<!-- mcp-name: io.github.crunchtools/airlock -->

Secure MCP server for quarantined web content extraction with two-layer prompt injection defense.

## Tools (6)

| Tool | Layers | Description |
|------|--------|-------------|
| `safe_fetch` | L1 | Fetch URL, sanitize, return markdown. Fails on injection. |
| `safe_read` | L1 | Read local file, sanitize, return markdown. Fails on injection. |
| `quarantine_fetch` | L1+L2 | Fetch URL, sanitize, extract via Q-Agent. Warns on injection. |
| `quarantine_read` | L1+L2 | Read local file, sanitize, extract via Q-Agent. Warns on injection. |
| `quarantine_scan` | L1+L2 | Pre-flight scan: detect injection vectors WITHOUT returning content. |
| `quarantine_stats` | — | Session stats, config, and blocklist summary. |

## Architecture

- **Layer 1 (Deterministic):** 7-stage sanitization pipeline strips hidden HTML, invisible unicode, encoded payloads, exfiltration URLs, and LLM delimiters.
- **Layer 2 (Q-Agent):** Quarantined Gemini Flash-Lite LLM for semantic content extraction — NO tools, NO memory, NO SDK.

## Install

```bash
# PyPI
pip install mcp-airlock-crunchtools

# uvx (zero-install)
uvx mcp-airlock-crunchtools

# Container
podman run quay.io/crunchtools/mcp-airlock
```

## Configuration

```bash
# Required for Layer 2 (Q-Agent)
export GEMINI_API_KEY=your-key

# Optional
export QUARANTINE_MODEL=gemini-2.0-flash-lite  # default
export QUARANTINE_FALLBACK=layer1              # or "fail"
export QUARANTINE_MAX_CONTENT=100000           # max chars to Q-Agent
export QUARANTINE_DB=/data/airlock.db          # SQLite blocklist path
export QUARANTINE_TRUST_CONFIG=~/.config/mcp-env/mcp-airlock-trust.json
```

## Claude Code

```json
{
  "mcpServers": {
    "mcp-airlock-crunchtools": {
      "command": "uvx",
      "args": ["mcp-airlock-crunchtools"]
    }
  }
}
```

## License

AGPL-3.0-or-later
