Metadata-Version: 2.4
Name: warrantd-gateway
Version: 0.2.0
Summary: MCP gateway for warrantd — earned autonomy in front of any MCP server.
Project-URL: Homepage, https://github.com/moritzkazooba-wq/warrantd
Project-URL: Repository, https://github.com/moritzkazooba-wq/warrantd
Author: Moritz
License: MIT
License-File: LICENSE
Keywords: agents,approval,audit,autonomy,gateway,mcp,trust
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: httpx>=0.27
Requires-Dist: mcp<2,>=1.27
Requires-Dist: pyyaml>=6.0
Requires-Dist: starlette>=0.40
Requires-Dist: uvicorn>=0.30
Requires-Dist: warrantd-core<0.4.0,>=0.3.0
Description-Content-Type: text/markdown

# warrantd-gateway

The warrantd MCP gateway: a proxy that sits between any agent and any MCP
server and wraps every downstream tool with risk classes, caps, approval
routing, and earned autonomy — **zero changes to the agent**.

> Put earned autonomy in front of any MCP server in five minutes.

## Five-minute quickstart

```bash
pip install warrantd          # meta-package: core + gateway
warrantd init                 # writes a commented warrantd.yaml skeleton
$EDITOR warrantd.yaml         # add your MCP servers under 'servers:'
warrantd init                 # discovers tools; each lands consequential/manual
warrantd run --shadow         # observe-only; point your agent at the gateway
# ... let the agent work for a while ...
warrantd report               # "warrantd would have gated N of M actions"
warrantd verify-audit         # the evidence chain is intact
```

Point your MCP client at the gateway instead of the downstream server, e.g.
in a Claude Desktop-style config:

```json
{
  "mcpServers": {
    "gated-tools": { "command": "warrantd", "args": ["run", "--shadow"] }
  }
}
```

When the would-have report looks right, flip `mode: enforce` (or run with
`--enforce`).

## Safety invariants (non-negotiable)

- **Safe-by-default tool surface.** Any unclassified tool — *including tools
  that appear downstream mid-session* — is treated as `CONSEQUENTIAL` +
  `MANUAL`. A compromised or updated MCP server expanding its tool surface can
  never grant un-vetted capability. (This is the gateway's answer to the
  OWASP Agentic AI tool-misuse class: the tool surface an agent can exercise
  is pinned by your policy, not by whatever the downstream server announces.)
- **Ceilings never move.** `hard_cap`, `auto_cap`, and `max_state` are
  enforced inside `warrantd-core` on every call; no metric, signal, or score
  can raise them.
- **Shadow mode never blocks.** `mode: shadow` forwards *everything* — even
  calls warrantd would BLOCK and calls under a tripped kill switch — and only
  records what it would have done. The gateway earns its own place in your
  stack before it enforces anything.
- **Earned autonomy stays earned, verifiably.** Approval evidence and
  decisions live in one hash-chained, append-only SQLite log;
  `warrantd run --enforce` refuses to start on a broken chain.

## warrantd.yaml reference

```yaml
version: 1                  # required
mode: shadow                # shadow | enforce
store: .warrantd/state.db   # SQLite path, relative to this file
tenant_id: default          # optional; stamped on every request

servers:                    # downstream MCP servers (stdio in this release)
  github:
    command: uvx
    args: [mcp-server-github]
    env:
      GITHUB_TOKEN: "${GITHUB_TOKEN}"   # ${VAR} reads your environment
    prefix: ""              # optional; prepended to exposed tool names
                            # (required if two servers expose the same name)

graduation:                 # optional; defaults shown
  supervised:  { pass_rate: 0.95, adversarial_pass_rate: 0.0,  min_samples: 10 }
  autonomous:  { pass_rate: 0.99, adversarial_pass_rate: 0.90, min_samples: 50 }
  approval_window: 20       # trailing approvals considered as evidence

tools:                      # per-server classification
  github:
    create_issue:
      risk: reversible_write    # read | reversible_write | consequential
      max_state: supervised     # manual | supervised | autonomous
      auto_cap: "100"           # quoted decimals; auto-approve at/below
      hard_cap: "1000"          # never auto-approve above
      value_param: amount       # argument that carries the request's value
    delete_repo:
      risk: consequential
      max_state: manual
# Anything not listed here is consequential/manual. Always.
```

## How a call flows

`tools/call` → resolve against the registry → `TrustLayer.evaluate()`:

- **ALLOW** → forwarded downstream, result returned unchanged.
- **BLOCK** → an in-band `isError` tool result carrying the reason (your
  agent's model sees *why* and can adapt) — never a protocol error.
- **REQUIRE_APPROVAL** → routed to the configured approval gate. This release
  ships no approval surface yet (dashboard/Slack/webhook gates are the next
  phase); the request is declined with an explanatory result and recorded as
  `pending_no_surface`.

Every human approval recorded against an action class is graduation evidence:
clean approvals inside the window move the class toward `SUPERVISED` per your
thresholds, and one override or error drops it back. `AUTONOMOUS` on a
`CONSEQUENTIAL` class additionally requires an adversarial eval signal —
approval history alone can never unlock the top tier.

## Approval surfaces

When a `dashboard:` section is configured, `warrantd run` serves the approval
inbox next to the proxy and **holds approval-required calls open** (default
300 s, `approval.timeout_seconds`) while approvers decide — in the dashboard,
from a Slack ping, or via a webhook integration. Every request shows its
graduation context: *"this class is 2/5 approvals from SUPERVISED — your
decision feeds its trust record."* Approval is investment, not interruption —
that is this project's answer to consent fatigue.

- A **timeout** returns an error inviting the agent to retry; a human decision
  arriving *after* the timeout still becomes chained trust evidence, so
  nothing is lost. Set your MCP client's tool timeout (e.g. Claude Code's
  `MCP_TOOL_TIMEOUT`) at or above the approval timeout for the smoothest UX.
- A **denial** is evidence too: it drags the class's pass rate down.
- **Graduation quorum**: a CONSEQUENTIAL class graduates only when its
  evidence qualifies *and* `approval.graduation_quorum` distinct approvers
  (default 2) ratify the unlock — the two-person rule at the exact
  high-stakes choke point. Ratification only ever releases headroom already
  configured in `max_state`; nothing can raise a ceiling above the YAML.
- `warrantd dashboard` serves a standalone **read-only** view of autonomy
  states, graduations, and the audit log.

## Security model, honestly

- **Named approvers**: every approval, denial, and ratification vote is
  chained under the authenticated approver's name. Tokens live in env vars
  (`token_env`) or as committed sha256 digests (`token_sha256`) — never
  plaintext in config.
- **Bearer tokens, no rotation/expiry in this release**: rotate by editing
  config and restarting. The login cookie carries the raw token (HttpOnly,
  SameSite=Lax) — keep the dashboard on localhost or behind TLS termination.
- **Identities are attributed, not signed**: the gateway process writes the
  chain after authenticating the token; entries are not cryptographically
  signed by the approver. Signed entries remain roadmapped.
- **Dev mode** (`dashboard.auth: none`, localhost only) attributes every
  decision to the synthetic approver `dev` — unattributed evidence, dev only.
- Slack resolution happens **in the dashboard** (the message links to it), so
  approval identity never depends on trusting Slack's sender. Webhook
  deliveries are HMAC-signed when `secret_env` is set; callbacks authenticate
  with a registered approver's bearer token.

## Audit integrity, honestly

Each audit row hashes its payload together with the previous row's hash;
`warrantd verify-audit` walks the chain and reports the first edit, deletion,
insertion, or reordering. The chain proves **sequence integrity, not writer
authenticity** — it cannot detect the writer itself truncating the tail.
Signed entries (non-repudiation) are on the roadmap. Decision rows store the
full tool-call arguments locally for the report; argument redaction hooks are
likewise roadmapped.

Full documentation lives in the
[repository README](https://github.com/moritzkazooba-wq/warrantd).
