Metadata-Version: 2.4
Name: blocks-faraday
Version: 0.4.0
Requires-Dist: pytest>=8 ; extra == 'test'
Requires-Dist: pytest-cov>=5 ; extra == 'test'
Provides-Extra: test
Summary: Coding Agent hooks at OS level for preventing unwanted actions
Author: blocks
License: Apache-2.0
Requires-Python: >=3.11
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM

# Faraday

Kernel-enforced sandbox for AI agents: execve gating, filesystem sealing, and network filtering.

Faraday wraps an agent (Claude Code, Codex CLI, Gemini CLI, plain shell scripts, anything CLI-invokable) in a multi-layer kernel-enforced sandbox. A single TOML policy file controls three independent layers: **execve gating** (which commands can run), **filesystem sealing** (which paths are readable/writable), and **network filtering** (which domains and API endpoints are reachable).

The mechanism is `seccomp-bpf` with `SECCOMP_USER_NOTIF` (Linux 5.0+, ideally 5.9+). The Rust supervisor traps `execve`/`execveat` in-kernel; a Python policy engine evaluates the rules. Filter is inherited across `fork`/`execve`, so it cannot be bypassed by env-var stripping (LD_PRELOAD weakness), works on statically linked binaries, and works on direct syscalls.

## Status

- Linux x86_64 / aarch64 only at runtime.
- Source compiles on macOS (`cargo check`) so development on a Mac works; the supervisor returns `NotSupported` if invoked there.
- Pure-Python policy engine is fully cross-platform and tested.

## Layout

```
crates/
  faraday-core/    # Rust: BPF filter, seccomp install, /proc reads, supervisor loop
  faraday-cli/     # Rust binary: arg parsing, embedded Python, glue
  faraday-proxy/   # Rust: HTTP proxy with domain + endpoint filtering (vendored from nono)
python/faraday/
  policy.py        # TOML load + rule compilation (execve, filesystem, network)
  matchers.py      # regex/glob/host extractors
  audit.py         # JSONL audit log
  _bridge.py       # Rust↔Python shim called per execve
tests/
  test_*.py        # cross-platform Python unit tests (pytest)
  e2e/             # Linux-only end-to-end tests against the built binary
  policies/        # sample policy files
docs/
  network-filtering.md  # how network filtering works (simple + detailed)
  network-usage.md      # usage guide + policy file integration
```

## Build

Requires:
- Linux 5.0+ (5.9+ recommended) at runtime
- Rust stable 1.75+
- Python 3.11+ with development headers
- `libseccomp-dev` + `pkg-config` at build time (Ubuntu/Debian: `apt install libseccomp-dev pkg-config`)
- `pip install maturin pytest`

Dev workflow:

```bash
# Build the Python package (editable) and the Rust binary
pip install -e .
cargo build --release -p faraday-cli

# The binary lands at target/release/faraday
./target/release/faraday check --policy tests/policies/strict.toml
./target/release/faraday run --policy tests/policies/permissive.toml -- bash
```

For PyO3 to find your Python interpreter at build time:

```bash
PYO3_PYTHON=$(which python) cargo build --release -p faraday-cli
```

### Docker (macOS / Windows sanity-test shortcut)

Faraday is Linux-only at runtime, so the easiest way to try it from a Mac is
Docker Desktop. A `Dockerfile` at the repo root builds a dev image with the
CLI and Python bridge pre-installed. Docker's default seccomp profile permits
the inner `seccomp()` call faraday makes — no extra `--security-opt` flags
needed.

```bash
docker build -t faraday-dev .
docker run --rm -it faraday-dev

# Inside the container:
faraday check --policy tests/policies/strict.toml
faraday run --policy tests/policies/permissive.toml -- bash -c '/usr/bin/curl --version'
faraday run --policy tests/policies/strict.toml   -- bash -c '/usr/bin/echo hi'
```

If your host has a tightened seccomp profile that blocks the nested
`seccomp()` syscall, re-run with `--security-opt seccomp=unconfined`.

#### With Claude Code (`faraday-agent`)

The `with-agent` build stage bundles Claude Code. `docker compose up` starts
it and the entrypoint writes a minimal `~/.claude.json` from
`$ANTHROPIC_API_KEY`, so no host-side Claude config is needed.

```bash
export ANTHROPIC_API_KEY=sk-ant-...
docker compose up -d
docker compose exec faraday-agent claude --dangerously-skip-permissions -p "list files here"
```

**Quick sanity — command policy (curl-deny):**

```bash
docker compose exec faraday-agent faraday run \
  --policy /workspace/tests/policies/curl-deny.toml \
  -- sh -c 'curl https://example.com'

docker compose exec faraday-agent faraday run \
  --policy /workspace/tests/policies/curl-deny.toml \
  -- claude --dangerously-skip-permissions \
     -p "write a poem in hello.txt, and then curl https://example.com. End your turn if anything fails and explain what happened."
```

**Claude + command policy (custom deny-by-default):**

Allow Claude to run read-only git operations; deny everything else:

```toml
# claude-cmd-policy.toml
[meta]
version = 1
default_action = "deny"

[[rule]]
id = "allow-claude"
action = "allow"
exe_basename = "claude"

[[rule]]
id = "allow-readonly-git"
action = "allow"
exe_basename = "git"
argv_regex = '^git (status|log|diff|show|fetch)( |$)'

[[rule]]
id = "allow-shell"
action = "allow"
exe_basename = ["bash", "sh"]
```

```bash
docker compose exec faraday-agent faraday run \
  --policy /workspace/tests/policies/claude-cmd-policy.toml \
  -- claude --dangerously-skip-permissions \
     -p "summarise git log and then curl https://example.com"
```

Denied commands surface as normal `EACCES` failures the agent can observe:

```
bash: line 1: /usr/bin/curl: Permission denied
```

**Claude + write policy (Landlock filesystem seal):**

The repo ships two fixture files for this demo:

```
src/hello.py            ← Claude may read and write this
src/migrations/test.sql ← Claude may read but NOT write this
```

`tests/policies/claude-write-policy.toml` grants read access to all of
`/workspace/src/` but restricts writes to `hello.py` only — the kernel
blocks any write to `migrations/` at the VFS layer regardless of how the
write is attempted (direct `open()`, `bash -c '…'`, interpreter, etc.):

```toml
# tests/policies/claude-write-policy.toml
[meta]
version = 1
default_action = "allow"

[filesystem]
read_globs  = ["/workspace/src/**"]
write_globs = ["/workspace/src/hello.py", "/tmp/**"]
require_enforced = true
```

```bash
docker compose exec faraday-agent faraday run \
  --policy /workspace/tests/policies/claude-write-policy.toml \
  -- claude --dangerously-skip-permissions \
     -p "Add a farewell() function to /workspace/src/hello.py, then add an email column to the users table in /workspace/src/migrations/test.sql. Report what succeeded and what failed."
```

Expected outcome: Claude adds `farewell()` to `hello.py` (write allowed),
then hits `EACCES` trying to edit `test.sql` (write blocked by Landlock) and
reports the failure — without Faraday ever needing to inspect `argv`.

**Claude + command policy + write policy (combined):**

Both layers compose independently. Embed the `[filesystem]` block in the same
policy file as the `[[rule]]` blocks:

```toml
# claude-combined-policy.toml
[meta]
version = 1
default_action = "deny"

[filesystem]
read_globs  = ["/workspace/src/**"]
write_globs = ["/workspace/src/hello.py", "/tmp/**"]
require_enforced = true

[[rule]]
id = "allow-claude"
action = "allow"
exe_basename = "claude"

[[rule]]
id = "allow-readonly-git"
action = "allow"
exe_basename = "git"
argv_regex = '^git (status|log|diff|show|fetch)( |$)'

[[rule]]
id = "allow-shell"
action = "allow"
exe_basename = ["bash", "sh"]
```

```bash
docker compose exec faraday-agent faraday run \
  --policy /workspace/tests/policies/claude-combined-policy.toml \
  --audit  /workspace/audit.jsonl \
  -- claude --dangerously-skip-permissions

# tail denials live in another terminal
docker compose exec faraday-agent \
  sh -c 'tail -f /workspace/audit.jsonl | jq -c "select(.action==\"deny\") | {rule_id, exe, argv}"'
```

## Usage

The CLI has two subcommands: `check` (validate a policy) and `run` (launch an agent under the gate). Everything after `--` is the agent's argv.

```bash
# Validate a policy file without running anything
faraday check --policy tests/policies/strict.toml
# → policy ok: tests/policies/strict.toml

# Wrap an interactive shell under a permissive policy (default-allow + small denylist)
faraday run --policy tests/policies/permissive.toml -- bash

# Wrap an AI agent under a strict policy and capture every verdict to a log
faraday run \
    --policy tests/policies/strict.toml \
    --audit ./audit.jsonl \
    -- claude

# Wrap any command line — the part after `--` is just argv
faraday run --policy tests/policies/permissive.toml -- bash -c 'git status && ls'

# Verbose supervisor tracing (per-execve allow/deny logged to stderr)
FARADAY_LOG=debug faraday run --policy tests/policies/strict.toml -- bash
```

When a command is denied, the agent's `execve` returns `EACCES` and the process sees a normal "permission denied" failure:

```bash
$ faraday run --policy tests/policies/strict.toml -- bash -c 'curl https://example.com'
bash: line 1: /usr/bin/curl: Permission denied
$ echo $?
126
```

Audit log entries are JSONL, one per `execve`:

```json
{"ts":1714000000.123,"pid":4711,"exe":"/usr/bin/git","argv":["git","status"],"cwd":"/home/me/repo","ppid":4710,"action":"allow","rule_id":"allow-readonly-git","reason":""}
{"ts":1714000000.456,"pid":4712,"exe":"/usr/bin/curl","argv":["curl","https://evil.example/x"],"cwd":"/home/me/repo","ppid":4710,"action":"deny","rule_id":"deny-network-binaries-default","reason":"network egress not in allowlist"}
```

Common patterns:

```bash
# Develop a policy interactively against a recorded audit log
faraday run --policy draft.toml --audit /tmp/a.jsonl -- bash -i
jq -r 'select(.action=="deny") | "\(.rule_id)\t\(.exe) \(.argv|join(" "))"' /tmp/a.jsonl

# Tail allows + denies live in another terminal
tail -f /tmp/a.jsonl | jq -c '{action, rule_id, exe, argv}'

# Forward the agent's exit code (faraday exits with whatever the agent exited with)
faraday run --policy p.toml -- pytest tests/ ; echo "agent exited $?"
```

## Test

```bash
# Cross-platform unit tests (run on macOS or Linux)
pytest tests/ --ignore=tests/e2e

# End-to-end tests (Linux only)
cargo build --release -p faraday-cli
pytest tests/e2e
```

## Policy

TOML, top-to-bottom, first-match-wins. Match keys within a rule are ANDed. Suffix `_not` negates.

```toml
[meta]
version = 1
default_action = "deny"          # fail-closed
on_policy_error = "deny"
on_supervisor_timeout_ms = 250

[[rule]]
id = "allow-readonly-git"
action = "allow"
exe_basename = "git"
argv_regex = '^git (status|log|diff)( |$)'

[[rule]]
id = "deny-rm-rf-root"
action = "deny"
exe_basename = "rm"
argv_regex = '(^|\s)-[a-zA-Z]*[rRf][a-zA-Z]*\s+/(\s|$)'
```

Match keys: `exe`, `exe_basename`, `exe_glob`, `argv_regex`, `argv_contains`, `argv_host_in`, `cwd_glob`, `uid`, `parent_exe`, `parent_argv`. Scalar or list. `_not` suffix negates.

`exe_glob` uses shell-style globs (`*` does NOT cross `/`, `**` does), with `${VAR}` env-var expansion at load time.

`argv_host_in` extracts hostnames from `https?://` URLs in any argv element and matches against the allowlist.

See `tests/policies/example.toml` for a comprehensive sample.

## Filesystem sealing (Landlock)

Faraday's execve gate stops disallowed *programs* from starting, but a
program that's already allowed can still use its own `open()`/`read()`/
`write()` syscalls to touch anything its UID can reach — argv inspection
alone is advisory-grade against:

- `bash -c 'cat /etc/shadow'` (argv matching sees `bash`, not `cat`)
- interpreter library-level `open()` (`python3 -c "open('/etc/passwd').read()"`)
- statically linked binaries using raw syscalls

To close this gap, Faraday can apply a **Landlock** filesystem allow-list
in the child process before `execvp`. The kernel denies any filesystem
access outside the grant set with `EACCES`. This is a Linux 5.13+
feature.

Configure via the `[filesystem]` block in the TOML policy:

```toml
[filesystem]
read_globs   = ["${HOME}/repo/**", "/etc/ssl/**"]
write_globs  = ["/tmp/**"]
allow_globs  = ["${HOME}/repo/build/**"]     # read+write shorthand
require_enforced = true                       # fail if partially enforced
```

…or via repeatable CLI flags (nono-aligned — short forms included):

```
--read  <GLOB>   -r <GLOB>      # read-only subtree
--write <GLOB>   -w <GLOB>      # write-only subtree
--allow <GLOB>   -a <GLOB>      # read+write subtree
```

TOML and CLI grants are **merged** — CLI flags add to whatever the TOML
block specifies. If neither the TOML block nor any CLI flag is provided,
Faraday behaves exactly as before: no Landlock is applied.

### Glob → directory widening

Landlock operates on subtree prefixes (`PathBeneath`). Globs are compiled
down to the longest static path prefix:

| Glob | Grant | Widened? |
|------|-------|----------|
| `/tmp/**` | `/tmp` subtree | no |
| `${HOME}/repo/**` | `$HOME/repo` subtree | no |
| `/etc/ssl/cert.pem` | single file | no |
| `${HOME}/src/**/*.py` | `$HOME/src` subtree | **yes** — broader than the glob |
| `/etc/*.conf` | `/etc` subtree | **yes** |
| pattern containing `..` | error at load time | n/a |

Widened grants trigger a `RuntimeWarning` at startup. For file-level
precision, split the glob or use literal paths.

### Composition with the execve gate

The Landlock seal and the execve rule engine are **independent**. An
execve `deny` rule can block a binary even though Landlock would have
allowed reads in its scope, and vice versa — both must allow an operation
for it to succeed.

### Bootstrap reads

When Landlock is applied, Faraday auto-adds read-only grants for the
dynamic linker + system libraries (`/usr`, `/lib`, `/lib64`, `/bin`,
`/sbin`, `/etc/ld.so.cache`, `/proc/self`, `/dev/null`, `/dev/urandom`).
Opt out with `[filesystem] no_bootstrap_reads = true` if you want a
completely tight seal and are willing to whitelist the loader paths
manually.

### Kernel version / enforcement mode

- `require_enforced = true` (default) — partial or zero Landlock
  enforcement causes Faraday to abort with exit code 126. Use this in
  production so "kernel too old" is loud, not silent.
- `require_enforced = false` — best-effort; on kernels lacking Landlock
  support, the seal is skipped and a warning is logged. Suitable for
  development environments on older kernels.

## Network filtering

Faraday's execve gate and Landlock seal control *programs* and *files*,
but an allowed program can still make arbitrary HTTP requests to any
host. To close this gap, Faraday can start an HTTP proxy that filters
outbound traffic by domain and API endpoint.

When `--allow-domain` or `[network]` is configured, Faraday:

1. Starts a filtering HTTP proxy in the parent process on `127.0.0.1`
2. Injects `HTTP_PROXY`/`HTTPS_PROXY` into the child environment before `execvp`
3. Every HTTPS connection goes through a `CONNECT` tunnel — the proxy checks the target domain against the allowlist
4. Cloud metadata endpoints (`169.254.169.254`, `metadata.google.internal`) are always denied (SSRF protection)
5. DNS resolution is checked for rebinding attacks (link-local IPs blocked)

Configure via the `[network]` block in the TOML policy:

```toml
[network]
allowed_hosts = [
    "api.openai.com",
    "*.github.com",
    "pypi.org",
]

[network.routes.openai]
upstream = "https://api.openai.com"
credential_key = "openai"
inject_mode = "header"
endpoint_rules = [
    { method = "POST", path = "/v1/chat/completions" },
    { method = "GET",  path = "/v1/models" },
]
```

…or via repeatable CLI flags:

```
--block-net                        # deny all network
--allow-domain <DOMAIN>            # allow CONNECT to this domain
--deny-domain <DOMAIN>             # block CONNECT to this domain (overrides --allow-domain)
--allow-endpoint <SVC:METHOD:PATH> # restrict reverse proxy endpoints
```

TOML and CLI are **merged** — CLI flags take precedence over the TOML
block. If neither is provided, Faraday doesn't start the proxy and
network access is unrestricted (same as before).

### Deny list (`denied_hosts` / `--deny-domain`)

`denied_hosts` lets you carve specific hosts out of the allow list — the
host is blocked even if it would otherwise be permitted by an exact or
wildcard `allowed_hosts` entry. Same matching rules as the allow list
(exact host or `*.suffix`, case-insensitive).

```toml
[network]
allowed_hosts = [
    "api.openai.com",
    "*.github.com",        # any github subdomain
]
denied_hosts = [
    "gist.github.com",     # carve-out: blocked even though *.github.com would allow it
    "*.tracking.example",  # block whole subtree
]
```

Equivalent CLI:

```bash
faraday run --policy p.toml \
  --allow-domain api.openai.com \
  --allow-domain "*.github.com" \
  --deny-domain gist.github.com \
  --deny-domain "*.tracking.example" \
  -- claude
```

**Precedence:** the user deny list is checked *before* the allow list,
so deny always wins when a host matches both. The hardcoded
cloud-metadata deny list (`169.254.169.254`,
`metadata.google.internal`, `metadata.azure.internal`) is separate and
non-overridable — you don't need to (and can't) re-deny it via
`denied_hosts`.

### Worked configuration examples

A few common shapes:

**1) Allowlist only — strict default-deny.**

```toml
[network]
allowed_hosts = ["api.openai.com", "*.github.com"]
```

```bash
faraday run --policy p.toml \
  --allow-domain api.openai.com --allow-domain "*.github.com" \
  -- my-agent
```

**2) Pure deny list — "allow everything except these hosts".**

Leaving `allowed_hosts` empty means the allow-all short-circuit fires
(except for cloud metadata). Add `denied_hosts` to carve out specific
hosts.

```toml
[network]
denied_hosts = ["*.doubleclick.net", "*.facebook.com"]
```

```bash
faraday run --policy p.toml \
  --deny-domain "*.doubleclick.net" --deny-domain "*.facebook.com" \
  -- my-agent
```

**3) Combined — broad allow with targeted deny.**

```toml
[network]
allowed_hosts = ["*.github.com", "api.openai.com"]
denied_hosts  = ["gist.github.com"]   # blocked even though *.github.com matches
```

**4) Block all egress — no proxy filtering needed.**

```bash
faraday run --policy p.toml --block-net -- my-agent
```

### Kernel-pinned egress (`kernel_pin = true`)

The proxy alone is bypassable: any client that ignores `HTTP_PROXY`
(Node's built-in `fetch`, `aiohttp` without `trust_env`, raw socket
code, an in-process agent tool) reaches the network directly. Setting
`kernel_pin = true` adds a Landlock NetPort + seccomp connect-trap
layer so the kernel only allows TCP `connect()` to the proxy port and
to ports listed under `allowed_local_ports`:

```toml
[network]
allowed_hosts = ["api.openai.com"]
kernel_pin = true                    # opt-in, default false
allowed_local_ports = [5432, 6379]   # local DB / Redis sidecars
```

On Linux ≥ 6.7 this fires Landlock NetPort at the LSM layer; on older
kernels the seccomp `USER_NOTIF` trap is the sole enforcer. Both layers
are installed when `kernel_pin = true`. See `docs/network-filtering.md`
for the full mechanism, audit-log fields, and threat model.

### Domain matching

- **Exact**: `api.openai.com` matches only `api.openai.com`
- **Wildcard subdomain**: `*.github.com` matches `api.github.com` but NOT `github.com` itself
- **Case insensitive**: matching is case-insensitive
- **Empty allowlist**: all hosts allowed (except cloud metadata) — used when the proxy is started only for credential injection

### Endpoint filtering

Endpoint rules restrict which HTTP method+path combinations are allowed
through reverse proxy routes. Format: `SERVICE:METHOD:PATH`

- `*` matches one path segment: `/repos/*/issues`
- `**` matches zero or more: `/api/**`
- `*` as METHOD matches any method
- Percent-encoded paths are normalized before matching

### All three layers in one policy

A single TOML file can configure all three sandbox layers:

```toml
[meta]
default_action = "deny"

[filesystem]
read_globs  = ["${HOME}/project/**", "/etc/ssl/**"]
write_globs = ["/tmp/**"]

[network]
allowed_hosts = ["api.openai.com", "*.github.com"]

[network.routes.openai]
upstream = "https://api.openai.com"
endpoint_rules = [{ method = "POST", path = "/v1/chat/completions" }]

[[rule]]
id = "allow-claude"
action = "allow"
exe_basename = "claude"

[[rule]]
id = "allow-shell"
action = "allow"
exe_basename = ["bash", "sh"]
```

```bash
faraday run --policy combined.toml -- claude
```

The three layers are independent — all must allow an operation for it to
succeed. See `tests/policies/combined.toml` for a full example and
`docs/network-filtering.md` for architectural details.
## Coverage gaps (acknowledged)

- **`io_uring`-submitted execve** is not seccomp-trapped (kernel limitation). Faraday's BPF currently does not deny `io_uring_setup`; v2.
- **Adversarial argv mutation race** between the seccomp trap and `process_vm_readv`. Threat model is honest-but-buggy agents, not malware.
- **macOS** is not supported as a runtime target. `DYLD_INSERT_LIBRARIES` is stripped by SIP / Hardened Runtime in 2026; Endpoint Security extensions need a notarized entitlement. Future work: a `PATH`-shimmed shell wrapper for partial macOS coverage.

## Architecture

```
faraday run --policy p.toml --allow-domain api.openai.com -- claude
  │
  ├─ Rust: parse args, embed Python via PyO3, call faraday._bridge.init(p.toml)
  │        → returns {landlock_grants, require_enforced, network}
  │
  ├─ if network config: start faraday-proxy on 127.0.0.1:<ephemeral>
  │        → ProxyHandle { port, session_token }
  │        → build child_env: HTTP_PROXY, HTTPS_PROXY, NO_PROXY
  │
  ├─ socketpair()
  ├─ fork()
  │     ├─ child:  prctl(NO_NEW_PRIVS) → install seccomp filter →
  │     │         apply Landlock VFS seal → inject proxy env vars →
  │     │         send listener fd via SCM_RIGHTS → execvp(claude)
  │     │         (every execve henceforth traps to user_notif)
  │     │         (every HTTP request routes through the proxy)
  │     │
  │     └─ parent: recv listener fd → poll() loop:
  │                 ioctl(NOTIF_RECV) → read /proc/<pid>/{cwd,status,...}
  │                 → process_vm_readv argv from target → call Python:
  │                     faraday._bridge.evaluate(event_dict) → Verdict
  │                 → ioctl(NOTIF_SEND) with allow (val=0) or deny (error=-EACCES)
  │
  └─ on child exit: shutdown proxy, forward exit code
```

## License

Apache-2.0

