Metadata-Version: 2.4
Name: agent-sensorium
Version: 0.1.0
Summary: Deterministic safety hooks for Claude Code
Project-URL: Homepage, https://github.com/kroq86/agent-sensorium
Project-URL: Repository, https://github.com/kroq86/agent-sensorium
License-Expression: AGPL-3.0-or-later
License-File: LICENSE
Requires-Python: >=3.10
Requires-Dist: click>=8.0
Requires-Dist: pyyaml>=6.0
Description-Content-Type: text/markdown

# Sensorium 🧠🛡️

A few months ago, a developer asked Claude to clean up an old repo.

Claude ran `rm -rf tests/ patches/ plan/ ~/`.

That trailing `~/` expands to the home directory. It wiped years of files on their Mac. The post hit 1,500+ upvotes on r/ClaudeAI within hours — because everyone building with agents recognized exactly how it happened: not malice, just confidence with no checkpoint.

It's not an isolated story:

- A founder watched a Cursor agent find an unrelated API token, decide it had permission, and delete an entire production database **and its backups** — in 9 seconds. (Railway CEO personally helped restore it.)
- Developers on GitHub have documented Claude Code running `git reset --hard`, wiping hours of uncommitted work, right after telling the user the operation was "safe."
- A benchmark on 45 failing test suites found agents reporting "45/45 pass" when only 26 actually did — the other 19 quietly never ran the tests that would've said otherwise.

Same root cause every time: **the model's confidence and the actual safety of the action are two different variables, and nothing was checking the second one.**

So I built the boundary myself. Excited to share **Sensorium** — deterministic safety hooks for Claude Code. 👇

---

## The problem nobody puts in the demo video

LLM agents are incredible at writing code. They are also, occasionally, incredible at:

- 🔥 running `rm -rf` with a trailing `~/` nobody meant to include
- 🔥 running `git reset --hard` on your uncommitted work, confidently
- 🔥 `curl -X DELETE`-ing a resource because the docs made it sound safe
- 🔥 declaring "all tests pass" without running the ones that don't
- 🔥 reapplying a stale cached manifest as if it were current state

None of this is malice. It's confidence without a checkpoint. And "just review every diff" doesn't scale when the agent is running fifty tool calls a session.

## The insight

You don't need a second LLM watching the first one. You need **sensors** — small, deterministic, boring rules that wake up on a specific kind of change, check exactly what they care about, and say yes/no/wait.

No vibes. No judgment calls. Pattern matching and policy, all the way down.

```
Claude wants to use a tool
  → PreToolUse hook
  → Sensorium reads the tool input
  → sensors match
  → allow / block / warn
  → tool executes (or doesn't)
  → PostToolUse hook
  → Sensorium checks the resulting file content
  → writes an audit ledger
```

## What this actually catches, out of the box

| Protects against | How | Gate |
|---|---|---|
| `rm -rf /`, `rm -rf ~`, `dd ... of=/dev/sd*` | `filesystem.wipe` | unconditional block |
| Bulk delete without a backup | `filesystem.bulk_delete` | needs `backup_exists` + `dry_run_passed` |
| `git reset --hard`, `git clean -fxd` | `git.destructive` | needs `backup_exists` |
| `curl -X POST/PUT/DELETE/PATCH` | `external_api.write` | needs a snapshot + rollback plan |
| `kubectl apply/delete`, `terraform apply/destroy`, `aws ... delete` | `infra.mutation` | needs snapshot + dry-run + rollback |
| Direct `psql`/`mysql` writes | `db.write` | needs snapshot + rollback plan |
| Reapplying a stale archive/cache as truth | `data.apply_from_archive` | unconditional block |
| Skipped tests slipped in quietly | `test_skip_introduced` | warn, shows up in the audit report |

`--dry-run` on the command bypasses the gate — sensors check for it explicitly.

## Bring your own rules 🔧

This is the part I actually wanted to ship. Your project has opinions Sensorium can't guess — so tell it:

```yaml
# .sensorium/sensors.yaml
sensors:
  - name: no_force_push
    description: Block git push --force (plain push still allowed)
    tools: [Bash]
    action: block
    patterns:
      - 'git\s+push\b.*(--force\b|-f\b)'
    unless:
      - '--dry-run'
    message: |
      Blocked: force-push detected. Use --force-with-lease and confirm
      with a human first.
```

Drop it in, no restart, no config reload — the next tool call picks it up.

**Sensor fields:**

- `tools` — which Claude Code tools trigger this sensor (`Bash`, `Edit`, `Write`, `MultiEdit`)
- `on_file_change` — glob patterns for file paths (PostToolUse, checks content)
- `action` — `block` (exit 2, Claude sees the message) or `warn` (logged, shown in report)
- `patterns` — regex list matched against the Bash command or file path
- `unless` — if any of these match, the sensor does not trigger
- `block_if_contains` — regex matched against file content after an edit
- `require_contains` — regex that must be present in file content (absence triggers the sensor)
- `message` — shown to Claude when blocked or warned

## Install (2 minutes, I promise)

**Step 1 — the CLI**

```bash
pipx install agent-sensorium
# or: pip install agent-sensorium
```

**Step 2 — wire up Claude Code**

```bash
cd my-project
sensorium init claude-code            # this project only
sensorium init claude-code --global   # every project
```

Writes `.claude/settings.json` with the absolute path to the `sensorium` binary. Claude Code picks it up immediately.

**Step 3 (optional) — your own rules**

```bash
mkdir -p .sensorium
# add .sensorium/sensors.yaml, see above
```

That's it. Claude Code works exactly as before — Sensorium just quietly rides along on every tool call.

## Receipts, not vibes

```bash
sensorium report          # show session audit log
sensorium report --clear  # show and reset
```

```
=== Sensorium Audit Report ===

Tools used:         12
Sensors triggered:  3
Blocked actions:    2
File violations:    1

--- Blocked Actions ---
  2026-07-03T10:14:22  [filesystem.wipe]  Bash: rm -rf /tmp/old-data
  2026-07-03T10:17:05  [git.destructive]  Bash: git reset --hard origin/main

--- File Sensor Violations ---
  2026-07-03T10:19:11  [full_object_overwrite]  src/apply.js
    required: ['delta|changed_fields', 'precondition|current_hash']

--- Sensor Activity ---
  external_api.write: 3x
  filesystem.wipe: 2x
```

Every block, every warning, every proof registered — append-only, in `.sensorium/state.jsonl`. Nothing silently disappears.

## The architecture, for the nerds (me too)

Sensorium follows the [State-Delta pattern](https://github.com/kroq86/cursor-global-rules/blob/main/rules/global/State-Delta.mdc):

```
world change (Claude tool use)
  → typed delta/event (PreToolUse / PostToolUse)
  → matching sensors wake up
  → each sensor selects the narrow context it needs
  → deterministic policy evaluates
  → allow / block / warn
  → ledger records the fact
```

No broad rescans. No LLM judge. No polling. A sensor declares exactly what it listens for and what invariant it protects — that's the whole contract.

```
.sensorium/
  state.jsonl      # append-only event ledger
  snapshots/       # file snapshots before edits (content-addressed)
  sensors.yaml     # your project-specific rules (optional)
```

## The honest part

This is regex and policy, not a sandbox. It catches the direct, literal case — an agent typing a dangerous command in the open. It is not a defense against something actively trying to route around it (a script file, an interpreter, a clever quote). Self-reported proofs are exactly that — self-reported. Treat it as a seatbelt, not a cage, and you'll use it correctly.

## Hook format reference

`sensorium init claude-code` writes this to `.claude/settings.json`:

```json
{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash|Edit|Write|MultiEdit",
        "hooks": [{ "type": "command", "command": "/path/to/sensorium hook pretool" }]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Bash|Edit|Write|MultiEdit",
        "hooks": [{ "type": "command", "command": "/path/to/sensorium hook posttool" }]
      }
    ],
    "Stop": [
      {
        "hooks": [{ "type": "command", "command": "/path/to/sensorium hook stop" }]
      }
    ]
  }
}
```

Exit code 2 from `pretool` blocks the tool. Exit 0 allows it.

## License

AGPL-3.0. See [LICENSE](LICENSE).

---

## The incidents this is built around

Not hypotheticals — these happened, and each one maps directly to a sensor above:

- [Coding Agent Horror Stories: The `rm -rf ~/` Incident](https://www.docker.com/blog/coding-agent-horror-stories-the-rm-rf-incident/) — Docker's write-up of the r/ClaudeAI post
- [Cursor AI coding agent deletes entire production database and backups in nine seconds](https://www.techradar.com/pro/it-took-9-seconds-tech-founder-outlines-how-rogue-claude-powered-ai-tool-wiped-entire-company-database-and-backups-but-says-theres-no-such-thing-as-bad-publicity) — TechRadar
- [Claude Code's Silent Git Reset](https://dev.to/shuicici/claude-codes-silent-git-reset-what-actually-happened-and-what-it-means-for-ai-dev-tools-3449) — dev.to
- [Why AI Coding Agents Say All Tests Pass When They Actually Fail](https://docs.bswen.com/blog/2026-06-25-ai-coding-agent-false-positive-failure/) — the 45-task benchmark

If you're shipping agents with real filesystem/shell/API access and you're not doing this yet — you're one confidently-wrong tool call away from a bad afternoon.

Would love thoughts from anyone else building guardrails for agentic coding tools. 🙏

#AIagents #ClaudeCode #DeveloperTools #AgentSafety #OpenSource
