## Codebase Patterns
- Use `StrEnum` for all enum classes (Python 3.11+). `ruff` UP042 flags `(str, Enum)` style.
- All Pydantic config models use `ConfigDict(extra="allow")` so unknown YAML fields land in `model_extra` for warning.
- When adding new nested config models, register them in `_NESTED` dict in `config.py` so `_collect_extra_paths()` can recurse into them.
- **All config fields are optional with defaults** — `classifier.model` defaults to `MULTILINGUAL`, `ClawStrikeConfig()` with no args works. `_RootConfig.clawstrike` has a `default_factory` so empty/absent YAML files produce all-defaults.
- `McpConfig.enabled: bool = True` — replaces old `transport: TransportMode` (which was removed). Set `false` to skip MCP server startup.
- Always use `uv run ruff format` to format; never manually wrap lines.
- Quality check commands: `uv run pytest`, `uv run ruff check`, `uv run ruff format`.
- MCP server lives in `src/clawstrike/mcpserver.py`. Module-level globals: `_config`, `_classifier`, `_elevated_sessions: set[str]`. `init_server(cfg)` injects config, loads classifier, and resets `_elevated_sessions` to a fresh empty set. Tests mock `create_classifier` in autouse fixture, which **yields the mock_clf** so individual tests can configure `mock_clf.classify.return_value`. Teardown resets ALL THREE globals (`_config = None`, `_classifier = None`, `_elevated_sessions.clear()`).
- FastMCP v3: `@mcp.tool` decorator (no parentheses). Test tools with `await mcp.call_tool(name, args)`, read result via `result.structured_content`. `RuntimeError` inside tools surfaces as `fastmcp.exceptions.ToolError` — assert `ToolError` in tests.
- Test isolation for module globals: use an `autouse` fixture that resets the global back to `None` after each test.
- Classifier module: `src/clawstrike/classifier.py` — `BaseClassifier` ABC, `PromptGuardClassifier`, `ClassifierResult` dataclass, `create_classifier(model)` factory. HF models require `HF_TOKEN` + Meta license acceptance. Never load real models in tests — always mock `create_classifier`. Use `PromptGuardClassifier.__new__()` to construct instances without `__init__` when testing `classify()` method directly.
- CLI `start` command catches `RuntimeError` from `init_server()` (classifier load failure) and exits with code 1.
- DB layer uses function-scoped connections via `open_db(path)` async context manager — idempotent `CREATE TABLE IF NOT EXISTS` DDL runs on every open (cheap for SQLite); no module-level persistent connection
- `init_server()` in `mcpserver.py` is sync; DB path is stored as `_db_path: str | None` module global and opened lazily inside async tools
- First-contact trust override: `classify` always calls `get_or_create_contact()` — if `is_first_contact=True`, trust is forced to UNTRUSTED regardless of channel; tests must pre-register contacts (seed call) before testing channel-specific threshold behavior
- Trust engine is pure functions in `src/clawstrike/trust.py` — no side effects, no mocking needed in unit tests
- `classify` tool response always includes `trust_level` and `threshold_applied` since US-011/015; tests that assert on block/flag/pass decisions must account for the channel_type modifier (e.g., `email_body` is LOW trust → eff_block = 0.87, not 0.92)
- `gate` tool's `trust_level` field is now dynamically resolved from `channel_type`, not hardcoded
- Interaction tracking: `classify` captures full `ContactRecord` (not just `is_first_contact`) from phase-1 DB read; phase-2 DB block performs increment + conditional auto-promotion. Auto-promo guard: `contact.trust_level == "auto"` (pre-increment) ensures manual overrides ('trusted'/'blocked') are never overwritten.
- DB write phase in `classify` is a single `open_db` block that can contain: `increment_interaction`, `set_contact_trust_level`, `insert_audit_event` (trust_update), and `insert_audit_event` (classify). All within one SQLite connection per request.
- Gating engine is in `src/clawstrike/gating.py` — pure functions only: `classify_action(action_type) -> (risk_level, reason)` and `apply_decision_matrix(risk_level, trust_level) -> recommendation`. `gate` tool does NOT call `get_or_create_contact` — trust resolved from channel_type only (no first-contact override for gate).
- `_MATRIX` in `gating.py` keys by `TrustLevel` enum (not string); works because `TrustLevel` is StrEnum and `resolve_trust_level()` returns `TrustLevel` instances.
- Audit DB has two init paths: `setup_audit_db(path)` (sync, stdlib sqlite3) for CLI startup; `open_db(path)` (async, aiosqlite) for request-time. Both run `_apply_migrations` to add any missing columns from `_NEW_AUDIT_COLS` list.
- `raw_input_hash` (SHA-256) always written to classify audit event; `raw_input_snippet` only when `log_raw_input=True`. `elevated_scrutiny` stored as bool in `details_json`, derived from `decision == "flag"` at write time.
- Content-source mismatch (US-016): `_mismatch_sessions` global in `mcpserver.py` works like `_elevated_sessions`. Mismatch condition: `trust_level in (HIGH, MEDIUM) AND score >= base_flag` — fires independently of the effective decision (pass/flag/block). In `gate`, mismatch forces effective trust to LOW (applied BEFORE elevated_scrutiny downgrade). Teardown must clear both sets.
- Allowlist check in `gate` runs BEFORE the decision matrix; if matched, `recommendation` is forced to `"allow"`. Response and audit event include `allowlisted: bool`, `allowlist_rule_id`, and `allowlist_source_scope`.
- `confirm` tool uses `_DECISION_MAP` for case-insensitive normalization. `always_allow`/`always_allow_global` create allowlist rules only when `cfg.action_gating.allowlist_learning` is `True`; otherwise silently downgraded to `approve`.
- Static allowlist rules (`ActionGatingConfig.static_rules`) are checked in `gate` after the DB allowlist; `allowlist_source: "db" | "config" | None` in the response and audit event identifies the rule origin. `list_allowlist_rules(path)` (sync, stdlib sqlite3) in `db.py` lists all DB rules for CLI display.
- Typer sub-apps for grouped commands: `_sub = typer.Typer(help="...")` + `app.add_typer(_sub, name="cmd")` + `@_sub.command("subcommand")` → `clawstrike cmd subcommand`.
- New CLI commands follow the same pattern: `--json` input, `init_server`, `asyncio.run(srv.<tool>(**params))`, `json.dumps` output.
- Config-based contact trust overrides: `TrustConfig.contacts` maps `source_id` → `ContactOverrideLevel` ("trusted"|"blocked"). Checked at the TOP of `classify` before classifier invocation. Blocked → early return (no classifier call, still creates DB record). Trusted → `TrustLevel.HIGH` overrides first-contact and channel defaults. Both write `trust_update` audit events with `reason="config_override"`. `ContactOverrideLevel` is separate from `TrustLevel` — Pydantic rejects invalid values like "high"/"medium" at config load time.

---

## 2026-02-22 - US-001
- Implemented YAML configuration loading (`src/clawstrike/config.py`)
- Set up the full project scaffold: `pyproject.toml` (hatchling build, src layout), `src/clawstrike/__init__.py`, `tests/__init__.py`
- Added dependencies via `uv add`: pydantic>=2, pyyaml, structlog; dev: pytest, pytest-asyncio, ruff
- Files changed:
  - `pyproject.toml` — added deps, build system, pytest/ruff config
  - `src/clawstrike/__init__.py` — created (empty)
  - `src/clawstrike/config.py` — Pydantic v2 config models + `load_config()` loader
  - `tests/__init__.py` — created (empty)
  - `tests/test_config.py` — 26 tests covering all US-001 acceptance criteria
  - `CLAUDE.md` — added dev commands + codebase patterns section
- All 26 tests pass; `ruff check` and `ruff format` clean
- **Learnings for future iterations:**
  - Pydantic v2 `ConfigDict(extra="allow")` stores unknown fields in `model_extra`; use this pattern + a recursive `_collect_extra_paths()` function + a `_NESTED` registry to emit stderr warnings for unknown YAML fields without breaking validation.
  - `StrEnum` (Python 3.11+) is preferred over `(str, Enum)` — ruff UP042 will flag the latter as an unsafe fix candidate; just use `StrEnum` from the start.
  - ruff's `--fix` handles import sorting and unused imports safely. `--unsafe-fixes` handles the StrEnum migration — use `StrEnum` from the start to avoid it.
  - `uv run ruff format` must be used for formatting, never manual line-wrapping.
  - `_RootConfig` wrapper model (with `clawstrike:` as the only key) is needed to validate the top-level YAML structure before extracting the inner `ClawStrikeConfig`.
---

## [2026-02-22] - US-002

- What was implemented:
  - Added `fastmcp>=3.0.1` dependency via `uv add fastmcp`
  - Created `src/clawstrike/server.py` with module-level `FastMCP("ClawStrike")` instance and three MCP tools:
    - `health` → returns `{"status": "ok", "mode": "skill", "classifier": "<model>"}`
    - `classify` → accepts `text`, `source_id`, `channel_type`; returns stub classification result (full inference in US-005/006)
    - `gate` → accepts `action_description`, `action_type`, `session_id`, `source_id`, `channel_type`; returns stub gating recommendation (full engine in US-017/018)
  - `init_server(cfg)` injects `ClawStrikeConfig` into the module-level `_config` global; `_require_config()` guards all tool handlers
  - Auto-initialization via `CLAWSTRIKE_CONFIG` env var for `fastmcp run src/clawstrike/server.py`
  - Updated `src/clawstrike/cli.py` `start` command: loads config, calls `init_server()`, logs startup banner to stderr, calls `mcp.run(transport="stdio")`
  - Created `tests/test_server.py` with 12 tests covering all three tools, uninitialized-server error paths, config injection, and tool registration
- Files changed:
  - `pyproject.toml` — added `fastmcp>=3.0.1`
  - `src/clawstrike/server.py` — new file
  - `src/clawstrike/cli.py` — updated `start` command
  - `tests/test_server.py` — new file
  - `CLAUDE.md` — added fastmcp patterns to Codebase Patterns
  - `tasks/clawstrike-user-stories.md` — marked US-002 complete
- **Learnings for future iterations:**
  - FastMCP v3 is already at 3.0.1; the `@mcp.tool` decorator (no parentheses) is the current API.
  - Direct tool testing: `result = await mcp.call_tool("name", {…})` works in-process without a Client. `result.structured_content` returns the typed dict. `result.data` also available.
  - `RuntimeError` inside a tool handler is caught and re-raised as `fastmcp.exceptions.ToolError` — tests must match `ToolError`, not `RuntimeError`.
  - Module-level `mcp = FastMCP(...)` must exist for `fastmcp run server.py` to find the server; don't use a factory-only pattern if you want `fastmcp run` support.
  - Test isolation for the `_config` global requires an `autouse` fixture that resets it to `None` after each test.
  - `mcp.run(transport="stdio")` is the correct call; the `transport` kwarg is a string literal.
  - Startup log must go to stderr (stdout is reserved for MCP protocol messages in stdio mode).
---

## 2026-02-22 - US-005 / US-006
- Implemented real classifier inference using Llama Prompt Guard 2 models
- Added `transformers>=4.40` and `torch>=2.0` runtime dependencies via `uv add`
- Files changed:
  - `pyproject.toml` — added `transformers`, `torch` dependencies
  - `src/clawstrike/classifier.py` — new: `ClassifierResult` dataclass, `BaseClassifier` ABC, `PromptGuardClassifier`, `_MODEL_IDS` mapping, `create_classifier()` factory
  - `src/clawstrike/mcpserver.py` — added `_classifier` global, `_require_classifier()`, updated `init_server()` to call `create_classifier()`, replaced `classify` stub with real inference + threshold-based decision (pass/flag/block)
  - `src/clawstrike/cli.py` — `start` now catches `RuntimeError` from `init_server()` and exits with code 1
  - `tests/test_classifier.py` — new: 11 tests covering model ID mapping, factory behavior, load failure, score/label mapping via mocked logits
  - `tests/test_server.py` — updated autouse fixture to mock `create_classifier` and reset `_classifier`
  - `tests/test_cli.py` — updated banner/banner-model tests to mock `create_classifier`; added `test_start_classifier_load_failure_exits_1`
  - `CLAUDE.md` — updated MCP server pattern, added classifier and CLI error-handling patterns
  - `tasks/clawstrike-user-stories.md` — marked US-005 and US-006 complete
- All 57 tests pass; `ruff check` and `ruff format` clean
---

## [2026-02-22] - Classifier chunking (sliding-window fix)

- What was implemented:
  - `_MAX_TOKENS: int = 512` module-level constant in `classifier.py`
  - `PromptGuardClassifier._classify_chunks(texts, temperature)` — batch-tokenizes a list of chunk strings and runs a single batched forward pass; returns `max(probs[:, 1])`
  - `PromptGuardClassifier.classify()` refactored: tokenizes the full text without truncation to count tokens; if ≤ 512 takes a fast path (single chunk = original text); if > 512 splits token IDs into non-overlapping 512-token windows, decodes each back to text via `tokenizer.decode(skip_special_tokens=True)`, then calls `_classify_chunks` on all chunks at once
  - Score aggregation: `max` across chunks (pessimistic — one malicious chunk marks the whole prompt)
  - Mirrors the official Meta `example.py` reference (`get_scores_for_texts`)
- Files changed:
  - `src/clawstrike/classifier.py` — added `_MAX_TOKENS`, `_classify_chunks`, refactored `classify`
  - `tests/test_classifier.py` — updated `_make_classifier_with_logits` helper (added `body_token_count`, `side_effect` tokenizer mock, `decode` stub); added 7 new tests covering fast/chunked paths, boundary chunk counts, max aggregation, and buried middle-chunk detection
  - `CLAUDE.md` — documented chunking pattern and updated test helper notes
- **Learnings:**
  - Official approach: decode token chunks back to text, re-tokenize as a batch — avoids manual CLS/SEP reconstruction and works with any tokenizer
  - "Parallel" in the docs means batched inference, not threads — single forward pass over all chunks is the right interpretation
  - Non-overlapping chunks (stride = window = 512) is the official recommendation; no overlap needed
  - Mock tokenizer needs `side_effect` (list) not `return_value` when `classify()` calls it twice with different signatures
---

## [2026-02-23] - US-008 / US-009 / US-010

- What was implemented:
  - **US-008 (Block):** `classify` now adds `"reason": "prompt_injection_detected"` when `decision == "block"` (score ≥ threshold.block)
  - **US-009 (Flag):** `classify` now adds `"elevated_scrutiny": True` when `decision == "flag"` (score ≥ flag, < block). Accepts optional `session_id: str = ""` — if non-empty, inserts the session into `_elevated_sessions` module global. `gate` tool reads `_elevated_sessions` and returns `"elevated_scrutiny": bool` in its response.
  - **US-010 (Pass):** No extra fields added for pass decisions; existing fields satisfy the AC. Verified no `reason`/`elevated_scrutiny` leak into pass responses.
  - Added `_elevated_sessions: set[str]` module global to `mcpserver.py`; `init_server()` resets it to a fresh empty set on (re)startup.
  - Updated autouse fixture in `tests/test_server.py` to yield the mock classifier and call `srv._elevated_sessions.clear()` in teardown.
  - Added 9 new tests covering block/flag/pass response shapes, session tagging, and gate elevation surfacing.
- Files changed:
  - `src/clawstrike/mcpserver.py` — added `_elevated_sessions`, updated `init_server`, updated `classify` signature and response, updated `gate` response
  - `tests/test_server.py` — updated autouse fixture (yield mock_clf + clear _elevated_sessions), added US-008/009/010 test section (9 tests)
  - `CLAUDE.md` — updated MCP server pattern + added session elevation and response shape patterns
  - `tasks/clawstrike-user-stories.md` — marked completed ACs for US-008/009/010
- All 76 tests pass; `ruff check` and `ruff format` clean
- **Learnings for future iterations:**
  - The autouse fixture should **yield the mock** (not a bare `yield`) so individual tests can override `mock_clf.classify.return_value` per-scenario. Tests that don't need a specific score just ignore the yielded value.
  - `_elevated_sessions` is an in-memory set — it survives for the lifetime of the process. `init_server()` must reset it (not just `_config`/`_classifier`) to guarantee clean state between tests and across real restarts.
  - Session ID `""` (empty string) is used as the "no session tracking" sentinel — truthy check guards the `_elevated_sessions.add()` call.
  - The full trust-downgrade logic for elevated sessions (US-022: high→medium, medium→low, etc.) is deferred to US-022 and the full gating engine (US-017/018). US-009 only requires the classification pipeline to tag sessions and surface the flag in the gate response.
  - The Llama Prompt Guard 2 models are binary (BENIGN/MALICIOUS). Class 1 probability = score. Map score > 0.5 → label "injection"; ≤ 0.5 → "benign". The "jailbreak" label is reserved for finer-grained future models.
  - HF models require Meta license acceptance + `HF_TOKEN` env var before first download. Cache at `~/.cache/huggingface/hub/`. 86M model ≈ 330 MB; 22M ≈ 90 MB.
  - Both `_config` and `_classifier` module globals must be reset in `test_server.py` autouse fixture. Forgetting `_classifier` causes cross-test contamination.
  - When testing a method on a class that has a complex `__init__`, use `MyClass.__new__(MyClass)` and assign attributes manually to bypass the constructor entirely.
  - Import `torch` and transformers lazily inside methods (not at module top-level) to keep import times fast in tests that don't need them.
  - CLI tests that exercise `init_server()` must also mock `create_classifier` — not just `mcp.run`.
---

## [2026-02-23] - US-011 + US-015

### What was implemented

**US-011 (Channel Trust Level Resolution):**
- `classify` and `gate` MCP tools now resolve `channel_type` against `trust.channel_defaults` config
- Unknown channel types default to `TrustLevel.UNTRUSTED` (fail-closed)
- Both tools include `trust_level` (string) in their responses
- `gate` tool replaced hardcoded `"trust_level": "medium"` with the dynamically resolved value

**US-015 (Trust-Modulated Classifier Thresholds):**
- `classify` tool applies `trust.threshold_modifiers` to base thresholds before making the block/flag/pass decision
- Effective thresholds are clamped to [0.0, 1.0]
- Response includes `threshold_applied: {block: <eff>, flag: <eff>}` for transparency/audit

### Files changed

- **Created:** `src/clawstrike/trust.py` — `resolve_trust_level()` and `compute_effective_thresholds()` pure functions
- **Modified:** `src/clawstrike/mcpserver.py` — wired trust resolution + threshold modulation into `classify`; resolved trust level in `gate`
- **Created:** `tests/test_trust.py` — 13 unit tests for trust engine (resolution + modulation + clamping)
- **Modified:** `tests/test_server.py` — 8 new integration tests for US-011/015 response fields and decision behavior
- **Modified:** `tasks/clawstrike-user-stories.md` — marked US-011 and US-015 as DONE
- **Modified:** `CLAUDE.md` — added trust engine patterns and classify response field notes
- **Created:** `progress.txt` — this file

### Test count: 76 → 99 (23 new tests)

### Learnings for future iterations:

- **Trust modulation affects existing decision tests:** when adding tests for `classify` decisions, always check what trust level the `channel_type` resolves to, since modifiers shift the effective thresholds. The existing tests use `email_body` (LOW, -0.05/-0.10) so scores like 0.95/0.80/0.30 were chosen safely outside boundary zones.
- **High trust raises thresholds, not lowers them:** A score of 0.93 with `owner_dm` (HIGH, eff_block=0.97) results in `flag`, not `block` — it only passes block threshold but falls into flag range (eff_flag=0.80). This is correct behavior: high trust raises block threshold (fewer false positive blocks), but scores still get flagged if above eff_flag.
- **`gate` tool now uses `cfg`:** After US-011, `_require_config()` return value is captured and used. Pre-US-011 it was called but the return value was discarded.
- **Pure function tests need no mocking:** `test_trust.py` tests pure functions with default `TrustConfig()` — no `patch`, no `AsyncMock`, no fixtures needed. Fast and simple.
---

## [2026-02-23] - US-012

### What was implemented

**US-012 (Contact Registry — First Contact Detection):**
- Created `src/clawstrike/db.py`: `open_db()` async context manager (creates schema + parent dirs), `get_or_create_contact()` (INSERT on first seen, SELECT on return), `insert_audit_event()` (writes to `audit_events` table)
- Two SQLite tables: `contacts` (source_id PK, channel_type, trust_level, first_seen, last_seen, interaction_count) and `audit_events` (id, timestamp, event_type, session_id, source_id, channel_type, decision, score, is_first_contact, trust_level, details_json)
- `mcpserver.py`: added `_db_path: str | None` module global set by `init_server()`; `classify` tool calls `get_or_create_contact()` → if first contact, forces `TrustLevel.UNTRUSTED` regardless of channel defaults; writes audit event after decision; includes `is_first_contact: bool` in every response
- Added `aiosqlite>=0.19` to `pyproject.toml` dependencies

### Files changed

- **Created:** `src/clawstrike/db.py`
- **Created:** `tests/test_db.py` — 11 unit tests for DB layer
- **Modified:** `src/clawstrike/mcpserver.py` — `_db_path` global, `init_server` sets it, `classify` integrates contact registry + audit write
- **Modified:** `tests/test_server.py` — `cfg` fixture injects temp DB path; autouse teardown resets `_db_path`; fixed 2 breaking tests; 6 new US-012 integration tests
- **Modified:** `pyproject.toml` — added aiosqlite dependency
- **Modified:** `CLAUDE.md` — added DB layer patterns and first-contact override notes

### Test count: 99 → 116 (17 new tests)

### Learnings for future iterations:
- **`open_db` is idempotent:** `CREATE TABLE IF NOT EXISTS` in DDL means opening the same DB path multiple times is safe. Two `open_db` calls per `classify` (one for contact registry, one for audit write) is fine for SQLite.
- **`init_server` must stay sync:** `asyncio.run()` would conflict with already-running event loop in tests (pytest-asyncio mode=auto). Store the path, open lazily inside async tools.
- **Per-test DB isolation:** The `cfg` fixture now injects `audit.db_path = tmp_path / "test.db"`. The autouse fixture resets `srv._db_path = None` in teardown. This prevents state leakage between tests.
- **First-contact changes trust level for ALL existing tests:** Any test that asserts on `trust_level` in a `classify` response must account for the first-contact UNTRUSTED override. Two tests were updated: `test_classify_response_includes_trust_level` (assertion changed to "untrusted") and `test_classify_high_trust_raises_block_threshold` (added seed call to pre-register contact).
- **`aiosqlite.Row` factory:** Set `conn.row_factory = aiosqlite.Row` inside the `async with aiosqlite.connect(...)` block. Enables dict-style column access (`row["source_id"]`) which is more robust than positional indexing.
---

## [2026-02-24] - US-013

### What was implemented

**US-013 (Contact Registry — Interaction Tracking & Auto-Promotion):**
- `db.py`: added `increment_interaction(conn, source_id) -> ContactRecord` — UPDATE + SELECT to atomically increment count and return the updated record; added `set_contact_trust_level(conn, source_id, trust_level) -> None` — simple UPDATE for manual/auto trust changes
- `mcpserver.py`: `classify` now captures the full `ContactRecord` (not just `is_first_contact`) from phase-1 `get_or_create_contact`. Phase-2 DB block: if NOT first-contact AND decision != "block", calls `increment_interaction`; then checks `contact.trust_level == "auto"` (pre-increment value) AND `updated.interaction_count >= cfg.trust.auto_promote_after` to trigger promotion: calls `set_contact_trust_level` + writes a `trust_update` audit event with `details.reason = "auto_promote"`. All four DB operations (increment, set_trust, audit trust_update, audit classify) happen in one `open_db` context.
- DB import list in `mcpserver.py` expanded to multi-line for readability.

### Files changed

- **Modified:** `src/clawstrike/db.py` — added `increment_interaction()` and `set_contact_trust_level()`
- **Modified:** `src/clawstrike/mcpserver.py` — updated `classify` with interaction tracking + auto-promotion logic; updated import
- **Modified:** `tests/test_db.py` — 7 new unit tests for the two new DB functions
- **Modified:** `tests/test_server.py` — 8 new integration tests (increment on pass/flag, no-increment on block, auto-promote at threshold, trust_update audit event, no-promote before threshold, no-promote for manual overrides, promote only once); added `_get_contact_from_db` and `_get_audit_events` local helpers
- **Modified:** `tasks/clawstrike-user-stories.md` — marked US-013 as DONE
- **Modified:** `CLAUDE.md` — added US-013 patterns
- **Modified:** `progress.txt` — this entry

### Test count: 116 → 131 (15 new tests)

### Learnings for future iterations:
- **Use pre-increment `contact.trust_level` for the promotion gate.** The check `contact.trust_level == "auto"` uses the value captured BEFORE `increment_interaction` is called. This is correct: if the contact had already been promoted (trust_level='low'), the check correctly skips re-promotion. Using the post-increment value from `updated` would also work since `increment_interaction` doesn't change trust_level, but semantically the pre-increment record is cleaner.
- **Block the entire phase-2 in one `open_db` context.** Consolidating increment + optional trust update + audit events into a single connection avoids 3 separate `open_db` calls and is still safe since `open_db` is idempotent.
- **`_get_contact_from_db` helper re-uses `get_or_create_contact`.** This works because a known contact just returns `is_first_contact=False` with the stored record. Slightly indirect but avoids duplicating the SELECT query.
- **`auto_promote_after` is checked AFTER the increment.** So with `auto_promote_after=5`, the 5th total non-blocked interaction (count 1 from creation + 4 increments → count=5 on the 5th classify call, but actually: count=1 from creation, then 1st non-first-contact call: count=2, ..., 4th: count=5) triggers promotion. Test uses 5 total `call_tool` calls (1 first-contact + 4 increments = count=5 → promote).
---

## [2026-02-24] - CLI Integration Mode & MCP Config Toggle

- What was implemented:
  - **CLI commands:** `clawstrike classify --json '...'`, `clawstrike gate --json '...'`, `clawstrike health` — one-shot JSON-in/JSON-out commands that work without MCP support in the agent
  - **`mcp.enabled: bool = True`** added to `McpConfig`; removed dead `TransportMode` enum and `transport` field. `clawstrike start` with `mcp.enabled: false` exits 0 with an informational message.
  - **No-config fallback:** `clawstrike start` (and all CLI commands) fall back to `ClawStrikeConfig()` all-defaults when the default `clawstrike.yaml` is absent. Explicit `--config path` to a missing file still exits 1.
  - **`classifier.model` now has default** (`MULTILINGUAL`). `ClassifierConfig()` and `ClawStrikeConfig()` work with no args. `_RootConfig.clawstrike` gained `default_factory=ClawStrikeConfig` so empty/absent YAML is valid.
  - `_load_cfg_or_defaults(path)` helper in `cli.py` centralizes the fallback logic.
  - `health` CLI command is config-only — no `init_server` call, no model load.
  - Updated `clawstrike.example.yaml`: `mcp.transport` → `mcp.enabled: true`
  - Added US-041 (CLI Integration Mode); updated US-002, US-003 in user stories
  - Updated PRD Section 3.1 (dual invocation table) and Section 7 (config reference)
- Files changed:
  - `src/clawstrike/config.py` — removed `TransportMode`; `McpConfig.enabled`; `ClassifierConfig.model` default; `ClawStrikeConfig.classifier` default_factory; `_RootConfig.clawstrike` default_factory
  - `src/clawstrike/cli.py` — added `asyncio`, `json`, `ClawStrikeConfig` imports; `_load_cfg_or_defaults` helper; updated `start`; added `classify`, `gate`, `health` commands
  - `clawstrike.example.yaml` — replaced `transport: "stdio"` with `enabled: true`
  - `tests/test_config.py` — removed `TransportMode` import; updated 3 tests that expected errors for missing classifier (now use defaults); updated `test_defaults_mode_and_mcp`; added 4 new tests for `mcp.enabled`, old `transport` key warning, `ClawStrikeConfig()` no-args
  - `tests/test_cli.py` — updated `test_start_invalid_config_exits_1` (needed an actually invalid value); added 10 new tests covering no-config start, mcp-disabled start, `health`, `classify`, `gate`
  - `tasks/clawstrike-user-stories.md` — updated US-002/US-003; added US-041
  - `tasks/clawstrike-prd.md` — Section 3.1 dual-invocation table; Section 7 mcp.enabled
- All 192 tests pass; `ruff check` and `ruff format` clean
- **Learnings for future iterations:**
  - `_RootConfig` must have `clawstrike: ClawStrikeConfig = Field(default_factory=ClawStrikeConfig)` (not just required) for empty YAML files to work. Without this, `{}` fails Pydantic validation even though all nested fields have defaults.
  - Typer distinguishes "user didn't pass --config" from "user passed --config X" only at parse time. The workaround: compare `config != Path("clawstrike.yaml")` (the default) to detect an explicit vs implicit path. This is the idiomatic Typer approach since Typer doesn't expose an "was this explicitly set?" flag.
  - When a Typer test (`CliRunner.invoke`) previously caught a `ValueError`/`FileNotFoundError` from config loading, those tests break silently when the config becomes valid. Always verify that "invalid config" tests use a value that is *still* invalid (e.g., a bad enum value like `"model": "bad-model"`, not an empty section that now has a default).
  - `asyncio.run(srv.classify(...))` works correctly from CLI commands — the `@mcp.tool` decorated functions are plain async callables; FastMCP v3 doesn't wrap them in a way that prevents direct invocation.
  - `monkeypatch.chdir(tmp_path)` is the right way to test "no config file in working directory" scenarios — avoids accidentally picking up the real project's `clawstrike.yaml` during test runs.
  - `elevated_scrutiny` is absent in CLI responses by design (documented limitation). In-memory session state in `_elevated_sessions` requires a persistent process; one-shot CLI invocations have no shared memory.
---

## [2026-02-24] - US-017 + US-018

### What was implemented

**US-017 (Advisory Action Classification via API):**
- Created `src/clawstrike/gating.py`: `classify_action(action_type) -> (risk_level, reason)` pure function that matches action_type against `_TAXONOMY` dict (27 entries covering all PRD 4.3.1 categories). Unknown types default to `"high"` (fail-safe) with reason `"unknown_action_type_defaulted_to_high"`. Known types return `"taxonomy_match"`.
- Full taxonomy: Critical (exec/spawn/system/child_process/shell_exec, outbound_network_unknown/curl/wget/fetch, skill_install/skill_modify, cron_create/cron_modify), High (send_email/send_message, file_write, calendar_modify/contact_modify), Medium (file_read_sensitive, web_browse/web_navigate/form_submit), Low (file_read, calendar_read, list_directory)

**US-018 (Gating Recommendation Matrix):**
- `apply_decision_matrix(risk_level, trust_level) -> str` pure function implementing the full 4×4 PRD matrix.
- `gate` tool in `mcpserver.py` now calls both functions; replaced stub with real taxonomy + matrix logic.
- Writes `action_gate` audit event via `insert_audit_event` with `decision=recommendation`, `trust_level`, and `details={action_type, action_description, risk_level, recommendation}`.

### Files changed

- **Created:** `src/clawstrike/gating.py` — taxonomy dict + `classify_action()` + `_MATRIX` + `apply_decision_matrix()`
- **Modified:** `src/clawstrike/mcpserver.py` — added `from clawstrike.gating import ...`, replaced stub `gate` with taxonomy + matrix + audit write
- **Created:** `tests/test_gating.py` — 32 unit tests (16 taxonomy, 16 matrix via parametrize)
- **Modified:** `tests/test_server.py` — 13 new integration tests for US-017/018
- **Modified:** `tasks/clawstrike-user-stories.md` — marked US-017 and US-018 as DONE
- **Modified:** `CLAUDE.md` — added gating engine patterns and gate audit event notes

### Test count: 131 → 178 (47 new tests)

### Learnings for future iterations:
- **`_MATRIX` uses `TrustLevel` enum keys directly.** The dict is keyed by `TrustLevel` instances (not strings), which works because `TrustLevel` is a `StrEnum`. `resolve_trust_level()` returns a `TrustLevel`, so direct lookup works: `_MATRIX[risk_level][trust_level]`.
- **`gate` tool does NOT call `get_or_create_contact`.** Trust is resolved purely from channel_type (no DB lookup). This differs from `classify`. This means gate tool's trust_level doesn't reflect first-contact overrides — that's intentional for the MVP since gate is called per-action, not per-message source.
- **Pure function tests for gating need no imports from fastmcp.** `test_gating.py` has no async, no fixtures — just direct calls. Follows the same pattern as `test_trust.py`.
- **Matrix parametrize covers all 16 cells.** Using `@pytest.mark.parametrize` for the 4×4 matrix is clean and ensures full coverage without writing 16 separate test functions.
- **Unknown action_type defaults to high, not untrusted-equivalent block.** The fail-safe is `"high"` risk (not `"critical"`), which means a high-trust session can still auto-allow an unknown action. This is intentional — over-blocking legitimate unknown actions would harm usability more than under-blocking unknown actions from trusted sources.
---

## [2026-02-24] - US-023 / US-024 (+ open ACs in US-008/009/010/015/018)
- What was implemented:
  - **US-023: Audit Log DB Initialization** — `setup_audit_db(path) -> (was_created, event_count)` sync function in `db.py` uses stdlib `sqlite3` to eagerly initialize the DB at startup. `clawstrike start` calls it after `init_server` and logs `"Audit log: <path> (created)"` or `"(ready, N events)"`. `open_db()` now calls `_apply_migrations()` after DDL to add missing columns on existing DBs (`PRAGMA table_info` check → `ALTER TABLE ADD COLUMN`).
  - **US-023: Schema additions** — `audit_events` table gains `label TEXT`, `raw_input_hash TEXT`, `raw_input_snippet TEXT` columns. Both async (`_apply_migrations`) and sync (`_apply_migrations_sync`) migration helpers implemented.
  - **US-024: Classify audit events enriched** — `insert_audit_event` accepts `label`, `raw_input_hash`, `raw_input_snippet`. The `classify` tool: always writes `raw_input_hash = SHA-256(text)`; writes `raw_input_snippet = text[:max_chars]` when `log_raw_input=True` (default) else `None`; writes `details = {model, threshold_applied: {block, flag}, elevated_scrutiny: bool}`.
  - **US-008 AC closed** — block events now include `raw_input_hash`, `label`, and full details in audit log.
  - **US-009 AC closed** — flag events now have `details.elevated_scrutiny=True` in audit log.
  - **US-010 AC closed** — pass events written (already was, now with full fields).
  - **US-015 AC closed** — effective thresholds now recorded in classify audit event `details.threshold_applied`.
- Files changed:
  - `src/clawstrike/db.py` — added `_NEW_AUDIT_COLS`, `_apply_migrations`, `_apply_migrations_sync`, `setup_audit_db`; updated `_DDL` and `insert_audit_event` signature
  - `src/clawstrike/mcpserver.py` — added `hashlib` import; classify tool now computes hash/snippet and writes enriched audit event with `details`
  - `src/clawstrike/cli.py` — `start` command calls `setup_audit_db` and logs DB status after `init_server`
  - `.gitignore` — added `data/` (default audit DB directory)
  - `tests/test_db.py` — 8 new tests: schema columns, migration, new insert_audit_event fields, `setup_audit_db` (new/existing/idempotent/parent-dirs)
  - `tests/test_server.py` — 8 new tests: full classify audit fields (label/model/thresholds), pass/flag/block decisions written, elevated_scrutiny in details, raw_input_snippet and hash
  - `tests/test_cli.py` — 2 new tests: startup log "(created)" and "(ready, N events)"
  - `tasks/clawstrike-user-stories.md` — marked US-023/024 ✅ DONE; closed open ACs in US-008/009/010/015
  - `CLAUDE.md` — added audit schema and classify audit event patterns
- **Learnings for future iterations:**
  - `setup_audit_db` uses stdlib `sqlite3` (sync) so it can be called from non-async startup code; `open_db` uses `aiosqlite` (async) for request-time operations. Two separate code paths but the same migration logic.
  - Migration uses `PRAGMA table_info(table_name)` to get existing columns; compare against `_NEW_AUDIT_COLS` list and `ALTER TABLE ADD COLUMN` for each missing column. This is safe to run on every open (no-op when columns already exist).
  - `raw_input_hash` is ALWAYS written (regardless of `log_raw_input`) so forensic identification is always possible. `raw_input_snippet` is only written when `log_raw_input=True`.
  - `elevated_scrutiny` in audit details is a bool derived from `decision == "flag"` at write time — not from the in-memory `_elevated_sessions` set (which only tracks sessions by ID).
  - Tests for new audit fields that vary by `log_raw_input` config need a separate `cfg` with `log_raw_input: False` — create it inline with `load_config(write_yaml(tmp_path, data))` rather than using the shared `cfg` fixture.
---

## [2026-02-25] - US-022
- What was implemented:
  - Added `downgrade_trust(trust_level) -> TrustLevel` pure function to `src/clawstrike/gating.py`
    - Maps HIGH→MEDIUM, MEDIUM→LOW, LOW→UNTRUSTED, UNTRUSTED stays UNTRUSTED (floor)
    - Designed to be called multiple times for stacked downgrades (US-016 will call it again)
  - Updated `gate` tool in `src/clawstrike/mcpserver.py`:
    - Checks `session_id in _elevated_sessions` to determine `elevated` flag
    - Applies `downgrade_trust(base_trust_level)` when elevated to get `effective_trust_level`
    - Uses `effective_trust_level` (not `base_trust_level`) for `apply_decision_matrix`
    - Response now includes both `trust_level` (original) and `effective_trust_level` (post-downgrade)
    - Audit event stores `effective_trust_level` as the top-level `trust_level` column; `details_json` includes `original_trust_level` and `elevated_scrutiny: bool`
- Files changed:
  - `src/clawstrike/gating.py` — added `_TRUST_DOWNGRADE` mapping and `downgrade_trust()` function
  - `src/clawstrike/mcpserver.py` — imported `downgrade_trust`; updated `gate` tool logic + response + audit
  - `tests/test_gating.py` — 5 unit tests for `downgrade_trust` (all 4 tiers + double-downgrade stack)
  - `tests/test_server.py` — 7 integration tests for elevated scrutiny gating behavior
  - `tasks/clawstrike-user-stories.md` — marked US-022 as DONE
- **Learnings for future iterations:**
  - `_elevated_sessions` is already populated by `classify` when decision=="flag" and session_id is non-empty; `gate` just reads from this set — no new global needed
  - The gate response now has BOTH `trust_level` (original channel-resolved) AND `effective_trust_level` (after downgrade); the audit log stores effective as the top-level column and original in details
  - `downgrade_trust` is designed to be called once per downgrade source — US-016 (content-source mismatch) will simply call it again on the already-downgraded result
  - `_make_cfg_with_trust()` helper in test_server.py creates a config with a specific channel→trust mapping for scenario testing; use this pattern for future trust-related tests
  - Avoid string annotations in test helpers (e.g., `-> "ClawStrikeConfig"`) — ruff UP037 flags them as fixable
---

## [2026-02-25] - US-016

- What was implemented:
  - Content-source mismatch detection in `classify`: when a known HIGH or MEDIUM trust contact sends content scoring ≥ the base `flag` threshold (before trust modulation), a mismatch is detected
  - Session tracking via `_mismatch_sessions: set[str]` module global in `mcpserver.py` (parallel to `_elevated_sessions`)
  - `init_server()` resets `_mismatch_sessions` alongside `_elevated_sessions`
  - In `gate`: mismatch forces effective trust to `LOW` (applied before elevated-scrutiny downgrade). Stacking: mismatch→LOW, then elevated_scrutiny→UNTRUSTED
  - Mismatch `trust_update` audit event with `reason: "content_source_mismatch"`, written in the classify DB block
  - `content_source_mismatch: bool` added to both `classify` and `gate` responses
  - `content_source_mismatch: bool` added to `gate` audit event `details_json`
- Files changed:
  - `src/clawstrike/mcpserver.py` — `_mismatch_sessions` global, `init_server` reset, mismatch detection in `classify`, mismatch downgrade in `gate`
  - `tests/test_server.py` — 15 new tests covering all US-016 ACs; teardown resets `_mismatch_sessions`
  - `CLAUDE.md` — documented mismatch pattern, updated MCP server globals list and gate response fields
  - `tasks/clawstrike-user-stories.md` — marked US-016 complete
- **Learnings for future iterations:**
  - Mismatch fires independently of the effective classify decision: a HIGH trust contact (effective_flag=0.80) scoring 0.75 gets a "pass" decision but still triggers mismatch (score ≥ base_flag=0.70). Test this explicitly — it's the key security insight of the story.
  - First-contact forces UNTRUSTED, so mismatch can never fire for first-time contacts (they won't be in HIGH/MEDIUM). No special guard needed.
  - Gate downgrade order matters: apply mismatch (→ LOW) BEFORE elevated_scrutiny (`downgrade_trust`). Wrong order would give different results for stacked sessions.
  - The `_make_cfg_with_trust()` helper already present in test_server.py is ideal for US-016 tests — reuse it.
  - Always seed a contact (benign classify call) before testing trust-sensitive behavior — otherwise first-contact override masks the intended trust level.
---

## [2026-02-26] - US-019 + US-020

### What was implemented

**US-019 (User Confirmation Prompt for Gated Actions):**
- `confirm` MCP tool in `mcpserver.py` — stateless tool that records the user's confirmation decision for gated actions
- Decision normalization via `_DECISION_MAP`: accepts `approve`/`a`, `deny`/`d`, `always_allow`/`aa`, `always_allow_global`/`aag` (case-insensitive, whitespace-stripped)
- Invalid decisions raise `RuntimeError` (surfaced as `ToolError` via FastMCP)
- Returns `{status, decision (allow/deny), user_decision (normalized), allowlist_created, allowlist_rule_id, ...}`
- Writes `action_confirm` audit event for every decision
- CLI equivalent: `clawstrike confirm --json '{...}'`

**US-020 (Action Allowlist Creation from Approval):**
- `action_allowlist` table in `db.py`: `id`, `action_type`, `action_pattern` (always NULL for now), `source_scope`, `created_at`, `created_by`
- `check_allowlist(conn, action_type, source_id)` — matches exact action_type AND (global OR source_id match)
- `insert_allowlist_rule(conn, action_type, source_scope, created_by)` — returns the new row ID
- In `gate` tool: allowlist check runs before the decision matrix; if matched, recommendation forced to `"allow"`, response includes `allowlisted: True` and `allowlist_rule_id`
- In `confirm` tool: `always_allow` creates source-scoped rule, `always_allow_global` creates global rule; only when `cfg.action_gating.allowlist_learning` is `True` (silently downgrades to `approve` when `False`)
- Writes `allowlist_creation` audit event when a rule is created
- Gate audit event `details_json` now includes `allowlisted`, `allowlist_rule_id`, `allowlist_source_scope`
- Updated MCP server instructions to mention the confirm tool

### Files changed

- **Modified:** `src/clawstrike/db.py` — added `action_allowlist` table DDL, `check_allowlist()`, `insert_allowlist_rule()`
- **Modified:** `src/clawstrike/mcpserver.py` — added `confirm` tool, `_DECISION_MAP`, allowlist bypass in `gate`, updated MCP instructions, new DB imports
- **Modified:** `src/clawstrike/cli.py` — added `confirm` CLI command
- **Modified:** `tests/test_db.py` — 5 new tests for allowlist CRUD
- **Modified:** `tests/test_server.py` — 22 new tests for confirm tool, allowlist creation, allowlist bypass in gate, E2E flow
- **Modified:** `tests/test_cli.py` — 2 new tests for confirm CLI command
- **Modified:** `CLAUDE.md` — added allowlist and confirm tool patterns
- **Modified:** `tasks/clawstrike-user-stories.md` — marked US-019 and US-020 as DONE
- **Modified:** `tasks/progress.txt` — this entry

### Test count: 233 → 262 (29 new tests)

### Learnings for future iterations:
- The `confirm` tool is fully stateless — it does not look up prior `gate` calls. The skill re-sends the full action context. This simplifies implementation (no session state for pending confirmations) at the cost of requiring the skill to relay all context.
- `always_allow`/`always_allow_global` with `allowlist_learning: false` silently downgrade to `approve` — no error, no rule. The `user_decision` in the response reflects the downgraded `"approve"`, not the original `"always_allow"`. This lets the skill know no rule was created.
- Allowlist check in `gate` uses a separate `open_db` call before the trust resolution pipeline. This keeps the allowlist check isolated and doesn't interfere with the existing trust/downgrade logic.
- The E2E test pattern (gate→confirm→gate) is effective for verifying the full flow. Use `_make_cfg_with_trust` to set up channels that produce `prompt_user` for the initial gate call.
---

## [2026-02-26] - US-028
- Implemented `clawstrike logs --export csv --output <path>` with full filter support
- Added `query_audit_events()` sync function to `db.py` using stdlib `sqlite3` (no event loop needed)
- Exported `AUDIT_EVENT_FIELDS` constant from `db.py` so CLI and tests share the same column order
- Replaced the placeholder `logs` command in `cli.py` with a full implementation
- Added `_parse_last_duration()` helper in `cli.py` to parse `24h`, `7d`, `30m` style strings into `timedelta`
- Files changed:
  - `src/clawstrike/db.py` — added `AUDIT_EVENT_FIELDS` list and `query_audit_events()` sync function
  - `src/clawstrike/cli.py` — replaced `logs` placeholder; added imports (`csv`, `re`, `datetime`, `timedelta`, `UTC`)
  - `tests/test_cli.py` — added 12 tests covering all acceptance criteria
  - `tasks/clawstrike-user-stories.md` — marked US-028 complete
- All 277 tests pass; ruff check and ruff format clean
- **Learnings for future iterations:**
  - For sync CLI read access to the audit DB, use `query_audit_events()` (stdlib sqlite3) — no async event loop needed; reserve `open_db` / `aiosqlite` for tool handlers that write within an async context
  - Export `AUDIT_EVENT_FIELDS` from `db.py` as a module-level constant so both the CLI writer and test assertions reference the same authoritative column list — avoids drift
  - `csv.DictWriter` writes `None` as the string "None"; preprocess rows to replace `None` with `""` for cleaner CSV output
  - Typer's `CliRunner.invoke(..., input="y\n")` simulates stdin for `typer.confirm()` prompts — use this in tests for overwrite-confirmation flows
  - Duration filter uses ISO string comparison in SQLite (`timestamp >= ?`) — works correctly because `datetime.now(UTC).isoformat()` and stored timestamps share the same `+00:00` UTC suffix, so lexicographic order equals chronological order
---

## [2026-02-26] - Remove migration helpers (db.py cleanup)
- Removed `_NEW_AUDIT_COLS`, `_apply_migrations()` (async), and `_apply_migrations_sync()` (sync) from `db.py` — the project is pre-release so no existing databases need migration; all columns (`label`, `raw_input_hash`, `raw_input_snippet`) were already present in the `_DDL` `CREATE TABLE` statement
- Removed the `_apply_migrations` call from `open_db()` and the `_apply_migrations_sync` call from `setup_audit_db()`
- Removed unused `sqlite3 as _sqlite3` import from `tests/test_db.py`
- Removed `test_open_db_migrates_old_schema` (migration-specific test, no longer needed)
- Renamed `test_audit_events_schema_has_new_columns` → `test_audit_events_schema_has_required_columns` to reflect that these are base DDL columns, not migration additions
- Files changed:
  - `src/clawstrike/db.py` — removed `_NEW_AUDIT_COLS`, `_apply_migrations`, `_apply_migrations_sync`, and their call-sites; simplified docstrings
  - `tests/test_db.py` — removed migration import and test; renamed schema column test
- 276 tests pass (net −1 from removed migration test); ruff check and ruff format clean
---

## [2026-02-26] - US-003 (+ open skill ACs in US-008 / US-019)

- Created `skills/clawstrike/` skill directory (no source code modified; no tests added — skill files are documentation only)
- **SKILL.md** — YAML frontmatter (`name`, `description`), mode detection (MCP tool list check → reads `references/mcp.md` or `references/cli.md`), session initialisation (UUID `session_id`), Step 1 classify workflow (block/flag/pass decision table including owner_dm notification wording for US-008), Step 2 gate workflow (allow/block/prompt_user decision table), Step 3 confirm workflow (full owner confirmation prompt spec including required context fields from US-019, deny → abandon instruction), action type reference table, channel type reference table
- **references/mcp.md** — Full parameter/response schemas for `classify`, `gate`, `confirm`, `health` MCP tools with JSON examples; notes that session elevation tracking is fully functional in MCP mode
- **references/cli.md** — Shell command syntax, JSON body schemas, and examples for all three CLI commands; notes that `elevated_scrutiny` is unavailable in CLI mode (no persistent process) and `session_id` is still required for audit correlation; expected ~1–2s cold start per call
- **README.md** — Prerequisites (`pip install clawstrike` / `uv add clawstrike`), config setup with `mcp.enabled: false` recommendation for OpenClaw, installation via manual file copy or ClawHub, MCP-mode setup instructions, troubleshooting section
- Files changed:
  - `skills/clawstrike/SKILL.md` — new
  - `skills/clawstrike/references/mcp.md` — new
  - `skills/clawstrike/references/cli.md` — new
  - `skills/clawstrike/README.md` — new
  - `tasks/clawstrike-user-stories.md` — marked US-003 ✅ DONE; closed remaining ACs in US-008 and US-019
  - `CLAUDE.md` — added skill file pattern note
  - `tasks/progress.txt` — this entry
- 276 tests pass (no Python changes; ruff check and ruff format clean)
- **Key design decisions:**
  - Unified skill covers both MCP and CLI modes via tool-list detection — no separate skill files needed
  - Block notifications go to owner_dm only, not the originating channel, to avoid creating a secondary injection vector (deviates from original US-008 AC wording; wording updated in user stories)
  - `session_id` is always passed (generated as UUID at session start) for audit correlation and elevation tracking; pass `""` explicitly to disable session tagging
  - Mode reference files (`mcp.md`, `cli.md`) are kept separate so SKILL.md stays under the recommended 500-line limit and agents load only the relevant invocation syntax
---

## [2026-02-26] - Make classify session_id required
- Removed the `= ""` default from `session_id` in the `classify` MCP tool — it is now a required parameter, consistent with `gate` and `confirm`
- Callers wishing to disable session tagging should pass `session_id: ""` explicitly
- Updated docstring for `session_id` param to reflect the new required status
- Files changed:
  - `src/clawstrike/mcpserver.py` — removed `= ""` default from `classify`'s `session_id` parameter; updated docstring
  - `tests/test_server.py` — added `"session_id"` to `_CLASSIFY_ARGS` and all explicit classify call dicts; renamed `test_classify_session_not_tagged_without_session_id` → `test_classify_elevated_scrutiny_scoped_to_session`; renamed `test_classify_mismatch_no_session_id_no_tag` → `test_classify_mismatch_empty_session_id_no_tag` (both updated to pass `session_id: ""` when testing the no-tagging sentinel)
  - `tests/test_cli.py` — added `"session_id"` to `_CLASSIFY_PARAMS`
  - `CLAUDE.md` — updated classify response shapes pattern
- 276 tests pass; ruff check and ruff format clean
---

## [2026-03-01] - US-043
- What was implemented:
  - `guard_allowlist_on_flag: bool = True` added to `ActionGatingConfig` in `config.py`
  - `confirm` tool updated with guard logic: when `guard_allowlist_on_flag=True` and `session_id` is in `_elevated_sessions` OR `_mismatch_sessions`, `always_allow`/`always_allow_global` decisions are silently downgraded to `approve` — no allowlist rule created
  - Guard reason logic: prefer `"elevated_scrutiny"` over `"content_source_mismatch"` when both apply; use `"allowlist_learning_disabled"` when both guard fires AND `allowlist_learning=False`
  - `confirm` response always includes `guard_applied: bool`; includes `guard_reason: str` only when `guard_applied=True`
  - Audit event `details` includes `guard_applied` and `guard_reason` when guard fires
  - 12 new tests covering all guard scenarios (parametrized for guard reasons, variants, stacking, disabled guard, empty session ID)
- Files changed:
  - `src/clawstrike/config.py` — added `guard_allowlist_on_flag: bool = True` to `ActionGatingConfig`
  - `src/clawstrike/mcpserver.py` — updated `confirm` tool with guard check + response/audit fields
  - `tests/test_server/test_confirm.py` — 12 new guard tests
  - `tasks/clawstrike-user-stories.md` — marked US-043 as DONE
- 270 tests pass; ruff check and ruff format clean
- **Learnings for future iterations:**
  - Guard logic runs BEFORE allowlist_learning check so that the stacking case (guard + learning=False → `allowlist_learning_disabled`) can be detected by checking `not cfg.action_gating.allowlist_learning` inside the guard branch
  - `guard_applied: bool` is always in the response (False by default); `guard_reason` is conditionally added only when guard fires — keeps response clean while remaining inspectable
  - The guard never fires in CLI mode because `_elevated_sessions` and `_mismatch_sessions` are module-level sets that are always empty in one-shot CLI invocations; no special CLI handling needed
  - Empty `session_id = ""` is never added to `_elevated_sessions`/`_mismatch_sessions` (guarded by `if session_id:` in classify), so guard naturally doesn't fire for empty session IDs
---

## [2026-03-01] - US-014

- What was implemented:
  - Added `ContactOverrideLevel` StrEnum (`"trusted"` | `"blocked"`) to `config.py`
  - Added `contacts: dict[str, ContactOverrideLevel]` field to `TrustConfig` — maps `source_id` to a static trust override
  - In `classify` tool: config override is checked before classifier invocation:
    - `"blocked"` contacts: early return with `decision: "block"`, classifier is never called; contact record is still created in DB; trust_update + classify audit events written
    - `"trusted"` contacts: trust overridden to HIGH regardless of channel default or first-contact status; classifier runs normally; trust_update audit event written
  - Added `make_cfg_with_contacts()` helper to `tests/test_server/helpers.py`
  - Added 10 new tests in `tests/test_server/test_classify.py`
- Files changed:
  - `src/clawstrike/config.py` — added `ContactOverrideLevel` enum, added `contacts` field to `TrustConfig`
  - `src/clawstrike/mcpserver.py` — classify tool updated with config-override check and early-return for blocked contacts
  - `tests/test_server/helpers.py` — added `make_cfg_with_contacts()` helper
  - `tests/test_server/test_classify.py` — added US-014 test section (10 tests); imported `make_cfg_with_contacts`
  - `tasks/clawstrike-user-stories.md` — marked US-014 as DONE
- 281 tests pass; ruff check and ruff format clean
- **Learnings for future iterations:**
  - Config-override check runs at the TOP of `classify`, before `clf.classify()` — the classifier is never instantiated for blocked contacts. The `clf = _require_classifier()` call still runs (needed for non-blocked contacts), but `clf.classify(text)` does not.
  - Blocked contacts still call `get_or_create_contact()` (to create DB record per AC), so the contact record exists even if they never pass classification.
  - "Trusted" override uses HIGH trust (`TrustLevel.HIGH`) and this takes precedence over both first-contact UNTRUSTED and channel-default trust. The existing mismatch detection logic fires correctly for trusted contacts since they are HIGH trust.
  - `ContactOverrideLevel` is a separate StrEnum from `TrustLevel` — values "trusted"/"blocked" are NOT in `TrustLevel`. This correctly prevents using "high"/"medium" etc. as contact overrides (Pydantic rejects them at config load time).
  - The DB write block for "trusted" contacts inserts the trust_update event at the start of the same `open_db` context as the classify event and interaction tracking — all writes are atomic per request.
  - `make_cfg_with_contacts()` helper pattern: `tmp_path` + contacts dict + optional channel/channel_trust/db_name — reusable for all future tests needing contact overrides.
---

## [2026-03-01] - US-021

- What was implemented:
  - Added `StaticAllowlistRule` Pydantic model (`action_type: str`, `source_scope: str = "global"`) to `config.py`
  - Added `static_rules: list[StaticAllowlistRule]` field to `ActionGatingConfig` (default empty list)
  - In `gate` tool: DB allowlist is checked first; if no match, static config rules are checked with the same matching logic (exact `action_type` AND `source_scope == "global"` OR `source_scope == source_id`). New `allowlist_source: "db" | "config" | None` field in gate response and audit event details distinguishes rule origins.
  - Added `list_allowlist_rules(path)` sync function to `db.py` (stdlib sqlite3, returns empty list if DB doesn't exist)
  - Replaced placeholder `allowlist` command in `cli.py` with a Typer sub-app (`_allowlist_app`) and `allowlist list` subcommand. Output is a left-aligned table showing Source, ID, Action Type, Source Scope, Created columns. Config rules show `"-"` for ID and `"(static)"` for Created.
  - Added `make_cfg_with_static_rules()` helper to `tests/test_server/helpers.py`
  - 17 new tests (11 gate, 4 CLI)
- Files changed:
  - `src/clawstrike/config.py` — added `StaticAllowlistRule`, updated `ActionGatingConfig`
  - `src/clawstrike/db.py` — added `list_allowlist_rules()`
  - `src/clawstrike/mcpserver.py` — gate tool: static rule check, `allowlist_source` field in response and audit
  - `src/clawstrike/cli.py` — replaced placeholder `allowlist` command with sub-app + `allowlist list`
  - `tests/test_server/helpers.py` — added `make_cfg_with_static_rules()` helper
  - `tests/test_server/test_gate.py` — 11 new tests (allowlist_source field, static rule matching, priority, audit)
  - `tests/test_cli.py` — 4 new tests for `allowlist list`
  - `tasks/clawstrike-user-stories.md` — marked US-021 as DONE
  - `CLAUDE.md` — updated allowlist patterns
- 292 tests pass; ruff check and ruff format clean
- **Learnings for future iterations:**
  - Typer sub-apps: `_sub = typer.Typer(help="...")` + `app.add_typer(_sub, name="subname")` + `@_sub.command("cmd")` enables `clawstrike subname cmd` syntax. The `name` in `add_typer` is what appears in CLI output.
  - DB rules are checked before config rules in gate — this ensures dynamic rules (created by the user at runtime) take precedence over static config rules when both match the same action.
  - Static config rules never get allowlist_rule_id (None) since they have no DB row — the `allowlist_source` field is the only way to distinguish them.
  - `list_allowlist_rules` follows the same sync/stdlib pattern as `query_audit_events` — no event loop needed for read-only CLI access.
  - `make_cfg_with_static_rules()` helper follows the same pattern as `make_cfg_with_trust()` and `make_cfg_with_contacts()` — minimal_config + extra dict + write_yaml + load_config.
---

## [2026-03-01] - US-042
- Implemented `clawstrike init` command in `src/clawstrike/cli.py`
- Command creates `clawstrike.yaml` in the CWD with secure defaults and descriptive inline comments
- Flags: `--force` (overwrite existing), `--mcp` (set `mcp.enabled: true` for MCP deployments)
- File permissions: `0o600` for `clawstrike.yaml`, `0o700` for `data/` directory
- Generated config includes commented-out `trust.contacts` and `action_gating.static_rules` examples
- Added 14 tests covering all acceptance criteria in `tests/test_cli.py`
- Files changed:
  - `src/clawstrike/cli.py` — added `init` command
  - `tests/test_cli.py` — added 14 tests for `init`
  - `tasks/clawstrike-user-stories.md` — marked US-042 complete
- **Learnings for future iterations:**
  - `clawstrike init` uses CWD via `Path("clawstrike.yaml")` — no `--config` flag since we're creating, not reading
  - `monkeypatch.chdir(tmp_path)` is essential in init tests (command creates files relative to CWD)
  - f-string template for YAML generation keeps comments intact (yaml.dump() strips comments); use a raw string template instead
  - `Path.chmod(0o600)` must be called after `write_text()` — write_text uses the process umask, chmod corrects it
  - `mkdir(mode=0o700)` sets the directory permissions reliably on Linux/WSL2
---

## [2026-03-02] - US-032
- What was implemented:
  - Created `tests/test_server/test_e2e.py` with one E2E test covering the full classify → gate pipeline for a benign owner DM message
  - Test pre-seeds the owner contact in the DB (via `get_or_create_contact` directly) so classify sees them as a known contact with HIGH trust (owner_dm channel default), not first-contact UNTRUSTED
  - Verifies: classify decision="pass", trust_level="high", is_first_contact=False, content_source_mismatch=False
  - Verifies: gate recommendation="allow", risk_level="low" (calendar_read), trust_level="high"
  - Verifies: classify + gate round trip < 110ms (mock classifier eliminates inference time; measures pipeline + DB + MCP overhead)
  - Verifies: audit log contains exactly 1 "classify" event (decision: pass) and 1 "action_gate" event (decision: allow)
- Files changed:
  - `tests/test_server/test_e2e.py` — new file, 1 test
  - `tasks/clawstrike-user-stories.md` — marked US-032 as DONE
- 306 tests pass; ruff check and ruff format clean
- **Learnings for future iterations:**
  - E2E tests live in `tests/test_server/test_e2e.py` and inherit the `reset_server_config` autouse fixture (mock classifier + global reset) and `cfg` fixture (isolated DB path) from `tests/test_server/conftest.py`
  - Pre-seeding the contact via `open_db` + `get_or_create_contact` before `init_server(cfg)` is the correct pattern for tests that need non-first-contact HIGH trust. Direct DB seeding avoids extra audit events that would break "exactly N events" assertions
  - The gate tool resolves trust from channel_type directly (no contact DB lookup), so gate always sees HIGH trust for owner_dm regardless of first-contact status. Only classify has first-contact override behavior
  - The "input_classification" audit event type in the PRD/user stories corresponds to "classify" in the actual implementation
  - `dict(aiosqlite.Row)` produces a plain dict keyed by column names — safe for assertion
---

## [2026-03-02] - US-033
- What was implemented:
  - E2E test for prompt injection detection from an untrusted email sender
  - Test verifies the full classify pipeline: injection scoring → UNTRUSTED first-contact trust → lowered block threshold → block decision → audit log
- Files changed:
  - `tests/e2e/test_e2e.py` — added `test_e2e_prompt_injection_from_untrusted_email`; added imports for `json`, `pytest`, `ClassifierResult`
  - `tasks/clawstrike-user-stories.md` — marked US-033 as DONE
- **Learnings for future iterations:**
  - To override the mock classifier score in an e2e test, add `reset_server_config` as an explicit parameter to the test function (the autouse fixture yields the mock; adding it as a parameter gives access to the yielded value). Then set `reset_server_config.classify.return_value = ClassifierResult(...)` before or after `init_server(cfg)` — both work since init stores the mock object, and changing its return_value affects future calls.
  - UNTRUSTED effective block threshold = base 0.92 + UNTRUSTED modifier (-0.10) = 0.82. LOW = 0.87. These are the thresholds to use when asserting `threshold_applied.block` in tests.
  - First-contact senders always resolve to UNTRUSTED regardless of channel_type default. Do NOT pre-register the contact when the e2e scenario intentionally tests first-contact behavior.
  - Content-source mismatch only fires for HIGH or MEDIUM trust senders (not UNTRUSTED), so `content_source_mismatch` is always False for untrusted first-contact scenarios.
  - Block decisions set `elevated_scrutiny: False` in the audit `details_json` (only flag decisions set it True). Verify this in tests to confirm the audit path is correct.
---

## [2026-03-02] - US-034
- What was implemented:
  - Added `test_e2e_flagged_session_escalates_gate_recommendation` to `tests/e2e/test_e2e.py`
  - Tests the full classify → gate pipeline where a borderline-suspicious message from a known medium-trust contact (trusted_group channel) triggers elevated scrutiny
  - Verifies that the subsequent gate call for a high-risk action escalates from `prompt_user` to `block` due to active downgrade signals
- Files changed:
  - `tests/e2e/test_e2e.py` — added US-034 e2e test
  - `tasks/clawstrike-user-stories.md` — marked US-034 complete
- **Learnings for future iterations:**
  - For MEDIUM trust flag decisions, content-source mismatch ALWAYS co-fires with elevated_scrutiny: mismatch condition is `trust_level in (HIGH, MEDIUM) AND score >= base_flag (0.70)`, and for MEDIUM trust, eff_flag == base_flag == 0.70. So any score that triggers a flag decision also triggers mismatch.
  - When both mismatch and elevated_scrutiny are active, the gate effective trust degrades two tiers: MEDIUM → LOW (mismatch) → UNTRUSTED (elevated_scrutiny). The final recommendation for high+UNTRUSTED is block, satisfying the AC's "prompt_user → block" escalation.
  - Pre-registering a contact (via `get_or_create_contact` before `init_server`) is required when the scenario requires channel-default trust resolution (non-first-contact). First-contact behavior always overrides to UNTRUSTED.
  - Gate resolves trust independently from channel_type using `resolve_trust_level` (channel defaults only), never using first-contact logic. This means the base_trust in gate can differ from the classify trust if the contact happened to be first-contact during classify.
  - The audit `action_gate` event stores `trust_level` as the **effective** (post-downgrade) trust, and `original_trust_level` in `details_json` as the base channel-resolved trust — check both in tests.
---

## [2026-03-02] - US-035
- What was implemented:
  - E2E test `test_e2e_first_contact_to_auto_promotion` in `tests/e2e/test_e2e.py`
  - Verifies the full lifecycle: first-contact UNTRUSTED → 5 benign interactions → auto-promotion to MEDIUM → gate returns allow for low-risk action
- Files changed:
  - `tests/e2e/test_e2e.py` — added US-035 E2E test
  - `tasks/clawstrike-user-stories.md` — marked US-035 complete
- **Learnings for future iterations:**
  - The auto-promote loop: contact is created with `interaction_count=1`; first-contact path skips `increment_interaction`; calls 2–N increment the count; auto-promote fires when `contact.trust_level == "auto"` AND `updated.interaction_count >= auto_promote_after` (default 5). So 5 total classify calls trigger promotion on the 5th.
  - The `is_first_contact` column in `audit_events` is stored as INTEGER (0/1) in SQLite; compare with `== 1` not `is True` when reading raw rows back.
  - E2E tests that exercise the default classifier score (0.0 benign) don't need to explicitly reference the `reset_server_config` fixture as a parameter; the `autouse=True` fixture in `tests/e2e/conftest.py` applies it automatically. Only reference it as a parameter when you need to override `mock_clf.classify.return_value`.
  - No pre-registration of the contact is needed when the test intentionally exercises first-contact behavior (unlike test_e2e_benign_owner_dm_passthrough which pre-registers to skip first-contact override).
---

## [2026-03-04] - Skill refactor: split unified skill into mode-specific variants
- Replaced `skills/clawstrike/` (unified skill with runtime mode detection) with two self-contained skills:
  - `skills/clawstrike-mcp/SKILL.md` — for agents with ClawStrike connected as an MCP server; includes inline MCP invocation syntax and JSON examples
  - `skills/clawstrike-cli/SKILL.md` — for agents invoking ClawStrike as a one-shot shell command; includes inline CLI invocation syntax and bash examples
- Both skills have `name: clawstrike` in frontmatter; directory names (`clawstrike-mcp/`, `clawstrike-cli/`) disambiguate at the filesystem level
- Eliminated Step 0 (tool-list mode detection) entirely — no wasted tokens at session start; operator installs the appropriate variant for their deployment
- Dropped `references/` subdirectories — each SKILL.md is fully self-contained (~190–200 lines, well under 500-line limit)
- Updated `CLAUDE.md` to reflect the new two-skill structure

## [2026-03-02] - US-036
- What was implemented:
  - E2E test `test_e2e_allowlist_reduces_prompt_fatigue` in `tests/e2e/test_e2e.py`
  - Tests the full gate → confirm(always_allow) → gate allowlist-bypass flow
  - Verifies: initial gate for send_email (high-risk) from trusted_group (MEDIUM) → prompt_user; confirm with always_allow creates source-scoped DB rule; subsequent gate returns allow with allowlisted=True and the rule ID; audit event details reference the rule ID; deleting the rule from the DB directly restores prompt_user
  - Updated US-036 last AC: replaced the non-existent `clawstrike allowlist remove` CLI reference with "delete row directly from action_allowlist table"
- Files changed:
  - `tests/e2e/test_e2e.py` — added US-036 E2E test; updated module docstring
  - `tasks/clawstrike-user-stories.md` — marked US-036 ✅ DONE; updated last AC to reflect no CLI removal command
  - `tasks/progress.txt` — this entry
- 310 tests pass; ruff check and ruff format clean
- **Learnings for future iterations:**
  - `ActionGatingConfig.allowlist_learning` defaults to `True` in `config.py` (line 133) — the default `cfg` fixture therefore already enables allowlist rule creation; no extra config setup needed for tests that exercise `always_allow`.
  - The gate tool does NOT perform a contact registry lookup (no `get_or_create_contact`); trust is resolved purely from `channel_type` → no pre-seeding required for gate-focused E2E tests.
  - Deleting rows from SQLite inside an async test: use `async with open_db(db_path) as conn: await conn.execute("DELETE ...")` followed by `await conn.commit()`. The commit is required because `open_db` sets WAL mode implicitly but does not auto-commit.
  - The `gate` response `allowlist_source` field is `"db"` for rules created via `confirm` and `"config"` for static config rules — verify this in tests that exercise allowlist bypass to confirm the correct code path ran.
  - User story ACs that reference CLI commands should be audited against the actual CLI surface when adapting them. The `allowlist remove` command was never implemented (PRD 4.5 deliberately omits it); the AC was updated to reflect the direct-DB approach.
---
