## Code Review: Pipeline State & Resume Support

### expert_build/cli.py
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Simple addition of `--resume` flag with `store_true` action. Consistent with existing arg patterns in the parser.
---

### expert_build/pipeline.py:_init_state
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Clean initialization. Uses `getattr` defensively for optional args (`url`, `domain`). Saves immediately, so even a crash after init leaves a state file. Stage keys like `"1_ingest"` are constructed consistently from `STAGE_NAMES`.
---

### expert_build/pipeline.py:_load_state / _save_state
VERDICT: CONCERN
CORRECTNESS: QUESTIONABLE
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: PARTIAL
INTEGRATION: WIRED
REASONING: `_save_state` does a non-atomic write (`Path.write_text`). If the process is killed mid-write (e.g., OOM, `kill -9`), the state file could be truncated or corrupt, and `_load_state` would raise `json.JSONDecodeError` on the next `--resume` instead of a user-friendly error. For a CLI tool this is low-probability but worth a defensive `try/except` in `_load_state`, or write-to-temp-then-rename for atomicity.
---

### expert_build/pipeline.py:cmd_pipeline (resume path)
VERDICT: CONCERN
CORRECTNESS: QUESTIONABLE
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: PARTIAL
INTEGRATION: WIRED
REASONING: Two issues:

1. **Resuming a completed pipeline re-runs work.** If the prior run succeeded (`status: "completed"`), `--resume` loads that state, sets it to `"running"`, skips stages 1-3 (all completed), but stage 8 (export) has no `_stage_completed` guard — it always re-runs. The convergence loop also restarts from `current_cycle`, potentially re-running cycles. A guard like `if state["status"] == "completed": print("Pipeline already completed"); return` would prevent this.

2. **No args-mismatch detection on resume.** The state file saves `args.model`, `args.rounds`, etc., but resume ignores them entirely, using the new CLI args instead. If someone resumes with `--rounds 1` after starting with `--rounds 5`, or switches models, the behavior could be confusing. At minimum, a warning when saved args differ from current args would help.
---

### expert_build/pipeline.py:cmd_pipeline (convergence loop resume)
VERDICT: CONCERN
CORRECTNESS: QUESTIONABLE
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: UNTESTED
INTEGRATION: WIRED
REASONING: Stages 4-7 within the convergence loop have no per-stage `_stage_completed` guards. If the pipeline crashes at stage 5 (review) in cycle 2, resume starts at cycle 2 but re-runs stage 4 (derive) even though it already completed for that cycle. Derive adds beliefs to `reasons.db` — re-running it could produce duplicates or different results depending on idempotency of `_stage_derive`. This mid-cycle resume behavior is untested. Consider either: (a) adding per-stage guards within the loop, or (b) documenting that resume replays the entire crashed cycle, and adding a test confirming this.
---

### expert_build/pipeline.py:_mark_stage
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Correctly updates stage status, sets `current_stage` on "running", adds `completed_at` timestamp on "completed", merges extra metadata. Each call persists to disk via `_save_state`, which is important for crash recovery.
---

### tests/test_pipeline.py:TestPipelineState
VERDICT: CONCERN
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: PARTIAL
INTEGRATION: WIRED
REASONING: Five tests cover the happy path well. Missing coverage for:
- **Mid-convergence-loop crash and resume** (e.g., crash at cycle 2 review, resume re-runs from cycle 2 derive) — this is the most complex resume scenario and the one most likely to have bugs.
- **Resume of an already-completed pipeline** — currently silently re-runs parts.
- **Corrupt state file** — `_load_state` would crash with an unhelpful `JSONDecodeError`.
- `STATE_FILE` is imported but never used in assertions — tests use `_load_state()` instead, which is fine but means the import is unnecessary.
---

### SELF_REVIEW
LIMITATIONS: Could not verify idempotency of `_stage_derive`, `_stage_review`, etc. to assess the real-world impact of re-running stages on resume. Did not see the `caffeinate.hold` implementation to know if it interacts with state. The observation data was thorough and covered the key areas.
---

### FEATURE_REQUESTS
- Include idempotency annotations or analysis for functions called during resume (does re-running `_stage_derive` produce duplicates?)
- Show the full test file to verify existing tests still pass with the new `resume=False` default in `make_pipeline_args`
---
