## Code Review: Pipeline State & Resume Support

### expert_build/cli.py
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Clean addition of `--resume` flag. Properly wired to `cmd_pipeline` via argparse. The `store_true` default of `False` matches the `getattr(args, "resume", False)` fallback in `pipeline.py`.

---

### expert_build/pipeline.py:_init_state
VERDICT: CONCERN
CORRECTNESS: QUESTIONABLE
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: The stored `args` dict captures `url`, `model`, `rounds`, `domain` but omits `pdf`, `no_auto_accept`, `max_derive_rounds`, `timeout`, and `batch_size`. While the stored args aren't used functionally on resume (the live `args` object is used), a user inspecting the state file to understand what ran won't have the full picture. More importantly, there's no validation on resume that the current args are compatible with the original run — e.g., resuming with `--rounds 2` after starting with `--rounds 5` could produce unexpected behavior in the convergence loop.

---

### expert_build/pipeline.py:_load_state
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Correctly handles missing file (returns None) and corrupt JSON (catches `JSONDecodeError`/`ValueError`, warns to stderr, returns None). The `test_corrupt_state_file_handled` test verifies this path.

---

### expert_build/pipeline.py:cmd_pipeline (resume + state tracking)
VERDICT: CONCERN
CORRECTNESS: QUESTIONABLE
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: PARTIAL
INTEGRATION: WIRED
REASONING: Two issues:

**1. Convergence loop re-runs on resume after loop completion (bug).** If all convergence cycles finish (or converge early) and then stage 8 crashes, resume sets `start_cycle = state.get("current_cycle") or 1`, which is the *last* cycle number. The loop then runs `range(last_cycle, args.rounds + 1)`, re-executing at least one cycle. Worse, if the loop converged early at cycle 2 of 5, resume re-runs cycles 2–5, undoing the convergence. The derive/review stages are likely idempotent enough that this produces waste rather than corruption, but it's still incorrect behavior. Fix: persist a `"loop_completed": True` flag after the `for` loop exits and skip the loop on resume when it's set.

**2. Unnecessary f-string** at line ~339: `print(f"Resuming pipeline from state file", ...)` — no interpolation.

---

### expert_build/pipeline.py:_mark_stage
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Correctly updates stage status, sets `completed_at` timestamp on completion, merges extra metadata via `**meta`, and persists to disk after every transition. The `current_stage` tracking on "running" is informational-only but useful for debugging.

---

### expert_build/pipeline.py:_stage_completed
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Simple status check. Used correctly for stages 1–3 to gate re-execution. Not used for stages 4–7 (loop body), which is the correct design — those stages run per-cycle and get their status overwritten each iteration.

---

### expert_build/pipeline.py:exception handler
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Catches `Exception` (not `BaseException`), so `KeyboardInterrupt` and `SystemExit` propagate without being recorded as failures. Stores both `error` string and full traceback for diagnostics. Re-raises after saving state, preserving the original exception behavior. Tested by `test_state_records_failure`.

---

### tests/test_pipeline.py:TestPipelineState
VERDICT: CONCERN
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: PARTIAL
INTEGRATION: WIRED
REASONING: Seven test cases covering the main scenarios: success, failure recording, resume skip, resume-without-state, pause, resume-completed, and corrupt state. All well-structured with proper mock isolation. However, missing:
- **Mid-convergence-loop crash + resume**: no test verifies that crashing during cycle 2's review and resuming re-runs from cycle 2 (and doesn't re-run cycle 1).
- **Resume after loop completion + stage 8 failure**: would expose the loop re-run bug described above.
- **Multi-round convergence with resume**: no test with `rounds > 1` and resume.

The `json` import added at line 1 is unused in tests (no test calls `json.loads`/`json.dumps` directly — they use `_load_state`/`_save_state`). Minor.

---

## Summary

The implementation is solid for the common cases (crash-and-resume for stages 1–3, recording failures, handling corrupt state). The main correctness issue is the convergence loop resume logic: if the loop finishes and stage 8 fails, resume will re-execute convergence cycles unnecessarily. This is likely benign (derive and deduplicate should be approximately idempotent) but is a latent bug worth fixing before the loop logic gets more complex.

### SELF_REVIEW
LIMITATIONS: Could not verify whether `_stage_derive` is truly idempotent when re-run for an already-completed cycle (depends on `validate_proposals` dedup logic in `reasons_lib`). Could not verify whether the `work_dir` fixture's `monkeypatch.chdir` correctly isolates `STATE_FILE` across parallel pytest execution (xdist). Did not have access to the full test file to check whether existing tests in `TestPipeline` (pre-existing class) still pass with the added `resume=False` default in `make_pipeline_args`.
---

### FEATURE_REQUESTS
- Show the full state of modified module-level constants (like `STATE_FILE`, `STAGE_NAMES`) alongside the diff to verify they're consistent with their usage sites
- Include `git log --oneline` for the branch to understand the evolution of the feature across commits
---
