## Code Review: `chunk-docs` command

### expert_build/chunk_docs.py:cmd_chunk_docs
VERDICT: BLOCK
CORRECTNESS: BROKEN
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: PARTIAL
INTEGRATION: WIRED
REASONING: **`--dry-run` writes to the manifest (line 155-157), poisoning subsequent real runs.** After `cmd_chunk_docs(args)` with `dry_run=True`, the source path is appended to `.chunked-docs`. A follow-up non-dry-run invocation will see the file in `done` and skip it (line 131-133), producing zero entries. Dry run should be side-effect-free. The test `test_dry_run_no_files_created` only asserts no entry files are written — it doesn't assert the manifest is clean, so it misses this bug.

Fix: move the manifest write (lines 155-157) out of the `if args.dry_run` block — it's already handled correctly in the non-dry-run path at lines 176-178.

---

### expert_build/chunk_docs.py:chunk_python
VERDICT: CONCERN
CORRECTNESS: QUESTIONABLE
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: PARTIAL
INTEGRATION: WIRED
REASONING: The boundary regex `^(class |def |@)` (line 46) treats decorators as independent boundaries. For code like `@decorator\ndef foo(): ...`, this produces two separate sections: one for `@decorator` and one for `def foo`. The merge step will recombine them if they're small, but at a chunk boundary a decorator could end up in a different chunk than its function. No test exercises the decorator case. Low practical impact for documentation chunking, but worth noting.

---

### expert_build/chunk_docs.py:chunk_markdown
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Clean heading-based split with merge for small sections and fallback to `chunk_fixed`. Regex `(?=^#{1,2} )` correctly uses lookahead to preserve heading markers. Tests cover heading split, merge, h2, and no-heading fallback.

---

### expert_build/chunk_docs.py:chunk_fixed
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Overlap is capped at 25% of `max_chars`, and `step` is floored at 1 to prevent infinite loops. Tests verify single-chunk passthrough and overlap correctness.

---

### expert_build/cli.py
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Argparse subcommand and lazy dispatch added consistently with existing commands (`chunk-pdf`, `summarize`). Default `--input-dir` of `sources` and `--threshold` of 30000 match the summarize truncation limit.

---

### expert_build/summarize.py
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Warning message updated from generic "Large documents may lose tail content" to actionable "Consider: expert-build chunk-docs". Test in `test_summarize.py` updated to match the new string.

---

### tests/test_chunk_docs.py
VERDICT: CONCERN
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: PARTIAL
INTEGRATION: WIRED
REASONING: Good coverage of the happy paths. Missing: (1) test that dry-run does NOT write the manifest (would have caught the bug above), (2) test for decorator handling in `chunk_python`, (3) test for files with frontmatter but no `source_url`/`source_id` to verify the minimal frontmatter path.

---

## Summary

One **BLOCK**: `--dry-run` corrupts the manifest, causing subsequent real runs to silently skip files. Fix by removing lines 155-157 (the manifest write inside the dry-run branch). The non-dry-run path already handles manifest writing correctly at lines 176-178.

One **CONCERN** on `chunk_python` decorator handling — low severity but worth a follow-up test.

### SELF_REVIEW
LIMITATIONS: Did not run the tests to confirm the dry-run manifest bug empirically — assessed from code reading. Did not see the full `chunk_pdf` implementation to compare manifest handling patterns beyond the observation excerpt.

---

### FEATURE_REQUESTS
- Include the existing sibling commands (`chunk_pdf`, `summarize`) in full so patterns can be compared for consistency without needing observations
- Auto-detect when `--dry-run` paths have side effects and flag them

---
