## Code Review: `chunk-docs` command

### expert_build/chunk_docs.py:chunk_markdown
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Greedy merge by heading boundaries with fallback to `chunk_fixed` when no headings found. The lookahead regex `(?=^#{1,2} )` correctly preserves headings in their respective chunks. Only splits on h1/h2, which is a reasonable default. Tests cover heading splits, small-section merging, h2-level headings, and the no-heading fallback.

---

### expert_build/chunk_docs.py:chunk_python
VERDICT: CONCERN
CORRECTNESS: QUESTIONABLE
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: PARTIAL
INTEGRATION: WIRED
REASONING: Two minor issues. (1) Line 36 — the preamble-end detection checks whether a `@`-line's successor matches `^(class |def |@)`, but this only looks one line ahead. A standalone decorator not followed by another decorator or def (e.g., a decorator above a blank line) would mark preamble_end prematurely. This is unlikely in practice but worth noting. (2) Lines 63-66 — when starting a new chunk, `current = preamble + section` omits the trailing `"\n"` that the else branch adds (`current += section + "\n"`), creating minor inconsistency in trailing whitespace between the first section of a chunk and subsequent ones. Neither is a blocker. Test coverage is missing the stacked-decorators case (`@d1\n@d2\ndef f`).

---

### expert_build/chunk_docs.py:chunk_fixed
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Standard sliding-window chunker. Overlap capped at `max_chars // 4` to prevent degenerate cases. Step floored at 1 to prevent infinite loops. Short-circuit for text under `max_chars`. Tests verify single-chunk case and overlap correctness.

---

### expert_build/chunk_docs.py:cmd_chunk_docs
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Follows the same pattern as `cmd_summarize` — manifest-based idempotency, frontmatter propagation, date-based entry directories. Dry-run correctly `continue`s before both file writes and manifest updates, preventing manifest poisoning (tested explicitly). The `source_url` validation rejects non-HTTP values. One design note: the manifest path is hardcoded to `.chunked-docs` (a single file for all source dirs), whereas `chunk-pdf` uses `.chunked-{prefix}` per PDF — if two different `--input-dir` values share a filename, the manifest would incorrectly skip the second. This matches the `cmd_summarize` pattern though (single `.summarized` file), so it's consistent with the codebase.

---

### expert_build/cli.py
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Subparser and `_lazy` dispatch follow the exact pattern of every other command. Argument defaults (`--input-dir sources`, `--threshold 30000`, `--dry-run`) are sensible and consistent with `summarize` and `chunk-pdf`.

---

### expert_build/summarize.py
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: One-line change replacing a generic warning with an actionable suggestion pointing to the new `chunk-docs` command. Test updated to match. The `chunk-pdf` path for `.pdf` files is preserved above this line.

---

### tests/test_chunk_docs.py
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Comprehensive test suite covering unit tests for all three chunkers and integration tests for the command. Key scenarios: small-file skip, large markdown/python chunking, dry-run safety, manifest idempotency, frontmatter propagation, and chunk naming. The `test_dry_run_does_not_poison_manifest` test is a good regression guard. The `monkeypatch.chdir(wd)` in `work_dir` fixture correctly isolates the manifest and entries from other tests.

---

### tests/test_summarize.py
VERDICT: PASS
CORRECTNESS: VALID
SPEC_COMPLIANCE: N/A
ISSUE_COMPLIANCE: N/A
BELIEF_COMPLIANCE: N/A
TEST_COVERAGE: COVERED
INTEGRATION: WIRED
REASONING: Assertion updated to match the new warning message. Straightforward.

---

## Summary

Clean, well-tested addition that follows existing codebase patterns. No blockers. The only concern is minor: `chunk_python`'s trailing-newline inconsistency between the two merge branches and missing test coverage for stacked decorators. Neither would cause user-visible problems in practice — the chunks are consumed as text blobs, so an extra or missing newline is immaterial.

---

### SELF_REVIEW
LIMITATIONS: Could not run the test suite to verify all tests pass. Did not see the full `cmd_chunk_pdf` to compare manifest strategies in detail (only the observation excerpt). Could not verify whether `re.split` with the heading regex handles edge cases like headings inside fenced code blocks (it doesn't — this is a known limitation of regex-based markdown splitting, not specific to this PR).

---

### FEATURE_REQUESTS
- Include the ability to run the test suite as part of the review to verify tests actually pass
- Flag when new code duplicates logic from existing functions (e.g., `_strip_frontmatter` vs the inline frontmatter parsing in `cmd_summarize`) as a potential refactoring opportunity
