Cross-Domain Parameter Sweep

5 corpora × 3 parameters × 108 queries each — validating parameter robustness across domains

Chunk Size Sweep (256, 512, 768, 1024)

Finding: MRR varies by <2.5% across all chunk sizes for every corpus. The original software engineering corpus shows the most sensitivity (0.54–0.56 MRR range); all cross-domain corpora are flatter (<1% spread for legal, medical). Hit Rate@5 is nearly identical across all values. Chunk size 512 is confirmed as a safe default across domains.

Chunk Overlap Sweep (0, 25, 50, 100)

Finding: MRR spread is <1% for all corpora across all overlap values. Hit Rate@5 is virtually identical. Overlap has no measurable impact on retrieval quality when Arcane Recall expansion is active. Overlap 50 retained as standard — no corpus benefits from changing it.

Expansion Similarity Threshold Sweep (0.85–0.95)

Finding: Ranking metrics are identical across all thresholds for all corpora. The threshold controls only token volume: 0.85 produces ~40% more tokens than 0.95. The current default of 0.92 sits in the sweet spot — moderate token savings with zero quality cost. Threshold 0.92 confirmed cross-domain.

Summary: Parameter Sensitivity by Corpus

CorpusDomainDocsChunk Size MRR RangeOverlap MRR RangeThreshold MRR Range
OriginalSoftware Engineering890.0250.0080.001
LegalContract Law, Regulation900.0050.0090.000
MedicalClinical Medicine910.0050.0000.000
API ReferenceREST API Docs950.0080.0090.005
NarrativeHistory, Nature, Food890.0220.0090.000
Verdict: All three parameters show <2.5% MRR variation across all tested values and all five corpora. The current defaults (chunk_size=512, overlap=50, threshold=0.92) are cross-domain validated. No parameter change is needed for deploying against legal, medical, API reference, or narrative corpora.
Caveat: These corpora are LLM-generated (Claude 3 Haiku). Real-world documents may have more structural variation. The eval queries (108 per corpus) target 55–57 of ~90 docs. Relevance Ward thresholds remain corpus-dependent and should be recalibrated per Threshold Calibration.