sigilant-runner · Phi-3.5-mini-instruct · L4 · vllm · 2 configs

Config | TPS | TPS p95 | TTFT | TTFT p95 | ITL | PPL | TPS% | TTFT% | PPL% | Score
BF16_FP16 · ctx:32768 · kv:k8v8 · long  <- best | 57.5 | n/a | 56.2 | n/a | 17.47 | 3.02 | 100.0 | 100.0 | 100.0 | 100
BF16_FP16 · ctx:16384 · kv:k8v8 · default | 49.2 | n/a | 751.7 | n/a | 20.40 | 3.02 | 85.6 | 7.5 | 100.0 | 77

Best config: BF16_FP16 · ctx:32768 · kv:k8v8 · long
Auto baseline compare: score Δ=0.00 TPS Δ=0.00 TTFT Δ=0.0ms PPL Δ=0.00
Confidence: target=medium gap_before=23.00% var_before=n/a% replay=False(disabled) gap_after=23.00%

PPL is a quality proxy, not production validation.
Full production safety and long-context certification require Sigilant Optimizer.
