sigilant-runner · Phi-3.5-mini-instruct · L4 · vllm · 2 configs

Config | TPS | TPS p95 | TTFT | TTFT p95 | ITL | PPL | TPS% | TTFT% | PPL% | Score
FP16 · ctx:16384 · kv:k16v16 · long  <- best | 57.1 | 57.1 | 896.8 | 897.2 | 14.07 | 3.06 | 100.0 | 100.0 | 99.3 | 100
FP16 · ctx:8192 · kv:k16v16 · default | 57.0 | 57.1 | 897.8 | 897.9 | 14.08 | 3.04 | 99.9 | 99.9 | 100.0 | 100

Best config: FP16 · ctx:16384 · kv:k16v16 · long
Auto baseline compare: score Δ=0.00 TPS Δ=0.00 TTFT Δ=0.0ms PPL Δ=0.00
                     TPS p95 Δ=0.00 TTFT p95 Δ=0.0ms
Confidence: target=medium gap_before=0.00% var_before=0.06% replay=False(disabled) gap_after=0.00%

PPL is a quality proxy, not production validation.
Full production safety and long-context certification require Sigilant Optimizer.
