sigilant-runner · Phi-3.5-mini-instruct · A10G · vllm · 2 configs

Config | TPS | TPS p95 | TTFT | TTFT p95 | ITL | PPL | TPS% | TTFT% | PPL% | Score
INT8_W8A8 · ctx:32768 · kv:k8v8 · long  <- best | 27.3 | n/a | 5550.5 | n/a | 36.83 | 2.84 | 100.0 | 100.0 | 100.0 | 100
FP16_BASELINE · ctx:32768 · kv:k8v8 · long | 23.0 | n/a | 7291.2 | n/a | 43.59 | 3.01 | 84.2 | 76.1 | 94.4 | 87

Best config: INT8_W8A8 · ctx:32768 · kv:k8v8 · long
Auto baseline compare: score Δ=13.00 TPS Δ=4.30 TTFT Δ=-1740.7ms PPL Δ=-0.17
Confidence: target=medium gap_before=13.00% var_before=n/a% replay=False(disabled) gap_after=13.00%

PPL is a quality proxy, not production validation.
Full production safety and long-context certification require Sigilant Optimizer.
