Simulation Dashboard
Concurrency:
Prefill TPS:
Decode TPS:
Run
Total KV Cache (GB):
KV Bytes/Token:
Block Size (tokens):
Cache Hit Rate Over Time
Active Requests Over Time
Input Tokens In-Flight
KV Cache Memory Pressure
Unique KV Cache Blocks
Total Evicted Blocks (cumulative)
Eviction Misses by Layer (cumulative re-fetched blocks)
Session Gantt (by concurrency slot)
Wait / Delay
Prefill
Decode
Session boundary
Restart continuation