Real-time vLLM server metrics with auto-discovery
Start a vLLM server to see live metrics, or click Demo to preview with simulated data.
The Prometheus /metrics endpoint on the remote vLLM server is not reachable from this machine.
/metrics
Ensure the endpoint is accessible, or use subprocess/container mode for full observability.
History will appear after the vLLM server has been running for a few seconds.
Latency histograms require request traffic on the vLLM server.