GPU Util % utilisation
GPU Temp °C die
Unified GB of 128 · 8 GB guard
Throughput tok / second
TTFT ms · first token
throughput & first-token from the active lane
Active Lane idle no warm brain

resident brain idle · waiting for the sidecar

Artifacts 22 manifests in roster
Articles 67 published deep-dives
Benches 4 cached evidence sources
Runs scored 78 bench + live
Envelope 128 GB unified · 8 GB guard
Top runs · last cut M6 mirror
#1
cyber · hermes-vertical-router-on-spark:vertical_router · 1 run
100.0%
#2
finance · hermes-vertical-router-on-spark:vertical_router · 1 run
100.0%
#3
medical · hermes-vertical-router-on-spark:vertical_router · 1 run
100.0%
#4
frontier-only · hermes-cost-routing-local-and-openrouter:cost_router · 1 run
100.0%
#5
cost-routed · hermes-cost-routing-local-and-openrouter:cost_router · 1 run
91.7%
#6
qwen3-30b-moe-llamacpp-q4km · picking-the-hermes-brain-on-spark:hermes_brain · 1 run
90.0% 84t/s
#7
4b-sft-v0.2::curveball-v0.1 · the-refusal-floor-is-trainable:advisor_contract · 1 run
90.0% 42t/s
#8
qwen3-30b-moe-vllm-fp8 · picking-the-hermes-brain-on-spark:hermes_brain · 1 run
87.5% 55t/s
Resident lane configured
This run vs the bar live
What is Spark Arena?

Spark Arena is the operator-driven alternative to public cloud model arenas: private eval leaderboards, efficiency-as-metric (quality and tok/s, unified-mem peak, TTFT, $/M), closed-loop eval → fine-tune → re-rank, tool-call replay, custom rubrics, and a cost-per-quality Pareto frontier anchored to the hardware the votes ran on — here the operator is the hardware.