{# T55 admin evals index — terminal-brutalist re-skin. #} {# Lists scans with at least one llm_evals row. #} {% extends "admin_layout.html" %} {% block page_title %}admin :: evals{% endblock %} {% block breadcrumb %} admin/ evals {% endblock %} {% block content %}
Fixture-based regression evals: run a prompt against a fixture set across all current models, then watch cost / quality / drift over time. Each row below is one scan that already has llm_evals results.
Eval campaigns run a prompt against a fixture set across every current model, so you can compare cost, quality, and drift over time. This is the place for prompt-regression questions that span many scans.
For a one-off model comparison on a single scan, you no longer need the eval matrix. Open the scan's trace, click into any LLM step, and use the quick-compare buttons in the per-step Explorer → browse scans.
| scan | target | owner | evals | total cost | last eval | action |
|---|---|---|---|---|---|---|
| {{ row.scan_id_short }} | {{ row.eval_count }} | ${{ '%.4f'|format(row.total_cost) }} | {{ row.last_eval_at }} | open scan |