{% extends "admin/base.html" %} {% block title %}Model routing — cmdop admin{% endblock %} {% block head %} {# Spend tab pulls Chart.js + cockpit JS just like the legacy spend page. #} {% endblock %} {% block content %} {# Plan 61/2b W2 — consolidated LLM Gateway page. The four sibling pages (keys / models / aliases / spend) used to be four top-level routes; sidebar + cmd-K both showed them separately. A1 IA audit flagged this as the #1 LLM-area discoverability gap: the four pages serve a single mental model ("when an agent asks for a model, who gets the bill, which key is used, and which fallback fires?") and reading them needed three sidebar clicks. New shape: one route `/admin/llm`, four tabs. URL preserves the tab so bookmarks + deep links still work (`?tab=models` etc.); the old routes redirect with 301. #} {% from 'admin/components/ui.html' import card, page_header, tooltip, collapsible_section %} {# Spend leads: it's the everyday monitoring view. Provider keys / models / aliases remain one click away. #} {% set active_tab = active_tab|default('spend') %} {% set _kebab = [ {'label': 'Add LLM key (wizard)', 'href': '/admin/wizards/add-llm-key', 'icon': 'wand-2'}, {'label': 'gRPC monitoring', 'href': '/admin/monitoring', 'icon': 'activity'}, {'label': 'Help — Model routing', 'href': '/admin/help/architecture', 'icon': 'book-open', 'target': '_blank'}, ] %}
{{ page_header( title='Model routing', subtitle='Spend · provider keys · models · aliases', eyebrow='LLM Gateway', help_anchor='architecture', help_label='Open the LLM Gateway docs', kebab_items=_kebab, actions=[ {'label': 'Add LLM key', 'kind': 'primary', 'icon': 'plus', 'href': '/admin/wizards/add-llm-key'}, ] ) }} {# ────────────────── Resolver-trace card (always visible) ─────────────── #} {# A1 §3 — the four-tab page begs the question "what _actually_ happens when an agent asks for a model?". The trace card is the legend. Static copy for now; a future plan can wire it to a live "trace this request" drawer. #} {# Decluttered: the resolver explainer is reference material — collapsed by default so the keys/models/spend tabs lead. Open state persists. #} {% call collapsible_section('Resolver trace', key='llm.resolver-trace', subtitle='what happens when an agent asks for a model', icon='route') %}
  1. 1. Look up alias

    Agent asks for alias:chat or a raw gpt-4o. Aliases resolve to a tier + ordered fallback list.

  2. 2. Pick provider

    Resolver scans the alias's fallback order, then cheapest-in-tier. First model whose provider has an active key wins.

  3. 3. Check key

    Provider key is decrypted, base_url applied. corrupt rows are skipped; default is preferred if multiple exist.

  4. 4. Dispatch & bill

    Call goes out; on response the per-fleet spend counter is INCRBY'd and a row lands in llm_spend_record.

{% endcall %} {# ─────────────────────────────── Tabs ───────────────────────────────── #}
{# ──────────────────────────── Tab panels ─────────────────────────────── #}
{% include 'admin/components/llm/_tab_keys.html' %}
{% include 'admin/components/llm/_tab_models.html' %}
{% include 'admin/components/llm/_tab_aliases.html' %}
{% include 'admin/components/llm/_tab_spend.html' %}
{% endblock %}