{% extends "admin/base.html" %} {% block title %}Model routing — cmdop admin{% endblock %} {% block head %} {# Spend tab pulls Chart.js + cockpit JS just like the legacy spend page. #} {% endblock %} {% block content %} {# Plan 61/2b W2 — consolidated LLM Gateway page. The four sibling pages (keys / models / aliases / spend) used to be four top-level routes; sidebar + cmd-K both showed them separately. A1 IA audit flagged this as the #1 LLM-area discoverability gap: the four pages serve a single mental model ("when an agent asks for a model, who gets the bill, which key is used, and which fallback fires?") and reading them needed three sidebar clicks. New shape: one route `/admin/llm`, four tabs. URL preserves the tab so bookmarks + deep links still work (`?tab=models` etc.); the old routes redirect with 301. #} {% from 'admin/components/ui.html' import card, page_header, tooltip, collapsible_section %} {# Spend leads: it's the everyday monitoring view. Provider keys / models / aliases remain one click away. #} {% set active_tab = active_tab|default('spend') %} {% set _kebab = [ {'label': 'Add LLM key (wizard)', 'href': '/admin/wizards/add-llm-key', 'icon': 'wand-2'}, {'label': 'gRPC monitoring', 'href': '/admin/monitoring', 'icon': 'activity'}, {'label': 'Help — Model routing', 'href': '/admin/help/architecture', 'icon': 'book-open', 'target': '_blank'}, ] %}
1. Look up alias
Agent asks for alias:chat or a raw gpt-4o. Aliases resolve to a tier + ordered fallback list.
2. Pick provider
Resolver scans the alias's fallback order, then cheapest-in-tier. First model whose provider has an active key wins.
3. Check key
Provider key is decrypted, base_url applied. corrupt rows are skipped; default is preferred if multiple exist.
4. Dispatch & bill
Call goes out; on response the per-fleet spend counter is INCRBY'd and a row lands in llm_spend_record.