Active Profile
Determines which per-mode dashboard panel is shown and which cost-attribution bucket requests land in.
Vault Injection
Controls whether TokenPak injects relevant vault context blocks into each request, and the retrieval parameters.
Budget Enforcement
Enforces a monthly token budget. Requests that would exceed the limit are rejected with a 429 response.
Cache Invalidation Alerts
Fire a webhook or Slack notification when the cache hit rate drops below the threshold.
Provider Routing
The failover chain is configured in ~/.tokenpak/config.yaml under failover.chain.
Edit that file directly to change provider order, model maps, or credential env vars.
View current usage →
Local-First Routing
Routes requests to a local Ollama instance first; falls back to cloud providers on error.
{{ ollama_upstream }} by default.
If Ollama is not running, requests fall back to the cloud provider chain — but latency will spike.
Only enable this if you have Ollama running locally.
Compliance Routing
Pin all requests to a specific provider endpoint for compliance or data-residency requirements.
{% if compliance_provider %}