{% extends "base.html" %} {% from "components/help_macros.html" import tooltip, help_panel, help_step, help_tip, privacy_note %} {% set active_page = 'metrics' %} {% block title %}Cost Analytics - Deep Research System{% endblock %} {% block extra_head %} {% endblock %} {% block content %}

Cost Analytics (Experimental)

Back to Metrics
Time Period:
{% call help_panel('cost-how', 'Understanding Cost Estimates', icon='dollar-sign', collapsed=true, dismissible=true) %}
{{ help_step(1, "Estimation Only", "Costs are calculated from token counts and public pricing. Refer to your provider's billing for exact charges.") }} {{ help_step(2, "Local = Free", "Models running locally (Ollama, LM Studio) show $0.00 - they use your hardware, not API calls.") }} {{ help_step(3, "Input vs Output", "Input tokens = your query + context. Output tokens = AI response. Output tokens typically cost 2-4x more.") }}
{{ help_tip("Save money: Use local models for exploratory research, and cloud models for final reports needing maximum quality.") }} {% endcall %}

Loading Cost Analytics...

Calculating costs and generating insights

Understanding Cost Analytics

What We Track

  • Token Usage: Input and output tokens for each LLM call
  • Provider Information: Track which provider (OpenAI, Ollama, etc.) was used
  • Model Pricing: Current market rates for each LLM provider
  • Research Sessions: Individual costs per research query
  • Local Model Savings: Estimated savings from using free local models
  • Provider Breakdown: Usage distribution across different providers
  • Time-based Analysis: Cost trends over different periods

How Costs Are Calculated

  • Input Tokens: Charged at prompt pricing rate (usually lower)
  • Output Tokens: Charged at completion pricing rate (usually higher)
  • Total Cost: (Input tokens ÷ 1000) × Input rate + (Output tokens ÷ 1000) × Output rate
  • Provider-Aware Pricing: Uses both model name and provider for accurate pricing
  • Local Models: Ollama, LM Studio, and self-hosted models are free ($0.00)
  • Savings Calculation: Estimates money saved by comparing local usage to commercial equivalents
  • Currency Format: Shows appropriate decimal places based on amount

Pricing Sources

  • OpenAI: GPT-4, GPT-4 Turbo, GPT-4o, GPT-3.5 Turbo pricing
  • Anthropic: Claude-3 Opus, Sonnet, Haiku, Claude-3.5 pricing
  • Google: Gemini Pro, Gemini 1.5 Pro/Flash pricing
  • Local/Open Source: Ollama, LM Studio, LlamaCpp (all free)
  • Updates: Pricing data is updated regularly from official sources

Cost Optimization Strategies

  • Model Selection: Use smaller models for simple tasks
  • Research Mode: Choose "Quick" mode for basic queries
  • Local Models: Consider Ollama for privacy and zero cost
  • Batch Queries: Combine related questions in one session
  • Monitor Trends: Track spending patterns to identify opportunities

Understanding the Data

  • Estimates Only: Costs are estimates based on current pricing
  • Real Usage: Data reflects your actual token consumption
  • Time Filtering: All metrics update based on selected time period
  • Research Links: Click session IDs to view detailed results
  • Model Efficiency: Lower cost per token indicates better efficiency

Local Model Pricing Estimates

  • Zero API Costs: Local models (Ollama, LM Studio) have no API fees
  • Savings Calculation: Compares your usage against equivalent commercial models
  • Baseline Pricing: Uses conservative estimates (GPT-3.5/Claude Haiku rates)
  • Rough Estimates: Actual commercial costs could be higher or lower
  • Quality Differences: Local vs commercial models have different capabilities
  • Hidden Costs: Does not include hardware, electricity, or setup time

Privacy & Data

  • Local Storage: All cost data is stored locally on your device
  • No Sharing: Cost information is never shared or transmitted
  • Real Costs: These are estimates - check provider bills for actual charges
  • Accuracy: Pricing data is kept up-to-date but may vary
  • Control: You have full control over your usage and spending

Important: These cost estimates are based on current public pricing from LLM providers and your actual token usage. Actual costs may vary due to pricing changes, promotional rates, or billing adjustments. Always refer to your official provider bills for exact charges. Local models (Ollama, self-hosted) show $0.00 as they do not incur API costs.

Local Model Savings Disclaimer: The "Local Models Savings" calculation is a rough estimate comparing your local model usage to hypothetical commercial API costs. This estimate uses conservative baseline pricing (~$0.0015 per 1K tokens) and should be viewed as an approximation only.

Important Caveats:
Quality Differences: Commercial models may provide different output quality, accuracy, or capabilities
Hidden Costs: Local models require hardware investment, electricity, maintenance, and setup time
Performance Variations: Speed, reliability, and availability differences between local and commercial models
Scale Considerations: Commercial APIs may be more cost-effective for very high or very low usage patterns
Feature Differences: Commercial services often include additional features, support, and guarantees

Use these estimates as a general guide only. The true value of local models includes privacy, control, and independence benefits beyond just cost savings.

{% endblock %}