Open source · Self-hosted · v0.1.0

AI infrastructure costs,
under control.

Full observability for every AI API call — track cost, latency, and token usage across all your providers in real-time. Instrument with one line of code, gain instant visibility into your AI infrastructure, and scale to enterprise-grade routing, budget enforcement, and governance when you need it. Free, open source, and fully self-hosted.

stacksense · live
Today's AI spend
$3.84
Total tokens today
1.24M
Avg latency (p95)
312ms
Providers active
4
Cost this month
$47.20
Budget remaining
82%
# one line to start
client = ss.monitor(client)
# that's it.
🟢 Open source · Fully licensed · No vendor lock-in
⚡ <50ms instrumentation overhead · Zero performance impact
🔌 5+ providers supported · OpenAI, Anthropic, ElevenLabs, Pinecone
🐋 Docker Compose included · Deploy in minutes
🔐 Self-hosted, your data · Full control
📊 Real-time metrics · Live dashboard updates

Visibility first. Optimization when you need it.

Start with complete observability for free. Add enterprise-grade routing, budget enforcement, and governance as you scale.

Open Source
Visibility Layer
Complete observability, free and self-hosted. The foundation every AI team needs.
  • Unified dashboard
    Real-time cost, latency, and token metrics across all providers. Drill down by model, timeframe, or environment with interactive charts and tables.
  • Zero-code instrumentation
    Wrap any client in one line, no SDK changes required. Automatic capture of all metrics including errors, retries, and streaming responses.
  • Multi-provider support
    OpenAI (GPT-4, GPT-3.5), Anthropic (Claude), ElevenLabs (voice), Pinecone (vectors). Unified interface across all providers with automatic cost calculation.
  • SQLite + PostgreSQL
    Local dev with SQLite for zero-config testing. Production-ready PostgreSQL support with automatic schema migrations and connection pooling.
  • Google OAuth + key vault
    Encrypted API key storage with AES-256. Per-user accounts with Google OAuth, role-based access control, and audit logging. Fully self-hosted.
  • Docker + Kubernetes
    Production Docker Compose setup included. Kubernetes Helm charts in beta with horizontal scaling, health checks, and Prometheus metrics export.
Enterprise
Optimization Engine
Active cost control — route, enforce, govern. The CFO layer for your AI infrastructure.
  • Dynamic model routing
    Automatically route requests to the most cost-effective model based on prompt complexity, required quality tier, and latency SLAs. Save 40-60% on AI costs without sacrificing performance.
  • Budget circuit breakers
    Set per-team, per-feature, or global spending limits. Auto-downgrade to cheaper models or rate-limit when budgets are hit. Zero service disruption, full spend control.
  • Token waste detection
    ML-powered prompt scoring identifies inefficient patterns, redundant context, and bloated system prompts. Get specific recommendations to reduce token usage by 20-50%.
  • Cross-vendor arbitrage
    Real-time price monitoring across OpenAI, Anthropic, and others. Automatically shift traffic to maximize quality-per-dollar. Handle provider outages with instant failover.
  • Governance + audit logs
    Tamper-evident blockchain-style audit trail. Model allowlists, PII detection with automatic redaction, data residency enforcement. Full SOC 2 compliance support.
  • Agent tracking
    Track multi-step agentic workflows end-to-end. Total cost per run, infinite loop detection, per-task token budgets. Full step-by-step audit trail with tool usage breakdown.
The philosophy

Start with visibility — free, forever. Add optimization when your AI spend is worth controlling. Two tiers, one clean upgrade path.

Everything you need out of the box.

StackSense gives you complete control over your AI infrastructure from day one. No partial solutions, no "coming soon" for critical features.

📊
Real-Time Cost Tracking
Automatic cost calculation per model and provider. Live updates every second with sub-cent precision. Historical trends, cost breakdowns by feature, and instant alerts when spending patterns change.
Latency & Performance Monitoring
Track p50, p95, p99 latencies across all requests. Automatic slow query detection, streaming vs batch performance analysis, and provider comparison dashboards to identify the fastest endpoints.
🔢
Token Usage Analytics
Granular input/output token breakdown. Track token efficiency over time, identify expensive patterns, and get automatic recommendations for prompt optimization and context reduction.
🔍
Error & Retry Analysis
Comprehensive error tracking with automatic categorization. Monitor rate limits, failed requests, and timeout patterns. Intelligent retry analysis helps identify wasteful retry loops costing you money.
📈
Multi-Timeframe Views
1H, 24H, 7D, and 30D views with intelligent data aggregation. Compare across periods, spot trends early, and understand seasonal usage patterns. Custom date ranges for detailed analysis.
🌍
Multi-Environment Support
Separate tracking for development, staging, and production. Environment-specific budgets, isolated metrics, and the ability to test cost impact before deploying to production.
🔐
Encrypted Key Management
AES-256 encrypted storage for all API keys. Per-user key vaults with role-based access control. Automatic key rotation reminders and audit logs of who accessed what, when.
📡
Live Monitoring Dashboard
WebSocket-powered real-time updates with zero polling overhead. Live alerts for budget thresholds, error spikes, and latency degradation. System health checks with Prometheus metrics export.

Plug into your entire stack.

🤖
OpenAI
Live
🧠
Anthropic
Live
🔊
ElevenLabs
Live
🌲
Pinecone
Live
☁️
AWS Bedrock
Building
🧬
Google AI
Building
☸️
Kubernetes
Beta
📊
Grafana
Planned
📈
Datadog
Planned
🐋
Docker
Live
🗄️
PostgreSQL
Live
🪶
SQLite
Live

From zero to full visibility in 2 minutes.

Three simple steps to instrument your entire AI stack. No configuration files, no complex setup, just wrap and go.

STEP 1 · INSTALL
# Install StackSense
pip install stacksense

# Or with Docker
docker compose up -d
PyPI package or Docker Compose. Zero dependencies, works with Python 3.8+.
STEP 2 · WRAP CLIENT
# Wrap any AI client
import stacksense as ss

client = OpenAI()
client = ss.monitor(client)

# That's it!
One line wraps your client. All methods automatically tracked with zero code changes.
STEP 3 · OPEN DASHBOARD
# Start dashboard
make dashboard

# Open browser
http://localhost:5000

# See all metrics live!
Dashboard auto-starts on port 5000. Real-time metrics, zero configuration required.
Full example
from openai import OpenAI
import stacksense as ss

# Initialize and wrap your client
client = ss.monitor(OpenAI())

# Use normally - every call is automatically tracked
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

# ✅ Cost, latency, tokens → tracked automatically
# ✅ Visible in dashboard at http://localhost:5000
# ✅ No configuration needed

Built for teams shipping AI products.

From indie hackers to enterprise engineering teams, StackSense helps you understand and control AI costs from day one.

🚀
Startups & Indie Hackers
Track every dollar spent on AI from day one. Know exactly which features cost what. Catch runaway costs before they hit your runway. Free tier means zero overhead.
🏢
Enterprise Teams
Multi-team visibility with per-project budgets. Compliance-ready audit trails. Dynamic routing saves 40-60% on AI spend. SOC 2 ready governance engine included.
🤖
AI Agent Developers
Track entire agentic workflows end-to-end. Catch infinite loops before they drain your budget. Per-task token limits. Full audit trail of every tool call and decision.
📊
Data Science Teams
Compare model performance vs cost in real-time. A/B test prompts with instant ROI analysis. Track embeddings, vector searches, and inference costs in one place.
💼
FinOps & Platform Teams
Centralized AI cost management across all teams. Chargeback by department. Budget forecasting with trend analysis. Prometheus metrics for existing monitoring stack.
🎓
Researchers & Academics
Track research experiment costs. Compare LLM providers for your specific workload. Self-hosted means your data stays private. Open source means full control and customization.

From observability to active control.

Six capabilities that transform StackSense from a dashboard into the economic layer of your AI infrastructure.

Dynamic Model Routing
Route each prompt to the right model based on task complexity, cost thresholds, and latency requirements. Automatic fallback when quality permits.
Routing
🔬
Token Waste Detection
Score prompts for efficiency. Detect redundant context, bloated system prompts, and high retry rates. Get specific recommendations to cut spend.
Cost Intelligence
🛑
Budget Circuit Breakers
Per-team, per-feature, or global limits. When hit, StackSense auto-downgrades model tiers or rate-limits requests with no service disruption.
Budget Enforcement
⚖️
Cross-Vendor Arbitrage
Monitor real-time pricing and latency across providers. Shift traffic automatically to maximize value without degrading user experience.
Multi-Provider
📋
Governance & Audit
Tamper-evident logs of every AI call. Model allowlists, PII detection, data residency enforcement, compliance reporting for SOC 2.
Governance
🤖
Agent Tracking
Track agentic workflows end-to-end. Total cost per run, loop detection, task-level token budgets, full step-by-step audit trail.
Agent Intelligence

OSS vs Enterprise

Feature🟢 Open Source🔵 Enterprise
Cost & token tracking
Latency & error monitoring
Multi-provider dashboard
SQLite + PostgreSQL
Docker / self-hosted
Kubernetes integrationBeta✓ Full
Google OAuth + accounts✓ + SSO/SAML
Dynamic model routing
Budget circuit breakers
Token waste detection
Cross-vendor arbitrage
SLA-aware routing
Agent workflow tracking
Enterprise policy engine
Audit logs & governance
AI unit economics
SupportCommunityEnterprise SLA
LicenseOpen SourceProprietary
PriceFreeCustom

Choose your path.

Open Source
Start monitoring in 5 minutes.

pip install stacksense — wrap your clients, see your costs. No signup, no credit card, no limits. Fully open source and self-hosted.

Enterprise
Ready to optimize at scale?

Talk to us about dynamic routing, budget enforcement, and governance. We'll show you exactly how much you're leaving on the table.