Metadata-Version: 2.4
Name: costpilot
Version: 0.1.1
Summary: Privacy-first LLM and infrastructure cost tracking SDK.
License: Proprietary
Requires-Python: >=3.10
Requires-Dist: aiosqlite>=0.20.0
Requires-Dist: anthropic>=0.45.0
Requires-Dist: click>=8.1.0
Requires-Dist: cryptography>=42.0.0
Requires-Dist: httpx>=0.27.0
Requires-Dist: openai>=1.55.0
Requires-Dist: pyyaml>=6.0.0
Requires-Dist: rich>=13.7.0
Requires-Dist: sqlalchemy>=2.0.0
Requires-Dist: sqlcipher3>=0.5.3
Requires-Dist: tiktoken>=0.8.0
Provides-Extra: dev
Requires-Dist: pytest-cov>=5.0.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Description-Content-Type: text/markdown

# CostPilot Python SDK

Privacy-first LLM and infrastructure cost tracking SDK. Instruments your AI clients, stores sanitized cost records locally in SQLite, and optionally syncs to CostPilot cloud. **Never sees your prompts.**

```
pip install costpilot
```

Requires Python 3.10+.

---

## Quick Start

### 1. Initialize config

```bash
costpilot init
```

Creates `.costpilot.yaml` in the current directory.

### 2. Wrap your LLM client

```python
from costpilot import CostPilotClient
import anthropic

cp = CostPilotClient(project="my-app", api_key="cp_live_sk_...")
client = cp.wrap(anthropic.Anthropic())

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
)
```

Cost is recorded automatically. No changes to your existing LLM call code.

---

## Configuration

`.costpilot.yaml` (auto-generated by `costpilot init`):

```yaml
project: my-app
environment: local           # local | azure | aws | gcp

# Exactly one auth key required:
api_key: cp_live_sk_...      # CostPilot cloud key (43 chars)
# license_key: cpl_lic_...  # Self-hosted enterprise key (40 chars)

privacy:
  hash_user_ids: true        # Fixed — cannot be disabled
  salt: <random-hex>         # Rotated salt invalidates historical user-ID hashes
  capture_prompts: false     # Fixed — prompts are never stored

llm:
  providers: [anthropic, openai, azure-openai]
  azure_openai:
    deployment_type: payg    # payg | ptu
    region: eastus

services:
  redis:
    enabled: true
    mode: docker             # docker | upstash | azure | aws | gcp
    host: localhost
    port: 6379
  qdrant:
    enabled: true
    mode: docker
    host: localhost
    port: 6333
  appwrite:
    enabled: true
    mode: docker
    endpoint: http://localhost/v1

storage:
  path: ./.costpilot/data.db

health:
  enabled: true
  poll_interval_sec: 60
  alerts:
    redis_memory_pct: 80
    qdrant_latency_ms: 500
    cost_spike_pct: 200
    llm_error_rate_pct: 5
```

Override config path via env var: `COSTPILOT_CONFIG=/path/to/.costpilot.yaml`

Override DB path: `COSTPILOT_DB=sqlite:///custom.db`

---

## Supported LLM Providers

All wrapped via `cp.wrap(client)`:

| Provider | Client |
|---|---|
| Anthropic | `anthropic.Anthropic()` / `anthropic.AsyncAnthropic()` |
| OpenAI | `openai.OpenAI()` / `openai.AsyncOpenAI()` |
| Azure OpenAI (PAYG + PTU) | `openai.AzureOpenAI()` / `openai.AsyncAzureOpenAI()` |
| GCP Gemini | `google.genai.Client()` |
| Azure AI Foundry | `azure.ai.projects.AIProjectClient()` |

```python
import openai
from google import genai

# OpenAI
oai = cp.wrap(openai.OpenAI())

# Azure OpenAI
azoai = cp.wrap(openai.AzureOpenAI(...))

# Gemini
gem = cp.wrap(genai.Client())
```

---

## Supported Models

| Provider | Models |
|---|---|
| Anthropic | claude-opus-4-6, claude-sonnet-4-5, claude-haiku-4-5 |
| OpenAI | gpt-4o, gpt-4o-mini, text-embedding-3-small/large |
| GCP Gemini | gemini-2.5-pro, gemini-2.0-flash, gemini-2.0-flash-lite, gemini-1.5-flash |
| Azure OpenAI | Same model aliases as OpenAI |

Pricing from live LiteLLM pricing registry (24h cache). Falls back to bundled `prices.json`.

---

## CostPilotClient API

```python
cp = CostPilotClient(
    project="my-app",              # required
    api_key="cp_live_sk_...",      # required
    feature="chat",                # optional — tag cost by feature/route
    user_id="user_123",            # optional — hashed before storage
    session_id="sess_abc",         # optional — hashed before storage
    config_path=".costpilot.yaml", # optional — merged if file exists; constructor args always win
)
```

### `cp.wrap(client)`

Returns an instrumented client with identical API. Intercepts every LLM call, records cost + token counts, never touches prompt content.

### `cp.record_llm_call(...)`

Manual recording when you manage raw token counts yourself:

```python
record = cp.record_llm_call(
    provider="anthropic",
    model="claude-sonnet-4-6",
    input_tokens=500,
    output_tokens=150,
    cache_read_tokens=0,
    cache_write_tokens=0,
    latency_ms=320,
)
# returns dict with cost_usd and safe storage fields
```

### Service trackers

Track infrastructure costs alongside LLM costs:

```python
# Redis — use as context manager
with cp.track_redis(operation="get", key_pattern="session:*"):
    value = redis.get("session:abc123")

# Qdrant
with cp.track_qdrant(operation="search", collection="embeddings"):
    results = qdrant.search(...)

# Appwrite
with cp.track_appwrite(service="databases", operation="listDocuments"):
    docs = appwrite.databases.list_documents(...)
```

---

## Modes

Determined at startup from your auth key:

| Mode | Trigger | History limit | Cloud sync |
|---|---|---|---|
| `cloud` | Valid `api_key` | Per plan | Yes |
| `self-host` | Valid `license_key` | Unlimited | No |
| `trial` | No key, machine not expired | 14 days | No |
| `trial_expired` | Trial past 14 days | 24h only | No |

Check current mode: `cp.mode`, `cp.tier`, `cp.limits`

---

## Plans

| Plan | Trial | LLM Providers | Services | Projects |
|---|---|---|---|---|
| Starter | 14 days, no CC | 1 (OpenAI) | 1 (Redis) | 1 |
| Pro | — | 3 | 3 (Redis, Qdrant, Appwrite) | 3 |
| Enterprise | — | Unlimited | Unlimited | Unlimited |

Trial expires → `trial_expired` mode (24h history only, no new data ingested to cloud).

---

## Privacy & Security

- Prompts and completions are **never** read, stored, or transmitted
- Only token counts, costs, model names, and timestamps are stored
- `user_id` and `session_id` are one-way SHA-256 hashed with a per-account salt before storage
- Local SQLite: two-layer encryption — SQLCipher (file) + AES-256-GCM (field level)
- Encryption keys derived from machine ID at runtime; never written to disk
- `privacy.hash_user_ids` and `privacy.capture_prompts` are hardcoded — config values ignored

---

## Health Monitoring

Background daemon polls Redis, Qdrant, Appwrite, and LLM endpoints every 60s (configurable). Fires alerts with 30-min dedup window when thresholds are breached.

Runs automatically when `health.enabled: true` in config.

---

## Pricing Registry

Live pricing fetched from LiteLLM JSON on startup, cached 24h at `~/.costpilot/pricing_cache.json`. Exchange rates from Frankfurter API for multi-currency support.

```python
# Cost in any currency
cost_eur = cp.pricing.convert_currency(cost_usd, "EUR")

# Available currencies
currencies = cp.pricing.get_available_currencies()
```

---

## Scenario Engine

Project cost projections and what-if analysis. Requires 500+ queries and 3+ days of data.

```python
from costpilot.exceptions import InsufficientDataError

try:
    projection = cp.scenario.project(days=30)
except InsufficientDataError:
    print("Need more data")
```

---

## CLI

```bash
costpilot init           # create .costpilot.yaml, register account
costpilot status         # show mode, tier, limits
costpilot report         # print cost summary to terminal
costpilot serve          # open cloud dashboard in browser
costpilot export         # export data as JSON (GDPR portability)
costpilot migrate        # migration cost analysis
costpilot update_prices  # force refresh pricing cache
```

---

## Development

```bash
pip install -e ".[dev]"
pytest tests/
```

Tests use in-memory SQLite — no config file or network required.

---

## Exceptions

| Exception | Raised when |
|---|---|
| `ConfigError` | Missing or invalid `.costpilot.yaml` |
| `InsufficientDataError` | Scenario engine called with too little data |
