Metadata-Version: 2.4
Name: aicostguard-dev
Version: 0.3.0b2
Summary: One-line HTTP-level auto-instrumentation for AI provider cost tracking. Catches every SDK, framework, and custom wrapper via httpx/requests interception. Supports OpenAI, Anthropic, Gemini, Cohere, Mistral, and 8 OpenAI-compatible providers (Groq, xAI, Together, Fireworks, Perplexity, DeepSeek, OpenRouter, Vercel AI Gateway).
Project-URL: Homepage, https://aicostguard.dev
Project-URL: Documentation, https://github.com/Vinodmurkute/ai-cost-guard/tree/main/packages/aicostguard-py#readme
Project-URL: Repository, https://github.com/Vinodmurkute/ai-cost-guard
Project-URL: Issues, https://github.com/Vinodmurkute/ai-cost-guard/issues
Author-email: AI Cost Guard <support@aicostguard.dev>
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: ai,anthropic,auto-instrumentation,cost,finops,gemini,llm,observability,openai,tracking
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.9
Provides-Extra: dev
Requires-Dist: anthropic<1.0,>=0.40; extra == 'dev'
Requires-Dist: google-generativeai<1.0,>=0.8; extra == 'dev'
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: openai<2.0,>=1.30; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest-cov>=4.1; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.5; extra == 'dev'
Description-Content-Type: text/markdown

# aicostguard

**One line. Zero secrets. Real-time AI cost tracking for OpenAI, Anthropic, and Gemini.**

Drop-in observability for AI provider costs. No proxy, no shared keys, no per-call code.

[![PyPI](https://img.shields.io/pypi/v/aicostguard.svg)](https://pypi.org/project/aicostguard/)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)

> **Status: Beta.** v0.1 is feature-complete for the supported configurations listed below. Not yet recommended for production-critical workloads without your own validation.

## Install

```bash
pip install aicostguard-dev
```

## Activate

Add **one line** at the top of your application entry file (e.g. `app.py`, `main.py`, `manage.py`):

```python
import aicostguard.auto  # done.
```

Then export your AI Cost Guard ingestion key:

```bash
export AICG_KEY=aicg_xxxxxxxxxxxxxxxxxxxx
export AICG_URL=https://your-aicg-instance.example.com   # or our hosted URL
```

That's the entire integration. Every OpenAI / Anthropic / Gemini call your application makes is now automatically tracked.

## What gets sent

Only this, per AI call:

```json
{
  "provider": "openai",
  "model": "gpt-4o",
  "input_tokens": 1240,
  "output_tokens": 312,
  "latency_ms": 842,
  "feature": "generate_rag_answer"
}
```

**Never sent:**

- Your AI provider API keys
- Your prompts
- The AI's responses
- Any user data

**The `feature` field** is inferred automatically from the calling function name. You can override it explicitly:

```python
import aicostguard as aicg

with aicg.feature("doc-parse"):
    completion = openai_client.chat.completions.create(...)
```

## Supported configurations (v0.1)

| Provider | SDK | Sync | Async | Non-streaming | Streaming (with usage opt-in) |
|---|---|---|---|---|---|
| OpenAI | `openai` ≥1.30, <2.0 | ✅ | ✅ | ✅ | ✅ |
| Anthropic | `anthropic` ≥0.40, <1.0 | ✅ | ✅ | ✅ | ✅ |
| Google Gemini | `google-generativeai` ≥0.8 | ✅ | ✅ | ✅ | ✅ |

**Not in v0.1** (planned for v0.2):
- LangChain / LlamaIndex auto-tagging
- Azure OpenAI, AWS Bedrock SDK shapes
- Cohere, Mistral SDKs
- Streaming WITHOUT usage opt-in (we warn loudly today; `tiktoken` fallback in v0.2)

If your stack isn't listed yet, use [Manual POST](https://github.com/Vinodmurkute/ai-cost-guard#manual-integration) — fully supported and language-agnostic.

## Runtime support

| Runtime | Supported | Sender mode | Notes |
|---|---|---|---|
| CPython 3.9+ long-running server (Flask, FastAPI, Django, Gunicorn, Celery, containers, local dev) | ✅ | `background` | Daemon thread + bounded queue + `atexit` flush — same behaviour as today |
| Vercel Python functions | ✅ | `inline` | Each receipt POSTs synchronously before the handler returns (Python has no `waitUntil` equivalent we can use globally). Adds ~50–200 ms to AI-route response. |
| AWS Lambda Python | ✅ | `inline` | Same as Vercel — synchronous send before handler return |
| GCP Cloud Functions Python | ✅ | `inline` | Detected via `K_SERVICE` / `FUNCTION_NAME` |
| Azure Functions Python | ✅ | `inline` | Detected via `AZURE_FUNCTIONS_ENVIRONMENT` |

**Why the modes exist.** On serverless platforms, the host freezes the container the moment the function returns. A long-running background drain thread does not survive that freeze — receipts queued *after* the response is sent are silently dropped. The package detects serverless and switches to **`inline`** mode: every `submit()` POSTs synchronously before returning, so by the time the host function returns, the receipt has hit the wire. There is no Python equivalent of Vercel's `waitUntil()` we can call from a module-scoped sender, so synchronous send is the only safe strategy.

> **Async-context caveat.** Inline mode uses `urllib.request.urlopen` (synchronous). Inside an asyncio event loop (FastAPI, async Flask) this briefly blocks the loop while the receipt POSTs. A native async sender that uses `httpx.AsyncClient` is planned for a future release. Track-correctness is unaffected.

## Trust contract (CI-enforced)

These properties are **runtime-asserted in CI**. No release ships without them passing. The relevant test files are linked.

1. **Cannot break your AI calls.** [`test_safety_cannot_break_calls.py`](tests/test_safety_cannot_break_calls.py) — observer exceptions are caught and swallowed; the SDK's original return value is always passed through.
2. **Cannot send prompts or responses.** [`test_safety_payload_keys_only.py`](tests/test_safety_payload_keys_only.py) — receipts only contain `{provider, model, input_tokens, output_tokens, latency_ms, feature?}`. Any other key fails the build.
3. **Cannot leak your AI provider API key.** [`test_safety_no_api_key_leak.py`](tests/test_safety_no_api_key_leak.py) — receipt payloads are scanned for the literal API key values during every test run.
4. **Cannot block your call thread.** [`test_safety_overhead_under_2ms.py`](tests/test_safety_overhead_under_2ms.py) — observer overhead is asserted <2ms p99 across 1,000 iterations.
5. **No silent failure modes.** [`test_safety_warnings_fire.py`](tests/test_safety_warnings_fire.py) — known issues (unsupported SDK version, streaming-without-usage, wrapper-before-import) each emit a clear warning.
6. **No silent receipt loss on serverless.** [`test_safety_serverless_flush.py`](tests/test_safety_serverless_flush.py) — with `VERCEL=1` (or other serverless env), `submit()` POSTs synchronously before returning; no background thread is started; errors during the inline POST do not propagate.

These are the entire trust story. Read the source. Run the tests.

## Diagnostics

Check what's instrumented and confirm receipts are flowing:

```bash
aicg-diagnose
# or:
python -m aicostguard.diagnostics
```

Output:

```
AI Cost Guard auto-instrumentation v0.1.0b1
─────────────────────────────────────────────
Instruments:
  ✅ openai 1.50.0          patched
  ✅ anthropic 0.42.0       patched
  ❌ google.generativeai    not installed
─────────────────────────────────────────────
Config:
  AICG_URL  https://your-aicg.example.com (reachable, 142 ms)
  AICG_KEY  aicg_xxxx••••••••••••••••  (valid format)
─────────────────────────────────────────────
Last receipt: 14:32:01 UTC  (200 OK)
```

## How it works (in one paragraph)

When you `import aicostguard.auto`, the package scans `sys.modules` for known AI provider SDKs and monkey-patches their response-returning methods. Each patched method delegates to the original SDK call (so your code's behaviour is unchanged), then reads the `usage` field from the response object and submits a fire-and-forget receipt to AI Cost Guard. Everything is wrapped in `try/except` at multiple layers so observer errors can never propagate to your application. The technique is identical to the one Sentry, Datadog APM, and OpenTelemetry use for application monitoring — it's been running in production at hyperscaler scale for over a decade.

## Configuration reference

| Env var | Default | Description |
|---|---|---|
| `AICG_KEY` | _(none — package is no-op)_ | Your ingestion key from the AI Cost Guard dashboard. |
| `AICG_URL` | _(none — package is no-op)_ | Base URL of your AI Cost Guard backend. |
| `AICG_FEATURE_DEFAULT` | inferred from caller frame | Fallback feature tag when neither inference nor `aicg.feature(...)` applies. |
| `AICG_DISABLED` | unset | Set to `1` to short-circuit all tracking without removing the package. |
| `AICG_DEBUG` | unset | Set to `1` to emit verbose debug logs to stderr (do not use in production). |

## License

Apache 2.0. See [LICENSE](LICENSE).
