Metadata-Version: 2.4
Name: tollgateai
Version: 0.1.1
Summary: Track real LLM model usage and compute live gross margin with Tollgate.
Project-URL: Homepage, https://tollgateai.vercel.app
Author: Tollgate
License: Proprietary
Keywords: anthropic,cost,llm,margin,observability,openai,tokens,tollgate
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.8
Description-Content-Type: text/markdown

# tollgateai (Python SDK)

Track **real** LLM model usage and compute live gross margin with
[Tollgate](https://tollgateai.vercel.app). The SDK reads the actual usage off
each provider response — you never hand-count tokens. Zero dependencies.

Published on PyPI: [tollgateai](https://pypi.org/project/tollgateai/0.1.0/) (v0.1.0).

```bash
pip install tollgateai
```

Create an API key in **Tollgate → Integrations**, then set:

```bash
export TOLLGATE_API_KEY=tg_live_xxx
# optional, defaults to the hosted app:
export TOLLGATE_BASE_URL=https://tollgateai.vercel.app
```

## Auto-instrumentation (recommended)

Wrap your provider client once; every call reports real usage in the background.

### Anthropic

```python
from anthropic import Anthropic
from tollgate import create_tollgate_client, wrap_anthropic

tollgate = create_tollgate_client()  # reads TOLLGATE_API_KEY
anthropic = wrap_anthropic(
    Anthropic(), tollgate,
    customer_id="cust_A",     # your end customer
    revenue_unit_cents=50,    # what you charge for this unit ($0.50)
)

# Use the client normally — usage is tracked automatically.
anthropic.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=512,
    messages=[{"role": "user", "content": "Summarize this ticket…"}],
)
```

### OpenAI

```python
from openai import OpenAI
from tollgate import create_tollgate_client, wrap_openai

tollgate = create_tollgate_client()
openai = wrap_openai(OpenAI(), tollgate, customer_id="cust_A")

openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)
```

`revenue_unit_cents` can also be a callable of the response, e.g.
`revenue_unit_cents=lambda res: 50 if res.something else 0`.

## Manual tracking

For providers without a wrapper (Bedrock, custom gateways) or full control:

```python
from tollgate import create_tollgate_client

tollgate = create_tollgate_client()

tollgate.track({
    "customerId": "cust_A",
    "runId": "run_12345",
    "provider": "anthropic",
    "model": "claude-sonnet-4-6",
    "tokensIn": 1200,
    "tokensOut": 450,
    "reasoningTokens": 0,
    "cachedTokens": 0,
    "revenueUnitCents": 50,
    "idempotencyKey": "run_12345#step_1",  # exactly-once: safe to retry
})
```

## Notes

- **Idempotent.** Events dedupe on `idempotencyKey` (auto-set to the provider
  response id by the wrappers), so retries never double-count.
- **No prompt content is ever sent** — only token counts and metadata.
- **Streaming** responses are not auto-tracked yet (the wrappers only report when
  a non-streaming `usage` is present). Track those manually for now.
- **Non-blocking.** Auto-instrumented tracking runs on a background thread;
  failures go to `on_error` (default: log a warning) and never break your call.

Licensed for use with Tollgate. Not open source.
