Metadata-Version: 2.4
Name: tokenfence
Version: 0.3.2
Summary: Cost circuit breaker for AI agents — guard your OpenAI spend with automatic downgrade and kill switch.
Author: TokenFence Team
License: MIT
Project-URL: Homepage, https://tokenfence.dev
Project-URL: Repository, https://github.com/u4ma-kev/tokenfence-python
Project-URL: Examples, https://github.com/u4ma-kev/tokenfence-examples
Project-URL: Issues, https://github.com/u4ma-kev/tokenfence-python/issues
Keywords: openai,cost,budget,ai,llm,guardrail
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: openai
Requires-Dist: openai>=1.0.0; extra == "openai"
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.30.0; extra == "anthropic"
Provides-Extra: google
Requires-Dist: google-generativeai>=0.7.0; extra == "google"
Provides-Extra: all
Requires-Dist: openai>=1.0.0; extra == "all"
Requires-Dist: anthropic>=0.30.0; extra == "all"
Requires-Dist: google-generativeai>=0.7.0; extra == "all"
Dynamic: license-file

# TokenFence

Cost circuit breaker for AI agents. Guard your LLM spend with automatic model downgrade and kill switch. Supports OpenAI, Anthropic Claude, Google Gemini, and DeepSeek.

## Install

```bash
pip install tokenfence[openai]
```

## Quick Start

```python
import openai
from tokenfence import guard

client = guard(
    openai.OpenAI(),
    budget='$0.50',
    fallback='gpt-4o-mini',
    on_limit='stop',
)

# Use exactly like a normal OpenAI client
response = client.chat.completions.create(
    model='gpt-4o',
    messages=[{'role': 'user', 'content': 'Hello'}],
)

# Check spend
print(client.tokenfence.spent)      # 0.0023
print(client.tokenfence.remaining)  # 0.4977
print(client.tokenfence.calls)      # 1
```

## Anthropic Claude

```python
import anthropic
from tokenfence import guard

client = guard(
    anthropic.Anthropic(),
    budget='$1.00',
    fallback='claude-3-haiku-20240307',
    on_limit='stop',
)

# Use exactly like a normal Anthropic client
response = client.messages.create(
    model='claude-3-5-sonnet-20241022',
    max_tokens=1024,
    messages=[{'role': 'user', 'content': 'Hello'}],
)

# Check spend
print(client.tokenfence.spent)      # 0.00105
print(client.tokenfence.remaining)  # 0.99895
```

## Async Support

For async applications (the standard in production agent pipelines), use `async_guard`:

```python
import openai
from tokenfence import async_guard

client = async_guard(
    openai.AsyncOpenAI(),
    budget='$0.50',
    fallback='gpt-4o-mini',
    on_limit='stop',
)

# Use exactly like a normal async OpenAI client
response = await client.chat.completions.create(
    model='gpt-4o',
    messages=[{'role': 'user', 'content': 'Hello'}],
)

print(client.tokenfence.spent)
```

Works with `anthropic.AsyncAnthropic` too:

```python
import anthropic
from tokenfence import async_guard

client = async_guard(
    anthropic.AsyncAnthropic(),
    budget='$1.00',
    fallback='claude-3-haiku-20240307',
    on_limit='raise',
)

response = await client.messages.create(
    model='claude-3-5-sonnet-20241022',
    max_tokens=1024,
    messages=[{'role': 'user', 'content': 'Hello'}],
)
```

## How It Works

1. **Track** — every `chat.completions.create()` call records token usage and calculates cost.
2. **Downgrade** — when cumulative spend hits the threshold (default 80% of budget), the model is transparently swapped to your fallback.
3. **Kill switch** — when the budget is fully consumed:
   - `on_limit='stop'` — returns a synthetic response explaining the budget was exceeded.
   - `on_limit='warn'` — logs a warning but allows the call through.
   - `on_limit='raise'` — raises `BudgetExceeded`.

## API

### `guard(client, *, budget, fallback=None, on_limit='stop', threshold=0.8)`

| Parameter | Type | Description |
|-----------|------|-------------|
| `client` | `openai.OpenAI` | An OpenAI client instance |
| `budget` | `str \| float` | Max spend — `'$0.50'` or `0.50` |
| `fallback` | `str \| None` | Model to downgrade to when threshold is hit |
| `on_limit` | `str` | `'stop'`, `'warn'`, or `'raise'` |
| `threshold` | `float` | Fraction of budget at which downgrade kicks in (0.0–1.0) |

### `async_guard(client, *, budget, fallback=None, on_limit='stop', threshold=0.8)`

Same parameters as `guard()`, but for async clients (`openai.AsyncOpenAI`, `anthropic.AsyncAnthropic`).

### `client.tokenfence`

| Attribute | Description |
|-----------|-------------|
| `.spent` | Total USD spent so far |
| `.remaining` | USD remaining in budget |
| `.calls` | Number of tracked API calls |
| `.budget` | The configured budget |
| `.reset()` | Reset spend tracking to zero |

## Limits on Free Tier

The free Hobby tier includes 50K tracked requests/month. For production workloads:

| Tier | Requests | Price |
|------|----------|-------|
| Hobby | 50K/mo | Free |
| Pro | 500K/mo | $49/mo |
| Team | 2M/mo | $149/mo |

→ **[Upgrade to Pro at tokenfence.dev](https://tokenfence.dev/#pricing)** — 7-day free trial, no credit card required to start.

## License

MIT
