Metadata-Version: 2.4
Name: spendguard
Version: 0.1.0
Summary: A 2-line wrapper around your OpenAI or Anthropic client that blocks an over-budget API call before it happens.
Project-URL: Homepage, https://github.com/Rahul-git23/spendguard
Project-URL: Repository, https://github.com/Rahul-git23/spendguard
Author-email: Rahul Vichare <rahulvichare@gmail.com>
License: MIT
Keywords: ai,anthropic,budget,cost,guardrail,llm,openai
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Requires-Dist: anthropic>=0.25.0
Requires-Dist: openai>=1.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Provides-Extra: tiktoken
Requires-Dist: tiktoken>=0.5.0; extra == 'tiktoken'
Description-Content-Type: text/markdown

# SpendGuard

A 2-line wrapper around your OpenAI or Anthropic client that blocks an over-budget API call **before it happens** — no surprises at the end of the month.

```python
from spendguard import SpendGuard

guard = SpendGuard(workspace="my-app", ceiling_usd=20.0)
client = guard.wrap_openai(OpenAI())          # or wrap_anthropic(Anthropic())

# Call the client exactly as normal — SpendGuard intercepts transparently.
# If the estimated cost would push cumulative spend past 25% of the $20 ceiling,
# it raises BudgetExceededError before the API call is made.
response = client.chat.completions.create(model="gpt-4o", messages=[...])
```

## Install

```bash
pip install spendguard
```

For more accurate pre-call token counting on OpenAI models:

```bash
pip install spendguard[tiktoken]
```

## How it works

SpendGuard wraps your existing client object. Every call goes through two steps:

1. **Pre-call estimate** — approximates the input token count and adds the max output tokens × the model's per-token rate. If `cumulative_spend + estimate > ceiling × threshold_pct`, it raises `BudgetExceededError` before the network call.
2. **Post-call commit** — reads the provider's actual usage numbers from the response and records the real cost.

The default threshold is 25% of the ceiling (`threshold_pct=0.25`). This means a single call can consume at most 25% of your monthly budget — it is a guardrail against a single runaway call, not a hard cap at 100%.

## Supported providers and models

| Provider   | Client wrapper       | Models gated by default |
| ---------- | -------------------- | ----------------------- |
| OpenAI     | `wrap_openai()`      | gpt-4o, gpt-4o-mini, and all models in the pricing config |
| Anthropic  | `wrap_anthropic()`   | claude-3-5-sonnet, claude-3-opus, haiku, and all models in the pricing config |

## Usage

### Basic setup

```python
from openai import OpenAI
from spendguard import SpendGuard

guard = SpendGuard(workspace="my-product", ceiling_usd=20.0)
client = guard.wrap_openai(OpenAI())

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello"}],
        max_tokens=512,
    )
except BudgetExceededError as e:
    print(f"Blocked: {e}")
```

### Anthropic

```python
from anthropic import Anthropic
from spendguard import SpendGuard

guard = SpendGuard(workspace="my-product", ceiling_usd=20.0)
client = guard.wrap_anthropic(Anthropic())

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
)
```

### Overriding a block on purpose

When you explicitly want to allow a call that would be blocked (e.g., a one-time large batch job), use `track()` with `override=True`:

```python
with guard.track(override=True):
    response = client.chat.completions.create(...)   # never blocked
```

The override only applies inside the `with` block and does not persist.

### Inspecting current spend

```python
summary = guard.get_summary()
# {"ceiling_usd": 20.0, "spent_usd": 1.23, "reserved_usd": 0.0, "threshold_pct": 0.25}
```

## Workspace isolation

Each `SpendGuard` instance is scoped to a `workspace` string. When you run multiple products or feature flags, give each its own workspace so their budgets are tracked independently.

## Out of scope for v0.1

- Streaming calls (`stream=True`) — explicitly rejected with a clear error.
- Embeddings, images, audio, and other non-chat/messages endpoints.
- Persistent spend across process restarts (resets on `SpendGuard()` construction).

Persistence and streaming support are planned for v1.0.

## Feedback

Found a bug or have a feature request? [Open an issue](https://github.com/Rahul-git23/spendguard/issues) — all feedback welcome.

## License

MIT
