Metadata-Version: 2.4
Name: tokenwise-observe
Version: 0.1.1
Summary: See your LLM calls in Tokenwise: a tiny, never-blocking wrapper for the OpenAI and Anthropic SDKs.
Project-URL: Homepage, https://tokenwisehq.com
Author: Tokenwise
License-Expression: MIT
Keywords: anthropic,cost,llm,observability,openai,tokenwise
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown

# tokenwise-observe

See your LLM calls in [Tokenwise](https://tokenwisehq.com) without rerouting
production. It wraps your existing OpenAI or Anthropic client and logs each call
in the background. Your requests are never slowed, broken, or changed. It has no
dependencies.

> Observe mode shows your calls on every dashboard, with quality scores and
> evals. To also cut cost automatically (caching and model routing), route
> through the gateway with `tokenwise-observe init --gateway`.

## Quick start

Set it up with one command. It finds your package manager and SDK, installs the
package, and prints the two lines to add. It does not touch your code or config
files.

```bash
pipx run tokenwise-observe init
```

Prefer to do it by hand? Install the package:

```bash
pip install tokenwise-observe
```

Get a key (it starts with `tw_api_`) from Settings, API Keys in Tokenwise, then
add it to your environment. The SDK reads it automatically:

```bash
TOKENWISE_OBSERVE_KEY=tw_api_...
```

## OpenAI

Also works with OpenAI-compatible providers: Groq, DeepSeek, Mistral, xAI,
OpenRouter, and more.

```python
from openai import OpenAI
from tokenwise_observe import observe_openai

# reads TOKENWISE_OBSERVE_KEY from your environment
client = observe_openai(OpenAI())

# Use it exactly as before. Every call is logged to Tokenwise.
res = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)
```

Streaming is handled for you, both sync and async: usage is added up as you read
the stream and sent once it finishes.

## Anthropic

```python
from anthropic import Anthropic
from tokenwise_observe import observe_anthropic

# reads TOKENWISE_OBSERVE_KEY from your environment
client = observe_anthropic(Anthropic())
client.messages.create(
    model="claude-haiku-4-5",
    max_tokens=256,
    messages=[{"role": "user", "content": "Hello"}],
)
```

Async clients (`AsyncOpenAI`, `AsyncAnthropic`) work the same way. Wrap them and
`await` as usual.

## Options

Pass these as keyword arguments to `observe_openai` or `observe_anthropic`:

| Option | Default | What it does |
| --- | --- | --- |
| `api_key` | `TOKENWISE_OBSERVE_KEY` env var | Your Tokenwise key (starts with `tw_api_`). Optional when the env var is set. |
| `endpoint` | `https://tokenwisehq.com/api/v1/observe` | Where events are sent. Set this only if you self-host. |
| `tag` | none | A label added to every call, e.g. a feature name. |
| `capture_content` | `True` | Set to `False` to send only metrics, never prompt or response content. |
| `redact` | none | A function to scrub or drop an event before it is sent. |
| `timeout` | `3.0` | How long to wait when sending an event, in seconds. |
| `on_error` | none | Called if sending an event fails. It never affects your request. |

## Privacy

By default the SDK includes your prompt and response so you can see them in the
dashboard. You can keep all content on your side:

- `capture_content=False` sends only metrics (model, tokens, cost, latency,
  status). Your prompts and responses never leave your process, and cost still
  works.
- `redact` runs right before an event is sent. Return a changed event, or
  `None` to drop it.

```python
observe_openai(
    OpenAI(),
    capture_content=False,  # metrics only
    # or scrub selectively:
    # redact=lambda e: {**e, "input": mask_pii(e["input"])},
)
```

Any content you do send is encrypted at rest, sent over TLS, and governed by
your workspace's payload-storage setting. Keys are stored as hashes, never in
plain text.

## What it promises

- Your requests are never slowed. Events are sent on a background thread.
- Your app never breaks because of logging. It never raises.
- Your responses come back unchanged.

## License

MIT
