Metadata-Version: 2.4
Name: sentrinode-llm
Version: 0.1.0
Summary: Drop-in observability for LLM calls — cost, tokens, latency, errors via OpenTelemetry GenAI conventions.
Author: SentriNode
License: MIT
Project-URL: Homepage, https://sentrinode.com
Project-URL: Repository, https://github.com/rg309/sentrinode
Keywords: observability,llm,openai,anthropic,opentelemetry,cost,tracing
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Topic :: System :: Monitoring
Classifier: Intended Audience :: Developers
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: opentelemetry-sdk>=1.20
Requires-Dist: opentelemetry-exporter-otlp-proto-http>=1.20
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.30; extra == "anthropic"
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == "openai"
Provides-Extra: dev
Requires-Dist: anthropic>=0.30; extra == "dev"
Requires-Dist: openai>=1.0; extra == "dev"

# sentrinode-llm

Drop-in observability for your LLM calls. Two lines, and every Anthropic /
OpenAI request is traced with **model, token usage, latency, finish reason, and
computed USD cost** — using the OpenTelemetry GenAI semantic conventions.

## Install

```bash
pip install sentrinode-llm        # add [anthropic] and/or [openai] as needed
```

## Use

```python
import sentrinode_llm
sentrinode_llm.instrument()   # reads SENTRINODE_API_KEY + SENTRINODE_TENANT

import anthropic
client = anthropic.Anthropic()
client.messages.create(            # ← automatically traced
    model="claude-haiku-4-5-20251001",
    max_tokens=256,
    messages=[{"role": "user", "content": "hello"}],
)
```

That's it — no wrapping your calls, no decorators. The same works for OpenAI:

```python
from openai import OpenAI
OpenAI().chat.completions.create(model="gpt-4o-mini", messages=[...])  # traced
```

## Configure

| Env var | Meaning | Default |
|---------|---------|---------|
| `SENTRINODE_API_KEY` | your tenant API key | — (required) |
| `SENTRINODE_TENANT` | your tenant slug | — (required) |
| `SENTRINODE_LLM_ENDPOINT` | ingest base URL | `https://api.sentrinode.com` |
| `SENTRINODE_SERVICE_NAME` | label for this app | `llm-app` |

Or pass them directly: `sentrinode_llm.instrument(api_key=..., tenant=...)`.

## What gets recorded

Each call becomes one OpenTelemetry span:

| Attribute | Example |
|-----------|---------|
| `gen_ai.system` | `anthropic` |
| `gen_ai.request.model` | `claude-haiku-4-5-20251001` |
| `gen_ai.usage.input_tokens` | `1432` |
| `gen_ai.usage.output_tokens` | `211` |
| `gen_ai.usage.cost_usd` | `0.002486` |
| `gen_ai.response.finish_reasons` | `["end_turn"]` |

Cost is computed from `pricing.py`. Override any model:

```python
sentrinode_llm.set_price("my-model", input_per_1m=2.0, output_per_1m=8.0)
```

## Not yet

- Streaming responses pass through untraced (usage isn't known until the stream
  ends — coming next).
- Async clients and embeddings — next iteration.
