Metadata-Version: 2.4
Name: pennsieve-llm
Version: 0.4.1
Summary: Thin configuration helper for the Pennsieve LLM Governor — returns a pre-configured anthropic.Anthropic client.
License-Expression: MIT
Requires-Python: >=3.10
Requires-Dist: anthropic>=0.40.0
Requires-Dist: boto3>=1.35.0
Requires-Dist: httpx>=0.27.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == 'dev'
Description-Content-Type: text/markdown

# pennsieve-llm

Thin Python configuration helper for the Pennsieve LLM Governor.

This library returns a pre-configured `anthropic.Anthropic` client pointed at the Pennsieve LLM Governor with SigV4 auth wired up. Streaming, tool use, prompt caching, extended thinking — every Anthropic SDK feature works because you're using the real Anthropic SDK.

## Installation

```bash
pip install pennsieve-llm
```

Requires Python 3.10+. `anthropic`, `httpx`, and `boto3` are installed as transitive dependencies.

## Quick Start

```python
from pennsieve_llm import Governor, MODEL_SONNET_45

gov = Governor()  # auto-configures from $LLM_GOVERNOR_URL + AWS creds

resp = gov.client().messages.create(
    model=MODEL_SONNET_45,
    messages=[{"role": "user", "content": "Hello, world!"}],
    max_tokens=1024,
)
print(resp.content[0].text)
```

The object returned by `gov.client()` **is** `anthropic.Anthropic`. Everything in [the Anthropic Python SDK docs](https://github.com/anthropics/anthropic-sdk-python) applies.

## Streaming

```python
gov = Governor()

with gov.client().messages.stream(
    model=MODEL_SONNET_45,
    messages=[{"role": "user", "content": "Write a poem"}],
    max_tokens=1024,
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
```

## Tool use

```python
resp = gov.client().messages.create(
    model=MODEL_SONNET_45,
    max_tokens=1024,
    tools=[{
        "name": "get_weather",
        "description": "Get current weather",
        "input_schema": {...},
    }],
    messages=[{"role": "user", "content": "What's the weather in SF?"}],
)
```

## Files on EFS

The governor supports referencing files on EFS without base64-encoding them into the request — use the `efs_document` content block builder:

```python
gov = Governor()
resp = gov.client().messages.create(
    model=MODEL_SONNET_45,
    messages=[{"role": "user", "content": [
        {"type": "text", "text": "Summarize this PDF:"},
        gov.efs_document("workdir/paper.pdf"),
    ]}],
    max_tokens=1024,
)
```

Files are read server-side by the governor with execution-scoped access controls.

## Backend selection

| Env var set | Behavior |
|---|---|
| `LLM_GOVERNOR_URL` | Returns `anthropic.Anthropic` configured for the Pennsieve governor (HTTPS + SigV4) |
| (none) | Returns `MockClient` for tests / offline use |

For local development against `api.anthropic.com`, just use `anthropic.Anthropic()` directly with your API key. That's not the SDK's job to wrap.

### Environment variables

| Variable | Source | Purpose |
|---|---|---|
| `LLM_GOVERNOR_URL` | Platform-injected (compute node) | Governor Function URL |
| `EXECUTION_RUN_ID` | Platform-injected (workflow) | Cost attribution; attached as `x-execution-run-id` header on every request |
| `AWS_REGION` | Default or override | SigV4 signing region (default `us-east-1`) |

Constructor arguments override env vars:

```python
gov = Governor(
    url="https://abc.lambda-url.us-east-1.on.aws",
    execution_run_id="run-123",
    region="us-east-1",
)
```

## Governor-specific operations

`Governor.check_budget()` and `Governor.list_models()` query Pennsieve-specific endpoints (`GET /v1/budget`, `GET /v1/models`) — these aren't part of the Anthropic API.

```python
budget = gov.check_budget()
print(f"${budget['periodRemainingUsd']:.2f} remaining this {budget['budgetPeriod']}")

models = gov.list_models()
for m in models["models"]:
    print(m["modelId"], m["status"])
```

## Testing

Without `LLM_GOVERNOR_URL` set, `Governor()` auto-selects `MockClient`. You can also inject one explicitly:

```python
from pennsieve_llm import Governor, MockClient

mock = MockClient()
mock.set_response(text="42", input_tokens=5, output_tokens=1)

gov = Governor(client=mock)
resp = gov.client().messages.create(
    model="test", messages=[{"role": "user", "content": "what is 6*7?"}], max_tokens=10
)
assert resp.content[0].text == "42"

# Inspect what was sent
assert mock.calls[0]["model"] == "test"
```

The `MockClient` mimics the parts of `anthropic.Anthropic`'s surface that typical caller code uses — enough for unit tests without making real network calls.

## Available model constants

| Constant | Bedrock inference profile ID |
|---|---|
| `MODEL_HAIKU_45` | `us.anthropic.claude-haiku-4-5-20251001-v1:0` |
| `MODEL_SONNET_4` | `us.anthropic.claude-sonnet-4-20250514-v1:0` |
| `MODEL_SONNET_45` | `us.anthropic.claude-sonnet-4-5-20250929-v1:0` |
| `MODEL_SONNET_46` | `us.anthropic.claude-sonnet-4-6` |
| `MODEL_OPUS_47` | `us.anthropic.claude-opus-4-7` |

`us.*` keeps inference in US AWS regions — HIPAA-friendly default for Pennsieve customers.

## Error handling

Errors from the chat path (`messages.create`) come back as Anthropic SDK exceptions — use [the Anthropic SDK's exception hierarchy](https://github.com/anthropics/anthropic-sdk-python#error-handling):

```python
import anthropic
from pennsieve_llm import Governor

gov = Governor()
try:
    resp = gov.client().messages.create(...)
except anthropic.BadRequestError as e:
    # 400 — bad request shape, model not allowed, etc.
    print(e.body)
except anthropic.RateLimitError as e:
    # 429
    pass
```

For governor-specific endpoints (`check_budget`, `list_models`), the SDK raises `GovernorError`:

```python
from pennsieve_llm import Governor, GovernorError

try:
    budget = gov.check_budget()
except GovernorError as e:
    print(f"{e.code}: {e.msg}")
```

## Migration from v0.3.x

v0.4.0 dropped the parallel type system (`InvokeRequest`, `InvokeResponse`, `Backend`, `LambdaBackend`, `AnthropicBackend`) in favor of returning the real `anthropic.Anthropic` client. **The convenience methods (`gov.ask`, `gov.ask_with_system`, `gov.ask_about_file`) are gone** — call `gov.client().messages.create(...)` directly. Net effect: less SDK code, more Anthropic features available.

Before (v0.3.x):
```python
text = gov.ask(MODEL_SONNET_46, "Hello")
```

After (v0.4.0+):
```python
resp = gov.client().messages.create(
    model=MODEL_SONNET_46,
    messages=[{"role": "user", "content": "Hello"}],
    max_tokens=1024,
)
text = resp.content[0].text
```

## Development

```bash
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest
```

## License

MIT
