Metadata-Version: 2.4
Name: octoryn-llm
Version: 0.1.0
Summary: Official Python SDK for Octoryn LLM (Octopus Core Pty Ltd)
Project-URL: Homepage, https://octopusos.dev
Author: Octopus Core Pty Ltd
License: Proprietary
Keywords: audit,byok,llm,octoryn,openai-compatible
Classifier: Intended Audience :: Developers
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Requires-Dist: httpx>=0.27
Requires-Dist: pydantic>=2.9
Requires-Dist: typing-extensions>=4.12
Provides-Extra: dev
Requires-Dist: openai>=1.50; extra == 'dev'
Requires-Dist: pytest; extra == 'dev'
Requires-Dist: pytest-asyncio; extra == 'dev'
Requires-Dist: respx; extra == 'dev'
Description-Content-Type: text/markdown

# octoryn-llm

**Octoryn LLM Python SDK — OpenAI-compatible client for Octoryn Gateway.**

`octoryn-llm` ships an `Octoryn` / `AsyncOctoryn` client that talks to the
Octoryn Gateway (default: `https://api.octopusos.dev/v1`). The chat / images /
audio / embeddings / moderations surface is byte-for-byte OpenAI-compatible,
plus first-class extensions for audit, usage, and BYOK.

## Install

```bash
pip install octoryn-llm
```

Python 3.10+.

## Quickstart

```python
from octoryn import Octoryn

client = Octoryn(api_key="oct_live_...")  # or set OCTORYN_API_KEY
out = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello, Octoryn"}],
)
print(out.choices[0].message.content)
```

## Authentication

### API key (recommended)

Get a key in the web console at <https://app.octopusos.dev>, then set
`OCTORYN_API_KEY` in your environment:

```python
import os
from octoryn import Octoryn

client = Octoryn(api_key=os.environ["OCTORYN_API_KEY"])
```

### OIDC (refresh-token)

For human / SSO flows the SDK ships an OAuth2 refresh-token credential. It
auto-refreshes the short-lived access token (single-flight, thread- and
asyncio-safe) and lets you persist the new token-set with `on_refresh`:

```python
import json, pathlib
from octoryn import Octoryn, OctorynOidcCredential

TOKEN_PATH = pathlib.Path.home() / ".octoryn" / "tokens.json"

def persist(token_set: dict) -> None:
    TOKEN_PATH.parent.mkdir(parents=True, exist_ok=True)
    TOKEN_PATH.write_text(json.dumps(token_set))

cached = json.loads(TOKEN_PATH.read_text()) if TOKEN_PATH.exists() else {}

cred = OctorynOidcCredential(
    identity_url="https://identity.octopusos.dev",
    client_id="cli",
    refresh_token=cached.get("refresh_token", "rt_..."),
    access_token=cached.get("access_token"),
    expires_at=cached.get("expires_at"),
    on_refresh=persist,
)
client = Octoryn(credential=cred)
```

Pass either `api_key=` *or* `credential=`, never both.

## Use the official `openai` SDK (drop-in)

The Octoryn Gateway is bytewise OpenAI-compatible for chat, images, audio,
embeddings, and moderations. Point the official `openai` SDK at the Octoryn
base URL and existing code keeps working:

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.octopusos.dev/v1",
    api_key="oct_live_...",
)
client.chat.completions.create(model="gpt-4o-mini", messages=[...])
```

If you'd rather import from `octoryn` while keeping the OpenAI surface, use
the bundled alias:

```python
from octoryn.openai_compat import OpenAI  # subclass of Octoryn
```

## Modality cookbook

### Chat

```python
out = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Summarize the OSI model in one sentence."}],
)
print(out.choices[0].message.content)

# Stream
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Count 1..5"}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)
```

### Images

```python
res = client.images.generate(
    model="dall-e-3",
    prompt="a small octopus drawing a vector logo, flat illustration",
    size="1024x1024",
)
print(res.data[0].url)
```

### Speech (TTS)

```python
speech = client.audio.speech.create(
    model="tts-1",
    voice="alloy",
    input="Octoryn is online.",
    response_format="mp3",
)
with open("out.mp3", "wb") as f:
    f.write(speech.content)
```

### Transcription (STT)

```python
tx = client.audio.transcriptions.create(
    file="meeting.mp3",      # path, bytes, or file-like
    model="whisper-1",
    language="en",
)
print(tx.text)
```

### Video

```python
job = client.videos.generate_and_wait(
    model="veo-3",
    prompt="a slow-motion octopus opening a jar underwater",
    duration_seconds=5,
    resolution="720p",
    aspect_ratio="16:9",
)
print(job.status, job.video_url)
```

### Realtime

```python
session = client.realtime.sessions.create(
    provider="openai",
    model="gpt-4o-realtime-preview",
    voice="verse",
)
print(session.ws_url, session.audio_sample_rate)
```

The SDK does not bundle a WebRTC/WebSocket client. Connect to `ws_url` with
your transport of choice (e.g. `websockets`, `aiortc`) and exchange events
following the underlying provider's realtime protocol:

```python
# import websockets, asyncio, json
# async with websockets.connect(session.ws_url) as ws:
#     await ws.send(json.dumps({"type": "input_audio_buffer.append", "audio": "..."}))
#     async for msg in ws: print(msg)
```

### Embeddings

```python
emb = client.embeddings.create(
    model="text-embedding-3-small",
    input=["hello", "world"],
)
print(len(emb.data), len(emb.data[0].embedding))
```

### Moderations

```python
mod = client.moderations.create(input="I love clean code.")
print(mod["results"][0]["flagged"])
```

### Models

```python
print(client.models.list())
```

## Async

```python
import asyncio
from octoryn import AsyncOctoryn

async def main() -> None:
    async with AsyncOctoryn(api_key="oct_live_...") as client:
        out = await client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": "ping"}],
        )
        print(out.choices[0].message.content)

asyncio.run(main())
```

## Errors & retries

| Class                          | When                                               |
| ------------------------------ | -------------------------------------------------- |
| `OctorynAuthError`             | `401` / `403` — bad / missing / unauthorized key   |
| `OctorynRateLimitedError`      | `429 rate_limited` — exposes `retry_after_s`       |
| `OctorynQuotaExceededError`    | `429 quota_exceeded` — exposes `retry_at`          |
| `OctorynKsiBlockedError`       | `422 ksi_blocked` — exposes `reasons`, `channel`   |
| `OctorynUpstreamError`         | `503 upstream_*` — exposes `upstream`              |
| `OctorynAPIStatusError`        | Any other non-2xx HTTP                             |
| `OctorynStreamInterrupted`     | SSE stream terminated mid-flight                   |

```python
import time
from octoryn import OctorynRateLimitedError

try:
    client.chat.completions.create(model="gpt-4o-mini", messages=[...])
except OctorynRateLimitedError as e:
    time.sleep(e.retry_after_s or 1.0)
```

The client retries `5xx` and transport errors up to `max_retries=3` with
exponential backoff (factor `0.5s`). Override per-client with
`Octoryn(api_key=..., max_retries=5)`.

## BYOK

Register an upstream credential the gateway should use on your behalf:

```python
created = client.byok.create(
    upstream="openai",       # openai | anthropic | openrouter | ...
    api_key="sk-...",        # never re-displayed by the gateway
    label="prod",
)
print(created.id)
```

## Audit & Usage

```python
runs = client.audit.list(limit=20)
detail = client.audit.get(run_id=runs.items[0].run_id)
verified = client.audit.verify(run_id=detail.run_id)
```

```python
summary = client.usage.get()
records = client.usage.records(model="gpt-4o-mini", limit=50)
```

## Versioning & support

Octoryn follows semver from `1.0.0`. Pre-1.0 minor bumps may break.
See `CHANGELOG.md` (added in Phase 7) and <https://octopusos.dev>.

## License

Proprietary © Octopus Core Pty Ltd (ACN 696 931 236). Octoryn™ is a trademark
of Octopus Core Pty Ltd.
