Metadata-Version: 2.4
Name: sunwaee
Version: 1.3.1
Summary: SUNWÆE gen — multi-provider LLM engine library.
Author: David NAISSE
Maintainer: David NAISSE
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx>=0.27.0
Provides-Extra: files
Requires-Dist: pypdf>=4.0.0; extra == "files"
Requires-Dist: python-docx>=1.1.0; extra == "files"
Requires-Dist: openpyxl>=3.1.0; extra == "files"
Requires-Dist: python-pptx>=1.0.0; extra == "files"
Provides-Extra: dev
Requires-Dist: pytest>=8.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=1.0.0; extra == "dev"
Requires-Dist: pytest-cov>=5.0.0; extra == "dev"
Requires-Dist: setuptools_scm>=8; extra == "dev"
Requires-Dist: twine; extra == "dev"
Requires-Dist: build; extra == "dev"
Dynamic: license-file

![Coverage](https://img.shields.io/badge/coverage-100%25-brightgreen) ![Python](https://img.shields.io/badge/python-3.11%2B-blue) ![PyPI](https://img.shields.io/pypi/v/sunwaee) ![License](https://img.shields.io/badge/license-MIT-blue)

All LLMs, one response format, one dependency (httpx). Supports switching model in conversations (e.g. draft with GPT, refine with Anthropic).

Handles streaming, tool calls, file attachments, prompt caching, extended thinking, and cost tracking across Anthropic, OpenAI, Google, DeepSeek, xAI, and Moonshot.

---

## Install

```bash
pip install sunwaee
pip install "sunwaee[files]"   # pdf, docx, xlsx, pptx extraction
pip install -e ".[dev,files]"  # development
```

---

## Quick start

```python
import asyncio
from sunwaee.modules.gen.engine import get_engine
from sunwaee.modules.gen.engine.types import Message, Role

engine = get_engine("anthropic", "claude-sonnet-4-6")  # reads ANTHROPIC_API_KEY

async def main():
    messages = [Message(role=Role.USER, content="Hello")]

    response = await engine.chat(messages)
    print(response.content, response.cost.total)

    async for chunk in engine.stream(messages):
        if chunk.content:
            print(chunk.content, end="", flush=True)

asyncio.run(main())
```

---

## Providers

| Provider  | `provider=`   | Env var             |
| --------- | ------------- | ------------------- |
| Anthropic | `"anthropic"` | `ANTHROPIC_API_KEY` |
| OpenAI    | `"openai"`    | `OPENAI_API_KEY`    |
| Google    | `"google"`    | `GOOGLE_API_KEY`    |
| DeepSeek  | `"deepseek"`  | `DEEPSEEK_API_KEY`  |
| xAI       | `"xai"`       | `XAI_API_KEY`       |
| Moonshot  | `"moonshot"`  | `MOONSHOT_API_KEY`  |

---

## Directory structure

```
sunwaee/
├── core/
│   ├── logger.py                 # get_logger(name) — scoped under "sunwaee.*"
│   └── tools.py                  # @tool decorator, ok(), err()
└── modules/gen/
    ├── __init__.py               # public re-exports (get_engine, run, stream_run, …)
    ├── agent.py                  # ReAct loop — run() + stream_run()
    ├── tools.py                  # TOOLS list
    └── engine/
        ├── __init__.py           # get_engine, Message, Response, Tool, …
        ├── base.py               # BaseEngine ABC
        ├── factory.py            # get_engine() — provider routing + connection pooling
        ├── model.py              # Model dataclass + compute_cost()
        ├── types.py              # Message, Response, ToolCall, Usage, Cost, Performance, …
        ├── models/               # model registry per provider
        │   ├── __init__.py       # get_model(), list_models()
        │   ├── anthropic.py / openai.py / google.py / deepseek.py / xai.py / moonshot.py
        └── providers/
            ├── anthropic.py      # AnthropicEngine
            ├── openai.py         # OpenAIEngine (also used by DeepSeek, xAI, Moonshot)
            └── google.py         # GoogleEngine

tests/gen/
├── test_agent.py / test_stream_agent.py / test_tools.py
└── engine/
    ├── test_types.py / test_factory.py / test_model.py
    ├── providers/
    │   └── test_anthropic.py / test_openai.py / test_google.py
    └── live/
        ├── test_providers.py     # real API calls, all providers × all scenarios
        └── run/                  # JSON snapshots (gitignored)
```

---

## Core types (`engine/types.py`)

```python
class Role(Enum):       SYSTEM, USER, ASSISTANT, TOOL, CONTEXT
class StopReason(Enum): END_TURN, TOOL_USE, MAX_TOKENS

@dataclass class Message:
    role: Role
    content: str | None
    reasoning_content: str | None       # thinking for models that support it
    reasoning_signature: str | None     # opaque blob — echo back verbatim
    tool_call_id: str | None            # set on Role.TOOL messages
    tool_calls: list[ToolCall] | None
    attachments: list[FileAttachment] | None   # Role.USER only

@dataclass class Error:
    message: str = ""; status_code: int = 0    # defined but never populated — errors are raised

@dataclass class Response:
    provider: str; model: str; streaming: bool; synthetic: bool
    content: str | None; reasoning_content: str | None; reasoning_signature: str | None
    tool_calls: list[ToolCall] | None; stop_reason: StopReason | None; error: Error | None
    usage: Usage | None; cost: Cost | None; performance: Performance | None

@dataclass class ToolCall:
    id: str; name: str; arguments: dict
    thought_signature: str | None    # Google only — echo back every subsequent turn
    error: str | None; duration: float; results: list[dict]

@dataclass class Usage:
    input_tokens: int; output_tokens: int; total_tokens: int
    cache_read_tokens: int; cache_write_tokens: int

@dataclass class Cost:
    input: float; output: float; cache_read: float; cache_write: float; total: float

@dataclass class Performance:
    latency: float            # seconds to first chunk
    reasoning_duration: float; content_duration: float; total_duration: float
    throughput: int           # output tokens / second

@dataclass class FileAttachment:
    data: bytes; filename: str; media_type: str = ""
    # text/* → <file name="…">…</file> block
    # image/jpeg|png|gif|webp → base64 inline
    # application/pdf|json + OOXML (docx/xlsx/pptx) → extracted text
```

---

## Usage

### With tools

```python
from sunwaee.core.tools import tool, ok, err
from sunwaee.modules.gen.engine.types import Tool

@tool("Return the current UTC time.")
def get_time() -> str:
    from datetime import datetime, timezone
    return ok({"time": datetime.now(timezone.utc).isoformat()})

response = await engine.chat(messages, tools=[get_time._tool])
```

### File and image attachments

```python
from sunwaee.modules.gen.engine.types import FileAttachment, Message, Role

with open("report.pdf", "rb") as f:
    att = FileAttachment(data=f.read(), filename="report.pdf")

response = await engine.chat([Message(role=Role.USER, content="Summarise.", attachments=[att])])
```

Supported: `text/*`, `application/json`, `image/jpeg|png|gif|webp`, `application/pdf`, `.docx`, `.xlsx`, `.pptx`

### ReAct agent loop

```python
from sunwaee.modules.gen.agent import stream_run

new_messages = []
async for chunk in stream_run(messages, tools, engine, new_messages=new_messages):
    if chunk.content:
        print(chunk.content, end="", flush=True)
# new_messages has all assistant + tool turns appended during the run
```

Up to 10 iterations by default. Concurrent tool calls via `asyncio.gather`. Sync tools dispatched via `run_in_executor`.

### Listing models

```python
from sunwaee.modules.gen.engine.models import list_models, get_model

all_models = list_models()              # list[Model]
model = get_model("claude-sonnet-4-6")  # Model | None
```

---

## Testing

```bash
pytest tests/gen/ -m "not live"                                        # unit (no keys needed)
pytest tests/gen/ -m live                                              # live (real API calls)
pytest tests/gen/ -m "not live" --cov=sunwaee --cov-report=term-missing
```

Unit test conventions:

- Mock `httpx.AsyncClient` — never make real HTTP calls
- Assert `response.cost`, `response.usage`, `response.performance` populated on final chunk
- For streaming, use an async generator as mock transport

Live test scenarios (all providers × chat + stream):

| Scenario           | What it tests                                   |
| ------------------ | ----------------------------------------------- |
| `ONLY_SYSTEM`      | System-only input edge case; lenient assertions |
| `ONLY_USER`        | Single user message                             |
| `SYSTEM_AND_USER`  | System prompt respected in response             |
| `TOOL_CALL`        | Model must issue at least one tool call         |
| `TOOL_CALL_RESULT` | Full multi-turn with real tool IDs/signatures   |
| `FILE_ATTACHMENT`  | Text file attached; asserts content populated   |
| `CONTEXT_ROLE`     | `Role.CONTEXT` message handled without errors   |

Image attachments tested separately, parametrized over vision-capable engines only.

---

## How to add a model

**File:** `sunwaee/modules/gen/engine/models/<provider>.py`

```python
Model(
    name="provider-model-name",
    display_name="Human Readable Name",
    provider="anthropic",
    context_window=200_000,
    max_output_tokens=64_000,
    input_price_per_mtok=3.0,
    output_price_per_mtok=15.0,
    cache_read_price_per_mtok=0.3,
    cache_write_price_per_mtok=3.75,
    input_price_per_mtok_200k=6.0,     # omit if no >200k tier
    output_price_per_mtok_200k=22.5,
    supports_vision=True, supports_tools=True,
    supports_thinking=True, supports_reasoning_tokens=True,
    cache_min_tokens=1_024,            # omit (None) if caching is undocumented
    release_date="2025-01-01",
)
```

**Pricing tiers** (`engine/model.py`): base required; `_128k` when `input_tokens > 128_000` (xAI only); `_200k` when `> 200_000`; `_272k` when `> 272_000` (OpenAI only). Thresholds are strict `>` — exactly at the boundary uses the lower tier.

**`cache_min_tokens`** — minimum tokens required at a cache breakpoint for prompt caching to activate. `None` = no caching. `0` = no minimum (caches everything). Known values:

| Provider  | Minimum | Models                                  |
| --------- | ------- | --------------------------------------- |
| Anthropic | 4,096   | Opus 4.6, Opus 4.5, Haiku 4.5          |
| Anthropic | 2,048   | Sonnet 4.6                              |
| Anthropic | 1,024   | Sonnet 4.5                              |
| OpenAI    | 1,024   | All models (automatic prefix caching)   |
| Google    | 1,024   | All models (explicit context caching)   |
| xAI       | 0       | All models (automatic, no minimum)      |
| DeepSeek  | 64      | All models (automatic prefix caching)   |
| Moonshot  | 0       | All models (automatic, no minimum)      |

**Tests:** Add assertions in `engine/test_model.py` for non-standard pricing tiers.

---

## How to add an OpenAI-compatible provider

1. `engine/models/<provider>.py` — `MODELS` list
2. `engine/models/__init__.py` — import + add to `_ALL`
3. `engine/factory.py` — add to `_OPENAI_COMPATIBLE: dict[str, str]` (env var auto-derived as `PROVIDER_API_KEY`)
4. `tests/gen/engine/live/test_providers.py` — add `("provider", "cheapest-model")` to `ENGINES`

---

## How to add a provider with a custom API

1. `engine/models/<provider>.py` + register in `__init__.py`
2. `engine/providers/<provider>.py` — implement `BaseEngine`:
   - `async def chat(self, messages, tools=None) -> Response`
   - `async def stream(self, messages, tools=None) -> AsyncIterator[Response]`
   - Accept `client: httpx.AsyncClient | None = None`
   - Call `resolve_tokens(usage)` before `compute_cost`
   - Strip `reasoning_content`/`reasoning_signature` from all but the last assistant turn
   - Handle system-only input: promote to `Role.USER`
   - On 4xx/5xx in streaming: read full body before raising
   - Buffer tool call JSON across SSE chunks; parse only on stop
3. `engine/factory.py` — wire into `get_engine()`
4. Tests: unit (`providers/test_<provider>.py`) + live entry

---

## How to add a tool to the agent

```python
from typing import Annotated
from sunwaee.core.tools import tool, ok, err

@tool("Search the web for current information.")
def web_search(
    query: Annotated[str, "The search query"],
    num_results: Annotated[int, "Number of results"] = 5,
) -> str:
    try:
        return ok(_do_search(query, num_results))
    except Exception as e:
        return err(str(e))
```

Register: add `web_search._tool` to `TOOLS` in `sunwaee/modules/gen/tools.py`.

Tests: `tests/gen/test_<tool_name>.py` — call directly, assert JSON output shape, test error path. Never call real external APIs.

---

## `@tool` decorator

Introspects signature to build JSON Schema `parameters` automatically.

Supports: `str`, `int`, `float`, `bool`, `list[T]`, `Literal[...]`, `Optional[T]`, `Annotated[T, "description"]`

- Parameters with defaults → not `required`
- Both sync and async supported
- Must return JSON string: `ok(data)` / `err(message)` / `json.dumps(...)`

```python
ok({"id": "123"})   # '{"ok": true, "data": {"id": "123"}}'
err("Not found")    # '{"ok": false, "error": "Not found"}'
```

---

## Provider-specific quirks

| #   | Rule                                                                                                                                                                      |
| --- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 1   | **`resolve_tokens()` before `compute_cost()`** — xAI/Google exclude reasoning tokens from `output_tokens`; `resolve_tokens` back-calculates from `total_tokens`           |
| 2   | **Strip reasoning from all but last assistant turn** — stale `reasoning_signature` breaks APIs                                                                            |
| 3   | **OpenAI uses `max_completion_tokens`**, not `max_tokens`                                                                                                                 |
| 4   | **OpenAI reasoning models: yield synthetic chunk immediately** — stream is silent during thinking; `Response(reasoning_content="Reasoning in progress…", synthetic=True)` |
| 5   | **Google: `thoughtSignature` on `functionCall` part** → `ToolCall.thought_signature`; echo every subsequent turn                                                          |
| 6   | **Google: no tool call IDs** — use function name as correlation ID                                                                                                        |
| 7   | **Google streaming: `?alt=sse` required** on `streamGenerateContent`                                                                                                      |
| 8   | **System-only input** — promote system message to `Role.USER` (Anthropic + Google)                                                                                        |
| 9   | **Anthropic thinking budget**: `1024 ≤ thinking_budget < max_tokens`; default `max(1024, max_tokens - 1024)`                                                              |
| 10  | **Connection pooling**: one `httpx.AsyncClient` per `(event_loop_id, base_url)` in `factory.py`                                                                           |
| 11  | **`Role.CONTEXT` mapping**: all providers wrap content in `<context>` tags automatically — Anthropic → `{"role":"user","content":"<context>…</context>"}`; OpenAI → `{"role":"system","content":"<context>…</context>"}`; Google → `{"role":"user","parts":[{"text":"<context>…</context>"}]}` |
| 12  | **Anthropic cache tokens suppressed when thinking is enabled** — `cache_read_tokens`/`cache_write_tokens` are 0 in the API response even when caching is active; caching still happens transparently |
