Metadata-Version: 2.4
Name: nodus-llm
Version: 0.1.0
Summary: Multi-credential LLM client with automatic failover and provider abstraction
Author: Shawn Knight
License: MIT
Project-URL: Homepage, https://github.com/Masterplanner25/nodus-llm
Project-URL: Repository, https://github.com/Masterplanner25/nodus-llm
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: openai
Requires-Dist: openai>=2.0.0; extra == "openai"
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.20.0; extra == "anthropic"
Provides-Extra: all
Requires-Dist: nodus-llm[anthropic,openai]; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-mock>=3.0; extra == "dev"
Dynamic: license-file

# nodus-llm

**Multi-credential LLM client with automatic failover and provider abstraction.**

Rotates through ordered credential profiles when an LLM provider fails —
handling rate limits, auth errors, billing limits, and context overflow — with
exponential per-credential backoff (5m→10m→20m→40m→1h). No required external
dependencies; provider SDKs are optional extras.

> **Status:** v0.1.0 — prepared, not yet published.

---

## Install

```bash
pip install nodus-llm

# With OpenAI support:
pip install "nodus-llm[openai]"

# With Anthropic support:
pip install "nodus-llm[anthropic]"

# Both:
pip install "nodus-llm[all]"
```

---

## What it provides

| Component | Purpose |
|---|---|
| `CredentialProfile` | One API key + provider + model with cooldown state |
| `CredentialStore` | Ordered profile list with availability tracking |
| `FailoverClient` | Rotates profiles on failure with exponential backoff |
| `FailoverError` | Raised when all profiles are exhausted |
| `FailoverReason` | `RATE_LIMIT` \| `AUTH` \| `BILLING` \| `CONTEXT_OVERFLOW` \| `EXHAUSTED` |
| `context_window_for(model)` | Token limit for a known model |
| `would_overflow(messages, model)` | Context window check before sending |
| `CONTEXT_WINDOWS` | Dict of model name → token limit |

---

## Quick start

```python
from nodus_llm import CredentialProfile, CredentialStore, FailoverClient
from nodus_llm.providers.openai import OpenAIProvider

profiles = [
    CredentialProfile(api_key="sk-primary", provider="openai", model="gpt-4o"),
    CredentialProfile(api_key="sk-backup",  provider="openai", model="gpt-4o-mini"),
]
store = CredentialStore(profiles=profiles)

client = FailoverClient(store, provider_fn=OpenAIProvider)
reply = client.chat([{"role": "user", "content": "Hello!"}])
```

---

## CredentialProfile and CredentialStore

```python
from nodus_llm import CredentialProfile, CredentialStore

profile = CredentialProfile(
    api_key="sk-abc123",
    provider="openai",     # "openai" | "anthropic" | custom string
    model="gpt-4o",
    max_tokens=4096,       # optional per-profile override
)

store = CredentialStore(profiles=[profile])

available = store.available_profiles()   # profiles not in cooldown
store.mark_failed(profile)               # starts exponential cooldown
store.mark_success(profile)              # clears cooldown
```

Cooldown schedule: `5m → 10m → 20m → 40m → 1h` (exponential, capped at 1h).

---

## FailoverClient

```python
from nodus_llm import FailoverClient, FailoverError, FailoverReason

client = FailoverClient(
    store,
    provider_fn=OpenAIProvider,   # factory: (profile) → LLMClient
)

try:
    reply = client.chat(
        messages=[{"role": "user", "content": "Hello!"}],
        max_tokens=256,
    )
except FailoverError as exc:
    print(exc.reason)   # FailoverReason.EXHAUSTED
```

`FailoverClient` tries profiles in order. On failure it:
1. Classifies the error → `FailoverReason`
2. Marks the profile failed (starts cooldown)
3. Tries the next available profile
4. Raises `FailoverError` if all profiles are exhausted

---

## Providers

### OpenAI

```python
from nodus_llm.providers.openai import OpenAIProvider

client = FailoverClient(store, provider_fn=OpenAIProvider)
```

Requires `pip install "nodus-llm[openai]"`.

### Anthropic

```python
from nodus_llm.providers.anthropic import AnthropicProvider

client = FailoverClient(store, provider_fn=AnthropicProvider)
```

Requires `pip install "nodus-llm[anthropic]"`.

### OpenAI-compatible endpoints

```python
from nodus_llm.providers.compat import OpenAICompatProvider

profile = CredentialProfile(
    api_key="local-key",
    provider="local",
    model="llama-3.1-8b",
    base_url="http://localhost:11434/v1",
)
client = FailoverClient(store, provider_fn=OpenAICompatProvider)
```

---

## Context window utilities

```python
from nodus_llm import context_window_for, would_overflow, CONTEXT_WINDOWS

limit = context_window_for("gpt-4o")   # 128000 | None (unknown model)

messages = [{"role": "user", "content": "..." * 10000}]
if would_overflow(messages, "gpt-4o-mini"):
    # truncate or summarise before sending
    ...

# All known limits
print(CONTEXT_WINDOWS)  # {"gpt-4o": 128000, "claude-3-5-sonnet-20241022": 200000, ...}
```

---

## Custom provider

Any callable `(CredentialProfile) -> LLMClient` works as `provider_fn`.
`LLMClient` is a structural protocol — any object with
`chat(messages, **kwargs) -> str` satisfies it.

```python
class MyProvider:
    def __init__(self, profile): self.profile = profile
    def chat(self, messages, **kwargs): return "custom response"

client = FailoverClient(store, provider_fn=MyProvider)
```

---

## Design

- **No required dependencies.** Core credential management and failover are
  pure stdlib. Provider SDKs are opt-in extras.
- **`nodus-circuit-breaker` is type-check only.** The `LLMClient` protocol
  is referenced via `TYPE_CHECKING` — no runtime import.
- **Exponential backoff per credential.** Each profile has independent
  cooldown state; a failed profile is skipped until its cooldown expires.

---

## Development

```bash
pip install -e ".[dev]"
pytest tests/ -q
```

---

## License

MIT — see [LICENSE](LICENSE).
