Metadata-Version: 2.4
Name: keiro
Version: 0.12.13
Summary: Keiro client — call the EB1 multi-model ensemble API.
Author: Keiro Engineering
License-Expression: LicenseRef-Proprietary
Project-URL: Homepage, https://pypi.org/project/keiro/
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: <3.14,>=3.11
Description-Content-Type: text/markdown
Requires-Dist: requests>=2.32.2
Requires-Dist: PyYAML>=6.0.1
Requires-Dist: rich>=13.0.0
Provides-Extra: dev
Requires-Dist: pytest>=8.3.2; extra == "dev"
Requires-Dist: ruff>=0.12.0; extra == "dev"

# Keiro

EB1 multi-model ensemble inference. Run multiple frontier models in parallel
and synthesize the best response.

## Quick start

```bash
pip install keiro
keiro setup
```

```python
from keiro import models

print(models("eb1-preview", "What is machine learning?"))
```

Or from the command line:

```bash
keiro "What is machine learning?"
```

## How it works

EB1 sends your prompt to multiple frontier models (Claude, GPT, Gemini) in
parallel, then a judge synthesizes the strongest elements into a single
response. The result is more accurate and more complete than any individual
model.

## Models

| Model | Description |
|-------|-------------|
| `eb1-preview` (default) | Adaptive GNN-routed ensemble |
| `eb1-delta-preview` | Adaptive ensemble with orchestration |
| `eb1` | Standard 5-model ensemble |
| `eb1-pro` | Extended 6-model ensemble |
| `eb1-frontier` | Highest quality, max reasoning |
| `eb1-codex` | Optimized for code and SWE tasks |
| `eb1-fast` | Low latency, lighter models |
| `eb1-fast-preview` | Adaptive routing, low latency |
| `eb1-frontier-preview` | Adaptive routing, max quality |
| `claude-opus-4-6` | Direct passthrough (no ensemble) |
| `gpt-5.2` | Direct passthrough |

```python
from keiro import models

# Default adaptive ensemble
answer = models("eb1-preview", "Solve this step by step: what is 23 * 47?")

# Max quality
answer = models("eb1-frontier", "Prove that sqrt(2) is irrational.")

# Low latency
answer = models("eb1-fast", "Summarize this in one sentence.")

# Direct passthrough to a single model
answer = models("claude-opus-4-6", "Write a haiku")
```

## Prompt-first API

```python
from keiro import models

# Structured response with usage metadata
reply = models.response("eb1-preview", "Explain quantum computing.")
print(reply.text)
print(reply.usage)

# Reusable model binding with fixed parameters
creative = models.instance("eb1-preview", temperature=0.8)
print(creative("Write a limerick about debugging."))

# Streaming
for chunk in models.stream("eb1-preview", "Draft a launch email."):
    print(chunk, end="")
```

## Full client

```python
from keiro import Client

client = Client()

# Chat completions API
response = client.chat(
    messages=[{"role": "user", "content": "Explain quantum computing."}],
    model="eb1-preview",
)
print(response["choices"][0]["message"]["content"])

# Rate limit visibility
print(client.rate_limits)
# RateLimitInfo(limit_requests=1000, remaining_requests=999, ...)

client.close()
```

## CLI

```bash
keiro "What is ML?"                 # one-shot response
keiro                               # interactive REPL
keiro gui                           # local browser chat UI
keiro -m eb1-fast "Quick answer"    # specific model
echo context | keiro "Summarize"    # pipe context as input
keiro setup                         # configure credentials
keiro models                        # list available models
```

In the interactive REPL, streamed code fences render as numbered code blocks.
Use `/copy [n]` to copy a block from the last assistant reply.

`keiro gui` opens the local browser chat UI in Chrome when available. If
startup takes longer than expected, the CLI prints a manual URL and log path.

## Configuration

**Interactive setup** (recommended):

```bash
keiro setup
```

This validates your API key against the gateway and saves credential metadata
to `~/.keiro/credentials`. Secret bytes are stored in owner-only sidecar files
under `~/.keiro/secrets/`, and the metadata file stores `file://` references.

**Explicit arguments**:

```python
from keiro import Client

client = Client(api_key="your-key", base_url="https://your-keiro-gateway.example")
```

API-key and endpoint precedence is explicit arguments, then credentials file.
Runtime credential and gateway URL environment variables are ignored; run
`keiro setup` or `keiro endpoint local` to update saved credentials.

## Requirements

- Python 3.11+
- No GPU required (inference runs on hosted infrastructure)
