Metadata-Version: 2.4
Name: olive-compute
Version: 0.1.1
Summary: Python client for the Olive distributed AI compute platform
Project-URL: Homepage, https://olivecompute.com
Project-URL: Documentation, https://olivecompute.com/docs
Project-URL: Repository, https://github.com/yotammos/olive
License: MIT
Keywords: ai,compute,distributed,embeddings,inference
Requires-Python: >=3.9
Requires-Dist: httpx>=0.27
Description-Content-Type: text/markdown

# Olive Python SDK

Distributed AI compute — embeddings and inference — with one import.

```bash
pip install olive-compute
```

## Quickstart

```python
from olive import OliveClient

client = OliveClient(api_key="olv_...")

# Embed text — uses the default embeddings model
vectors = client.embeddings(["hello world", "olive compute"])
print(vectors[0][:4])  # [0.0521, -0.1234, ...]

# Run inference — uses the default chat model
reply = client.inference("What is a neural network?", max_tokens=128)
print(reply)
```

## Choosing a model

Olive supports a catalog of curated open-source models. Browse them at
[olivecompute.com/models](https://olivecompute.com/models) or programmatically:

```python
# List all available chat models
for m in client.list_models(modality="chat"):
    print(m["id"], "—", m["pricing"]["input_per_1m_tokens_usd"], "/1M tokens")

# Get one model's full record
m = client.get_model("meta/llama-3.1-8b-instruct")
print(m["description"])
```

Pass `model=` to any inference call to pin a specific model:

```python
reply = client.inference(
    "Write a Python function to reverse a list.",
    model="qwen/qwen-2.5-coder-7b",
)

vectors = client.embeddings(
    ["semantic search query"],
    model="baai/bge-large-en-v1.5",
)
```

If `model=` is omitted, Olive picks the default (featured) model for the workload.

## Authentication

Get an API key from [provider.olivecompute.com](https://provider.olivecompute.com) → Settings → API Keys.

```python
# API key (recommended)
client = OliveClient(api_key="olv_...")

# Email + password (issues a short-lived token automatically)
client = OliveClient(email="you@example.com", password="...")
```

## Compute tiers

| Tier | CPU | RAM | Use case |
|------|-----|-----|----------|
| `"light"` | 1 core | 2 GB | Embeddings, small inputs |
| `"medium"` | 2 cores | 4 GB | Standard inference (default) |
| `"heavy"` | 4 cores | 8 GB | Long context, large batches |

## Async jobs

For long-running workloads, submit and poll separately:

```python
job = client.submit_job(
    workload_type="inference",
    input_data='{"prompt": "Write a haiku", "max_tokens": 64}',
    model="meta/llama-3.1-8b-instruct",   # optional — default chat model otherwise
    compute="medium",
)
print(job.id)       # e3b2a1c0-...
print(job.status)   # "running"

result = job.wait(timeout=120)
print(result["output_data"])
```

## Error handling

```python
from olive import OliveClient, AuthError, JobError

try:
    client = OliveClient(api_key="bad_key")
    vectors = client.embeddings(["test"])
except AuthError:
    print("Check your API key")
except JobError as e:
    print(f"Job failed: {e}")
```

## Context manager

```python
with OliveClient(api_key="olv_...") as client:
    vectors = client.embeddings(["hello"])
```
