Metadata-Version: 2.4
Name: fallbakit
Version: 0.1.0
Summary: Python SDK for Fallbakit local-first chat completions.
Author: Fallbakit
License: Apache-2.0
License-File: LICENSE
Keywords: fallbakit,llm,ollama,openai,sdk,vllm
Requires-Python: >=3.9
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == 'openai'
Provides-Extra: test
Requires-Dist: pytest>=8.0; extra == 'test'
Description-Content-Type: text/markdown

# Fallbakit Python SDK

Python SDK for Fallbakit's OpenAI-compatible, local-first chat completions API.

Fallbakit routes to the customer's local Ollama, oMLX, or vLLM runner through the open-source tunnel agent first, then falls back to configured BYOK cloud providers when local inference is unavailable and fallback is allowed.

## Install

```sh
pip install fallbakit
```

For local development from this repository:

```sh
python -m pip install --upgrade pip setuptools wheel
python -m pip install -e ".[test,openai]"
```

## Configuration

```sh
export FALLBAKIT_API_KEY=or_your_application_key
export FALLBAKIT_BASE_URL=https://api.fallbakit.com
```

Keep `FALLBAKIT_API_KEY` in your environment or secret manager for security, then pass it explicitly when constructing the client. `base_url` defaults to `https://api.fallbakit.com`. For local development, pass `base_url="http://localhost:8080"`.

For local examples in this repository:

```sh
cp .env.example .env.local
```

`.env.example` is set up for the local developer stack:

```dotenv
FALLBAKIT_API_KEY=or_your_generated_api_key
FALLBAKIT_BASE_URL=http://localhost:8080
FALLBAKIT_OPENAI_BASE_URL=http://localhost:8080/v1
FALLBAKIT_MODEL=llama3.2
FALLBAKIT_FALLBACK_PROVIDER=openai
FALLBAKIT_FALLBACK_MODEL=gpt-4o-mini
```

## Basic Chat

```python
import os

from fallbakit import Fallbakit

client = Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"])

response = client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Write a tiny launch checklist."}],
)

print(response["choices"][0]["message"]["content"])
```

## Official OpenAI SDK

Fallbakit also works with the official OpenAI Python SDK because the router exposes `POST /v1/chat/completions`. Use a Fallbakit application API key and include `/v1` in the OpenAI SDK base URL.

```sh
python -m pip install -e ".[openai]"
python examples/openai_sdk_chat.py
python examples/openai_sdk_streaming.py
```

```python
import os

from openai import OpenAI

client = OpenAI(
    api_key=os.environ["FALLBAKIT_API_KEY"],
    base_url=os.environ.get("FALLBAKIT_OPENAI_BASE_URL", "http://localhost:8080/v1"),
)

completion = client.chat.completions.create(
    model=os.environ.get("FALLBAKIT_MODEL", "llama3.2"),
    messages=[{"role": "user", "content": "Write a tiny launch checklist."}],
)
```

## Streaming

```python
import os

from fallbakit import Fallbakit

client = Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"])

for chunk in client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Stream a short answer."}],
    stream=True,
):
    delta = chunk["choices"][0].get("delta", {})
    print(delta.get("content", ""), end="", flush=True)
```

## Cloud Fallback Controls

```python
response = client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Summarize hybrid inference."}],
    fallback_provider="openai",
    fallback_model="gpt-4o-mini",
)
```

Supported Fallbakit controls:

- `fallback`: allow cloud fallback when local cannot serve. Defaults to `True`.
- `force_local`: require local routing.
- `fallback_provider`: request a specific configured BYOK fallback provider.
- `fallback_model`: request a specific cloud model for fallback.
- `cloud_model_only`: skip local routing and call the selected cloud provider.
- `local_model_only`: prevent cloud fallback.
- `extra_body`: pass additional OpenAI-compatible or Fallbakit-specific request body fields.

## Timeouts

```python
import os

client = Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"], timeout=30)

response = client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Hello"}],
    timeout=10,
)
```

Per-request `timeout` overrides the client default.

## Errors

API failures raise `FallbakitAPIError` with:

- `status_code`
- `code`
- `response`

```python
import os

from fallbakit import Fallbakit, FallbakitAPIError

try:
    Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"]).chat.completions.create(
        model="llama3.2",
        messages=[{"role": "user", "content": "Hello"}],
    )
except FallbakitAPIError as error:
    print(error.status_code, error.code, error)
```

## Development

```sh
pytest
python examples/minimal_chat.py
python examples/streaming_chat.py
python examples/openai_sdk_chat.py
python examples/openai_sdk_streaming.py
```

## Local Verification

Use this flow to confirm the package installs, the unit tests pass, and the examples can talk to a local Fallbakit router.

1. Start a local model runtime in a separate terminal:

```sh
ollama serve
ollama pull llama3.2
```

2. Start the local Fallbakit stack from the repository root:

```sh
cp configs/local-infra.env.example configs/local-infra.env
cp configs/api.env.example configs/api.env
scripts/dev-up.sh
# In the dashboard, create a runner and export its generated RUNNER_* values.
(
  cd open-source/tunnel-agent
  go run ./cmd/fallbakit-agent \
    -api-key="$FALLBAKIT_RUNNER_API_KEY" \
    -runner-id="$FALLBAKIT_RUNNER_ID" \
    -base-url=http://localhost:8080 \
    -local-provider=ollama \
    -local-base-url=http://localhost:11434
)
```

For vLLM local verification, start vLLM on `localhost:8000`, create a vLLM runner in the dashboard, then use `-local-provider=vllm -local-base-url=http://localhost:8000`. Direct OpenAI clients use `http://localhost:8000/v1`; the Fallbakit agent should use the origin without `/v1`.

3. In a new terminal, install the SDK in editable mode and load the example environment:

```sh
cd open-source/python-sdk
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install --upgrade setuptools wheel
python -m pip install -e ".[test,openai]"
cp .env.example .env.local
set -a
source .env.local
set +a
```

Replace `FALLBAKIT_API_KEY` in `.env.local` with the application API key you created in the dashboard.

4. Run the package tests:

```sh
pytest
```

5. Run the local smoke tests against `http://localhost:8080`:

```sh
python examples/minimal_chat.py
python examples/streaming_chat.py
```

6. If your local stack has a fallback provider configured, run the fallback example too:

```sh
python examples/local_first_with_fallback.py
```

When database-backed auth is enabled, application allowlist rules are enforced per `application_id`, so use an application API key created from an enabled dashboard application when running the examples.

If those commands return model output, the local package setup is working correctly. When you are done, stop the dev stack with `scripts/dev-down.sh` from the repository root.

## Publishing

1. Update `version` in `pyproject.toml`.
2. Run `pytest`.
3. Build artifacts:

```sh
python -m pip install build twine
python -m build
twine check dist/*
```

4. Publish:

```sh
twine upload dist/*
```
