Metadata-Version: 2.4
Name: routelayer
Version: 0.1.0
Summary: Unified LLM routing with automatic fallback, cost-aware model selection, and semantic caching.
Author-email: Freedom Engineers <hello@freedomengineers.tech>
License-Expression: MIT
Project-URL: Homepage, https://routelayer.io
Project-URL: Repository, https://github.com/carsonlabs/routelayer-python
Project-URL: Issues, https://github.com/carsonlabs/routelayer-python/issues
Keywords: llm,ai,routing,openai,anthropic,gemini,fallback,caching
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == "openai"
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.20; extra == "anthropic"
Provides-Extra: gemini
Requires-Dist: google-generativeai>=0.5; extra == "gemini"
Provides-Extra: all
Requires-Dist: openai>=1.0; extra == "all"
Requires-Dist: anthropic>=0.20; extra == "all"
Requires-Dist: google-generativeai>=0.5; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-asyncio; extra == "dev"
Requires-Dist: openai>=1.0; extra == "dev"
Requires-Dist: anthropic>=0.20; extra == "dev"
Dynamic: license-file

# RouteLayer (Python)

**Unified LLM routing with automatic fallback, cost-aware model selection, and semantic caching.**

RouteLayer is a lightweight, zero-dependency (other than the provider SDKs) middleware that sits between your application and your LLM providers. It solves the "Inference Cost Crisis" by intelligently routing requests based on priority, cost, and semantic similarity.

## Why RouteLayer?

LiteLLM is the 800-pound gorilla in this space, but it requires running a proxy server and comes with a mountain of configuration. RouteLayer takes a different approach:

*   **Zero config** -- add your providers and call `generate()`. No proxy, no YAML files, no Docker containers.
*   **Simpler mental model** -- RouteLayer is a library, not a platform. Import it, use it, done.
*   **Built-in semantic caching** -- LiteLLM charges extra for caching via their proxy. RouteLayer includes it for free, in-process, with zero dependencies.

If you need 100+ provider adapters and team management, use LiteLLM. If you need a lightweight router that just works, use RouteLayer.

## Features

*   **Unified API:** Call OpenAI, Anthropic, and Gemini through a single, standardized interface.
*   **Zero-Downtime Fallback:** Automatically failover to a secondary provider if the primary provider times out or goes down.
*   **Cost-Aware Routing:** Automatically select the cheapest available model for a given request.
*   **Built-in Semantic Caching:** Stop paying for the same answer twice. RouteLayer includes a lightweight, local, zero-dependency semantic cache that returns cached responses for semantically similar prompts.

## Installation

Install RouteLayer with the providers you intend to use:

```bash
# Install with all providers
pip install "routelayer[all]"

# Or install specific providers
pip install "routelayer[openai,anthropic]"
```

## Quick Start

```python
import os
from routelayer import RouteLayer

# 1. Initialize the router with your preferred models
rl = RouteLayer(
    providers=[
        {
            "name": "openai",
            "model": "gpt-4o-mini",
            "api_key": os.environ.get("OPENAI_API_KEY"),
            "priority": 0, # Try first
            "cost_per_1k_tokens": 0.0006,
        },
        {
            "name": "anthropic",
            "model": "claude-haiku-4-5-20251001",
            "api_key": os.environ.get("ANTHROPIC_API_KEY"),
            "priority": 1, # Fallback
            "cost_per_1k_tokens": 0.001,
        }
    ],
    strategy="priority", # Or "cheapest"
    verbose=True
)

# 2. Generate a response
response = rl.generate("Explain quantum computing in one sentence.")

# 3. Inspect the standardized response
print(response.text)
print(f"Served by: {response.provider} ({response.model})")
print(f"Cost: ${response.cost_usd:.6f}")
print(f"Cached: {response.cached}")
```

## Routing Strategies

*   **`priority` (Default):** Tries providers in ascending order of their `priority` value. Great for setting up a primary model and a reliable fallback.
*   **`cheapest`:** Ignores priority and always attempts the provider with the lowest `cost_per_1k_tokens`.

## Semantic Caching

RouteLayer includes a built-in semantic cache. If a user asks "What is your refund policy?" and later asks "How do I get a refund?", the cache will intercept the second request and return the cached answer without hitting the LLM API, saving you money and reducing latency to zero.

```python
from routelayer import RouteLayer, SemanticCache

# Configure the cache (defaults are usually fine)
cache = SemanticCache(
    threshold=0.92,   # Similarity threshold (0.0 to 1.0)
    ttl_seconds=3600, # Time to live (1 hour)
    max_size=1000     # Max entries in memory
)

rl = RouteLayer(providers=[...], cache=cache)
```

## License

MIT License. Built by Freedom Engineers -- routelayer.io
