Metadata-Version: 2.4
Name: smart-llm-router
Version: 0.1.1
Summary: Provider-agnostic LLM router. Pick the cheapest capable model per prompt with rule-based scoring. Wraps LiteLLM for format conversion + streaming.
Author-email: Huanzhou Huang <huanzhou.huang@netmind.ai>
License: MIT
Project-URL: Homepage, https://github.com/protagolabs/smart-llm-router
Project-URL: Issues, https://github.com/protagolabs/smart-llm-router/issues
Project-URL: Repository, https://github.com/protagolabs/smart-llm-router
Keywords: llm,router,litellm,openrouter,openai,anthropic,gemini,smart-routing,cost-optimization
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: litellm[proxy]>=1.50.0
Requires-Dist: typer>=0.12.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: pydantic>=2.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Requires-Dist: ruff>=0.6; extra == "dev"
Requires-Dist: mypy>=1.10; extra == "dev"
Dynamic: license-file

# smart-llm-router

Provider-agnostic LLM router. Pick the cheapest capable model per prompt with rule-based scoring. Wraps [LiteLLM](https://github.com/BerriAI/litellm) for format conversion, streaming, tool calls, and 100+ provider integrations.

## Why

Every LLM proxy today routes based on a model name you pick. This one **picks the model for you** — locally, in <1ms, with zero ML — by scoring the prompt across 14 dimensions (code presence, reasoning markers, multi-step patterns, multilingual keywords, etc.) and mapping to one of four tiers (SIMPLE / MEDIUM / COMPLEX / REASONING).

You bring an upstream (OpenRouter, Together, Fireworks, Groq, Anthropic direct, vLLM, Ollama — anything OpenAI-compatible). It does the rest.

## Install

```bash
pip install smart-llm-router
```

Two console scripts ship with the package: `smart-llm-router` (full name) and `slr` (short alias).

## Quick start with OpenRouter (default upstream)

```bash
# 1. Get an OpenRouter key at https://openrouter.ai/keys
export OPENROUTER_API_KEY=sk-or-v1-...
export LITELLM_MASTER_KEY=sk-anything    # gates the proxy itself

# 2. Start the proxy on :4000 (uses bundled OpenRouter config by default)
smart-llm-router start
```

In another terminal — any OpenAI-compatible client works:

```python
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:4000/v1", api_key="sk-anything")

# Smart routing — rule-based scorer picks the cheapest capable model
resp = client.chat.completions.create(
    model="smart/auto",
    messages=[{"role": "user", "content": "prove that sqrt(2) is irrational step by step"}],
)
# → routed to REASONING tier (e.g. deepseek/deepseek-r1)
```

Or curl:

```bash
curl http://127.0.0.1:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-anything" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart/auto","messages":[{"role":"user","content":"hi"}]}'
```

## Inspect routing without dispatching

```bash
slr test "what is the capital of france"
# → SIMPLE / google/gemini-2.5-flash-lite / 100% savings vs claude-sonnet-4.6

slr test "Prove that sqrt(2) is irrational step by step"
# → REASONING / deepseek/deepseek-r1 / 90% savings

slr test "design a high-availability microservices architecture" --profile premium
# → COMPLEX / anthropic/claude-opus-4.7

slr models --profile auto    # show the tier→model table
```

## Pointing at a different upstream

The bundled config targets OpenRouter, but anything OpenAI-compatible works (Together, Fireworks, Groq, DeepInfra, vLLM, Ollama, OpenAI direct). Copy the bundled YAML and edit `api_base` / `api_key`:

```bash
# Copy the bundled config to your working directory
python -c "from importlib.resources import files; import shutil; shutil.copy(files('smart_llm_router') / 'default_config.yaml', './smart-llm-router.yaml')"

# Edit smart-llm-router.yaml — swap api_base / api_key per model_list entry
# Then start with --config
smart-llm-router start --config smart-llm-router.yaml
```

## Available routing profiles

| `model` value | Behavior |
|---|---|
| `smart/auto` | Rule-based scoring → cheapest capable model |
| `smart/eco` | Rule-based scoring → cheapest tier table (free + lite models) |
| `smart/premium` | Rule-based scoring → quality-first tier table |
| `smart/free` | Forces only free/local models |
| `<provider>/<model>` | Bypasses routing, dispatches directly |

## Test routing without dispatching

```bash
smart-llm-router test "write a python function to compute fibonacci"
# tier: MEDIUM | model: deepseek/deepseek-chat | confidence: 0.82
# signals: code (function, python), imperative (write)
```

## How it works

1. Client sends OpenAI/Anthropic/Gemini-format request to `localhost:4000`.
2. LiteLLM Proxy parses; `SmartRouterHook.async_pre_call_hook` intercepts.
3. If `model` is a `smart/*` profile, the rule-based router scores the prompt and picks a concrete upstream model ID.
4. LiteLLM dispatches to the configured upstream — handling format conversion, streaming, tool calls, retries, etc.

## Attribution

The 14-dimension rule-based router in `smart_llm_router/router/` is ported from [ClawRouter](https://github.com/BlockRunAI/ClawRouter) (MIT). Format conversion and streaming come from [LiteLLM](https://github.com/BerriAI/litellm) (MIT).

## License

MIT
