Metadata-Version: 2.4
Name: langchain-router
Version: 0.1.0
Summary: Cost-aware model routing for LangChain agents based on task phase
Project-URL: Repository, https://github.com/johanity/langchain-router
Project-URL: Issues, https://github.com/johanity/langchain-router/issues
Project-URL: Twitter, https://x.com/LangChain
Project-URL: Slack, https://www.langchain.com/join-community
Author: Johan
License: MIT
License-File: LICENSE
Keywords: agent,cost-optimization,langchain,middleware,model-routing
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: <4.0.0,>=3.10.0
Requires-Dist: langchain-core<2.0.0,>=1.2.21
Requires-Dist: langchain<2.0.0,>=1.2
Provides-Extra: lint
Requires-Dist: ruff<0.16.0,>=0.15.0; extra == 'lint'
Provides-Extra: test
Requires-Dist: hypothesis<7,>=6.100; extra == 'test'
Requires-Dist: pytest-asyncio<1,>=0.24; extra == 'test'
Requires-Dist: pytest<9,>=8; extra == 'test'
Provides-Extra: typing
Requires-Dist: mypy<2.0.0,>=1.10.0; extra == 'typing'
Description-Content-Type: text/markdown

# langchain-router

[![PyPI](https://img.shields.io/pypi/v/langchain-router?label=%20)](https://pypi.org/project/langchain-router/)
[![License](https://img.shields.io/pypi/l/langchain-router)](https://opensource.org/licenses/MIT)
[![CI](https://github.com/johanity/langchain-router/actions/workflows/ci.yml/badge.svg)](https://github.com/johanity/langchain-router/actions)

Your agent doesn't need the expensive model for every call.

Most calls are just the model picking which file to read next or which pattern to search for. A smaller model does that fine. This middleware detects when the agent is doing that kind of work and routes to a fast model automatically.

<p align="center">
  <img src="assets/diagram.svg" alt="Phase based routing" width="700"/>
</p>

## Quick Install

```bash
pip install langchain-router
```

## 🤔 What is this?

Agent sessions have a pattern. The user says something, the agent thinks about it (planning). Then it reads files, searches code, runs commands (execution). Sometimes something breaks (recovery). Then the user says something again.

Planning and recovery need the primary model. Execution doesn't. RouterMiddleware detects which phase the agent is in and routes accordingly.

| What just happened | Phase | Model |
|---|---|---|
| User spoke | planning | primary |
| Tool call succeeded | execution | **fast** |
| Tool call failed | recovery | primary |

On a [simulated 18-call session](examples/benchmark.py), 83% of calls route to the fast model.

```python
from langchain.agents import create_agent
from langchain_router import RouterMiddleware

agent = create_agent(
    model="anthropic:claude-sonnet-4-6",
    tools=[...],
    middleware=[RouterMiddleware(fast="anthropic:claude-haiku-4-5-20251001")],
)
```

### With CollapseMiddleware

```python
from langchain_collapse import CollapseMiddleware

middleware = [
    CollapseMiddleware(),
    RouterMiddleware(fast="anthropic:claude-haiku-4-5-20251001"),
]
```

```mermaid
flowchart TB
    A["📥 37 messages"] --> B["CollapseMiddleware"]
    B --> C["📥 9 messages"]
    C --> D["RouterMiddleware"]
    D --> E{"phase?"}
    E --> |"execution  ·  83%"| F["⚡ Haiku"]
    E --> |"planning"| G["🧠 Sonnet"]
    E --> |"recovery"| G

    style A fill:#ff6b6b,stroke:#e03131,color:#fff
    style B fill:#339af0,stroke:#1c7ed6,color:#fff
    style C fill:#339af0,stroke:#1c7ed6,color:#fff
    style D fill:#51cf66,stroke:#2f9e44,color:#fff
    style E fill:#fff3bf,stroke:#f59f00,color:#333
    style F fill:#20c997,stroke:#099268,color:#fff
    style G fill:#845ef7,stroke:#7048e8,color:#fff
```

### On false positives

The error heuristic checks for `error`, `traceback`, `exception`, `failed` in tool output. Code containing those words (like `def handle_error`) routes to the primary model. That's the safe direction: more capability than needed, never less.

## 📖 Documentation

- [Source](langchain_router/__init__.py) (single file, ~170 lines)
- [Benchmark](examples/benchmark.py) (simulated session with cost breakdown)
- [Tests](tests/) (unit tests + property based invariant tests)

## 💁 Contributing

```bash
git clone https://github.com/johanity/langchain-router.git
cd langchain-router
pip install -e ".[test]"
pytest
```

## 📕 License

MIT
