Metadata-Version: 2.4
Name: colador
Version: 0.0.1
Summary: Strain your LLM context. Cut costs 70-95% by routing to cheaper models.
Author: Your Name
License: AGPL-3.0
Project-URL: Homepage, https://github.com/TheVibeMachine/colador
Keywords: llm,router,proxy,context-compression,coding-agent
Classifier: Development Status :: 1 - Planning
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.11
Description-Content-Type: text/markdown

# Colador

> *Spend less. Send less. Get smarter results.*

Colador is a local proxy that sits between your coding tools and your LLM backends. It strains context down to what matters, routes each task to the cheapest model that can handle it, and escalates to frontier models only when justified.

The name comes from Spanish: *colador* means strainer. It lets only the important parts through.

## The Problem

AI coding tools are expensive because they're wasteful. Cursor, Claude Code, Copilot, Aider — they all send bloated context to expensive models for every task. A function rename gets the same $75/M-token model as a concurrency bug. The result: $80–300/month in API costs, most of it on tasks a local model could handle in seconds.

## How Colador Fixes It

```
┌──────────┐     ┌──────────────┐     ┌──────────┐
│  Your    │     │              │     │  Local   │
│  Coding  │────▶│   Colador    │────▶│  Models  │
│  Tool    │     │  :5757       │     │  (free)  │
│          │     │              │     └──────────┘
└──────────┘     │              │
                 │  Strains     │     ┌──────────┐
                 │  Routes      │────▶│  Cloud   │
                 │  Escalates   │     │  APIs    │
                 │              │     │  (paid)  │
                 └──────────────┘     └──────────┘
```

**Strain.** Before any request leaves your machine, Colador compresses the context — extracting only relevant files, diffs, error output, and summaries. A typical 20K-token request becomes 2K tokens.

**Route.** A classifier assigns each task to a tier. Simple work (rename, find, test writing) stays local. Complex work (architecture, debugging, design) goes to frontier models. Most tasks are simple.

**Escalate.** For medium tasks, the local model does the work and a frontier model reviews a compressed diff. You pay frontier prices only for the review, not the execution.

## Quick Start

```bash
pip install colador
colador init          # generates config with detected local models
colador start         # launches proxy on localhost:5757
```

Point your tool at `http://localhost:5757/v1` and use it normally. No plugins, no IDE changes.

## How It Works With Your Tools

Colador implements the OpenAI-compatible API. Any tool that can set a custom base URL works out of the box:

```bash
# Aider
aider --openai-api-base http://localhost:5757/v1 --openai-api-key colador

# Claude Code
export ANTHROPIC_BASE_URL="http://localhost:5757"

# Continue.dev / Cursor / Copilot
# Set base URL to http://localhost:5757/v1 in settings
```

## Routing Policy

You can override routing with prefixes, or let Colador decide automatically:

```
@local rename getUserName to fetchUserName     → stays local
@review add pagination to /users endpoint      → local + supervisor review
@hard we're seeing race conditions in the queue → frontier model plans
```

Without a prefix, the classifier picks the tier based on the prompt.

| Task | Tier | What Happens |
|---|---|---|
| Find where X is implemented | Local | Local model only |
| Rename this method | Local | Local model only |
| Write tests for this module | Local | Local model only |
| Add a feature with clear spec | Review | Local writes, frontier reviews |
| Refactor across a few modules | Review | Local writes, frontier reviews |
| Architecture decision | Hard | Frontier plans, local executes |
| Debug a concurrency issue | Hard | Frontier plans, local executes |

## Configuration

```yaml
# ~/.colador/config.yaml

backends:
  local:
    url: "http://localhost:11434/v1"
    model: "qwen3-coder-next"
    api_key: "ollama"

  supervisor:
    provider: "anthropic"
    model: "claude-sonnet-4-20250514"
    api_key_env: "ANTHROPIC_API_KEY"

routing:
  default_tier: "auto"
  classifier: "hybrid"

  tier1:
    backend: "local"
  tier2:
    worker: "local"
    reviewer: "supervisor"
    max_review_tokens: 3000
  tier3:
    planner: "supervisor"
    executor: "local"
```

## No Local Models? Still Saves Money

You don't need Ollama or local hardware. Colador works with any combination of backends:

```yaml
backends:
  cheap:
    provider: "openrouter"
    model: "google/gemma-4"      # ~$0.10/M tokens

  smart:
    provider: "anthropic"
    model: "claude-sonnet-4"     # ~$15/M tokens
```

Context compression alone cuts your spend significantly, even when both backends are cloud APIs.

## Transparency

Every routing decision is logged locally. See what was sent, why it was sent, and what it cost:

```bash
colador logs              # recent routing decisions
colador stats             # aggregate savings
```

```json
{
  "timestamp": "2026-04-12T14:30:00Z",
  "tier": "TIER_1",
  "reason": "prompt matched rule: 'rename'",
  "backend": "local",
  "tokens_in": 342,
  "tokens_out": 128,
  "estimated_cost_usd": 0.0,
  "latency_ms": 890
}
```

## Project Docs

| Document | Purpose |
|---|---|
| [product.md](product.md) | Vision, positioning, differentiators |
| [architecture.md](architecture.md) | System design, modules, schemas, data flow |
| [plan.md](plan.md) | Phased build plan with deliverables |
| [agents.md](agents.md) | Rules for contributors (human and AI) |

## Contributing

Colador is open source. Before contributing, read `agents.md` — it covers the file header rule, module boundaries, and coding conventions.

Every file must start with a comment answering: **WHY** does this file exist, **WHAT** does it do, **HOW** is it used. If you can't write the header, the file probably shouldn't exist.

## License

TBD

## Status

Pre-release. Under active development. See `plan.md` for the current phase.
