Metadata-Version: 2.4
Name: tool-rot
Version: 0.1.0
Summary: Semantic tool routing for MCP. Stop injecting all schemas. Inject only what matters.
Author: tool-rot contributors
License-Expression: Apache-2.0
Keywords: mcp,llm,agents,tools,routing,claude
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Requires-Python: <3.13,>=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: mcp>=1.0.0
Requires-Dist: faiss-cpu>=1.7.0
Requires-Dist: onnxruntime>=1.16.0
Requires-Dist: tokenizers>=0.15.0
Requires-Dist: httpx>=0.25.0
Requires-Dist: aiohttp>=3.9.0
Requires-Dist: click>=8.1.0
Requires-Dist: rich>=13.0.0
Requires-Dist: anyio>=4.0.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: scikit-learn>=1.3.0
Provides-Extra: dev
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"
Provides-Extra: sse
Requires-Dist: starlette>=0.37.0; extra == "sse"
Requires-Dist: uvicorn>=0.29.0; extra == "sse"
Dynamic: license-file

# tool-rot

<p align="center">
  <img src="assets/tool-rot-logo.svg" alt="tool-rot logo" width="220">
</p>

**Semantic tool routing for MCP. Keep every tool callable. Stop sending every schema.**

`tool-rot` is a transparent MCP proxy for teams running many MCP servers. It indexes all upstream tools locally, returns only the most relevant tool schemas on each turn, and forwards tool calls to the original server unchanged.

```bash
$ tool-rot bench --k 3

  Queries tested       100
  K (tools injected)   3
  Recall@K             91.3%   (correct tool was in top-3)
  Avg token reduction  94.6%
  Avg routing latency  8.7ms
  Index build time     420ms
```

## Why This Exists

MCP tool schemas are useful context, but they get expensive fast. A few servers can add thousands of schema tokens before the model sees the user's actual request. Most of those tools are irrelevant on any given turn.

That creates three problems:

- **Cost:** repeated schema tokens are paid for every turn.
- **Attention:** the model has to scan unused tools before solving the task.
- **Accuracy:** large tool menus make wrong tool selection more likely.

`tool-rot` treats tool schemas like a retrieval problem. The full toolset stays available behind the proxy, while the model only sees the schemas that are likely to matter right now.

## What It Guarantees

- **Execution is never blocked by routing.** Filtering only affects `tools/list`; `tools/call` still forwards to the upstream server.
- **Tool names are namespaced.** Exposed tools use MCP-compliant `server.tool` names, so duplicate names across servers do not collide.
- **Resources and prompts pass through.** Tool schemas are routed; resources and prompts are aggregated and forwarded without semantic filtering.
- **Routing is local.** MiniLM ONNX embeddings and FAISS run locally; no routing API key is required.
- **Schemas are slimmed.** Nonessential display metadata is removed before indexing and injection.
- **Misses are observable.** Tool-call hits and misses are logged, and `status` can recommend a higher `k`.

## Install

```bash
pip install tool-rot
```

For SSE transport support:

```bash
pip install "tool-rot[sse]"
```

## Quick Start

One-line Cursor integration:

```bash
pip install tool-rot && tool-rot init cursor --filesystem .
```

This creates `tool-rot.toml` and `.cursor/mcp.json` with local filesystem MCP routing enabled.

Create `tool-rot.toml`:

```toml
[proxy]
transport     = "stdio"
k             = 3
max_k         = 8
auto_tune_k   = false
always_inject = ["filesystem.read_file"]

[routing]
embedder     = "all-MiniLM-L6-v2"
index_type   = "flat"
ranking_mode = "hybrid"
cache_dir    = ".tool-rot"

[logging]
session_log = ".tool-rot/session.jsonl"
verbose     = false

[[server]]
name    = "filesystem"
command = "npx"
args    = ["-y", "@modelcontextprotocol/server-filesystem", "."]

[[server]]
name    = "github"
command = "npx"
args    = ["-y", "@modelcontextprotocol/server-github"]
env     = { GITHUB_TOKEN = "${GITHUB_TOKEN}" }
```

Generate client config:

```bash
# Claude Code
tool-rot cc-snippet

# Cursor
tool-rot cursor-snippet
```

Then start using your MCP client normally. The client connects to `tool-rot`; `tool-rot` connects to your real MCP servers.

## Architecture

```text
MCP client
  |
  | tools/list, tools/call
  v
tool-rot proxy
  |
  | list_tools / call_tool
  +--> filesystem MCP
  +--> github MCP
  +--> slack MCP
  +--> ...

Routing sidecar:
  tool schemas -> schema slimming -> embeddings -> FAISS index
  user query   -> hybrid semantic + lexical retrieval -> top-K schemas
```

Turn flow:

1. A client asks for `tools/list`.
2. `tool-rot` routes against the current query context.
3. It returns the top-K namespaced schemas plus `always_inject` tools.
4. If the model calls a tool, `tool-rot` forwards the call to the matching upstream server.
5. Hits, misses, token estimates, and latency are logged to `.tool-rot/session.jsonl`.

Claude Code can provide per-turn query context through the packaged `tool-rot-hook`. Other hosts can POST `{"query": "..."}` to `http://127.0.0.1:4748/query` before `tools/list`. Without query context, `tool-rot` falls back to default routing.

## CLI

```bash
# Start the proxy
tool-rot serve --config tool-rot.toml

# Show session stats and K recommendations
tool-rot status
tool-rot status --json

# Benchmark Recall@K and token reduction
tool-rot bench --k 3
tool-rot bench --tools tools.json --queries queries.jsonl --k 3 --json

# Compare routing against all-tools and compression baselines
tool-rot eval-report --k 3
tool-rot eval-report --k 3 --output eval.json

# Regenerate real open-source MCP savings reports
tool-rot mcp-smoke-report --output-dir reports

# Inspect or clear the local index
tool-rot index show
tool-rot index show --server github
tool-rot index rebuild

# Generate client snippets
tool-rot cc-snippet
tool-rot cursor-snippet
```

See `docs/CLIENT_SETUP.md` for exact Cursor and Claude Code configuration examples.

## Benchmark And Eval

`bench` answers: "If I inject K tools, how often is the correct tool included?"

```bash
tool-rot bench --k 3
```

Custom eval files:

```json
[
  {
    "server_name": "github",
    "name": "create_pull_request",
    "description": "Create a pull request in a GitHub repository.",
    "inputSchema": {
      "type": "object",
      "properties": {
        "repo": { "type": "string" },
        "title": { "type": "string" }
      }
    }
  }
]
```

```jsonl
{"query":"open a pull request for this branch","correct_tool":"github.create_pull_request"}
```

`eval-report` compares:

- `all_tools`: sends every schema, 100% tool visibility, no token reduction.
- `compression`: slims every schema, still sends all tools.
- `semantic`: vector retrieval only.
- `hybrid`: vector retrieval plus lexical/BM25-like scoring.

Use this before rolling out a large MCP setup. Real tool names, descriptions, and query patterns matter.

## Real MCP Results

These results were measured against official open-source MCP servers from npm. Token counts use `len(minified_json) // 4`, so treat them as comparable estimates rather than provider billing numbers.

| MCP setup | Direct tools | tool-rot tools | Tokens saved | Reduction |
| --- | ---: | ---: | ---: | ---: |
| Filesystem | 14 | 3 | 2,094 | 79.2% |
| Memory | 9 | 2 | 1,736 | 75.5% |
| Sequential Thinking | 1 | 1 | 10 | 0.9% |
| Everything | 13 | 2 | 1,267 | 87.3% |
| Combined official stack | 37 | 7 | 5,436 | 71.6% |

Full reports live in `reports/`. The single-tool Sequential Thinking server is a useful negative control: routing cannot save much when a server only exposes one tool.

To reproduce these results, run:

```bash
tool-rot mcp-smoke-report --output-dir reports
```

See `docs/REPRODUCING_RESULTS.md` for details.

## Configuration Reference

```toml
[proxy]
transport     = "stdio"          # stdio | sse
port          = 4747             # SSE listener port
context_api_port = 4748          # Local query-context API
context_api_token = ""           # Optional bearer token; env TOOL_ROT_CONTEXT_TOKEN also works
k             = 3                # Initial tools injected per turn
max_k         = 8                # Upper bound for auto-tuned K
auto_tune_k   = false            # Increase effective K after routing misses
always_inject = ["server.tool"] # Tools always returned by tools/list

[routing]
embedder     = "all-MiniLM-L6-v2" # all-MiniLM-L6-v2 | tfidf
index_type   = "flat"             # flat | hnsw
ranking_mode = "hybrid"           # hybrid | semantic
cache_dir    = ".tool-rot"

[logging]
session_log = ".tool-rot/session.jsonl"
verbose     = false
log_query_preview = false        # Avoid logging prompt text by default

[[server]]
name    = "local-server"
command = "npx"
args    = ["-y", "@modelcontextprotocol/server-name"]
env     = { API_KEY = "${API_KEY}" }

[[server]]
name = "remote-server"
url  = "https://example.com/sse"
```

## Production Notes

- Use namespaced `always_inject` values such as `filesystem.read_file`.
- Keep tool descriptions short but specific; routing quality depends on schema quality.
- Start with `k = 3`, run `bench`, then adjust based on Recall@K and token savings.
- Enable `auto_tune_k` when correctness matters more than maximum token savings.
- Watch `.tool-rot/session.jsonl` or `tool-rot status` for routing misses.
- Upstream `notifications/tools/list_changed` events trigger a tool refresh and index rebuild.
- See `docs/ROUTING_GUIDE.md` for recommended `k` values by tool count.

## Client Setup

### Claude Code

```bash
tool-rot cc-snippet
```

Add the generated MCP config and hook config to `.claude/settings.json`. The hook sends the latest user query to the local context API so routing is accurate from the first turn.

### Cursor

```bash
tool-rot cursor-snippet
```

Add the generated JSON to `.cursor/mcp.json` or your user-level Cursor MCP config. Cursor can use `tool-rot` as a normal MCP server over `stdio` or `sse`.

## Development

```bash
python3.12 -m venv .venv
.venv/bin/python -m pip install -e ".[dev,sse]"
.venv/bin/python -m pytest tests -q
```

Focused checks:

```bash
.venv/bin/python -m pytest tests/test_prod_hardening.py tests/test_p0_p1.py -q
.venv/bin/python -m compileall tool_rot tests -q
```

Release steps are documented in `docs/RELEASE_CHECKLIST.md`.

## Status

`tool-rot` is intended for local and team MCP workflows where large tool menus are creating measurable context overhead. It is designed to be conservative: route schemas aggressively, but keep execution passthrough intact.

Known limits:

- Query context is best when the host can send the latest user message before `tools/list`.
- Resource and prompt passthrough is aggregation-only; semantic routing is currently tools-only.
- HNSW is available as a config option, but you should benchmark it on your own tool corpus before using it for large deployments.

## License

Apache-2.0
