Metadata-Version: 2.4
Name: xrouter-llm
Version: 0.1.1
Summary: Prompt-aware LLM routing-decision service: predicts which model can complete a prompt and picks the cheapest one.
Author: Xorbits Inc.
License: # Xagent Source License
        
        **Effective Date:** February 15, 2026
        
        Copyright © 2026 Xorbits Inc.
        
        ---
        
        ## 1. Overview
        
        The Xagent software, source code, and associated materials (the **“Software”**) are provided under this Xagent Source License (the **“License”**).
        
        This License provides source-available rights for use, modification, and internal commercial deployment, while restricting certain hosted/service and competitive uses.
        
        > **Note:** This License is **not** an OSI-approved open source license.
        
        ---
        
        ## 2. Acceptance
        
        By using, copying, modifying, distributing, or making available the Software, you agree to be bound by this License.
        
        ---
        
        ## 3. Grant of Rights
        
        Subject to the terms and conditions of this License, the licensor (**“Licensor”**) grants you a non-exclusive, worldwide, royalty-free, non-transferable, non-sublicensable license to:
        
        1. **Use** the Software;
        2. **Copy** the Software;
        3. **Modify** the Software and create derivative works;
        4. **Distribute** the Software (including derivative works) in source and/or object form; and
        5. **Deploy** the Software for internal business purposes.
        
        All rights not expressly granted are reserved.
        
        ---
        
        ## 4. Restrictions
        
        ### 4.1 Hosted / Managed Service Restriction
        
        Except as expressly permitted below, you may not provide the Software, or any **Restricted Functionality** of the Software, to any **Third Party** as a hosted service, managed service, or otherwise make it available for use over a network.
        
        This prohibition includes (without limitation):
        
        * offering the Software as “Xagent-as-a-Service” or a shared agent execution platform for multiple Third Parties;
        * providing multiple Third Parties access to a shared runtime, orchestration, execution, scheduling, workflow, or UI environment powered by the Software; or
        * operating a multi-tenant service in which Third Parties can create, run, manage, or monitor agents or workflows using the Software.
        
        ### Permitted Single-Tenant Deployment
        
        You may deploy and operate the Software on behalf of a single Third Party customer, provided that:
        
        1. the deployment is dedicated to that customer (single-tenant);
        2. the customer does not share access with other Third Parties;
        3. the Software is not offered as a generalized or reusable platform service to multiple customers;
        4. such deployment is limited to that specific customer’s internal use; and
        5. all Xagent trademarks, product names, copyright notices, and branding elements remain visible and unaltered within the Software and related user interfaces.
        
        Removal, replacement, white-labeling, or obscuring of Xagent branding in a single-tenant deployment is prohibited unless you have obtained a separate commercial license or written authorization from the Licensor.
        
        For clarity, internal deployment within your own organization and your Affiliated Entities is permitted.
        
        ### 4.2 Competitive Use Restriction
        
        You may not use the Software to develop, offer, or operate a product or service whose primary purpose is to provide an agent orchestration runtime or agent execution platform that competes directly with the Licensor’s commercial Xagent offering.
        
        ### 4.3 License Protection / Technical Restrictions
        
        You may not remove, disable, circumvent, or materially alter any license verification, usage limitation, feature gating, entitlement checking, or similar functionality included in the Software that is intended to enforce this License or commercial terms.
        
        ### 4.4 Notice and Attribution
        
        You may not alter, remove, or obscure any licensing, copyright, attribution, or other notices included in the Software.
        
        If you distribute a modified version of the Software, you must include prominent notices stating that you have modified the Software.
        
        ---
        
        ## 5. Trademarks
        
        This License does not grant you any rights to use the Licensor’s trademarks, service marks, trade names, logos, or product names (including **“Xagent”**), except as required for reasonable and customary use in describing the origin of the Software.
        
        ---
        
        ## 6. Patents
        
        The Licensor grants you a license under any patent claims the Licensor can license, or becomes able to license, to make, have made, use, sell, offer for sale, import, and have imported the Software, subject to the restrictions in this License.
        
        This patent license does not apply to any patent claims infringed by your modifications or additions.
        
        If you or your company make any written claim (including in a lawsuit or administrative proceeding) that the Software infringes or contributes to infringement of any patent, then your patent license under this License terminates immediately.
        
        ---
        
        ## 7. Distribution Conditions
        
        If you distribute any copy of the Software (modified or unmodified), you must ensure that recipients receive a copy of this License.
        
        ---
        
        ## 8. Termination and Reinstatement
        
        If you violate this License, your rights under this License terminate automatically.
        
        If the Licensor provides notice of the violation and you cure the violation within **30 days** of receiving notice, your rights will be reinstated retroactively.
        
        If you violate this License after reinstatement, your rights terminate automatically and permanently.
        
        ---
        
        ## 9. Disclaimer of Warranty
        
        TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED **“AS IS”**, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT.
        
        ---
        
        ## 10. Limitation of Liability
        
        TO THE MAXIMUM EXTENT PERMITTED BY LAW, IN NO EVENT WILL THE LICENSOR BE LIABLE FOR ANY DAMAGES ARISING OUT OF OR RELATING TO THIS LICENSE OR THE SOFTWARE, WHETHER IN CONTRACT, TORT, OR OTHERWISE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
        
        ---
        
        ## 11. Definitions
        
        **“Affiliated Entities”** means any entity that controls, is controlled by, or is under common control with you.
        
        **“Control”** means ownership of more than 50% of the voting power or equity interests, or the power to direct management or policies.
        
        **“Restricted Functionality”** means the core runtime and orchestration capabilities of the Software, including (without limitation):
        
        * agent orchestration and task execution runtime;
        * multi-agent coordination and scheduling;
        * workflow execution and planning engine;
        * tool integration runtime and connectors;
        * management UI used to create, run, manage, or monitor agents/workflows.
        
        **“Third Party”** means any person or entity other than you and your Affiliated Entities.
        
        **“You”** means the individual or entity exercising rights under this License.
        
        ---
        
        ## 12. Commercial Licensing
        
        If you wish to use the Software in a way not permitted under this License (including offering a hosted or managed service), you may obtain a commercial license from the Licensor.
        
        ---
        
        ## 13. Miscellaneous
        
        If any provision of this License is held unenforceable, the remaining provisions will remain in effect.
        
        This License is the entire agreement regarding the Software and supersedes any prior or contemporaneous agreements relating to the Software.
        
        ---
        
        **Version 1.0 — Effective February 15, 2026**
        
Project-URL: Homepage, https://github.com/xorbitsai/xrouter-llm
Project-URL: Repository, https://github.com/xorbitsai/xrouter-llm
Keywords: llm,router,routing,model-selection,irt,openrouter
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: huggingface-hub>=0.23
Requires-Dist: joblib>=1.3
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=2.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: scikit-learn>=1.3
Requires-Dist: scipy>=1.10
Provides-Extra: dev
Requires-Dist: pytest>=7.4; extra == "dev"
Dynamic: license-file

<div align="center">
<img src="./assets/xorbits-logo.png" width="180px" alt="xorbits" />

# xrouter-llm

<img src="./assets/xrouter-llm-hero-clean.png" alt="xrouter-llm: 52.4% lower cost and +1.7 pts completion on our tested dataset" />

</div>

Stop sending every prompt to your most expensive LLM.

`xrouter-llm` is a prompt-aware LLM **routing-decision** service: it predicts
which models can complete a prompt, then chooses the cheapest model that clears
the bar. On our tested dataset, it cuts realized cost by **52.4%** while
improving completion by **+1.7 pts**.

It answers "which model should serve this prompt?" and records the choice — it
does NOT call the underlying LLMs.

## Install

```bash
pip install xrouter-llm        # ships a trained router + model registry
# or, for development:
pip install -e ".[dev]"
```

The wheel bundles a trained router artifact, the model-profile registry, and the
router configs, so a fresh install can serve immediately with no extra files.

## Serve

The bundled router, registry, and configs are the defaults, so a bare invocation
works out of the box:

```bash
xrouter-llm serve --port 8080
```

Override any of them to use your own trained model or registry:

```bash
xrouter-llm serve \
  --model artifacts/models/irt_router_350k.joblib \
  --models-dir path/to/models --routers-dir path/to/routers \
  --db artifacts/calls.db --port 8080
```

- `GET /` — single-page UI (prompt box, config picker, decision table, history)
- `GET /api/configs`, `POST /api/route` (`{prompt, config, task?}`),
  `GET /api/history?limit=N`
- Every decision is logged to SQLite (`*.db`/`*.sqlite` are gitignored — the log
  holds user prompts).

## Model registry

One YAML per supported model, bundled under
`src/xrouter_llm/resources/config/models/` (capability profile: provider, costs,
context, published benchmarks as 0-100 percentages). `model_id` is the model's
canonical OpenRouter slug (e.g. `anthropic/claude-opus-4.8`). The bundled
registry is the default for `--benchmark-profiles`; point it at your own
directory or file to extend it. Add a model = add a file.

```python
from xrouter_llm import IRTRouter, default_model_path, default_models_dir, load_benchmark_profiles

router = IRTRouter.load(default_model_path())
for profile in load_benchmark_profiles(default_models_dir()).profiles():
    router.add_benchmark_profile(profile)

preds = router.predict(
    "Design a distributed consensus algorithm",
    model_ids=["anthropic/claude-opus-4.8", "deepseek/deepseek-v4-pro"],
)
print({p.model_id: round(p.mu, 3) for p in preds})
```

## How it works

```text
Do not train:  prompt -> selected model
Train:         prompt + model -> probability the model completes the prompt
Decide:        predicted completion + cost -> cheapest model that can complete
```

Completion is factored into two decoupled axes (an IRT-style model):

```text
P(complete) = sigmoid(a * capability(model) + b * difficulty(prompt) + c)
```

- **capability(model)** = the mean of the model's published `gpqa_diamond` and
  `livecodebench` (both full-coverage on the training side). Going wider doesn't
  help at this data scale — a flat mean dilutes and learned weights overfit at
  37 profiled models; see AGENTS.md "Capability benchmarks". Used directly, so a
  brand-new model's benchmarks drive its ranking.
- **difficulty(prompt)** = a Ridge regressor on a multilingual embedding
  (`Qwen/Qwen3-Embedding-0.6B`), trained on each prompt's empirical pass-rate.
  Multilingual (Chinese transfers from English training data). Picked over
  `bge-m3` by a controlled probe (`scripts/probe_qwen_difficulty.py`): higher
  held-out Pearson and it no longer rates trivial prompts ("1+1=?") as maximally
  hard.

This factoring is the key lesson: a single joint classifier could not rank
unseen models by their benchmarks (on this data, model capability barely
explains completion *marginally* — but it does once difficulty is controlled,
which is exactly what the factored model exploits).

## Datasets

The production difficulty model is trained on **multiple datasets combined**
(all feed the difficulty axis; only profiled models feed the capability axis):

| Source | Type | Scale | In production train? |
| --- | --- | --- | --- |
| `NPULH/LLMRouterBench` (350k stream sample) | single-turn QA / code / math (22 tasks) | 37 models x ~13.8k prompts | ✅ |
| agent-psychometrics — Terminal-Bench 2.0 | terminal agent | 89 tasks x 112 subjects | ✅ `--dataset agentic:agentic/terminalbench` |
| agent-psychometrics — SWE-bench Verified | coding agent | 500 tasks x 134 subjects | ✅ task text joined from `princeton-nlp/SWE-bench_Verified` |
| agent-psychometrics — SWE-bench Pro / GSO | coding agent | 730x14 / 102x15 | ⛔ ship no local task text, external join needed |

The current artifact trains on LLMRouterBench 350k **+ Terminal-Bench +
SWE-bench Verified** (377,997 rows / ~14,364 prompts / 283 subjects). The
agentic matrices come from
[agent-psychometrics](https://github.com/dariakryvosheieva/agent-psychometrics)
(MIT) via `agentic.py`. Only the 37 profiled llmrouterbench models feed the
capability axis; agentic subjects feed difficulty only. RouterBench
(`withmartian/routerbench`) remains a smaller legacy baseline. Local datasets and
trained artifacts are not committed (`data/`, `artifacts/` are gitignored).

Adding more agentic prompt types (e.g. your own traffic) is the only way to make
difficulty accurate for task mixes outside coding/terminal — see AGENTS.md.

## Train

```bash
xrouter-llm train-irt \
  --dataset llmrouterbench:data/raw/llmrouterbench_stream_sample_350k \
  --dataset agentic:agentic/terminalbench \
  --dataset agentic:agentic/swebench_verified \
  --benchmark-profiles artifacts/profiles/llmrouterbench_350k_profiles_priority_collected.json \
  --output artifacts/models/irt_router_350k.joblib
```

Diagnostics: `sweep-thresholds` (cost/completion frontier + calibration) and
`eval-model-holdout` (leave-one-model-out generalization).

## Components

- `IRTRouter` (`irt_router.py`): the predictor (difficulty x capability).
- `RoutingPolicy` (`policy.py`): "cheapest model whose predicted completion
  clears `completion_threshold`; else the cheapest within `fallback_quality_margin`
  of the best predicted completion".
- `serving.py` / `server.py`: HTTP routing-decision API + single-page web UI.
- `resources/config/models/`: a per-model YAML registry of capability profiles
  (bundled in the package; resolve with `default_models_dir()`).
- `resources/config/routers/`: named "auto configs" — a candidate model set +
  policy (bundled; `default_routers_dir()`).
- `resources/models/irt_router_350k.joblib`: the trained router shipped with the
  package (`default_model_path()`).

## License

`xrouter-llm` is released under the **Xagent Source License** (© Xorbits Inc.) —
see [LICENSE](LICENSE). It is source-available, **not** an OSI-approved open
source license.

The license text is shared verbatim with [Xagent](https://github.com/xorbitsai/xagent);
for this project the licensed "Software" is `xrouter-llm`, and the
"Restricted Functionality" / hosted-service and competitive-use clauses apply to
its routing-decision and model-selection capabilities. In short: use,
modification, and internal/single-tenant deployment are permitted; offering it as
a multi-tenant hosted/managed service, or a directly competing service, is not.
See [LICENSE](LICENSE) for the controlling terms.
