Metadata-Version: 2.4
Name: litellm-wzrd-momentum
Version: 0.2.2
Summary: Your LiteLLM config, but smarter. Routes to the model the ecosystem is converging on.
Author-email: twzrd <twzrd@twzrd.xyz>
License-Expression: MIT
Project-URL: Homepage, https://github.com/twzrd-sol/litellm-wzrd-momentum
Project-URL: Documentation, https://github.com/twzrd-sol/litellm-wzrd-momentum#readme
Project-URL: Signal API, https://api.twzrd.xyz/v1/signals/momentum
Project-URL: WZRD Protocol, https://twzrd.xyz
Keywords: litellm,llm,routing,momentum,wzrd,ai,model-selection
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: httpx>=0.24
Provides-Extra: litellm
Requires-Dist: litellm>=1.0; extra == "litellm"
Provides-Extra: requests
Requires-Dist: requests>=2.28; extra == "requests"
Provides-Extra: rewards
Requires-Dist: wzrd-client>=0.5.5; extra == "rewards"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: wzrd-client>=0.5.5; extra == "dev"

# litellm-wzrd-momentum

**Your LiteLLM config, but smarter.** Routes to the model the ecosystem is converging on — not the one you hardcoded six months ago.

```bash
pip install litellm-wzrd-momentum
```

```python
from litellm import Router
from wzrd_momentum_strategy import register

router = Router(model_list=[...])  # your existing config
register(router)                    # done — routing is now dynamic
```

That's it. Your `router.completion()` calls now prefer models with accelerating real-world adoption. If WZRD is unreachable, your existing config takes over. Nothing breaks.

## Why

You hardcode `"gpt-4o"` or `"claude-sonnet-4-20250514"`. A new model surges past both on HuggingFace and OpenRouter. You don't notice for weeks. Meanwhile, the new model is 3x cheaper and faster for your workload.

This plugin tracks which models are trending RIGHT NOW across HuggingFace downloads, GitHub stars, OpenRouter routing volume, and ArtificialAnalysis benchmarks. It re-ranks your existing model list every 5 minutes. No new models injected — just smarter ordering of what you already configured.

## Full example

```python
from litellm import Router
from wzrd_momentum_strategy import register

router = Router(model_list=[
    {"model_name": "qwen-9b",  "litellm_params": {"model": "openrouter/qwen/qwen-3.5-9b"}},
    {"model_name": "qwen-35b", "litellm_params": {"model": "openrouter/qwen/qwen-3.5-35b-a3b"}},
    {"model_name": "llama-70b","litellm_params": {"model": "openrouter/meta-llama/llama-3.3-70b-instruct"}},
])

register(router, alias_map={
    "qwen-9b":  ["Qwen/Qwen3.5-9B"],
    "qwen-35b": ["Qwen/Qwen3.5-35B-A3B"],
    "llama-70b": ["meta-llama/Llama-3.3-70B-Instruct"],
})

# Sync
response = router.completion(model="qwen-9b", messages=[{"role": "user", "content": "Hello"}])

# Async
# response = await router.acompletion(model="qwen-9b", messages=[...])
```

## How it works

1. On each routing decision, fetches [WZRD momentum signals](https://api.twzrd.xyz/v1/signals/momentum) (cached 5 min)
2. Scores each deployment: `trend + momentum × 0.3 + delta × 0.25`, weighted by confidence
3. Returns the highest-scoring deployment to LiteLLM
4. LiteLLM handles retries, fallbacks, and provider errors as normal

If WZRD is unreachable, returns the first deployment. Your inference pipeline never breaks.

## Behavior defaults

- `cache_ttl=300` seconds (5 minutes)
- confidence policy:
  - `normal`: full signal weight (eligible for proactive routing)
  - `low`: half signal weight (observe-first posture)
  - `insufficient`: zero signal weight (observe-only; no proactive push)
- fallback policy: if WZRD is down or payload contract drifts, route by deployment order (first candidate)
- contract guard: requires `contract_version` (or legacy `signal_version`) and model-level fields
  (`model`, `trend`, `score`, `confidence`)

## Score table

| Trend | Score | Signal |
|-------|-------|--------|
| surging | +3.0 | Downloads/stars growing >50% day-over-day |
| accelerating | +2.0 | Growing 10-50% day-over-day |
| stable | 0.0 | Flat or <10% growth |
| decelerating | -1.0 | Slowing 5-30% day-over-day |
| cooling | -2.0 | Dropping >30% day-over-day |

Confidence scaling: `normal` = full weight, `low` = 50%, `insufficient` = 0% (new models with <3 days of data).

## Alias mapping

WZRD tracks models by HuggingFace/GitHub name (`Qwen/Qwen3.5-9B`).
LiteLLM uses provider-specific names (`openrouter/qwen/qwen-3.5-9b`).

The `alias_map` bridges them explicitly. Without it, the strategy auto-matches
by extracting slugs from `litellm_params.model` — works for most cases, but
explicit mapping is more reliable.

```python
register(router, alias_map={
    "qwen-9b": ["Qwen/Qwen3.5-9B", "Qwen/Qwen3-9B"],  # multiple variants
    "llama-70b": ["meta-llama/Llama-3.3-70B-Instruct"],
})
```

## Proxy integration

LiteLLM's proxy doesn't support custom strategies via YAML config.
For proxy deployments, create a wrapper script:

```python
# wzrd_proxy.py
import litellm
from litellm import Router
from wzrd_momentum_strategy import register

# Your normal proxy config
router = Router(model_list=[...])
register(router, alias_map={...})

# Start proxy with the patched router
from litellm.proxy.proxy_server import app
```

Or use the pre-router pattern from `integrations/litellm-wzrd-router/` which
works as middleware before any LiteLLM call (SDK or proxy).

## Manual setup

If you prefer explicit control over the `register()` convenience:

```python
from wzrd_momentum_strategy import WZRDMomentumStrategy

strategy = WZRDMomentumStrategy(
    router,
    wzrd_url="https://api.twzrd.xyz/v1/signals/momentum",
    alias_map={"qwen-9b": ["Qwen/Qwen3.5-9B"]},
    cache_ttl=300,
)
router.set_custom_routing_strategy(strategy)
```

## API

The momentum data comes from a public, free, no-auth endpoint:

```
GET https://api.twzrd.xyz/v1/signals/momentum
GET https://api.twzrd.xyz/v1/signals/momentum?platform=huggingface&trending=true
```

Returns trend classification, score, confidence, action, capabilities,
and platform for 48+ tracked AI models.

## Expected output (live sample)

For a candidate set like `qwen-9b`, `nemotron-120b`, `llama-70b`, expected behavior is:

- route to `nemotron-120b` when it is `surging`
- deprioritize `qwen-9b` when `decelerating`
- deprioritize `llama-70b` when `cooling`

The exact winner changes as momentum updates, but routing should follow trend
and confidence consistently.

## v0.1.0 release notes

- Added LiteLLM `CustomRoutingStrategyBase` plugin with one-line registration helper
- Added trend + momentum + delta scoring with confidence weighting
- Added explicit alias map matching and automatic fallback matching from provider model slugs
- Added contract guard for WZRD payload shape (`signal_version` + required model fields)
- Added graceful degradation fallback to first deployment when WZRD is unavailable
- Added test suite coverage for scoring order, confidence behavior, matching paths, async routing,
  caching behavior, register helper, and payload contract guard

## License

MIT
