Metadata-Version: 2.4
Name: xpyd-proxy
Version: 1.3.0
Summary: Lightweight Prefill-Decode proxy for disaggregated LLM serving
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi>=0.110.0
Requires-Dist: uvicorn>=0.29.0
Requires-Dist: uvloop>=0.19.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: httpx>=0.27.0
Requires-Dist: aiohttp>=3.9.0
Requires-Dist: requests>=2.31.0
Requires-Dist: colorlog>=6.8.0
Requires-Dist: transformers>=4.38.0
Requires-Dist: prometheus-client>=0.20.0
Requires-Dist: PyYAML>=6.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23.0; extra == "dev"
Requires-Dist: ruff>=0.4.0; extra == "dev"
Requires-Dist: isort>=5.13.0; extra == "dev"
Requires-Dist: tiktoken>=0.7.0; extra == "dev"
Requires-Dist: pytest-timeout>=2.3.0; extra == "dev"
Dynamic: license-file

# xPyD-proxy

**Lightweight Prefill-Decode disaggregated proxy for LLM serving.**

xPyD-proxy routes inference requests between prefill and decode nodes, enabling PD-disaggregated LLM serving with load balancing, health monitoring, and fault tolerance.

## Key Features

- **PD disaggregation** — separate prefill and decode nodes for optimal resource utilization
- **Multiple scheduling policies** — round-robin, consistent hash, cache-aware, power-of-two
- **Resilience** — circuit breaker, health monitoring, automatic failover
- **Multi-model routing** — serve multiple models through a single proxy
- **OpenAI-compatible API** — drop-in replacement for vLLM/OpenAI endpoints
- **YAML configuration** — declarative topology and settings

## Install

```bash
pip install xpyd-proxy
```

Or as part of the full xPyD toolkit:

```bash
pip install xpyd
```

## Quick Start

```bash
# Start with YAML config
xpyd proxy --config proxy.yaml

# Or with CLI args
xpyd proxy --model my-model \
  --prefill 127.0.0.1:8001 \
  --decode 127.0.0.1:8002
```

## Part of xPyD

| Component | Description |
|-----------|-------------|
| **xpyd-proxy** | PD-disaggregated proxy |
| [xpyd-sim](https://github.com/xPyD-hub/xPyD-sim) | OpenAI-compatible inference simulator |
| [xpyd-bench](https://github.com/xPyD-hub/xPyD-bench) | Benchmarking & planning tool |

📖 **[Full Guide →](docs/guide.md)** | 💡 **[Examples →](examples/)** | 🏗️ **[Contributing →](CONTRIBUTING.md)**

## License

Apache 2.0 — see [LICENSE](LICENSE)
