Metadata-Version: 2.4
Name: isagellm-control-plane
Version: 0.5.2.2
Summary: sageLLM Control Plane - Intelligent request routing, scheduling, and engine lifecycle management
Author: IntelliStream Team
License: Proprietary - IntelliStream
Project-URL: Homepage, https://github.com/IntelliStream/sagellm-control-plane
Project-URL: Repository, https://github.com/IntelliStream/sagellm-control-plane
Keywords: llm,inference,control-plane,scheduling,routing,autoscaling
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: ==3.11.*
Description-Content-Type: text/markdown
Requires-Dist: isagellm-protocol<0.6.0,>=0.5.2.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: httpx>=0.24.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-benchmark>=4.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: pytest-timeout>=2.3.0; extra == "dev"
Requires-Dist: ruff>=0.8.0; extra == "dev"
Requires-Dist: isage-pypi-publisher>=0.2.0; extra == "dev"

# sageLLM Control Plane

## Protocol Compliance (Mandatory)

- MUST follow Protocol v0.1:
  https://github.com/intellistream/sagellm-docs/blob/main/docs/specs/protocol_v0.1.md
- Any globally shared definitions (fields, error codes, metrics, IDs, schemas) MUST be added to
  Protocol first.

[![CI Status](https://github.com/intellistream/sagellm-control-plane/actions/workflows/ci.yml/badge.svg)](https://github.com/intellistream/sagellm-control-plane/actions/workflows/ci.yml)
[![PyPI version](https://badge.fury.io/py/isagellm-control-plane.svg)](https://badge.fury.io/py/isagellm-control-plane)
[![Python Versions](https://img.shields.io/pypi/pyversions/isagellm-control-plane.svg)](https://pypi.org/project/isagellm-control-plane/)
[![License](https://img.shields.io/badge/License-Proprietary-red.svg)](LICENSE)
[![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)

**Intelligent request routing, scheduling, and engine lifecycle management for sageLLM.**

## 职责定位

Control Plane 位于用户/Gateway 与执行引擎之间，负责：

- 注册与健康管理引擎
- 请求路由与调度（含 PD 分离场景）
- 负载均衡与基础扩缩容钩子

## 依赖关系

**PyPI 包名**：`isagellm-control-plane` ｜ **导入命名空间**：`sagellm_control`

依赖（以 pyproject.toml 为准）：

- isagellm-protocol>=0.4.0.0,\<0.5.0
- pydantic>=2.0.0
- httpx>=0.24.0

可选（用于本地直接执行引擎）：

- isagellm-core（提供 `LLMEngine`）

被以下组件使用（示例）：

- sagellm（统一入口/CLI）
- sagellm-gateway（API 网关）

## 安装指南

```bash
pip install isagellm-control-plane
```

**Requirements**: Python 3.10+

## 快速开始（CPU-first，可运行）

```python
import asyncio

from sagellm_control import ControlPlaneManager, EngineState, ExecutionInstanceType
from sagellm_protocol import Request


async def main() -> None:
    cp = ControlPlaneManager(scheduling_policy="fifo", routing_strategy="least_loaded", mode="local")

    cp.register_engine(
        engine_id="engine-001",
        model_id="Qwen2-7B",
        host="localhost",
        port=8001,
        engine_kind="llm",
        metadata={"instance_type": ExecutionInstanceType.GENERAL.value},
    )
    cp.update_engine_state("engine-001", EngineState.READY)

    req = Request(
        request_id="req-001",
        trace_id="trace-001",
        model="Qwen2-7B",
        prompt="Hello",
        max_tokens=16,
        stream=False,
    )

    decision = await cp.schedule_request(
        request_id=req.request_id,
        trace_id=req.trace_id,
        model_id=req.model,
        prompt=req.prompt,
        max_tokens=req.max_tokens,
    )
    print(decision)

    cp.unregister_engine("engine-001")


if __name__ == "__main__":
    asyncio.run(main())
```

完整演示见 [examples/mvp_integration_demo.py](examples/mvp_integration_demo.py)。

## Scheduler IR 模块使用说明

Control Plane 内置 Scheduler IR（Intermediate Representation）模块，用于把请求调度过程表达为可优化图结构：

- `IRBuilder`：把请求构建为 `SchedulerIR`（Task/Prefill/Decode 节点 + 依赖边）
- `IROptimizer`：执行可插拔优化 Pass（如 `KVReusePass`、`ComputeCommOverlapPass`）
- `DefaultIRExecutor`：把 IR 翻译为执行命令并通过 Control Plane/Engine Client 执行

可直接从根包导入：

```python
from sagellm_control import IRBuilder, IROptimizer, DefaultIRExecutor
```

示例程序（至少 3 个）：

- 基础构建：[examples/ir_basic_example.py](examples/ir_basic_example.py)
- 优化流程：[examples/ir_optimization_example.py](examples/ir_optimization_example.py)
- KV-aware 场景：[examples/ir_kv_aware_example.py](examples/ir_kv_aware_example.py)

运行示例：

```bash
python examples/ir_basic_example.py
python examples/ir_optimization_example.py
python examples/ir_kv_aware_example.py
```

## API 文档（核心接口）

- `ControlPlaneManager`
  - `register_engine()` / `unregister_engine()` / `list_engines()`
  - `schedule_request()`
  - `execute_request()` / `stream_request()`
  - `get_embeddings()`
- `EngineClient`（HTTP 调用执行引擎）
- `LocalEngineClient`（本地直接调用引擎）
- `Scheduler IR`：`SchedulerIR`, `IRBuilder`, `IROptimizer`, `DefaultIRExecutor`
- `SchedulingPolicy` 及内置策略（`FIFOPolicy`, `PriorityPolicy`, `SLOAwarePolicy`, `AdaptivePolicy`,
  `KVAwareSchedulingPolicy`）
- 关键类型：`EngineInfo`, `EngineState`, `SchedulingDecision`, `RequestPriority`, `RequestType`

## 架构图示

```mermaid
flowchart LR
    A[Gateway/Client] --> B[ControlPlaneManager]
    B --> C[SchedulingPolicy]
    B --> D[RequestRouter/LoadBalancer]
    B --> E[EngineLifecycleManager]
    D --> F[EngineClient (HTTP)]
    B --> G[LocalEngineClient]
    F --> H[Execution Engines]
    G --> H
```

## 代码结构

```
sagellm_control/
├── types.py             # Core data types (EngineInfo, SchedulingDecision, etc.)
├── policies/            # Scheduling policies (FIFO, Priority, SLO-aware, Adaptive)
├── router.py            # Request routing and load balancing
├── lifecycle.py         # Engine lifecycle management
├── scaling.py           # Scaling manager (MVP hooks)
├── engine_client.py     # HTTP client to engines
├── local_engine_client.py # Local (in-process) engine client
├── ir/                  # Scheduler IR (types/builder/optimizer/executor)
└── manager.py           # ControlPlaneManager
```

## 开发指南

```bash
git clone git@github.com:intellistream/sagellm-control-plane.git
cd sagellm-control-plane
./quickstart.sh

# 或手动安装
pip install -e ".[dev]"
```

运行测试：

```bash
pytest tests/ -v
```

Lint/格式化：

```bash
ruff format .
ruff check . --fix
```

提交流程：

- 创建 Issue
- 在 `fix/#123-xxx` 分支开发
- 提交 PR 到 `main-dev`

## 版本信息

- 当前版本：0.5.0.0
- 变更记录：[CHANGELOG.md](CHANGELOG.md)

### Related Repositories

- [sagellm](https://github.com/intellistream/sagellm) - Umbrella 包 + CLI
- [sagellm-protocol](https://github.com/intellistream/sagellm-protocol) - 协议定义
- [sagellm-backend](https://github.com/intellistream/sagellm-backend) - 后端抽象
- [sagellm-gateway](https://github.com/intellistream/sagellm-gateway) - API 网关

______________________________________________________________________

## License

Proprietary - IntelliStream
