Metadata-Version: 2.4
Name: isagellm-control-plane
Version: 0.1.0
Summary: sageLLM Control Plane - Intelligent request routing, scheduling, and engine lifecycle management
Author: IntelliStream Team
License: Proprietary - IntelliStream
Project-URL: Homepage, https://github.com/IntelliStream/sagellm-control-plane
Project-URL: Repository, https://github.com/IntelliStream/sagellm-control-plane
Keywords: llm,inference,control-plane,scheduling,routing,autoscaling
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: isagellm-protocol<0.2.0,>=0.1.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: httpx>=0.24.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: ruff>=0.8.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Provides-Extra: gpu
Requires-Dist: pynvml>=11.0.0; extra == "gpu"
Provides-Extra: metrics
Requires-Dist: prometheus-client>=0.17.0; extra == "metrics"
Provides-Extra: all
Requires-Dist: isagellm-control-plane[gpu,metrics]; extra == "all"

# sageLLM Control Plane

**Intelligent request routing, scheduling, and engine lifecycle management for sageLLM.**

## Features

- 🎯 **Scheduling Policies** - FIFO, Priority, SLO-aware, Cost-optimized, Adaptive
- ⚖️ **Load Balancing** - Intelligent request routing across multiple engine instances
- 📈 **Autoscaling** - SLA-based autoscaling for Prefill/Decode instances
- 🔄 **Engine Lifecycle** - Spawn, stop, health check, auto-restart
- 📊 **Observability** - Metrics collection, performance monitoring
- 🧩 **Parallelism** - TP, PP, DP, EP strategy optimization

## Installation

```bash
pip install isagellm-control-plane

# With GPU monitoring
pip install isagellm-control-plane[gpu]

# With Prometheus metrics
pip install isagellm-control-plane[metrics]
```

## Quick Start

```python
from sagellm_control import ControlPlaneManager

# Create manager with mock mode (no GPU required)
manager = ControlPlaneManager(
    scheduling_policy="adaptive",
    routing_strategy="load_balanced",
    mode="local",  # Use local async executor
)

# Register a mock engine
manager.register_engine(
    engine_id="engine-001",
    model_id="mock-model",
    host="localhost",
    port=8000,
)

# Schedule a request
decision = await manager.schedule_request(
    request_id="req-001",
    prompt="Hello, world!",
    max_tokens=128,
)

print(f"Scheduled to: {decision.instance_id}")
```

## Architecture

```
sagellm_control/
├── types.py           # Core data types (RequestMetadata, EngineInfo, etc.)
├── strategies/        # Scheduling policies (FIFO, Priority, SLO, etc.)
├── executors/         # Execution coordinators (HTTP, LocalAsync, Mock)
├── router.py          # Request routing and load balancing
├── autoscaler.py      # SLA-based autoscaling
├── parallelism.py     # Parallelism strategy optimization
├── manager.py         # Main ControlPlaneManager
└── engine_lifecycle.py # Engine lifecycle management
```

## Mock-First Development

All modules support mock mode for testing without GPU:

```python
from sagellm_control.executors import MockExecutionCoordinator

# Use mock executor for CI/CD
executor = MockExecutionCoordinator()
result = await executor.execute(request)
```

## License

Proprietary - IntelliStream
