Metadata-Version: 2.4
Name: aquila
Version: 0.3.0
Summary: Deploy, manage, and monitor vLLM instances across a GPU cluster from a single web dashboard.
Author-email: Marc Schlichting <mschl@stanford.edu>
License: MIT
Project-URL: Homepage, https://github.com/sisl/aquila
Keywords: vllm,llm,gpu,cluster,inference,deployment
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown

# Aquila

GPU inference cluster manager. Deploy, manage, and monitor vLLM instances across a GPU cluster from a single web dashboard.

## Quick Start

```bash
pip install aquila
```

**On each GPU node (satellite):**

```bash
aquila client install
aquila client start
```

**On the management server (host):**

```bash
aquila host install
aquila host start
```

Then open the dashboard at `http://<host-ip>:5173` to deploy and monitor models across your cluster.

## Features

- Web dashboard for deploying and managing vLLM models across multiple GPU nodes
- Runs each model in the official `vllm/vllm-openai` Docker container — no per-node CUDA/PyTorch setup; the version maps to an image tag
- Live GPU utilization metrics and deployment status monitoring
- Consul-based automatic node discovery
- Support for custom pip packages (via cached derived images) and vLLM plugins
