Metadata-Version: 2.4
Name: spark-pulse
Version: 1.2.0
Summary: Web UI for spark-vllm-docker — recipe management, deployment, and monitoring for DGX Spark
Maintainer-email: Alexander Kharkevich <alex@kharkevich.org>
License-Expression: MIT
Project-URL: homepage, https://github.com/kharkevich-engineering-lab/spark-pulse
Project-URL: repository, https://github.com/kharkevich-engineering-lab/spark-pulse
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi>=0.115.0
Requires-Dist: uvicorn[standard]>=0.34.0
Requires-Dist: sse-starlette>=2.2.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: pydantic>=2.0
Requires-Dist: click>=8.0
Requires-Dist: authlib>=1.3.0
Requires-Dist: httpx>=0.28.0
Provides-Extra: dev
Requires-Dist: black>=24.8.0; extra == "dev"
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: httpx>=0.28; extra == "dev"
Provides-Extra: mcp
Requires-Dist: mcp>=1.0.0; extra == "mcp"
Dynamic: license-file

# Spark Pulse

Web UI for [spark-vllm-docker](https://github.com/eugr/spark-vllm-docker) — recipe management, deployment, and monitoring for NVIDIA DGX Spark hardware.

**License:** [MIT](LICENSE) — Copyright © 2026 Kharkevich Engineering Lab

> **Disclaimer:** This project is not sponsored by, endorsed by, or affiliated
> with NVIDIA Corporation or any of its subsidiaries. NVIDIA, DGX, and NVIDIA
> trademarks are property of their respective owners.

## Features

- **Recipe Browser** — Browse available vLLM deployment recipes, view model details, configs, and mods
- **Job Management** — Launch deployments with custom parameters, monitor status, view live logs via SSE, stop jobs
- **Memory Monitor** — Real-time GPU memory, CPU RAM, and disk usage via SSE streaming
- **Cache Manager** — Browse and clean HF model cache, wheels, ccache, Triton cache, and more
- **Settings** — Configure spark-vllm-docker path, default container, GPU memory utilization
- **OIDC Authentication** — Configurable authentication (Keycloak, Auth0, Google, Azure AD, etc.)
- **Dark/Light Theme** — Switch between dark, light, or system theme (persisted in localStorage)
- **Custom Modals** — Beautiful confirmation dialogs and error alerts
- **MCP Server** — Expose spark-vllm-docker operations as Model Context Protocol tools for AI assistants

## Architecture

```
┌─────────────┐     HTTP / SSE     ┌──────────────────────┐
│   Browser    │ ◄───────────────► │  FastAPI + Uvicorn   │
│  React SPA   │                    │  Port 8100           │
└─────────────┘                    │                      │
                                   │  spark-vllm-docker/  │
                                   │  Docker, nvidia-smi  │
                                   └──────────────────────┘
```

## Project Structure

```
spark-pulse/
├── pyproject.toml           # Python package config (setuptools)
├── README.md
├── LICENSE
│
├── spark_pulse/             # Python package
│   ├── __init__.py          # Package init + version
│   ├── app.py               # FastAPI app factory (creates app, mounts SPA)
│   ├── cli.py               # CLI: spark-pulse start|install|mcp|...
│   ├── config.py            # Config loader (YAML + env vars)
│   ├── service.py           # Systemd service management
│   ├── auth.py              # OIDC authentication middleware + routes
│   ├── mcp_server.py        # MCP (Model Context Protocol) server
│   ├── config.yaml          # Default configuration
│   ├── routers/             # API route modules
│   │   ├── recipes.py           # Recipe listing + detail
│   │   ├── deployments.py       # Deployment CRUD + launch/stop
│   │   ├── memory.py            # GPU/CPU/disk stats
│   │   ├── cache.py             # Cache scanning + cleanup
│   │   └── settings.py          # Settings CRUD
│   ├── tools/               # Tool abstractions (real/mock swap)
│   │   ├── __init__.py          # Factory: real vs mock
│   │   ├── system.py            # nvidia-smi, free, df
│   │   ├── cache.py             # cache dir scanning
│   │   └── recipes.py           # YAML recipe parsing
│   ├── mock/                # Simulation mode implementations
│   │   ├── system.py              # DGX Spark 128GB mock
│   │   ├── cache.py               # Plausible cache sizes
│   │   ├── recipes.py             # 6 realistic recipes
│   │   └── deployments.py         # In-memory tracker
│   ├── ui/                  # Built frontend (generated by `npm run build`)
│   │   └── index.html       # Vite build output
│   └── data/                # Persistent state (deployments.json)
│
├── web/                     # Frontend (separate npm project)
│   ├── package.json         # Vite + React + TypeScript + Tailwind
│   ├── vite.config.ts       # Builds to ../spark_pulse/ui/
│   ├── tsconfig.json
│   └── src/
│       ├── main.tsx             # Entry point
│       ├── App.tsx              # Router + AuthProvider
│       ├── lib/
│       │   ├── types.ts         # TypeScript types
│       │   ├── api.ts           # API client + SSE hooks
│       │   ├── utils.ts         # Formatting helpers
│       │   ├── theme.ts         # Theme state manager
│       │   └── auth.ts          # Auth context
│       ├── hooks/
│       │   └── useQuery.ts      # Data fetching hook
│       ├── components/
│       │   ├── Layout.tsx       # Sidebar nav + theme toggle
│       │   ├── StatusBadge.tsx  # Status indicators
│       │   └── Modal.tsx        # Confirm + Alert modals
│       └── pages/
│           ├── RecipesPage.tsx
│           ├── JobsPage.tsx
│           ├── MemoryPage.tsx
│           ├── CachePage.tsx
│           └── SettingsPage.tsx
│
└── scripts/
    ├── run-dev-server.sh      # Full dev (venv + frontend + backend)
    ├── run-backend.sh         # Backend only (simulation)
    ├── run-production.sh      # Production mode
    └── build-ui.sh            # Build frontend only
```

## Installation

### As a Python package

#### From PyPI

```bash
# Install from PyPI
python3 -m pip install spark-pulse

# Optional MCP extras
python3 -m pip install "spark-pulse[mcp]"
```

If you install with user-site packages (`python3 -m pip install --user spark-pulse`),
ensure your local bin directory is in `PATH`:

```bash
export PATH="$HOME/.local/bin:$PATH"
```

#### From local source

```bash
# Install from local repo
pip install -e .

# Or with MCP support
pip install -e ".[mcp]"
```

This provides the `spark-pulse` CLI command.

### First-time setup

```bash
# Install Python dependencies (from repo root)
pip install -e ".[dev]"

# Build frontend
cd web && npm install && npm run build && cd ..

# Configure (edit spark_pulse/config.yaml or use Settings page)
# Default spark_vllm_path: /tmp/spark-vllm-docker
```

## Usage

### Quick start (development)

```bash
# One command — venv + frontend deps + backend (sim) + frontend dev server
./scripts/run-dev-server.sh
# → Frontend: http://localhost:3000
# → Backend:  http://localhost:8100 (+ /docs Swagger UI)
```

### Simulation mode (local dev without DGX)

Set `SIMULATION_MODE=1` to get mock data for all APIs — useful for frontend
development on macOS or any machine without a DGX Spark:

```bash
# Full dev (includes frontend dev server)
./scripts/run-dev-server.sh

# Backend only (simulation)
./scripts/run-backend.sh
# → http://localhost:8100

# Production mode (real tools, needs spark-vllm-docker + nvidia-smi)
./scripts/run-production.sh

# Just build frontend into spark_pulse/ui/
./scripts/build-ui.sh
```

### Systemd service (production)

After installing from PyPI or source, register the service in one of these modes.

#### System scope (root-managed)

```bash
# Install and start system unit (requires sudo)
sudo spark-pulse install

# Start/stop/status
sudo spark-pulse start-service
sudo spark-pulse stop-service
spark-pulse status

# Unregister unit
sudo spark-pulse uninstall
```

#### User scope (no sudo)

```bash
# Install and start user unit
spark-pulse install --user

# Start/stop/status for user unit
spark-pulse start-service --user
spark-pulse stop-service --user
spark-pulse status --user

# Unregister user unit
spark-pulse uninstall --user
```

User-mode unit files are written under `~/.config/systemd/user/` and env files under
`~/.config/spark-pulse/`. System-mode files are written under `/etc/systemd/system/`
and `/etc/`.

To keep user services running after logout, enable lingering:

```bash
loginctl enable-linger "$USER"
```

### MCP Server (AI Assistants)

Expose spark-vllm-docker operations as MCP tools:

```bash
# Start MCP server (stdio mode for AI assistants)
spark-pulse mcp
```

Exposed tools:
- `list_recipes` — list available deployment recipes
- `get_recipe` — get recipe details by name
- `create_deployment` — launch a recipe deployment
- `stop_deployment` — stop a running deployment
- `list_deployments` — list all deployments with status
- `get_memory` — get GPU/CPU/disk stats
- `list_cache` — list cache directories with sizes
- `clean_cache` — clean specified cache directories
- `get_logs` — get deployment logs

### OIDC Authentication

Configure authentication in `spark_pulse/config.yaml`:

```yaml
auth_enabled: true
oidc_provider_url: https://keycloak.example.com/realms/myrealm
oidc_client_id: spark-manager
oidc_client_secret: your-secret-here
```

Or via environment variables:
```bash
export SPARK_PULSE_AUTH_ENABLED=true
```

When enabled:
- All routes are protected except `/health`, `/auth/*`
- Users are redirected to the OIDC provider on `/auth/login`
- After successful auth, users see their name and a logout button in the header

### Environment variables

Override config via environment:

```bash
export SPARK_VLLM_PATH=/path/to/spark-vllm-docker
export WEBUI_PORT=8100
export SPARK_PULSE_AUTH_ENABLED=true
```

## API Reference

| Method | Path | Description |
|--------|------|-------------|
| `GET` | `/api/recipes` | List all recipes |
| `GET` | `/api/recipes/{id}` | Recipe details |
| `GET` | `/api/deployments` | List deployments |
| `POST` | `/api/deployments` | Create + launch |
| `DELETE` | `/api/deployments/{id}` | Stop deployment |
| `GET` | `/api/deployments/{id}/logs` | Recent logs |
| `GET` | `/api/memory` | All memory stats |
| `GET` | `/api/memory/gpu` | GPU stats |
| `GET` | `/api/cache` | Cache dirs + sizes |
| `POST` | `/api/cache/clean` | Clean caches |
| `GET` | `/api/settings` | Current settings |
| `PUT` | `/api/settings` | Update settings |
| `GET` | `/auth/login` | Redirect to OIDC login |
| `GET` | `/auth/callback` | OIDC callback |
| `GET` | `/auth/me` | Current user info |
| `GET` | `/` | SPA entry point |

## Building for distribution

```bash
# Build frontend
cd web && npm run build && cd ..

# Build Python package
python -m build
```

The built frontend is bundled into the wheel via `package-data: ui/**/*`.
