Metadata-Version: 2.4
Name: adrg
Version: 1.1.2
Summary: Aldertech Dynamic Resource Governor. Monitors host pressure and enforces cgroups v2 limits to Docker containers.
Author-email: Aldertech <hello@aldertech.uk>
Project-URL: Homepage, https://github.com/jaldertech/adrg
Project-URL: Bug Tracker, https://github.com/jaldertech/adrg/issues
Project-URL: Logo, https://raw.githubusercontent.com/jaldertech/adrg/main/assets/logo.png
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: POSIX :: Linux
Classifier: Environment :: Console
Classifier: Intended Audience :: System Administrators
Classifier: Topic :: System :: Systems Administration
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENCE
Requires-Dist: docker>=7.0.0
Requires-Dist: requests>=2.31.0
Requires-Dist: PyYAML>=6.0.1
Dynamic: license-file

<p align="center">
  <img src="https://raw.githubusercontent.com/jaldertech/adrg/main/assets/logo.png" alt="Aldertech Logo" width="200"/>
</p>

# ADRG — Aldertech Dynamic Resource Governor

> **"Why is Jellyfin stuttering every time my downloads start?"**
>
> ADRG fixes that — automatically, at the kernel level.

ADRG is a lightweight daemon that watches your home server's real-time resource pressure (CPU, memory, I/O, and temperature) and dynamically throttles background containers so your interactive services always feel snappy.

No more Jellyfin stuttering because Tdarr decided to transcode something. No more Home Assistant lag because Kopia kicked off a backup. ADRG handles it silently, in the background, every five seconds.

---

## The Problem

A home server running 20–50 Docker containers on a Raspberry Pi 5 or an Intel N100 is constantly fighting itself. When a background task like a transcoder, unpacker, or backup agent wakes up, it competes for the same CPU, disk I/O, and RAM as your media server and dashboards.

ADRG is specifically designed for enthusiasts managing **large personal media collections and legal backups** (e.g., digitized DVD/Blu-ray libraries). In these environments, high-bitrate video playback is the priority, but background maintenance tasks like library indexing or off-site backups can cause unacceptable stuttering.

The Linux kernel has no way of knowing that your 4K local media stream matters more than a background file compression task. It gives them equal weight — and your film buffers.

---

## The Solution

ADRG assigns your containers to **resource tiers** and enforces priorities using the Linux kernel's own cgroup v2 interface — the same mechanism Docker uses internally. No external agents, no wrappers, no polling overhead.

When ADRG detects a trigger (media playback, high temperature, memory pressure, or an I/O storm), it applies the appropriate constraints:

- **Tier 3 containers** (bulk tasks) are **paused** — completely frozen until the pressure clears.
- **Tier 2 containers** (background tasks) are **throttled** — CPU and I/O capped to a small fraction.
- **Tier 0/1 containers** are **untouched** — maximum priority, always.

When the trigger clears, everything is restored automatically.

---

## Features

- **Four enforcer rules:** Media Mode, Thermal Protection, Memory Pressure, I/O Saturation
- **PSI-based decisions:** Uses Linux Pressure Stall Information — far more accurate than load average
- **Media providers:** Jellyfin, Plex, or webhook (for any other source)
- **Download client throttle:** Automatically caps download speeds during media playback to preserve bandwidth for 4K streams.
- **Enthusiast-focused:** Designed for users with large personal, legally-owned media libraries (digitized DVDs/Blu-rays) who want a seamless, high-performance home server experience.
- **Glob pattern matching:** `tdarr*` in a tier matches `tdarr`, `tdarr_node`, `tdarr_node2`, etc.
- **Notifications:** Discord, NTFY, and Gotify — all optional, all configurable
- **HTTP API:** `/status` for live state, `/trigger` for external control (n8n, Home Assistant, scripts)
- **Dry-run mode:** Observe every decision without applying a single change
- **SIGHUP config reload:** Update `config.yaml` and reload without restarting the daemon
- **Designed for constrained hardware:** Pi 5, N100, and similar low-power multi-container hosts

---

## Requirements

| Requirement | Minimum | Notes |
|---|---|---|
| CPU | 4-core | ARM64 or x86_64 |
| RAM | 4GB | 8GB+ recommended for high-density stacks |
| Linux kernel | 4.20+ | For PSI support. 5.10+ recommended. |
| cgroup v2 | Required | Verify: `ls /sys/fs/cgroup/cgroup.controllers` |
| Docker | v20.10+ | Uses the Docker SDK via `/var/run/docker.sock` |
| Python | 3.9+ | |
| Root access | Required | cgroup writes require root or `CAP_SYS_ADMIN` |
| systemd-python | Optional* | *Required for host installs. See note below. |

> **Note on systemd-python:** With `Type=notify` and `WatchdogSec=30` in the service unit, `systemd-python` is **required** for a host (systemd) install. Without it, systemd will kill the daemon 30 seconds after launch due to missed watchdog pings. It is optional only for the Docker install path (where the service unit is not used).

### Enabling cgroup v2

Most modern distros (Debian Bookworm, Ubuntu 22.04+, Raspberry Pi OS Bookworm) ship with cgroup v2 enabled by default.

If it is not enabled, add the following to your kernel command line and reboot:

**Raspberry Pi** — edit `/boot/firmware/cmdline.txt`:
```
cgroup_no_v1=all
```

**Other systemd systems** — add to kernel parameters:
```
systemd.unified_cgroup_hierarchy=1
```

---

## Quick Start

### Option 1: Install via PyPI

You can install ADRG directly from PyPI.

```bash
# 1. Install the package
sudo pip3 install adrg --break-system-packages

# 2. Download the default config file
sudo mkdir -p /etc/adrg
sudo wget https://raw.githubusercontent.com/jaldertech/adrg/main/config.yaml -O /etc/adrg/config.yaml
sudo touch /etc/adrg/adrg.env

# 3. Edit your config — assign containers to tiers
sudo nano /etc/adrg/config.yaml

# 4. Add your API keys
sudo nano /etc/adrg/adrg.env

# 5. Run the daemon
sudo adrg
```
*(Note: To run as a background service, use the script in Option 2, or copy the `adrg.service` file from this repo).*

### Option 2: Install via Git Clone (Includes systemd service)

```bash
# 1. Clone the repository
git clone https://github.com/jaldertech/adrg.git
cd adrg

# 2. Run the installer (requires root)
sudo bash setup.sh

# 3. Edit your config — assign containers to tiers
sudo nano /etc/adrg/config.yaml

# 4. Add your API keys
sudo nano /etc/adrg/adrg.env

# 5. Restart the daemon
sudo systemctl restart adrg

# 6. Watch it work
journalctl -u adrg -f
```

---

## Configuration

The full annotated template is installed at `/etc/adrg/config.yaml`. The key sections are:

> **Important:** All container names shown below (e.g., `pihole`, `jellyfin`, `tdarr*`) are **examples based on successfully tested configurations**. ADRG is generic and will work with any Docker container on your system once you add its name or a matching glob pattern to a tier.

### Tiers

Assign your containers to tiers. Names support glob patterns (`tdarr*`).

```yaml
tiers:
  0:
    name: "Core Infra"           # Never touched. Ever.
    containers: ["pihole", "nginx-proxy-manager", "homeassistant"]
    cpu_weight: 1000
    io_weight: 1000

  1:
    name: "Interactive"          # High priority. Protected during pressure.
    containers: ["jellyfin", "komga", "homepage"]
    cpu_weight: 800
    io_weight: 800
    memory_high: "3G"
    memory_max: "4G"

  2:
    name: "Background"           # Throttled during media playback.
    containers: ["sonarr", "radarr", "prowlarr", "qbittorrent"]
    cpu_weight: 100
    io_weight: 100
    memory_high: "1.5G"
    memory_max: "2G"

  3:
    name: "Bulk"                 # Paused during media playback. Restarted during memory emergencies.
    containers: ["tdarr*", "unpackerr", "kopia"]
    cpu_weight: 10
    io_weight: 10
    memory_high: "2G"
    memory_max: "3G"
```

> **Note:** Tier 0 containers are **always implicitly protected** from any pause or restart action — you do not need to add them to `protected_containers` as well.

### Protected Containers

An explicit list of containers that ADRG will never pause, throttle, or restart, regardless of any pressure rule. Useful for containers in Tier 2/3 that you still want to safeguard (e.g. a database that a background worker writes to).

```yaml
protected_containers:
  - postgres
  - redis
```

### Media Mode

When active streams are detected, Tier 3 is paused and Tier 2 is throttled.

```yaml
media_mode:
  enabled: true
  provider: jellyfin        # jellyfin | plex | webhook | none
  url: "http://jellyfin:8096"
  api_key: "${ADRG_MEDIA_API_KEY}"
  tier2_cpu_max_percent: 20
  tier2_io_max_read_mb_per_sec: 10
  tier2_io_max_write_mb_per_sec: 5
  cooldown_seconds: 60

  download_throttle:        # Optional: cap download speed during playback
    enabled: true
    provider: qbittorrent
    url: "http://qbittorrent:8080"
    username: "admin"
    password: "${ADRG_QB_PASSWORD}"
    limit_mb_per_sec: 5
```

For `provider: webhook`, stream state is controlled externally via `POST /trigger` — useful for Plex users, Emby, or any custom trigger.

### Secrets

Store API keys in `/etc/adrg/adrg.env` (created by `setup.sh`):

```bash
ADRG_MEDIA_API_KEY=your_jellyfin_api_key
ADRG_DISCORD_WEBHOOK_URL=https://discord.com/api/webhooks/...
ADRG_NTFY_URL=https://ntfy.sh/your-topic
ADRG_QB_PASSWORD=your_qbittorrent_password
```

These are referenced in `config.yaml` as `${ADRG_MEDIA_API_KEY}` etc. and expanded at startup.

---

## The Four Rules

Rules are evaluated every 5 seconds (configurable) in priority order.

### 1. Thermal Protection

Reads the maximum temperature across all thermal zones. Two escalating stages with hysteresis on recovery.

| Threshold | Action |
|---|---|
| `warn_temp_c` (default 70°C) | Log warning only |
| `stage1_temp_c` (default 75°C) | Pause all Tier 3 containers |
| `stage2_temp_c` (default 80°C) | Pause Tier 2 + 3 containers |
| Below `recovery_temp_c` for `recovery_hold_seconds` | Restore all containers |

### 2. Memory Pressure

Uses Linux PSI (`/proc/pressure/memory`) for accurate, kernel-reported pressure rather than crude RSS polling.

| Trigger | Action |
|---|---|
| `some_avg10` > 50% | Squeeze `memory.high` on Tier 3 containers (Stage 1) |
| `some_avg60` > 40% | Restart highest-RSS Tier 3 container (Stage 2) |
| `full_avg10` > 25% | Emergency restart — escalates to Tier 2 if Tier 3 exhausted (Stage 3) |

### 3. I/O Pressure

Uses Linux PSI (`/proc/pressure/io`). When I/O is saturated, applies hard bandwidth caps to Tier 3 via `io.max`.

### 4. Media Mode

Polls your media server (or listens for a webhook trigger) and enforces playback-priority throttling on demand.

---

## CLI Reference

```bash
# Run the daemon
python3 adrg.py --config /etc/adrg/config.yaml

# Observe all decisions without applying any changes
python3 adrg.py --dry-run

# Validate your config and check which containers are running
python3 adrg.py --check-config

# Remove all active overrides and exit (called automatically by systemd on stop)
python3 adrg.py --cleanup

# Reload config without restarting the daemon
kill -HUP $(pidof adrg)
```

SIGHUP reloads tiers, media provider, and notifications. Changes to `download_throttle` or `http_server` require a daemon restart (`systemctl restart adrg`).

---

## HTTP API

ADRG exposes a lightweight HTTP server on `127.0.0.1:8765` by default (configurable).

### `GET /status`

Returns current governor state as JSON.

```bash
curl http://127.0.0.1:8765/status
```

```json
{
  "version": "1.0.0",
  "uptime_seconds": 3600.0,
  "dry_run": false,
  "media_mode_active": true,
  "media_provider": "jellyfin",
  "thermal_stage": 0,
  "memory_throttled": false,
  "io_throttled": false,
  "protected_containers": ["pihole"],
  "containers": {
    "tdarr": { "tier": 3, "paused_by": ["media_mode"], "cpu_max_by": [], "io_max_by": [] }
  }
}
```

### `POST /trigger`

Push an external event. Useful for Home Assistant automations, n8n workflows, or custom scripts.

```bash
# Force media mode on (e.g. from a Home Assistant automation when you sit down to watch something)
curl -X POST http://127.0.0.1:8765/trigger \
  -H "Content-Type: application/json" \
  -d '{"event": "media_start"}'

# Clear it when you're done
curl -X POST http://127.0.0.1:8765/trigger \
  -d '{"event": "media_stop"}'
```

| Event | Effect |
|---|---|
| `media_start` | Force media mode on (overrides provider polling) |
| `media_stop` | Clear the media mode override |
| `tier3_pause` | Manually pause all Tier 3 containers |
| `tier3_resume` | Resume all Tier 3 containers |

---

## Notifications

All backends are optional and independent. Configure any combination.

| Backend | Config key | Notes |
|---|---|---|
| Discord | `discord_webhook_url` | Standard Discord webhook URL |
| NTFY | `ntfy_url` + `ntfy_token` | Full topic URL, e.g. `https://ntfy.sh/my-topic` |
| Gotify | `gotify_url` + `gotify_token` | Base URL + application token |

---

## Security

### Host Install (systemd)

ADRG runs as root. It requires root to write cgroup control files — there is no privilege-separated alternative for this on Linux. The attack surface is limited to `config.yaml` and `adrg.env`. Both files are installed with `640` permissions (root-readable only) by `setup.sh`.

### Docker Install

```
--privileged
```

Running ADRG as a Docker container requires `--privileged`. This grants the container the same capabilities as a root process on the host — specifically, the ability to write to `/sys/fs/cgroup/`. This is a property of the Linux cgroup interface, not a flaw in ADRG.

**Recommendation:** Read the source before running any privileged container. ADRG is fully open source for this reason. The only files it writes to are cgroup control files under `/sys/fs/cgroup/` and its own log and state files.

### Docker Socket

ADRG mounts `/var/run/docker.sock` to pause, unpause, and restart containers. Anyone with access to the Docker socket has effective root on the host — this is a standard, well-understood trade-off for any Docker management tool.

---

## Verified Hardware

The following configuration has been verified to run ADRG in production:

| Hardware | OS | Status |
| :--- | :--- | :--- |
| **Raspberry Pi 5 (16GB RAM)** | Raspberry Pi OS Bookworm (64-bit) | ✅ Verified with 48 containers running simultaneously |

## Architecture & Compatibility
ADRG is built on generic Linux kernel interfaces (cgroup v2 and PSI). While developed and battle-tested on the Raspberry Pi 5, the codebase is architecturally agnostic and is designed to work on any 64-bit Linux system (x86_64 or ARM64) meeting the [Requirements](#requirements).

ADRG should work on any Linux system meeting the requirements. If you get it running on other hardware, feel free to open a PR to add it to this table.

---

## Licence

MIT — see `LICENCE` file.

---

*Built by [Aldertech](https://aldertech.uk)*
