Metadata-Version: 2.4
Name: krabby-bench
Version: 0.1.0
Summary: Bench watchdog: polls ECR, runs smoke tests, and alerts on failure
Author-email: Your Name <your.email@example.com>
License-Expression: Apache-2.0
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: boto3>=1.26.0
Requires-Dist: requests>=2.31.0
Requires-Dist: tomli>=2.0.0; python_version < "3.11"

# krabby-bench

Bench watchdog for the Krabby locomotion stack. Polls ECR for new `mainline-latest` digests, runs a firmware smoke test when one appears, and alerts on failure.

## Install

```bash
pip install krabby-bench
```

## Config

Default config path: `/etc/krabby-bench/config.toml`

```toml
[ecr]
repo = "632914961627.dkr.ecr.us-east-1.amazonaws.com/krabby-locomotion"
tag = "mainline-latest"
region = "us-east-1"
poll_interval = 60          # seconds

[smoke]
firmware_channel = "mainline"
run_hal_check = false       # set true to also start/stop container and check telemetry

[alert]
mode = "email"              # "email" | "github" | "both"
dedup_window = 3600         # suppress repeat alerts for the same failure within this window (seconds)

[smtp]
host = "smtp.example.com"
port = 587
user = "user@example.com"
password = "secret"
from = "krabby-bench@example.com"
to = "oncall@example.com"

[github]
repo = "owner/krabby-research"
token = "ghp_..."
```

## Smoke test

For each new digest the watchdog:

1. Runs `krabby firmware update <channel>` (flashes all three boards).
2. Runs `krabby firmware show` and parses the version strings.
3. Asserts all three boards report the same version.
4. Fetches `https://krabby-firmware-public.s3.amazonaws.com/<channel>/latest.json` and checks the reported version matches the S3 manifest.

## systemd

Copy the unit and timer files, then enable:

```bash
sudo cp bench/systemd/krabby-bench.service /etc/systemd/system/
sudo cp bench/systemd/krabby-bench.timer   /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now krabby-bench
```

Monitor:

```bash
journalctl -fu krabby-bench
```

## Force a failure (test alert path)

Unplug one Mega, then push a dummy commit to `mainline` to trigger a new ECR digest. Within one poll cycle the watchdog should detect the new digest, run the smoke test, and fire an alert.

To reset: replug the board, let the next poll cycle pass, confirm no new alert.

## State file

`/var/lib/krabby-bench/state.json` — persists the last-tested digest and last-alert metadata. Delete it to force a re-test on the next poll.
