Metadata-Version: 2.4
Name: perf-guard
Version: 0.2.0
Summary: Performance regression monitor for ML inference projects
Author: huxie
License: MIT
Project-URL: Homepage, https://github.com/huxie/perf-guard
Project-URL: Repository, https://github.com/huxie/perf-guard
Project-URL: Issues, https://github.com/huxie/perf-guard/issues
Keywords: benchmark,performance,regression,ml,inference,monitoring
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Testing
Classifier: Topic :: System :: Benchmark
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click>=8.0
Requires-Dist: pyyaml>=6.0
Dynamic: license-file

# perf-guard

`perf-guard` is a zero-intrusion command-line tool that turns the result
directories of your existing benchmark / eval scripts into a versioned
performance history. Record snapshots, compare against one or many baselines,
and plot long-term trends.

## Highlights

- **Zero intrusion.** Does not run your benchmarks; only parses the output.
- **Config-driven.** All metric extraction is declared in `perf_guard.yaml`.
- **Multi-baseline compare.** Compare the current run against `baseline`,
  `last_week`, `main`, or any tag in one command.
- **Trend charts.** Render the whole `.perf_history/` as an interactive HTML
  chart, or print ASCII sparklines in the terminal.
- **CI-ready.** Exits with code `1` on regression so pipelines fail
  automatically.

## Install

```bash
pip install perf-guard
```

## Quick start

1. Drop a `perf_guard.yaml` at the project root:

   ```yaml
   results_dir: ./results

   metrics:
     - name: total_eval_time
       file: summary.txt
       pattern: "Eval time:\\s+([\\d.]+)s"
       threshold_pct: 5
       direction: lower_is_better

     - name: pusht_success_rate
       file: pusht/eval.log
       pattern: "success_rate=([\\d.]+)%"
       threshold_pct: 2
       direction: higher_is_better
   ```

2. Record your reference run and tag it:

   ```bash
   perf-guard record results/20260421_054042 --tag baseline
   ```

3. After your next run, compare and detect regressions:

   ```bash
   perf-guard compare                               # latest vs baseline
   perf-guard compare --base baseline --base last_week
   perf-guard compare --base main --current latest
   ```

## Commands

| Command | Purpose |
|---|---|
| `perf-guard record <dir> [--tag …]` | Extract metrics from a result dir and store a snapshot. |
| `perf-guard compare [--base …] [--current …]` | Compare current against one or more baselines. |
| `perf-guard list` | List all recorded snapshots. |
| `perf-guard report <ref>` | Show full metrics for one snapshot. |
| `perf-guard trend [--ref …] [--metric …] [-o path] [--ascii]` | Plot metric trends across history. |
| `perf-guard install-hook` | Install a git `post-commit` hook. |

### Ref resolution

Anywhere a `<ref>` is accepted, the following are valid:

- `latest` — the most recently recorded snapshot
- A user tag (created with `record --tag`)
- A dirname (e.g. `20260421_054042`)
- `yesterday`, `last_week`, `last_month` — closest snapshot to that time
- `<N>d_ago` or `<N>_days_ago` — closest snapshot to N days ago
- `latest~<N>` — N records before the latest

### Multi-baseline compare

```bash
perf-guard compare --base baseline --base last_week --current latest
```

Prints one table per baseline. Exits with `1` if any baseline shows a
regression.

### Trend chart

```bash
# HTML chart (Chart.js via CDN) — default
perf-guard trend -o trend.html

# Restrict to a few snapshots / metrics
perf-guard trend --ref baseline --ref dtype-fix --ref compile-opt \
                 --metric total_eval_time --metric pusht_success_rate

# ASCII sparklines in the terminal
perf-guard trend --ascii
```

## Exit codes

| Code | Meaning |
|---|---|
| 0 | Success, no regressions |
| 1 | Config error, ref not found, or regression detected |
| 2 | Result directory does not exist |

## Development

```bash
git clone https://github.com/huxie/perf-guard
cd perf-guard
pip install -e .
```

Build and publish to PyPI:

```bash
pip install build twine
python -m build
python -m twine upload dist/*
```

## License

MIT
