Metadata-Version: 2.4
Name: perf-guard
Version: 0.3.1
Summary: Performance regression monitor for ML inference projects
Author: huxie
License: MIT
Project-URL: Homepage, https://github.com/huxie/perf-guard
Project-URL: Repository, https://github.com/huxie/perf-guard
Project-URL: Issues, https://github.com/huxie/perf-guard/issues
Keywords: benchmark,performance,regression,ml,inference,monitoring
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Testing
Classifier: Topic :: System :: Benchmark
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click>=8.0
Requires-Dist: pyyaml>=6.0
Dynamic: license-file

# perf-guard

`perf-guard` is a zero-intrusion command-line tool that turns the result
directories of your existing benchmark / eval scripts into a versioned
performance history. Record snapshots, compare against one or many baselines,
and plot long-term trends.

## Highlights

- **Zero intrusion.** Does not run your benchmarks; only parses the output.
- **Config-driven.** All metric extraction is declared in `perf_guard.yaml`.
- **Multi-baseline compare.** Compare the current run against `baseline`,
  `last_week`, `main`, or any tag in one command.
- **Trend charts.** Render the whole `.perf_history/` as an interactive HTML
  chart, or print ASCII sparklines in the terminal.
- **CI-ready.** Exits with code `1` on regression so pipelines fail
  automatically.

## Install

```bash
pip install perf-guard
```

## Quick start

1. Drop a `perf_guard.yaml` at the project root:

   ```yaml
   results_dir: ./results

   metrics:
     - name: total_eval_time
       file: summary.txt
       pattern: "Eval time:\\s+([\\d.]+)s"
       threshold_pct: 5
       direction: lower_is_better

     - name: pusht_success_rate
       file: pusht/eval.log
       pattern: "success_rate=([\\d.]+)%"
       threshold_pct: 2
       direction: higher_is_better
   ```

2. Record your reference run and tag it:

   ```bash
   perf-guard record results/20260421_054042 --tag baseline
   ```

3. After your next run, compare and detect regressions:

   ```bash
   perf-guard compare                               # latest vs baseline
   perf-guard compare --base baseline --base last_week
   perf-guard compare --base main --current latest
   ```

## Commands

| Command | Purpose |
|---|---|
| `perf-guard record <dir> [--tag …]` | Extract metrics from a result dir and store a snapshot. |
| `perf-guard compare [--base …] [--current …]` | Compare current against one or more baselines. |
| `perf-guard list` | List all recorded snapshots. |
| `perf-guard report <ref>` | Show full metrics for one snapshot. |
| `perf-guard trend [--ref …] [--metric …] [-o path] [--ascii]` | Plot metric trends across history. |
| `perf-guard install-hook` | Install a git `post-commit` hook. |

### Ref resolution

Anywhere a `<ref>` is accepted, the following are valid:

- `latest` — the most recently recorded snapshot
- A user tag (created with `record --tag`)
- A dirname (e.g. `20260421_054042`)
- `yesterday`, `last_week`, `last_month` — closest snapshot to that time
- `<N>d_ago` or `<N>_days_ago` — closest snapshot to N days ago
- `latest~<N>` — N records before the latest
- `HEAD`, `HEAD~N`, branch names, tags, short SHAs — any git revision whose
  commit has been recorded (see *Git integration* below)

### Git integration

When `perf-guard record` runs inside a git working tree, each snapshot also
stores the current commit:

```json
"git": {
  "sha": "a1b2c3d",
  "sha_full": "a1b2c3d4e5f6...",
  "branch": "main",
  "dirty": false,
  "subject": "jepa: fix fp16 cache alignment",
  "author": "huxie",
  "committed_at": "2026-04-21T05:12:00+08:00"
}
```

This unlocks git-style ref resolution for every command that takes a `<ref>`:

```bash
perf-guard compare --base HEAD~1         # vs the previous commit's run
perf-guard compare --base HEAD~5         # vs 5 commits ago
perf-guard compare --base main           # vs latest main-branch run
perf-guard compare --base v1.0           # vs a tagged release
perf-guard compare --base a1b2c3d        # vs a specific short SHA
perf-guard report HEAD                   # detail view of the HEAD run
```

`perf-guard list` and `perf-guard report` also display the commit SHA, branch
and dirty flag so you can trace each snapshot back to its code state.

### Multi-baseline compare

```bash
perf-guard compare --base baseline --base last_week --current latest
```

Prints one table per baseline. Exits with `1` if any baseline shows a
regression.

### Trend chart

```bash
# HTML chart (Chart.js via CDN) — default
perf-guard trend -o trend.html

# Restrict to a few snapshots / metrics
perf-guard trend --ref baseline --ref dtype-fix --ref compile-opt \
                 --metric total_eval_time --metric pusht_success_rate

# ASCII sparklines in the terminal
perf-guard trend --ascii
```

## Exit codes

| Code | Meaning |
|---|---|
| 0 | Success, no regressions |
| 1 | Config error, ref not found, or regression detected |
| 2 | Result directory does not exist |

## Development

```bash
git clone https://github.com/huxie/perf-guard
cd perf-guard
pip install -e .
perf-guard --version
```

CI (`.github/workflows/ci.yml`) runs the CLI smoke-test matrix on Python
3.9–3.12 and builds/validates the distribution on every push and PR.

## Releasing

Releases are fully automated by `.github/workflows/publish.yml`. Any tag of
the form `v*` triggers:

1. build sdist + wheel, check with `twine check --strict`;
2. publish to **PyPI** via Trusted Publishing (no long-lived tokens);
3. create a GitHub Release with the matching `CHANGELOG.md` section and the
   built artifacts attached.

To cut a release:

```bash
scripts/release.sh 0.3.1
```

That script bumps the version in `pyproject.toml` and `perf_guard/__init__.py`,
inserts a CHANGELOG stub, asks for confirmation, then commits, tags and pushes.
The publish workflow takes it from there.

### One-time setup: PyPI Trusted Publishing

On [pypi.org](https://pypi.org/manage/account/publishing/) add a pending
publisher for the `perf-guard` project:

| Field | Value |
|---|---|
| Owner | your GitHub username / org |
| Repository | `perf-guard` |
| Workflow name | `publish.yml` |
| Environment | `pypi` |

Also create a GitHub **environment** named `pypi` in the repository settings.
No secrets or tokens are needed — authentication happens via short-lived OIDC
tokens minted per run.

## License

MIT
