Metadata-Version: 2.4
Name: django-pg-baseline
Version: 0.1.0
Summary: Manage a baseline pg_dump for Django test databases — turn N-minute migrate boots into a few-second psql import.
Project-URL: Homepage, https://github.com/iplweb/django-pg-baseline
Project-URL: Repository, https://github.com/iplweb/django-pg-baseline
Project-URL: Issues, https://github.com/iplweb/django-pg-baseline/issues
Project-URL: Changelog, https://github.com/iplweb/django-pg-baseline/blob/main/CHANGELOG.md
Author-email: Michał Pasternak <michal.dtz@gmail.com>
License: MIT
License-File: LICENSE
Keywords: baseline,django,migrations,pg_dump,postgres,postgresql,pytest,testcontainers,testing
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Web Environment
Classifier: Framework :: Django
Classifier: Framework :: Django :: 5.0
Classifier: Framework :: Django :: 5.1
Classifier: Framework :: Django :: 5.2
Classifier: Framework :: Pytest
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: POSIX
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Testing
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: django>=5.0
Requires-Dist: testcontainers[postgres]>=4.14.2
Provides-Extra: test
Requires-Dist: psycopg[binary]>=3.1; extra == 'test'
Requires-Dist: pytest-django>=4.8; extra == 'test'
Requires-Dist: pytest>=7; extra == 'test'
Description-Content-Type: text/markdown

# django-pg-baseline

[![tests](https://github.com/iplweb/django-pg-baseline/actions/workflows/tests.yml/badge.svg)](https://github.com/iplweb/django-pg-baseline/actions/workflows/tests.yml)
[![PyPI version](https://img.shields.io/pypi/v/django-pg-baseline.svg)](https://pypi.org/project/django-pg-baseline/)
[![Python versions](https://img.shields.io/pypi/pyversions/django-pg-baseline.svg)](https://pypi.org/project/django-pg-baseline/)
[![Django versions](https://img.shields.io/badge/Django-5.0%20%7C%205.1%20%7C%205.2-blue.svg)](https://pypi.org/project/django-pg-baseline/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

> Manage a baseline `pg_dump` for Django test databases — turn N-minute
> `migrate` boots into a few-second `psql` import.

A reusable Django app that manages a `baseline.sql` artifact (a
`pg_dump` of the post-migrate schema + seed data) and loads it
automatically whenever Django creates a test database. `migrate` then
applies only the small delta of migrations added since the dump was
taken.

In real projects with hundreds of migrations this turns a ~6-minute
`migrate` into a ~3-second `psql` import — or sub-second when paired
with a testcontainer that clones from a populated template DB.

## Why

A Django suite with hundreds of migrations and/or non-trivial seed
data spends many minutes per test run on `migrate`. The fix is
well-known in principle:

1. Apply migrations once against a clean PG.
2. `pg_dump` the result.
3. On every subsequent test run, `psql -f` (or
   `CREATE DATABASE ... WITH TEMPLATE`) the dump into the test DB,
   then let `migrate` apply only the small delta of migrations added
   since the dump was taken.

Every Django shop with a heavy migration history rediscovers this
pattern independently. `django-pg-baseline` packages it as a reusable
app with all the operational bits people forget the first time:

- a deterministic, version-controlled `baseline.sql` (with timestamp
  freezing for diff stability),
- a sidecar `baseline.meta.json` recording the highest migration name
  per app, plus git SHA and PG version,
- automatic loading on test DB creation when no faster
  template-clone path is available,
- explicit coordination with [pytest-testcontainers-django] for the
  template-clone path,
- a one-shot `manage.py baseline_rebuild` that spins an isolated PG
  via `testcontainers`, runs `migrate`, and emits the dump.

[pytest-testcontainers-django]: https://github.com/iplweb/pytest-testcontainers-django

## Features

- **One-line setup**: add `"django_pg_baseline"` to `INSTALLED_APPS`,
  set `PG_BASELINE['BASELINE_DIR']`, done.
- **Three usage modes**: standalone (host `psql`), with
  testcontainers (template clone), or rebuild-only (CI cron).
- **Optional pytest plugin** for projects that don't want to add the
  app to `INSTALLED_APPS`.
- **Deterministic dumps**: built-in timestamp freezing produces
  byte-stable diffs across rebuilds.
- **Cross-major PG support**: runs `pg_dump` *inside* the rebuild
  container to guarantee client/server version match; scrubs known
  PG17→PG16 incompatibilities.
- **Stale baseline is fine**: if the dump lags behind HEAD, Django's
  `migrate` applies the delta on top. `manage.py baseline_info` shows
  per-app deltas; the package itself never gates on freshness.
- **psycopg v2 *and* v3** compatible. No runtime psycopg dep — uses
  whichever the host project already pulled in for Django's PG
  backend.

## Installation

### Using uv (recommended)

```bash
uv add django-pg-baseline
```

### Using pip

```bash
pip install django-pg-baseline
```

This package depends on `Django>=5.0` and `testcontainers[postgres]`.
It does **not** declare a runtime psycopg dependency — your Django
project already has either `psycopg`, `psycopg-binary`, `psycopg2`, or
`psycopg2-binary` installed (Django's PG backend requires one), and
forcing a flavor would conflict with that choice.

## Quick start

### 1. Configure

```python
# settings.py
INSTALLED_APPS = [
    ...,
    "django_pg_baseline",
]

PG_BASELINE = {
    "BASELINE_DIR": BASE_DIR / "baseline-sql",
}
```

The directory should be tracked in git — it holds `baseline.sql` and
`baseline.meta.json`, both produced by the `baseline_rebuild`
command.

### 2. Generate the baseline

```bash
python manage.py baseline_rebuild
git add baseline-sql/baseline.sql baseline-sql/baseline.meta.json
git commit -m "chore(baseline): refresh after migrations"
```

This spins a fresh Postgres testcontainer, runs `migrate`, freezes
configured timestamp columns, runs `pg_dump` inside the container,
scrubs PG-version-specific lines, and writes the dump + meta file.

### 3. Run tests

```bash
pytest
```

Django creates the test DB; the monkey patch loads `baseline.sql` via
`psql`; `migrate` applies any post-baseline delta. That's it.

## Three modes of use

### Mode A — Standalone (host `psql`)

The simplest case. Useful when:

- the consumer runs tests against a long-lived PG (host PG, a
  `docker compose` service, a CI service container),
- `psql` is on `PATH`.

What happens at test time:

1. `AppConfig.ready()` installs the `_create_test_db` patch.
2. Django's runner calls `_create_test_db` → `CREATE DATABASE
   test_<name>`.
3. Patch sees `django_migrations` is missing in the new DB →
   `psql -f baseline.sql --single-transaction --quiet -v ON_ERROR_STOP=1`.
4. Django's `migrate` applies any post-baseline delta.

**Note:** if you use a `TEMPLATE` DB (set `TEST.TEMPLATE` in your
`DATABASES`), the test DB user must be granted the `pg_signal_backend`
role — Postgres needs zero connections on the source DB before
`CREATE DATABASE WITH TEMPLATE` is allowed, and the patch terminates
leftover sessions to enforce that.

### Mode B — With pytest-testcontainers-django

Faster (sub-second test-DB creation via template clone). Useful when:

- you accept Docker as a test dependency,
- you want the test DB to be a *clone* of a populated template
  rather than a `psql` reload.

Setup is identical to Mode A. Once
[pytest-testcontainers-django] is installed, it auto-detects this
package via `get_baseline_path()`, mounts `baseline.sql` into the PG
container as `/docker-entrypoint-initdb.d/01-baseline.sql`, and sets
`DATABASES['default']['TEST']['TEMPLATE']` so Django runs
`CREATE DATABASE … WITH TEMPLATE …`.

In Mode B the host `psql` shell-out is **never** invoked. We still
own:

- the patch's "kick sessions off template" prelude,
- `settings.PG_BASELINE` and `get_baseline_path()`,
- `manage.py baseline_rebuild`.

### Mode C — Build/rebuild the baseline (CI or local)

```bash
python manage.py baseline_rebuild
git add path/to/baseline-sql/
git commit -m "chore(baseline): refresh after migrations …"
```

Recommended downstream wiring: a GitHub Action that runs
`baseline_rebuild` whenever `**/migrations/**` changes on the main
branch and opens a PR with the refreshed dump. The package itself
does not enforce any "freshness" policy — when to rebuild is the
project's decision; we just provide the tooling.

## Configuration reference

```python
PG_BASELINE = {
    # REQUIRED. Directory holding baseline.sql + baseline.meta.json.
    "BASELINE_DIR": BASE_DIR / "baseline-sql",

    # Optional, defaults shown.
    "SQL_FILENAME": "baseline.sql",
    "META_FILENAME": "baseline.meta.json",

    # Which Django connection to load into / dump from.
    "DATABASE_ALIAS": "default",

    # Auto-install the _create_test_db monkey patch in
    # AppConfig.ready(). Set to False for manual control (e.g. only
    # under pytest, only on certain CI hosts).
    "AUTO_LOAD_ON_TEST_DB": True,

    # Image used by `baseline_rebuild`. Override for plpython3u,
    # custom locales, extensions, etc.
    "REBUILD_IMAGE": "postgres:16",

    # Extra args appended to the built-in pg_dump invocation. The
    # default invocation already includes --no-owner --no-acl
    # --no-privileges --no-comments and --exclude-table-data=django_session.
    "PG_DUMP_EXTRA_ARGS": ["--exclude-table-data=audit_log"],

    # Stacks ON TOP of the default exclusions. Each entry becomes
    # --exclude-table-data=<pattern>. Cleaner than spelling out
    # --exclude-table-data=... in PG_DUMP_EXTRA_ARGS.
    "PG_DUMP_EXTRA_EXCLUDE_TABLE_DATA": ["django_cache*", "easy_thumbnails_*"],

    # Tables/columns whose timestamps are frozen before pg_dump,
    # for deterministic diffs across rebuilds.
    "FREEZE_TIMESTAMPS": [("django_migrations", ["applied"])],
    "FREEZE_TIMESTAMPS_EXTRA": [("django_template", ["creation_date"])],
    "FREEZE_TIMESTAMP_VALUE": "2000-01-01 00:00:00+00",
}
```

## Management commands

| Command | What it does |
| --- | --- |
| `baseline_load` | Load `baseline.sql` into the configured DB. Skips when `django_migrations` already exists, unless `--force`. |
| `baseline_info` | Human summary: git SHA, PG version, sql/meta paths, plus per-app deltas. Always exits 0. |
| `baseline_rebuild` | Regenerate `baseline.sql` + `baseline.meta.json`. Spins a `testcontainers` PG, runs `migrate`, freezes timestamps, runs in-container `pg_dump`, scrubs, writes meta. Flags: `--image`, `--baseline-dir`. |

## Pytest plugin (alternative to `INSTALLED_APPS`)

If you'd rather not add the app to `INSTALLED_APPS`, the package
ships a pytest plugin that installs the same monkey patch via
`pytest_configure`:

```toml
# pyproject.toml — pytest auto-discovers the plugin via the
# pytest11 entry point. Nothing else needed.
```

Behaviour matches the `INSTALLED_APPS` route exactly:

- no-op when `DJANGO_SETTINGS_MODULE` is unset,
- no-op when `PG_BASELINE` is unset,
- raises `pytest.UsageError` when `BASELINE_DIR` is configured but
  `baseline.sql` is missing (matching `AppConfig.ready()` policy —
  loud failure beats silent slowness in CI).

Use one route or the other, not both. (Both are idempotent; double
install is safe but pointless.)

## Public API

Stable from v0.1 (the contract surface for downstream tooling such
as `pytest-testcontainers-django`):

```python
from django_pg_baseline import get_baseline_path  # Path | None
```

Reachable via submodules but **not yet locked under semver**
(stabilised at v1.0):

```python
from django_pg_baseline.conf import get_config, BaselineConfig
from django_pg_baseline.patches import install_test_db_patch
from django_pg_baseline.loader import load_baseline, baseline_needed
from django_pg_baseline.freshness import check_freshness, FreshnessReport
```

## Environment variables

| Variable | Effect |
| --- | --- |
| `DJANGO_PG_BASELINE_SQL_PATH` | Override `get_baseline_path()` resolution. Points at a dump file directly, bypassing `settings.PG_BASELINE['BASELINE_DIR']`. Useful for CI pinning a specific baseline. |

## Security note

The dump captures all data present in the testcontainer after
`migrate()`. If your data migrations seed users, fixtures, or any
other content that ends up in the dump, *that data lands in version
control*. Review the dump before committing, especially on the first
rebuild. Use `PG_DUMP_EXTRA_EXCLUDE_TABLE_DATA` to skip tables whose
row data should not ship (e.g. `auth_user` when you have real test
passwords). The package does **not** exclude `auth_user` by default —
projects that intentionally seed admin fixtures rely on that data
being in the baseline.

## Supported versions

### Python

| Python | 3.10 | 3.11 | 3.12 | 3.13 |
|--------|:----:|:----:|:----:|:----:|
|        | ✓    | ✓    | ✓    | ✓    |

### Django × Python

| Django  | 3.10 | 3.11 | 3.12 | 3.13 | Status                                  |
|---------|:----:|:----:|:----:|:----:|-----------------------------------------|
| 5.0     | ✓    | ✓    | ✓    | —    | EOL Apr 2025 — supported on a best-effort basis |
| 5.1     | ✓    | ✓    | ✓    | ✓    | EOL Dec 2025 — supported on a best-effort basis |
| 5.2 LTS | ✓    | ✓    | ✓    | ✓    | Active LTS (extended support to Apr 2028) |

Django 4.2 is out of scope (LTS goes EOL in April 2026 — the project
targets current Django).

### PostgreSQL

PostgreSQL 16 and 17. Older PG versions (14, 15) are out of scope:
they're already EOL on the ladder and would complicate
`_scrub_dump` (the list of cross-major incompatibilities to scrub
grows with every PG release we keep alive).

### psycopg

`psycopg2`, `psycopg2-binary`, and `psycopg[binary]>=3` all work —
the package uses whichever your Django project already pulled in for
its PG backend. CI tests both `psycopg2-binary` and `psycopg[binary]`
in separate matrix cells.

### Operating system

Linux is the supported CI target. macOS works in practice for local
development. Windows is not supported — the package shells out to
`psql`/`pg_dump` and assumes POSIX path conventions and a Linux-style
Docker daemon for the rebuild path.

## How it fits with related packages

`django-pg-baseline` is package #3 of the testcontainers-for-Django
family:

1. `pytest-testcontainers` — generic pytest plugin,
   session-scoped Docker container lifecycle. Framework-agnostic.
2. `pytest-testcontainers-django` — Django bridge on top of #1.
   Injects env vars before Django imports settings; supports
   init-script mounts and `DATABASES['default']['TEST']['TEMPLATE']`
   for fast test-DB clone.
3. **`django-pg-baseline`** (this package) — manages the
   `baseline.sql` artifact and provides the patch /
   `get_baseline_path()` contract that #2 reads.

Each package can be used standalone. Pair #3 with #2 for the
fastest test-DB creation; use #3 alone with a host `psql` if you
prefer no Docker dependency.

## Contributing

Issues and PRs welcome at
<https://github.com/iplweb/django-pg-baseline>.

Local development:

```bash
git clone https://github.com/iplweb/django-pg-baseline
cd django-pg-baseline
uv sync --extra test
pre-commit install
pytest
```

## License

MIT — see [LICENSE](LICENSE).
