Metadata-Version: 2.4
Name: sleap-roots-contracts
Version: 0.1.0a2
Summary: Shared result + provenance contract for the sleap-roots <-> Bloom pipeline.
Keywords: sleap,roots,phenotyping,provenance,contract
Author: eberrigan
Author-email: eberrigan <eberrigan@salk.edu>
License-Expression: GPL-3.0-or-later
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering
Classifier: Intended Audience :: Science/Research
Requires-Dist: pydantic>=2.7
Requires-Dist: pyyaml>=6.0
Requires-Dist: pandas>=2.0 ; extra == 'pandas'
Requires-Python: >=3.11
Project-URL: Homepage, https://github.com/talmolab/sleap-roots-contracts
Project-URL: Issues, https://github.com/talmolab/sleap-roots-contracts/issues
Project-URL: Repository, https://github.com/talmolab/sleap-roots-contracts
Provides-Extra: pandas
Description-Content-Type: text/markdown

# sleap-roots-contracts

Shared **result + provenance contract** for the sleap-roots ↔ Bloom pipeline.

This is a small, dependency-light, Bloom-agnostic library that defines the shape of a
per-scan pipeline result and its provenance (Pydantic v2 models), emits a versioned JSON
Schema artifact, and ships a trait-definitions registry. The Python producers
(`sleap-roots-predict`, `sleap-roots-traits`) import it; Bloom consumes the emitted schema.

It also defines the **analysis-input contract** — the canonical shape of the wide trait
table that crosses the `sleap-roots-analyze` ↔ Bloom boundary.
`validate_analysis_input(df, *, strict=False)` structurally validates that table against
fixed canonical role names (`genotype` + optional `sample_id` / `replicate` /
`image_path`) plus an open set of opaque numeric trait columns, returning a structured
`ValidationResult`. It operates on a pandas DataFrame, so pandas is an optional install
extra — `pip install sleap-roots-contracts[pandas]` — while the runtime core stays
pydantic + pyyaml. Canonical example tables ship in the package
(`sleap_roots_contracts.examples.load_analysis_input_example(...)`) so consumers can load a
validating frame straight from the released wheel.

It is sub-project #1 of the sleap-roots ↔ Bloom integration program. Design and plan:
`docs/01-contract-library-design.md` and `docs/02-contract-library-plan.md`.

## Develop

```bash
uv sync
uv run pytest -v
uv run black --check src tests && uv run ruff check src tests
```

## Key ideas

- **Pydantic is canonical**; `schema/*.json` is generated and drift-guarded in CI.
- Trait **values** are long-format rows (no jsonb); provenance is a jsonb blob on the source.
- Hashes (`param_hash`, `idempotency_key`) are **producer-side only**; Bloom treats them as
  opaque strings.
- Distributed via **PyPI** (no Docker image — this is a library).
