Metadata-Version: 2.4
Name: mdmp-protocol
Version: 0.1.2
Summary: MDMP core protocol for dataset contracts, grading, fingerprints, and AI lineage cards
Author: IINTS / MDMP
License-Expression: Apache-2.0
Project-URL: Homepage, https://github.com/python35/MDMP
Project-URL: Documentation, https://python35.github.io/MDMP/
Project-URL: Repository, https://github.com/python35/MDMP
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 3 - Alpha
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=2.0
Requires-Dist: numpy>=1.24
Requires-Dist: PyYAML>=6.0
Requires-Dist: typer>=0.12
Dynamic: license-file

# MDMP

MDMP is an open protocol and tooling stack for dataset quality and AI training provenance.

**Tagline:** _Know what your AI learned from._

## 30-second summary

MDMP gives every dataset a contract, a grade, and a fingerprint.
It gives every model a lineage card that points to exactly which dataset fingerprints were used.

- Contract: schema, ranges, consent metadata.
- Validation: reproducible checks + deterministic grade.
- Fingerprint: immutable dataset identity (`sha256:...`).
- Lineage card: model-to-dataset traceability with stale detection.

## Install

```bash
python -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
pip install -e .
```

## Quick Start

```bash
mdmp init --flavor health --output contracts/mdmp_contract.yaml
mdmp validate contracts/mdmp_contract.yaml data/demo_cgm.csv --output-json results/mdmp_report.json
mdmp report results/mdmp_report.json --output-html results/mdmp_dashboard.html
```

## Core Commands

```bash
# grading + fingerprint
mdmp grade contracts/mdmp_contract.yaml data/demo_cgm.csv
mdmp fingerprint data/demo_cgm.csv

# stale lineage lifecycle
mdmp fingerprint-record data/demo_cgm.csv --output-json results/fingerprint.json --expires-days 365
mdmp fingerprint-check results/fingerprint.json data/demo_cgm.csv
mdmp lineage-card --model glucose_forecaster_v2 --dataset data/demo_cgm.csv --contract contracts/health_demo.yaml --output results/mdmp_model_card.yaml
mdmp lineage-card-refresh results/mdmp_model_card.yaml

# local registry scaffold
mdmp registry init --registry registry/mdmp_registry.json
mdmp registry push --registry registry/mdmp_registry.json --report results/mdmp_report.json --visibility public --model-id glucose_forecaster_v2
mdmp registry lookup sha256:YOUR_FINGERPRINT --registry registry/mdmp_registry.json
mdmp registry list --registry registry/mdmp_registry.json

# Hugging Face section export
mdmp hf-export --dataset-id python35/demo-cgm --report-json results/mdmp_report.json --output-md results/mdmp_hf_section.md
```

## Documentation

- Spec: `MDMP_SPEC.md`
- Docs index: `docs/index.md`
- CLI reference: `docs/reference/cli.md`
- Launch checklist: `docs/launch/launch-checklist.md`
- Contributing: `CONTRIBUTING.md`

## Release Automation

- GitHub release workflow: `.github/workflows/release.yml`
- PyPI publish workflow: `.github/workflows/publish-pypi.yml`
- Docs deploy workflow: `.github/workflows/docs-site.yml`

## Design Boundaries

- MDMP stores metadata + fingerprints, not raw datasets.
- Local-first workflow; cloud registry is optional.
- Complementary to DVC / MLflow / W&B.

## Status

- Spec version: `v0.1-draft`
- Research/provenance utility.
- Not a medical device and not clinical decision support.

## Integrations

- IINTS integration guide: `docs/IINTS_INTEGRATION.md`
