Metadata-Version: 2.4
Name: invarlock
Version: 0.8.0
Summary: Edit‑agnostic robustness evaluation reports for weight edits (InvarLock framework)
Author-email: InvarLock Team <oss@invarlock.dev>
Maintainer-email: InvarLock Maintainers <support@invarlock.dev>
License-Expression: Apache-2.0
Project-URL: Homepage, https://github.com/invarlock/invarlock
Project-URL: Repository, https://github.com/invarlock/invarlock
Project-URL: Documentation, https://invarlock.github.io/invarlock/0.8.0/
Project-URL: Issues, https://github.com/invarlock/invarlock/issues
Project-URL: Changelog, https://github.com/invarlock/invarlock/blob/v0.8.0/CHANGELOG.md
Keywords: machine-learning,deep-learning,transformers,pytorch,llm,quantization,evaluation,verification,model-editing,release-assurance,reproducibility
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Operating System :: OS Independent
Classifier: Typing :: Typed
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: typer>=0.15
Requires-Dist: click>=8.1
Requires-Dist: shellingham>=1.5.0
Requires-Dist: cryptography>=46.0.0
Requires-Dist: pydantic>=2.0
Requires-Dist: rich>=13.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: markdown>=3.5
Requires-Dist: psutil>=5.9
Requires-Dist: jsonschema>=4.0
Provides-Extra: adapters
Requires-Dist: torch>=2.1.0; extra == "adapters"
Requires-Dist: transformers>=5.5.0; extra == "adapters"
Requires-Dist: safetensors>=0.4.0; extra == "adapters"
Requires-Dist: protobuf>=4.25; extra == "adapters"
Requires-Dist: sentencepiece>=0.2.1; extra == "adapters"
Requires-Dist: tiktoken>=0.9.0; extra == "adapters"
Requires-Dist: pillow>=11.3.0; extra == "adapters"
Provides-Extra: hf
Requires-Dist: torch>=2.1.0; extra == "hf"
Requires-Dist: accelerate>=0.26.0; extra == "hf"
Requires-Dist: transformers>=5.5.0; extra == "hf"
Requires-Dist: safetensors>=0.4.0; extra == "hf"
Requires-Dist: protobuf>=4.25; extra == "hf"
Requires-Dist: sentencepiece>=0.2.1; extra == "hf"
Requires-Dist: tiktoken>=0.9.0; extra == "hf"
Requires-Dist: datasets>=3.0; extra == "hf"
Requires-Dist: requests>=2.33.0; extra == "hf"
Requires-Dist: numpy>=1.24; extra == "hf"
Requires-Dist: huggingface_hub>=1.0.0; extra == "hf"
Requires-Dist: aiohttp>=3.12.14; extra == "hf"
Requires-Dist: h2>=4.3.0; extra == "hf"
Requires-Dist: pillow>=11.3.0; extra == "hf"
Provides-Extra: guards
Requires-Dist: torch>=2.1.0; extra == "guards"
Requires-Dist: numpy>=1.24; extra == "guards"
Provides-Extra: edits
Requires-Dist: torch>=2.1.0; extra == "edits"
Provides-Extra: eval
Requires-Dist: torch>=2.1.0; extra == "eval"
Requires-Dist: datasets>=3.0; extra == "eval"
Requires-Dist: requests>=2.33.0; extra == "eval"
Provides-Extra: probes
Requires-Dist: scikit-learn>=1.4; extra == "probes"
Provides-Extra: gptq
Requires-Dist: torch>=2.1.0; extra == "gptq"
Requires-Dist: auto-gptq>=0.7.0; (platform_system == "Linux" and python_version < "3.13") and extra == "gptq"
Requires-Dist: triton>=2.3.0; platform_system == "Linux" and extra == "gptq"
Requires-Dist: transformers>=5.5.0; extra == "gptq"
Requires-Dist: safetensors>=0.4.0; extra == "gptq"
Provides-Extra: awq
Requires-Dist: torch>=2.1.0; extra == "awq"
Requires-Dist: autoawq>=0.2.0; platform_system == "Linux" and extra == "awq"
Requires-Dist: transformers>=5.5.0; extra == "awq"
Requires-Dist: safetensors>=0.4.0; extra == "awq"
Requires-Dist: triton>=2.3.0; platform_system == "Linux" and extra == "awq"
Provides-Extra: gpu
Requires-Dist: torch>=2.1.0; extra == "gpu"
Requires-Dist: accelerate>=0.27; extra == "gpu"
Requires-Dist: bitsandbytes>=0.41; extra == "gpu"
Requires-Dist: safetensors>=0.4.0; extra == "gpu"
Provides-Extra: advanced
Requires-Dist: torch>=2.1.0; extra == "advanced"
Requires-Dist: transformers>=5.5.0; extra == "advanced"
Requires-Dist: protobuf>=4.25; extra == "advanced"
Requires-Dist: sentencepiece>=0.2.1; extra == "advanced"
Requires-Dist: tiktoken>=0.9.0; extra == "advanced"
Requires-Dist: datasets>=3.0; extra == "advanced"
Requires-Dist: requests>=2.33.0; extra == "advanced"
Requires-Dist: numpy>=1.24; extra == "advanced"
Requires-Dist: scikit-learn>=1.4; extra == "advanced"
Requires-Dist: huggingface_hub>=1.0.0; extra == "advanced"
Requires-Dist: accelerate>=0.27; extra == "advanced"
Requires-Dist: bitsandbytes>=0.41; extra == "advanced"
Requires-Dist: auto-gptq>=0.7.0; (platform_system == "Linux" and python_version < "3.13") and extra == "advanced"
Requires-Dist: autoawq>=0.2.0; platform_system == "Linux" and extra == "advanced"
Requires-Dist: triton>=2.3.0; platform_system == "Linux" and extra == "advanced"
Requires-Dist: aiohttp>=3.12.14; extra == "advanced"
Requires-Dist: h2>=4.3.0; extra == "advanced"
Requires-Dist: pillow>=11.3.0; extra == "advanced"
Provides-Extra: all
Requires-Dist: torch>=2.1.0; extra == "all"
Requires-Dist: transformers>=5.5.0; extra == "all"
Requires-Dist: protobuf>=4.25; extra == "all"
Requires-Dist: sentencepiece>=0.2.1; extra == "all"
Requires-Dist: tiktoken>=0.9.0; extra == "all"
Requires-Dist: datasets>=3.0; extra == "all"
Requires-Dist: requests>=2.33.0; extra == "all"
Requires-Dist: numpy>=1.24; extra == "all"
Requires-Dist: scikit-learn>=1.4; extra == "all"
Requires-Dist: huggingface_hub>=1.0.0; extra == "all"
Requires-Dist: accelerate>=0.27; extra == "all"
Requires-Dist: bitsandbytes>=0.41; extra == "all"
Requires-Dist: auto-gptq>=0.7.0; (platform_system == "Linux" and python_version < "3.13") and extra == "all"
Requires-Dist: autoawq>=0.2.0; platform_system == "Linux" and extra == "all"
Requires-Dist: triton>=2.3.0; platform_system == "Linux" and extra == "all"
Requires-Dist: aiohttp>=3.12.14; extra == "all"
Requires-Dist: h2>=4.3.0; extra == "all"
Requires-Dist: pillow>=11.3.0; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: ruff==0.15.11; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Requires-Dist: hypothesis>=6.98; extra == "dev"
Requires-Dist: pre-commit>=3.0; extra == "dev"
Requires-Dist: mkdocs>=1.5; extra == "dev"
Requires-Dist: mkdocs-material>=9.5; extra == "dev"
Requires-Dist: mkdocs-mermaid2-plugin>=1.1; extra == "dev"
Requires-Dist: sphinx>=7.0; extra == "dev"
Requires-Dist: matplotlib>=3.7; extra == "dev"
Requires-Dist: bitsandbytes>=0.41; extra == "dev"
Requires-Dist: build>=0.10.0; extra == "dev"
Requires-Dist: wheel>=0.46.0; extra == "dev"
Requires-Dist: requests>=2.33.0; extra == "dev"
Requires-Dist: twine>=4.0.0; extra == "dev"
Provides-Extra: ci
Requires-Dist: pytest>=7.0; extra == "ci"
Requires-Dist: pytest-cov>=4.0; extra == "ci"
Requires-Dist: hypothesis>=6.98; extra == "ci"
Requires-Dist: ruff==0.15.11; extra == "ci"
Requires-Dist: mypy>=1.0; extra == "ci"
Requires-Dist: mkdocs>=1.5; extra == "ci"
Requires-Dist: mkdocs-material>=9.5; extra == "ci"
Requires-Dist: mkdocs-mermaid2-plugin>=1.1; extra == "ci"
Requires-Dist: build>=0.10.0; extra == "ci"
Requires-Dist: wheel>=0.46.0; extra == "ci"
Requires-Dist: requests>=2.33.0; extra == "ci"
Provides-Extra: docs-ci
Requires-Dist: linkchecker>=10.5; extra == "docs-ci"
Requires-Dist: numpy>=1.24; extra == "docs-ci"
Requires-Dist: requests>=2.33.0; extra == "docs-ci"
Provides-Extra: precommit-ci
Requires-Dist: pre-commit>=3.0; extra == "precommit-ci"
Provides-Extra: release-ci
Requires-Dist: build>=0.10.0; extra == "release-ci"
Requires-Dist: requests>=2.33.0; extra == "release-ci"
Requires-Dist: twine>=4.0.0; extra == "release-ci"
Requires-Dist: cyclonedx-bom>=4.1; extra == "release-ci"
Provides-Extra: security-ci
Requires-Dist: pip-audit>=2.8; extra == "security-ci"
Requires-Dist: requests>=2.33.0; extra == "security-ci"
Requires-Dist: cyclonedx-bom>=4.1; extra == "security-ci"
Dynamic: license-file

<p align="center">
  <picture>
    <source
      media="(prefers-color-scheme: dark)"
      srcset="docs/assets/invarlock-logo-dark.svg"
    />
    <img src="docs/assets/invarlock-logo.svg" alt="InvarLock" />
  </picture>
</p>

<p align="center"><em>Edit‑agnostic robustness reports for weight edits</em></p>

<p align="center">
  <a href="https://github.com/invarlock/invarlock/actions/workflows/ci.yml">
    <img alt="CI" src="https://img.shields.io/github/actions/workflow/status/invarlock/invarlock/ci.yml?branch=main&logo=github&label=CI" />
  </a>
  <a href="https://pypi.org/project/invarlock/">
    <img alt="PyPI" src="https://badge.fury.io/py/invarlock.svg" />
  </a>
  <a href="https://invarlock.github.io/invarlock/0.8.0/">
    <img alt="Docs" src="https://img.shields.io/badge/docs-quickstart-blue.svg" />
  </a>
  <a href="LICENSE">
    <img alt="License: Apache-2.0" src="https://img.shields.io/badge/License-Apache_2.0-blue.svg" />
  </a>
  <a href="https://www.python.org/downloads/release/python-3120/">
    <img alt="Python 3.12+" src="https://img.shields.io/badge/python-3.12+-blue.svg" />
  </a>
</p>

<p align="center">
  <strong>Catch silent quality regressions from quantization, pruning, and weight edits before they ship.</strong>
</p>

Quantizing, pruning, or otherwise editing a model’s weights can silently degrade quality.
InvarLock compares an edited **subject** checkpoint against a fixed **baseline** with paired
evaluation windows, enforces the canonical guard chain (`invariants` → `spectral` → `RMT`
→ `variance` → `invariants`), and produces a machine-readable evaluation report you can gate
in CI.

## Why InvarLock?

- **Quality gates for edited checkpoints**: catch regressions before deployment.
- **Paired statistical evidence**: primary metrics with confidence intervals.
- **Auditable evidence**: deterministic pairing metadata + policy digests in `evaluation.report.json`.
- **CI/CD-friendly**: stable exit codes, `--json` outputs, and portable “evidence packs”.
- **Offline-first**: network is disabled by default; enable downloads per command.

## Who is this for?

- ML engineers shipping edited model checkpoints, including quantized, pruned, fine-tuned, or otherwise weight-modified variants.
- MLOps and platform teams building CI gates, runtime-provenance verification, and reviewable evaluation artifacts.
- Researchers validating weight-edit, compression, and model-comparison methods with reproducible paired evaluation across text and image-text workflows supported here.

## How it works

```text
┌───────────────────────┐     ┌────────────────────────────────────────────┐
│ Baseline (checkpoint) │────►│                                            │
└───────────────────────┘     │  invarlock evaluate                        │
                              │  ├─► Paired windows (deterministic)        │
┌───────────────────────┐     │  ├─► GuardChain pipeline                   │
│ Subject  (checkpoint) │────►│  │   └─► invariants → spectral → RMT → VE  │
└───────────────────────┘     │  └─► Emit: evaluation.report.json          │
                              │                                            │
                              └────────────────────────────────────────────┘
                                                     │
                                     ┌───────────────┴───────────────┐
                                     ▼                               ▼
                                 ✅ PASS                          ❌ FAIL
                                 (ship)                          (rollback)

```

## Quick start

Colab (CPU-friendly):
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/invarlock/invarlock/blob/v0.8.0/notebooks/invarlock_quickstart_cpu.ipynb)

The public front door is `evaluate -> verify -> report html`, but the repo now
splits onboarding by user type:

- **Wheel user / reviewer**: install `invarlock`, inspect an existing
  `evaluation.report.json`, and render HTML without cloning the repository.
- **Evaluator**: install `invarlock[hf]` when you want `evaluate` to load
  Hugging Face models and emit a fresh evaluation bundle.
- **Repo maintainer**: clone the repo and build the local runtime image when you
  need maintainer smokes, repo presets, or local container-image iteration.

The default `evaluate` path runs model-loading commands inside the runtime
container and expects an OCI engine such as `podman` or `docker`. Host-side
workflows can opt into `--execution-mode host`, but the default verification
path below expects a container-backed report with sibling runtime provenance.

```bash
# Evaluator path: create a fresh bundle
pip install "invarlock[hf]"

invarlock --version

# Compare baseline vs subject (downloads require explicit network enable)
invarlock evaluate --allow-network \
  --baseline gpt2 \
  --subject  distilgpt2 \
  --adapter auto \
  --profile ci \
  --report-out reports/eval \
  --quiet

# Validate the container-backed evaluation report
test -f reports/eval/runtime.manifest.json
invarlock verify --json reports/eval/evaluation.report.json

# Render HTML for sharing
invarlock report html -i reports/eval/evaluation.report.json -o reports/eval/evaluation.html
```

Wheel-only review path:
`pip install invarlock`, `invarlock doctor`,
`invarlock verify /path/to/evaluation.report.json`, and
`invarlock report html -i /path/to/evaluation.report.json -o /path/to/evaluation.html`.

Repo maintainers can build the local runtime image once with `make runtime-image`;
InvarLock automatically prefers `invarlock-runtime:local` when it is present.

Artifact model:

| Artifact | Produced by | Primary consumers |
| --- | --- | --- |
| `evaluation.report.json` | `invarlock evaluate`, `invarlock report generate --format report` | `invarlock verify`, `invarlock report html`, `invarlock report validate`, `invarlock report explain --evaluation-report`, `invarlock advanced runtime-verify` |
| `report.json` | Baseline/subject run directories under `runs/...` | `invarlock report generate`, `invarlock report explain --subject-report ... --baseline-report ...` |

Example output (abridged; counts vary by profile/config):

```text
INVARLOCK v<version> · EVALUATE
Baseline: gpt2 -> Subject: gpt2 · Profile: dev
Status: PASS · Gates: <passed>/<total> passed
Primary metric ratio: <ratio>
Output: reports/eval/evaluation.report.json
Runtime provenance: reports/eval/runtime.manifest.json
```

## Command Surface

- First touch in a fresh install: `invarlock --help`, `invarlock --version`,
  `invarlock report --help`, and `invarlock advanced --help`.
- Core workflow: `invarlock evaluate` → `invarlock verify` →
  `invarlock report html`.
- Follow-on report analysis after the core loop: `invarlock report generate`,
  `invarlock report explain`, and `invarlock report validate`.
- Environment and release checks: `invarlock doctor` plus the JSON surfaces
  emitted by `doctor --json` and `advanced plugins ... --json`.
- Runtime-manifest verifier: `invarlock advanced runtime-verify --report <evaluation.report.json> --manifest <runtime.manifest.json>`.
- The public contract catalog exposed by those JSON surfaces includes
  `validation_keys`, `console_labels`, and `metric_kinds`.
- Advanced workflows: `invarlock advanced evidence-pack`, `invarlock advanced policy`,
  `invarlock advanced plugins`, `invarlock advanced calibrate`, and
  `invarlock advanced runtime-verify`.
- Host execution for the core evaluate path uses `--execution-mode host`.
- Optional adapter/backend installs use normal Python extras such as
  `pip install "invarlock[hf]"` rather than CLI install commands.

## Evidence packs (portable evidence bundles)

Evidence packs bundle reports + verification metadata into a distributable artifact.

- Guide: <https://invarlock.github.io/invarlock/0.8.0/user-guide/evidence-packs/>
- Verify from an installed wheel:
  `invarlock advanced evidence-pack verify <dir> --strict`
- Repo harness alternative: `scripts/evidence_packs/verify_pack.sh --pack <dir> --strict`

Note: `configs/` and most `scripts/` remain repo resources and are not included in
wheels. Installed wheels include the public contracts and the
`invarlock advanced evidence-pack verify` verifier, so installed packages can
check bundles without cloning the repository.

## Installation

```bash
# Minimal CLI (no torch/transformers)
pip install invarlock

# HF workflows (torch/transformers)
pip install "invarlock[hf]"
```

Optional extras: `invarlock[probes]`, `invarlock[gpu]`, `invarlock[awq,gptq]`.
On Python 3.13+ stacks, `gptq` may still require a vendor wheel or a
supported older interpreter because upstream `auto-gptq` packaging is narrower
than the core InvarLock support matrix. Full setup:
<https://invarlock.github.io/invarlock/0.8.0/user-guide/getting-started/>.

The minimal install covers the core verification and reporting flows. Add
`invarlock[hf]` only for model-loading evaluate runs, and use the installed
wheel's evidence-pack verifier when you need to inspect a bundle without cloning
the repository.

## Documentation

- Docs home: <https://invarlock.github.io/invarlock/0.8.0/>
- Quickstart: <https://invarlock.github.io/invarlock/0.8.0/user-guide/quickstart/>
- Compare & evaluate (BYOE): <https://invarlock.github.io/invarlock/0.8.0/user-guide/compare-and-evaluate/>
- Reading a report: <https://invarlock.github.io/invarlock/0.8.0/user-guide/reading-report/>
- CLI reference: <https://invarlock.github.io/invarlock/0.8.0/reference/cli/>
- Assurance case: <https://invarlock.github.io/invarlock/0.8.0/assurance/00-assurance-case/>
  (repo source: `docs/assurance/00-assurance-case.md`)
- Threat model: <https://invarlock.github.io/invarlock/0.8.0/security/threat-model/>

## Community

- Questions/ideas: <https://github.com/invarlock/invarlock/discussions>
- Bug reports: <https://github.com/invarlock/invarlock/issues>
- Contact: <mailto:support@invarlock.dev>

## Citation

If you use InvarLock in scientific work, please cite it (canonical metadata is in `CITATION.cff`):

```bibtex
@software{invarlock,
  title  = {InvarLock: Edit-agnostic robustness evaluation reports for weight edits},
  author = {{InvarLock}},
  url    = {https://github.com/invarlock/invarlock},
}
```

## Limitations

- InvarLock evaluates an edited model relative to a baseline under a specific configuration; results are not “global” guarantees.
- Not a content-safety/alignment tool.
- Native Windows is not supported (use WSL2 or Linux).

## Support matrix

<!-- markdownlint-disable MD060 -->
| Platform               | Status          | Notes                                     |
| ---------------------- | --------------- | ----------------------------------------- |
| Python 3.12+           | ✅ Required      |                                           |
| Linux                  | ✅ Full          | Primary dev target                        |
| macOS (Intel/M-series) | ✅ Full          | MPS supported (default on Apple Silicon)  |
| Windows                | ❌ Not supported | Use WSL2 or a Linux container if required |
| CUDA                   | ✅ Recommended   | For larger models                         |
| CPU                    | ✅ Fallback      | Slower but functional                     |
<!-- markdownlint-enable MD060 -->

## Project status

InvarLock is pre‑1.0. Until 1.0, minor releases may include breaking changes. See [`CHANGELOG.md`](CHANGELOG.md).

For guidance on where to ask questions, how to report bugs, and what to expect in terms of response times, see
[`SUPPORT.md`](SUPPORT.md).

## Contributing

- Contributing guide: <https://github.com/invarlock/invarlock/blob/v0.8.0/CONTRIBUTING.md>
- Fast local checks (repo clone):
  - `make` targets auto-select Python 3.12+, preferring an active 3.12 env, `python3.12`, then the Conda env `invarlock-py312` when present.
  - `make dev-install`
  - `make test`
  - `make lint`
  - `make docs-live`

## License

Apache-2.0 — see `LICENSE`.
