Metadata-Version: 2.4
Name: epistasis-v2
Version: 1.0.0
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Rust
Classifier: License :: Public Domain
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Dist: numpy>=1.23
Requires-Dist: pandas>=2.0
Requires-Dist: scipy>=1.11
Requires-Dist: scikit-learn>=1.3
Requires-Dist: lmfit>=1.2
Requires-Dist: emcee>=3.1
Requires-Dist: matplotlib>=3.7
Requires-Dist: gpmap-v2>=1.0.0
Summary: High-performance Python library for fitting high-order epistatic interactions in genotype-phenotype maps.
Keywords: epistasis,genotype-phenotype,genetics,genomics,bioinformatics
Author-email: Luis Perez <lperezmo@users.noreply.github.com>
License: Unlicense
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://github.com/lperezmo/epistasis-v2
Project-URL: Issues, https://github.com/lperezmo/epistasis-v2/issues
Project-URL: Repository, https://github.com/lperezmo/epistasis-v2

# epistasis-v2

High-performance Python library for fitting high-order epistatic interactions in genotype-phenotype maps. A clean-break rewrite of [harmslab/epistasis](https://github.com/harmslab/epistasis).

**Status: alpha.** Phase 1 port complete; Phase 2 Rust kernel and Phase 3 FWHT / sparse paths still to come.

## What changed from v1

- Rust hot-path kernels via PyO3 (`epistasis._core`) instead of a shipped Cython `.c` blob.
- `uv` + `maturin` build. `pyproject.toml` only; no `setup.py`.
- Python 3.10 through 3.13. Older interpreters dropped.
- Type hints on the public API; `mypy --strict` in CI.
- Composition over `@use_sklearn` MRO injection. Concrete models hold an sklearn estimator as an attribute and forward calls explicitly, which unlocks modern sklearn (>=1.2) that broke the v1 trick when `normalize=` was removed.
- Walsh-Hadamard fast-path for Hadamard-encoded OLS fits (Phase 3, not yet landed).
- Sparse design matrix path for Lasso / ElasticNet at high order (Phase 3).
- Coordinated rewrite of the [gpmap](https://github.com/harmslab/gpmap) dependency as [gpmap-v2](https://github.com/lperezmo/gpmap-v2). Consumes `binary_packed` (uint8 2D) and `encoding_table` with `site_index` instead of the deprecated `genotype_index`.
- No backward compatibility with v1. Pin the v1 package if you need that behavior.

## Repository layout

```
epistasis-v2/
├── pyproject.toml          uv + maturin build, ruff + mypy + pytest config
├── Cargo.toml              Rust workspace
├── python/epistasis/       Python source (installed as `epistasis`)
├── crates/epistasis-core/  Rust crate, exposed as `epistasis._core`
├── tests/                  pytest suite
├── benches/                pytest-benchmark + criterion (Phase 2+)
├── docs/                   Sphinx docs (Phase 5)
├── .github/workflows/      CI (lint, test, matrix) + release (semantic-release, maturin wheels, PyPI OIDC)
├── CHANGELOG.md            generated by python-semantic-release
└── CONTRIBUTING.md         commit conventions, dev workflow
```

## Installation (dev)

Requires Python >= 3.10 and a Rust toolchain. `gpmap-v2` is pulled from PyPI.

```bash
uv sync
uv run maturin develop --release
uv run pytest
```

For lint and type-check:

```bash
uv run ruff check .
uv run ruff format --check .
uv run mypy python/epistasis
```

## Current progress

Phase 0 (scaffold) and Phase 1 (port) complete.

Ported modules:

- `epistasis.mapping` (sites, coefficients, `EpistasisMap`)
- `epistasis.matrix` (encoded vectors, design matrix, NumPy backend)
- `epistasis.exceptions` (`EpistasisError`, `XMatrixError`, `FittingError`)
- `epistasis.utils` (`genotypes_to_X`)
- `epistasis.models.base` (`AbstractEpistasisModel`, `EpistasisBaseModel`)
- `epistasis.models.linear` (`EpistasisLinearRegression` with analytic coefficient standard errors, `EpistasisRidge`, `EpistasisLasso`, `EpistasisElasticNet`)
- `epistasis.models.nonlinear` (`EpistasisNonlinearRegression`, `FunctionMinimizer`; `power` and `spline` variants deferred)
- `epistasis.models.classifiers` (`EpistasisLogisticRegression`; LDA, QDA, Gaussian Process, and GMM deferred)
- `epistasis.simulate` (`simulate_linear_gpm`, `simulate_random_linear_gpm`)
- `epistasis.stats` (Pearson, R^2, RMSD, SS residuals, AIC, `split_gpm`)
- `epistasis.validate` (`k_fold`, `holdout`)
- `epistasis.sampling.bayesian` (`BayesianSampler` via emcee 3)

Pending:

- Rust kernels for `build_model_matrix`, `encode_vectors`, FWHT fast path (Phase 2)
- Sparse design matrix path for Lasso / ElasticNet (Phase 3)
- `power.py` and `spline.py` nonlinear variants
- Remaining classifier implementations if demand surfaces
- ReadTheDocs build

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md). Commits follow [Conventional Commits](https://www.conventionalcommits.org/); releases and the changelog are automated by `python-semantic-release`.

## License

Unlicense (public domain). See [UNLICENSE](UNLICENSE).

