Metadata-Version: 2.4
Name: pharmacoml
Version: 0.1.1
Summary: ML-Enhanced Pharmacometrics Toolkit
Author: Sai Rani
License: MIT
Project-URL: Repository, https://github.com/s-rani1/pharmacoml
Keywords: pharmacometrics,machine-learning,covariate-selection,SHAP,population-PK,NONMEM
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Medical Science Apps.
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=2.0
Requires-Dist: scikit-learn>=1.3
Requires-Dist: xgboost>=1.7
Requires-Dist: lightgbm>=4.0
Requires-Dist: catboost>=1.2
Requires-Dist: shap>=0.42
Requires-Dist: matplotlib>=3.7
Requires-Dist: scipy>=1.10
Requires-Dist: statsmodels>=0.14
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Provides-Extra: dl
Requires-Dist: pytorch-tabnet>=4.0; extra == "dl"
Requires-Dist: torch>=2.0; extra == "dl"
Provides-Extra: symbolic
Requires-Dist: gplearn>=0.4.2; extra == "symbolic"
Requires-Dist: pysr>=1.5.9; extra == "symbolic"
Provides-Extra: all
Requires-Dist: pytorch-tabnet>=4.0; extra == "all"
Requires-Dist: torch>=2.0; extra == "all"
Requires-Dist: gplearn>=0.4.2; extra == "all"
Requires-Dist: pysr>=1.5.9; extra == "all"

# pharmacoml

**ML-Enhanced Pharmacometrics Toolkit for Python**

`pharmacoml` brings machine learning methods to pharmacometrics workflows — usable **with or without NONMEM**.

## Why pharmacoml?

Existing ML tools for pharmacometrics (pyDarwin, shap-cov) require a NONMEM license. `pharmacoml` is **estimation-tool agnostic** — it works with empirical Bayes estimates from NONMEM, nlmixr2, Monolix, Pumas, or manual estimation.

## Quick Start

The recommended default workflow is the hybrid preselection path: explainable
boosting with benchmark-approved recursive elimination for discovery,
augmented adaptive LASSO for confirmation, optional stochastic-gates and
symbolic-structure confirmation, and tiered output for downstream
SCM/backward elimination.

```python
from pharmacoml.covselect import HybridScreener
import pandas as pd

ebes = pd.read_csv("individual_parameters.csv")
covs = pd.read_csv("covariates.csv")

report = HybridScreener(include_scm=True).fit(ebes, covs)

report.confirmed_covariates()  # recommended daily-use answer
report.core_covariates()       # strongest ML signals
report.candidate_covariates()  # broader shortlist for exploration
report.scm_covariates()        # explicit SCM-confirmed set
report.proxy_groups()
print(report.to_nonmem_candidates())
```

## Experimental Consensus

For broader research comparisons, the experimental namespace exposes a curated
multi-model consensus workflow. It runs a scikit-first model set by default,
aggregates top-k covariate frequency across model families, and lets you compare
that consensus against the main hybrid workflow.

```python
from pharmacoml.covselect.experimental import MultiModelConsensusScreener

report = MultiModelConsensusScreener(
    top_k=3,
    n_bootstrap=8,
    include_neural=False,
).fit(ebes, covs)

report.consensus_covariates()
report.selection_frequency_table()
```

## Benchmark-Gated Development

`pharmacoml` now carries a release benchmark suite for default-calibration work:

- `pheno` (Pharmpy phenobarbital example)
- `Eleveld/Wahlquist` public propofol data
- `ggPMX` Monolix theophylline example
- `Asiimwe-style` correlated-covariate simulation
- `Shap-Cov-style` collinear simulation
- optional `Kekic` public synthetic scenarios when the repo is available locally

The benchmark harness compares the default hybrid workflow against optional
variants such as RFE and shrinkage-awareness. The current benchmark-approved
defaults are `rfe_enabled=True` and `shrinkage_awareness=True`.

```bash
PYTHONPATH=. python benchmarks/run_public_benchmarks.py --check
```

By default, that command now writes a reusable report bundle to
`benchmarks/reports/fixed_public/`:

- `public_benchmark_report.md`
- `public_benchmark_summary.csv`
- `public_benchmark_details.csv`
- `public_benchmark_report.json`

Use `--no-report` to skip artifact generation, or `--report-dir <path>` to
write the bundle somewhere else.

## Optional Advanced Backends

`pharmacoml` now includes:

- a nonlinear stochastic-gates engine (`STG`) with an input-gated MLP body
- symbolic covariate structure search with three backends:
  - `basis` (default, lightweight pharmacometric basis search)
  - `gplearn` (optional true symbolic regression)
  - `pysr` (optional true symbolic regression; install manually)

Example:

```python
report = HybridScreener(
    include_symbolic=True,
    symbolic_backend="basis",
    include_stg=True,
).fit(ebes, covs)
```

Install optional extras:

```bash
pip install -e ".[dev,dl,symbolic]"
```

## Installation

```bash
git clone https://github.com/s-rani1/pharmacoml.git
cd pharmacoml
pip install -e ".[dev]"
```

## Docs

Static docs pages live in `docs/` and can be used directly on GitHub Pages:

- `docs/index.html`
- `docs/tutorial.html`
- `docs/benchmarks.html`

## Modules

| Module | Description | Status |
|--------|-------------|--------|
| `pharmacoml.covselect` | Hybrid ML-assisted covariate screening with SCM bridge | 🚧 Active |
| `pharmacoml.expose` | Exposure-response with interpretable ML | 📋 Planned |
| `pharmacoml.trajpk` | PK/PD trajectory clustering | 📋 Planned |

## License

MIT
