Metadata-Version: 2.4
Name: deeplife-toolkit
Version: 0.1.0a1
Summary: Official DeepLife Python toolkit (PyPI): TwinCell (`deeplife_toolkit.twincell`), pseudo-bulk (`deeplife_toolkit.pseudobulk`), and differential expression (`deeplife_toolkit.differential_expression`). Optional `[notebook]` extra for Jupyter tutorial dependencies.
Project-URL: Homepage, https://deeplife.co
Project-URL: Documentation, https://github.com/deeplifeai/deeplife-toolkit/tree/main/docs
Project-URL: Repository, https://github.com/deeplifeai/deeplife-toolkit
Project-URL: Issues, https://github.com/deeplifeai/deeplife-toolkit/issues
Project-URL: Changelog, https://github.com/deeplifeai/deeplife-toolkit/releases
Author-email: DeepLife <hello@deeplife.co>
License: MIT
License-File: LICENSE
Keywords: anndata,api,bioinformatics,deeplife,differential-expression,digital-twin,drug-discovery,pseudobulk,scanpy,single-cell,twincell
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Healthcare Industry
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Medical Science Apps.
Classifier: Typing :: Typed
Requires-Python: >=3.12
Requires-Dist: anndata>=0.12.7
Requires-Dist: decoupler>=2.1.1
Requires-Dist: h5py>=3.10.0
Requires-Dist: hdf5plugin>=5.1.0
Requires-Dist: httpx>=0.27.2
Requires-Dist: hydra-core>=1.3.0
Requires-Dist: kaleido>=1.0.0
Requires-Dist: numpy>=1.26.4
Requires-Dist: omegaconf>=2.3.0
Requires-Dist: pandas>=2.2.3
Requires-Dist: plotly>=6.3.0
Requires-Dist: pyarrow>=21.0.0
Requires-Dist: pydantic>=2.8.2
Requires-Dist: pydeseq2>=0.5.2
Requires-Dist: pyyaml>=6.0
Requires-Dist: scanpy[leiden]>=1.11.4
Requires-Dist: tenacity>=9.0.0
Requires-Dist: tqdm>=4.67.1
Provides-Extra: notebook
Requires-Dist: adjusttext>=0.8.0; extra == 'notebook'
Requires-Dist: gseapy>=1.1.9; extra == 'notebook'
Requires-Dist: ipykernel>=7.1.0; extra == 'notebook'
Requires-Dist: ipython>=8.26.0; extra == 'notebook'
Description-Content-Type: text/markdown

# deeplife-toolkit

[![PyPI version](https://img.shields.io/pypi/v/deeplife-toolkit)](https://pypi.org/project/deeplife-toolkit/)
[![CI](https://github.com/deeplifeai/deeplife-toolkit/actions/workflows/ci.yml/badge.svg)](https://github.com/deeplifeai/deeplife-toolkit/actions/workflows/ci.yml)
[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

Official **[DeepLife](https://deeplife.co/)** Python **toolkit** for **TwinCell** and related analysis: call the Open API from Python, validate and preprocess **AnnData** (`.h5ad`) locally, aggregate to **pseudo-bulk** with **`deeplife_toolkit.pseudobulk`**, and run **sample-level differential expression** with **`deeplife_toolkit.differential_expression`** (suitable for pseudo-bulk, bulk, or other compatible count tables).

**Source:** [github.com/deeplifeai/deeplife-toolkit](https://github.com/deeplifeai/deeplife-toolkit) · **Docs:** [`docs/README.md`](docs/README.md) · **Python:** 3.12+ (see [`pyproject.toml`](pyproject.toml)).

---

## Install

```bash
pip install deeplife-toolkit
```

**API key:** you need a DeepLife key (usually `dl_…`). Create or copy one from the **[TwinCell console](https://twincell.core.dev.deeplife-internal.co/console)** (sign in, open **API keys** / **Keys**, then generate or reveal a key). In code, set `DEEPLIFE_API_KEY` in the environment or pass `api_key=` when constructing the client. Your org may also document keys beside its Open API (Swagger) entrypoints.

**From a git clone** (contributors):

```bash
uv sync --group dev
```

---

## Package layout

| Import | Role |
|--------|------|
| `deeplife_toolkit.twincell.api` | HTTP client (`OpenDeepLifeClient`, async variant), predictions, datasets, `TwinCell` / `TwinCellStudy` workflows, data catalog helpers, plotting |
| `deeplife_toolkit.twincell.validation` | TwinCell-ready preprocessing and local validation (`DeepLifePreprocessor`, configs, CLI `twincell-validate-h5ad`) |
| `deeplife_toolkit.pseudobulk`, `deeplife_toolkit.differential_expression` | Pseudo-bulk from single-cell `AnnData`, and sample-level DE (CLIs `twincell-pseudobulk`, `twincell-diffexpr`) |

Install with **`pip install deeplife-toolkit`** (PyPI **project** name, hyphen). **Import** **`deeplife_toolkit.twincell`**, **`deeplife_toolkit.pseudobulk`**, and **`deeplife_toolkit.differential_expression`** (nested packages inside the `deeplife_toolkit` distribution).

---

## Minimal API usage

End-to-end flow: preprocess/validate `.h5ad` → `create_prediction` → poll or `watch_prediction` → read `results` (and optional influence / causal helpers). The toolkit validates uploads the same way the TwinCell API does.

```python
import os
from deeplife_toolkit.twincell.api import OpenDeepLifeClient

client = OpenDeepLifeClient(api_key=os.environ["DEEPLIFE_API_KEY"])
prediction = client.create_prediction(dataset="twincell_ready.h5ad")
final = client.wait_for_prediction(prediction_id=prediction.prediction_id)
print(final.status)
```

**Defaults:** the client uses the toolkit’s configured API base URL; pass `base_url=` for another environment. Retries apply to safe **GET**-style calls (polling), not duplicate uploads on `POST`. For TLS/proxy issues, use `tls_verify=` and `trust_env=` on the client—see `OpenDeepLifeClient` in `deeplife_toolkit.twincell.api`.

For preprocessing options (raw layer, disease filter, gene symbol column, etc.), use `deeplife_toolkit.twincell.validation` or the [example notebooks](#example-notebooks).

---

## Example notebooks

Curated tutorials live under **`notebooks/`** (next to [`setup_deeplife_environment.sh`](notebooks/setup_deeplife_environment.sh)). The PyPI wheel does not bundle these files; use a **git checkout** of this repository (or copy the `.ipynb` files) to run them.

[`kang_pbmc.ipynb`](notebooks/kang_pbmc.ipynb) is the **Kang PBMC (IFN-β)** tutorial: preprocess → pseudo-bulk **PyDESeq2** → **`TwinCellStudy`**, scorecards, causal subgraphs, and `simulate()`, with an optional commented block for in-memory **`skin_atlas_2024`**.

**Layout**

| Path | Role |
|------|------|
| [`notebooks/`](notebooks/) | Jupyter TwinCell tutorial (`kang_pbmc.ipynb`). |
| [`notebooks/data/`](notebooks/data/) | Downloaded `.h5ad` and notebook outputs (gitignored). The notebook resolves the repo root from the kernel cwd or the installed `deeplife_toolkit` package path, then writes here when paths are relative to the repo. |
| [`notebooks/setup_deeplife_environment.sh`](notebooks/setup_deeplife_environment.sh) | Optional bootstrap: `uv sync --group dev --extra notebook`, optional AWS CodeArtifact token for internal indexes, and a **`Python (deeplife-toolkit)`** ipykernel. Run from anywhere; it `cd`s to the repo root automatically. |

**Paths and API keys:** use a **git checkout** so `notebooks/data/` exists on disk. The notebook walks up from the kernel cwd for a `pyproject.toml` with `name = "deeplife-toolkit"`, or falls back to the installed `deeplife_toolkit` package path. It loads Kang data over HTTP via `get_adata_from_url` and uses **`getpass`** (or `DEEPLIFE_API_KEY`) before TwinCell study flows.

| Notebook | When to use |
|----------|-------------|
| [`notebooks/kang_pbmc.ipynb`](notebooks/kang_pbmc.ipynb) | **Kang PBMC (IFN-β):** preprocess → pseudo-bulk **PyDESeq2** → **`TwinCellStudy`**, scorecards, causal subgraphs, `simulate()`. |

If you installed **`deeplife-toolkit` from PyPI**, create a virtual environment with `pip install deeplife-toolkit`, clone this repo for the notebook, then install whatever you need to run Jupyter (for example **`pip install "deeplife-toolkit[notebook]"`**, **JupyterLab** or **Notebook**, and **seaborn** for `kang_pbmc.ipynb` as written). Select that environment as the Jupyter kernel.

Example dataset helpers live in `deeplife_toolkit.twincell.api.datasets` (`EXAMPLE_DATASETS` is empty until you add HTTP(S) links; `download_example_dataset` streams those URLs). In-memory demo atlas: `get_adata(dataset_name="skin_atlas_2024")` from `deeplife_toolkit.twincell.api.data`.

> PyPI releases can lag `main`; after `pip install -U deeplife-toolkit`, check that imports match the [layout](#package-layout) you expect.

---

## Development

```bash
uv sync --group dev
make check-all    # or: ruff, mypy, pytest — see Makefile
```

**CI / CD:** [`.github/workflows/ci.yml`](.github/workflows/ci.yml) runs on pushes and PRs to **`main`** / **`master`**: `uv sync --frozen --group dev`, **Ruff** (format + lint), **mypy**, **pytest**, then **`uv build`** and **`twine check --strict`**. Releases use [`.github/workflows/pypi-publish.yml`](.github/workflows/pypi-publish.yml) (same checks, then **PyPI** via OIDC). [`.github/dependabot.yml`](.github/dependabot.yml) bumps **GitHub Actions** dependencies weekly. Maintainer steps: [`README_PYPI_TODO.md`](README_PYPI_TODO.md).

Contributors who rely on internal **CodeArtifact**-backed `uv` indexes can run **`sh notebooks/setup_deeplife_environment.sh`** once (AWS SSO + `uv sync --group dev --extra notebook` + optional Jupyter kernel registration), then open the tutorials under `notebooks/`.

---

## Publishing to PyPI (maintainers)

Use **[`README_PYPI_TODO.md`](README_PYPI_TODO.md)** for trusted publishing, GitHub environment `pypi`, and version tags.
