Metadata-Version: 2.4
Name: dvpimg
Version: 0.0.1a36
Summary: Utility functions for scalable image DVP image analysis
Author-email: Lucas Diedrich <diedrich@biochem.mpg.de>, Anton Schüle <schuele@biochem.mpg.de>
Maintainer-email: Lucas Diedrich <diedrich@biochem.mpg.de>
License: MIT License
        
        Copyright (c) 2021, AUTHORS
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: bioimage,deep visual proteomics,image analysis,microscopy,spatial data
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.10
Requires-Dist: dask[distributed]<2026.1.2
Requires-Dist: dvp-io
Requires-Dist: harpy-analysis
Requires-Dist: lazyslide
Requires-Dist: pydantic
Requires-Dist: scikit-image
Requires-Dist: scikit-learn
Requires-Dist: scipy
Requires-Dist: seaborn
Requires-Dist: spatialdata
Requires-Dist: wsidata
Provides-Extra: dev
Requires-Dist: pre-commit>=3.3; extra == 'dev'
Requires-Dist: ruff>=0.0.280; extra == 'dev'
Provides-Extra: doc
Requires-Dist: docutils!=0.18.*,!=0.19.*,>=0.8; extra == 'doc'
Requires-Dist: ipykernel; extra == 'doc'
Requires-Dist: ipython; extra == 'doc'
Requires-Dist: myst-nb>=1.1; extra == 'doc'
Requires-Dist: setuptools; extra == 'doc'
Requires-Dist: sphinx-autodoc-typehints; extra == 'doc'
Requires-Dist: sphinx-book-theme>=1; extra == 'doc'
Requires-Dist: sphinx-copybutton; extra == 'doc'
Requires-Dist: sphinx-tabs; extra == 'doc'
Requires-Dist: sphinx>=4; extra == 'doc'
Requires-Dist: sphinxcontrib-bibtex>=1; extra == 'doc'
Requires-Dist: sphinxext-opengraph; extra == 'doc'
Provides-Extra: test
Requires-Dist: coverage; extra == 'test'
Requires-Dist: lazyslide; extra == 'test'
Requires-Dist: pytest; extra == 'test'
Provides-Extra: workflow
Requires-Dist: cellpose<4; extra == 'workflow'
Requires-Dist: snakemake>=7.32; extra == 'workflow'
Description-Content-Type: text/markdown

# Snakemake workflow: `dvp-imaging-pipeline`

[![Snakemake](https://img.shields.io/badge/snakemake-≥8.0.0-brightgreen.svg)](https://snakemake.github.io)
[![Test](https://github.com/lucas-diedrich/dvp-imaging-pipeline/actions/workflows/test.yaml/badge.svg)](https://github.com/lucas-diedrich/dvp-imaging-pipeline/actions/workflows/test.yaml)

<!-- [![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/) -->
<!-- [![workflow catalog](https://img.shields.io/badge/Snakemake%20workflow%20catalog-darkgreen)](https://snakemake.github.io/snakemake-workflow-catalog/docs/workflows/MannLabs/dvp-imaging-pipeline) -->

A Snakemake workflow for `Scalable processing of DVP imaging data with snakemake.`

- [Snakemake workflow: `dvp-imaging-pipeline`](#snakemake-workflow-name)
  - [Scope](#scope)
  - [Usage](#usage)
  - [References](#references)
  - [TODO](#todo)

## About

![Pipeline](docs/_static/image/pipeline.png)

### Workflow

```mermaid
flowchart LR


style QCTissue1 fill:#ffffff,stroke:#222222,stroke-width:2px,color:#000
style QCSegmentation1 fill:#ffffff,stroke:#222222,stroke-width:2px,color:#000
style QCClassification1 fill:#ffffff,stroke:#222222,stroke-width:2px,color:#000
style QCClassification2 fill:#ffffff,stroke:#222222,stroke-width:2px,color:#000

%% colors
%% DONE: #0097A7
style IO fill:#ffffff,stroke:#0097A7,stroke-width:4px,color:#000
style Segmentation1a fill:#ffffff,stroke:#0097A7,stroke-width:2px,color:#000
style Classification1a fill:#ffffff,stroke:#0097A7,stroke-width:2px,color:#000
style Classification2a fill:#ffffff,stroke:#0097A7,stroke-width:2px,color:#000




IO[Generate spatialdata]

Tissue1[Tissue Identification]
Tissue2[Dearraying]
Tissue3a[Tissue QC, IF]
Tissue3b[Tissue QC, HE]

Segmentation1a[Cell/Nuclei segmentation, IF]
Segmentation1b[Cell/Nuclei segmentation, HE]


Classification1a[QC, IF]
Classification2a[Cell Classification, IF]
Classification2b[Cell Classification, HE]

QCTissue1(Tissue Plots)
QCSegmentation1(Segmentation Plots + Metrics)
QCClassification1(QC Plots + Metrics)
QCClassification2(Classification Plots + Metrics)


IO --> Tissue1
Tissue1 -->|TMA| Tissue2
Tissue1 -->|WSI| Tissue3a
Tissue1 -->|WSI| Tissue3b

Tissue2 --> Tissue3a
Tissue2 --> Tissue3b


Tissue3a --> Segmentation1a
Tissue3b --> Segmentation1b

Segmentation1a --> Classification1a --> Classification2a
Segmentation1b --> Classification2b
```

### Scope

| Priority         |                                      | Current implementation/status |
| ---------------- | ------------------------------------ | ----------------------------- |
| **Must have**    |                                      |
|                  | Spatialdata IO                       | :white_check_mark: (dvpio)    |
|                  | Nuclei+Cell segmentation             | :white_check_mark: (harpy)    |
|                  | Align cells+nuclei                   | (harpy)                       |
|                  | Tissue QC                            | (GrandQC, custom)             |
|                  | Cell QC                              | :white_check_mark: (custom)   |
|                  | Cell classification                  | :white_check_mark: (GMM)      |
|                  | Workflow manager (Snakemake)         |                               |
|                  | Documentation+Tutorials              |                               |
| **Nice to have** |                                      |                               |
|                  | Automated Reports                    |                               |
|                  | Spatialdata database                 |                               |
|                  | Config editor                        |                               |
|                  | Train cellpose models in spatialdata |                               |
|                  | Dockerized solution                  |                               |
| **Not in scope** |                                      |                               |
|                  | Complete GUI                         |                               |

## Usage

<!-- The usage of this workflow is described in the [Snakemake Workflow Catalog](https://snakemake.github.io/snakemake-workflow-catalog/docs/workflows/MannLabs/dvp-imaging-pipeline). -->

Detailed information about input data and workflow configuration can also be found in the [`config/README.md`](config/README.md).

If you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this repository or its DOI.

### Deployment options

To run the workflow from the command line, change the working directory.

#### From repository

Clone this repository or download it via (`Code > Download ZIP`)

```bash
git clone https://github.com/lucas-diedrich/dvp-imaging-pipeline.git
```

```bash
cd path/to/dvp-imaging-pipeline
```

Adjust options in the default config file `config/config.yaml`.
Before running the complete workflow, you can perform a dry run using:

```bash
snakemake --dry-run
```

To run the workflow with test files using **conda**:

```bash
snakemake --logger snkmt --cores 2 --sdm conda --directory workflow --configfile tests/integration/config/config.yaml
```

Results (spatialdata objects + logs + plots) can be found in the `/tests/results` directory.

##### Logging

Monitor the progress of the workflow in the workflow directory with [snkmt](https://github.com/cademirch/snkmt), see also [the documentation](./docs/tutorials/workflow/logger.md)

```
snkmt console
```

<!-- To run the workflow with **apptainer** / **singularity**, add a link to a container registry in the `Snakefile`, for example `container: "oras://ghcr.io/<user>/<repository>:<version>"` for Github's container registry.
Run the workflow with:

```bash
snakemake --cores 2 --sdm conda apptainer --directory .test
``` -->

#### Install utility functions

The utility functions in the `src/` directory are a viable python package (`dvpimg`) and can be installed with `pip`.

1. In the shell, go to your favorite directory. Create a suitable environment, e.g. with `mamba`

```shell
mamba create -n dvpimg python=3.12 -y && mamba activate dvpimg
```

2. Clone the repository

```shell
git clone https://github.com/lucas-diedrich/dvp-imaging-pipeline.git
cd dvp-imaging-pipeline
```

3. Install the package

```shell
pip install .
```

#### Docker

Todo

#### HPC

You can run this workflow on an high-performance computing cluster

On the cluster, create the `snakemake` base environment the environment

```
conda create -n snakemake -y
conda env update --n snakemake --file environment.yaml
```

Additionally install the `snakemake-executor-plugin-slurm`:

```shell
pip install snakemake-executor-plugin-slurm
```

Then submit the provided workflow script on a cluster. Please checkout the script and the official [snakemake slurm plugin documentation](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/slurm.html#snakemake-executor-plugin-slurm) to be learn about relevant flags and settings.

```shell
cd /workflow/
sbatch snakemake.sbatch
```

### Tests

You can run an example workflow with hatch (it automatically manages the required dependencies)

```shell
# Install hatch
pip install hatch

# Install all dependencies+environments for package + snakemake workflow
hatch run workflow:install

# Dry run
hatch run workflow:dry-run

# Full test with example data
hatch run workflow:run-test
```

## Components

See the [documentation](./docs/usage.md) for more information on the individual components and detailed instructions on their configuration.

## References

(_alphabetical order_)

> **dvp-io**: _MannLabs/dvp-io_. Lucas Diedrich (2025). https://github.com/MannLabs/dvp-io.git

> **Harpy**: _saeyslab/harpy_. Saeys Lab (2025). https://github.com/saeyslab/harpy.git

> **Lazyslide**: Zheng, Y., Abila, E., Chrenková, E., Winkler, J. & Rendeiro, A. F. _LazySlide: accessible and interoperable whole slide image analysis_. 2025.05.28.656548 Preprint at https://doi.org/10.1101/2025.05.28.656548 (2025).

> **Snakemake** Köster, J., Mölder, F., Jablonski, K. P., Letcher, B., Hall, M. B., Tomkins-Tinch, C. H., Sochat, V., Forster, J., Lee, S., Twardziok, S. O., Kanitz, A., Wilm, A., Holtgrewe, M., Rahmann, S., & Nahnsen, S. _Sustainable data analysis with Snakemake_. F1000Research, 10:33, 10, 33, **2021**. https://doi.org/10.12688/f1000research.29032.2.

> **Spatialdata**: Marconato, L. et al. _SpatialData: an open and universal data framework for spatial omics_. Nat Methods 1–5 (2024) doi:10.1038/s41592-024-02212-x.

## Citation

> DVP-Imaging-Pipeline [Computer software]. https://github.com/lucas-diedrich/dvp-imaging-pipeline.git
