Metadata-Version: 2.4
Name: atlas-patch
Version: 1.0.0.post5
Summary: A Python package for processing and handling whole slide images
Author: Yousef Kotp, Omar Metwally, Ahmed Alagha
License: CC-BY-NC-SA-4.0
Keywords: atlas-patch,whole-slide-image,wsi,tissue-segmentation,patch-extraction,computational-pathology
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: Other/Proprietary License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Medical Science Apps.
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Healthcare Industry
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: openslide-python>=1.2.0
Requires-Dist: Pillow>=9.0.0
Requires-Dist: numpy>=1.21.0
Requires-Dist: h5py>=3.8.0
Requires-Dist: opencv-python>=4.7.0
Requires-Dist: click>=8.0.0
Requires-Dist: torch>=2.0.0
Requires-Dist: omegaconf>=2.3.0
Requires-Dist: hydra-core>=1.3.2
Requires-Dist: matplotlib>=3.5.0
Requires-Dist: tqdm>=4.66.0
Requires-Dist: torchvision>=0.15.0
Requires-Dist: timm>=0.9.0
Requires-Dist: huggingface-hub>=0.23.0
Requires-Dist: gdown>=5.2.0
Requires-Dist: transformers>=4.41.0
Requires-Dist: sentencepiece>=0.2.0
Requires-Dist: open-clip-torch>=2.24.0
Requires-Dist: fairscale>=0.4.0
Requires-Dist: einops>=0.8.0
Requires-Dist: einops-exts>=0.0.4
Provides-Extra: dev
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: pre-commit>=3.0.0; extra == "dev"
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Dynamic: license-file

<p align="center">
  <img src="https://raw.githubusercontent.com/AtlasAnalyticsLab/AtlasPatch/main/assets/images/Logo.png" alt="AtlasPatch Logo" width="100%">
</p>

# AtlasPatch: An Efficient and Scalable Tool for Whole Slide Image Preprocessing in Computational Pathology

<p align="center">
  <a href="https://pypi.org/project/atlas-patch/"><img alt="PyPI" src="https://img.shields.io/pypi/v/atlas-patch"></a>
  <a href="https://pypi.org/project/atlas-patch/"><img alt="Python" src="https://img.shields.io/pypi/pyversions/atlas-patch"></a>
  <a href="LICENSE"><img alt="License" src="https://img.shields.io/badge/License-CC--BY--NC--SA--4.0-blue"></a>
</p>

<!-- TODO: Update paper link (XXXX.XXXXX) once published on arXiv -->
<p align="center">
  <a href="https://atlasanalyticslab.github.io/AtlasPatch/"><b>Project Page</b></a> |
  <a href="https://arxiv.org/abs/2602.03998"><b>Paper</b></a> |
  <a href="https://huggingface.co/AtlasAnalyticsLab/AtlasPatch"><b>Hugging Face</b></a> |
  <a href="https://github.com/AtlasAnalyticsLab/AtlasPatch"><b>GitHub</b></a>
</p>

## Table of Contents
- [Installation](#installation)
  - [Quick Install (Recommended)](#quick-install-recommended)
  - [OpenSlide Prerequisites](#openslide-prerequisites)
  - [Optional Encoder Dependencies](#optional-encoder-dependencies)
  - [Alternative Installation Methods](#alternative-installation-methods)
- [Usage Guide](#usage-guide)
  - [Pipeline Checkpoints](#pipeline-checkpoints)
    - [A - Tissue Detection](#a-tissue-detection)
    - [B - Patch Coordinate Extraction](#b-patch-coordinate-extraction)
    - [C - Patch Embedding](#c-patch-embedding)
    - [D - Patch Writing](#d-patch-writing)
  - [Visualization Samples](#visualization-samples)
  - [Process Command Arguments](#process-command-arguments)
    - [Required](#required)
    - [Optional](#optional)
      - [Patch Layout](#patch-layout)
      - [Segmentation & Extraction Performance](#segmentation--extraction-performance)
      - [Feature Extraction](#feature-extraction)
      - [Filtering & Quality](#filtering--quality)
      - [Visualization](#visualization)
      - [Run Control](#run-control)
- [Supported Formats](#supported-formats)
- [Using Extracted Data](#using-extracted-data)
  - [Patch Coordinates](#patch-coordinates)
  - [Feature Matrices](#feature-matrices)
- [Available Feature Extractors](#available-feature-extractors)
  - [Core vision backbones on Natural Images](#core-vision-backbones-on-natural-images)
  - [Medical- and Pathology-Specific Vision Encoders](#medical--and-pathology-specific-vision-encoders)
  - [CLIP-like models](#clip-like-models)
    - [Natural Images](#natural-images)
    - [Medical- and Pathology-Specific CLIP](#medical--and-pathology-specific-clip)
- [Bring Your Own Encoder](#bring-your-own-encoder)
- [SLURM job scripts](#slurm-job-scripts)
- [Frequently Asked Questions (FAQ)](#frequently-asked-questions-faq)
- [Feedback](#feedback)
- [Citation](#citation)
- [License](#license)
- [Future Updates](#future-updates)
  - [Slide Encoders](#slide-encoders)

## Installation

### Quick Install (Recommended)

```bash
# Install AtlasPatch
pip install atlas-patch

# Install SAM2 (required for tissue segmentation)
pip install git+https://github.com/facebookresearch/sam2.git
```

> **Note:** AtlasPatch requires the OpenSlide system library for WSI processing. See [OpenSlide Prerequisites](#openslide-prerequisites) below.

### OpenSlide Prerequisites

Before installing AtlasPatch, you need the OpenSlide system library:

- **Using Conda (Recommended)**:
  ```bash
  conda install -c conda-forge openslide
  ```

- **Ubuntu/Debian**:
  ```bash
  sudo apt-get install openslide-tools
  ```

- **macOS**:
  ```bash
  brew install openslide
  ```

- **Other systems**: Visit [OpenSlide Documentation](https://openslide.org/)

### Optional Encoder Dependencies

Some feature extractors require additional dependencies that must be installed separately:

```bash
# For CONCH encoder (conch_v1, conch_v15)
pip install git+https://github.com/Mahmoodlab/CONCH.git

# For MUSK encoder
pip install git+https://github.com/lilab-stanford/MUSK.git
```

These are only needed if you plan to use those specific encoders.

### Alternative Installation Methods

<details>
<summary><b>Using Conda Environment</b></summary>

```bash
# Create and activate environment
conda create -n atlas_patch python=3.10
conda activate atlas_patch

# Install OpenSlide
conda install -c conda-forge openslide

# Install AtlasPatch and SAM2
pip install atlas-patch
pip install git+https://github.com/facebookresearch/sam2.git
```
</details>

<details>
<summary><b>Using uv (faster installs)</b></summary>

```bash
# Install uv (see https://docs.astral.sh/uv/getting-started/)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create and activate environment
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install AtlasPatch and SAM2
uv pip install atlas-patch
uv pip install git+https://github.com/facebookresearch/sam2.git
```
</details>


## Usage Guide

AtlasPatch provides a flexible pipeline with **4 checkpoints** that you can use independently or combine based on your needs.

### Pipeline Checkpoints

<p align="center">
  <img src="https://raw.githubusercontent.com/AtlasAnalyticsLab/AtlasPatch/main/assets/images/Checkouts.png" alt="AtlasPatch Pipeline Checkpoints" width="100%">
</p>

Quick overview of the checkpoint commands:
- `detect-tissue`: runs SAM2 segmentation and writes mask overlays under `<output>/visualization/`.
- `segment-and-get-coords`: runs segmentation + patch coordinate extraction into `<output>/patches/<stem>.h5`.
- `process`: full pipeline (segmentation + coords + feature embeddings) in the same H5.
- `segment-and-get-coords --save-images`: same as `segment-and-get-coords`, plus patch PNGs under `<output>/images/<stem>/`.

---

#### [A] Tissue Detection

Detect and visualize tissue regions in your WSI using SAM2 segmentation.

```bash
atlaspatch detect-tissue /path/to/slide.svs \
  --output ./output \
  --device cuda
```

---

#### [B] Patch Coordinate Extraction

Detect tissue and extract patch coordinates without feature embedding.

```bash
atlaspatch segment-and-get-coords /path/to/slide.svs \
  --output ./output \
  --patch-size 256 \
  --target-mag 20 \
  --device cuda
```

---

#### [C] Patch Embedding

Run the full pipeline: Tissue detection, coordinate extraction, and feature embedding.

```bash
atlaspatch process /path/to/slide.svs \
  --output ./output \
  --patch-size 256 \
  --target-mag 20 \
  --feature-extractors resnet50 \
  --device cuda
```

---

#### [D] Patch Writing

Full pipeline with optional patch image export for visualization or downstream tasks.

```bash
atlaspatch segment-and-get-coords /path/to/slide.svs \
  --output ./output \
  --patch-size 256 \
  --target-mag 20 \
  --device cuda \
  --save-images
```

---

Pass a directory instead of a single file to process multiple WSIs; outputs land in `<output>/patches/<stem>.h5` based on the path you provide to `--output`.

### Visualization Samples

Below are some examples for the output masks and overlays (original image, predicted mask, overlay, contours, grid).

<p align="center">
  <img src="https://raw.githubusercontent.com/AtlasAnalyticsLab/AtlasPatch/main/assets/images/VisualizationSamples.png" alt="AtlasPatch visualization samples" width="100%">
</p>

Quantitative and qualitative analysis of AtlasPatch tissue detection against existing slide-preprocessing tools.

<p align="center">
  <img src="https://raw.githubusercontent.com/AtlasAnalyticsLab/AtlasPatch/main/assets/images/Comparisons.jpg" alt="AtlasPatch method comparison" width="100%">
</p>

Representative WSI thumbnails are shown from diverse tissue features and artifact conditions, with tissue masks predicted by thresholding methods (TIAToolbox, CLAM) and deep learning methods (pretrained "non-finetuned" SAM2 model, Trident-QC, Trident-Hest and AtlasPatch), highlighting differences in boundary fidelity, artifact suppression and handling of fragmented tissue. Tissue detection performance is also shown on the held-out test set for AtlasPatch and baseline pipelines, highlighting that AtlasPatch matches or exceeds their segmentation quality. The segmentation complexity–performance trade-off, plotting F1-score against segmentation runtime (on a random set of 100 WSIs), shows AtlasPatch achieves high performance with substantially lower wall-clock time than tile-wise detectors and heuristic pipelines, underscoring its suitability for large-scale WSI preprocessing.

### Process Command Arguments

The `process` command is the primary entry point for most workflows. It runs the full pipeline: tissue segmentation, patch coordinate extraction, and feature embedding. You can process a single slide or an entire directory of WSIs in one command.

```bash
atlaspatch process <WSI_PATH> --output <DIR> --patch-size <INT> --target-mag <INT> --feature-extractors <NAMES> [OPTIONS]
```

#### Required

| Argument | Description |
| --- | --- |
| `WSI_PATH` | Path to a single slide file or a directory containing slides. When a directory is provided, all supported formats are processed. |
| `--output`, `-o` | Root directory for results. Outputs are organized as `<output>/patches/<stem>.h5` for coordinates and features, and `<output>/visualization/` for overlays. |
| `--patch-size` | Final patch size in pixels at the target magnification (e.g., `256` for 256×256 patches). |
| `--target-mag` | Magnification level to extract patches at. Common values: `5`, `10`, `20`, `40`. The pipeline reads from the closest available pyramid level and resizes if needed. |
| `--feature-extractors` | Comma or space-separated list of encoder names from [Available Feature Extractors](#available-feature-extractors). Multiple encoders can be specified to extract several feature sets in one pass (e.g., `resnet50,uni_v2`). |

#### Optional

##### Patch Layout

| Argument | Default | Description |
| --- | --- | --- |
| `--step-size` | Same as `--patch-size` | Stride between patches. Omit for non-overlapping grids. Use smaller values (e.g., `128` with `--patch-size 256`) to create 50% overlap. |

##### Segmentation & Extraction Performance

| Argument | Default | Description |
| --- | --- | --- |
| `--device` | `cuda` | Device for SAM2 tissue segmentation. Options: `cuda`, `cuda:0`, `cuda:1`, or `cpu`. |
| `--seg-batch-size` | `1` | Batch size for SAM2 thumbnail segmentation. Increase for faster processing if GPU memory allows. |
| `--patch-workers` | CPU count | Number of threads for patch extraction and H5 writes. |
| `--max-open-slides` | `200` | Maximum number of WSI files open simultaneously. Lower this if you hit file descriptor limits. |

##### Feature Extraction

| Argument | Default | Description |
| --- | --- | --- |
| `--feature-device` | Same as `--device` | Device for feature extraction. Set separately to use a different GPU than segmentation. |
| `--feature-batch-size` | `32` | Batch size for the feature extractor forward pass. Increase for faster throughput; decrease if running out of GPU memory. |
| `--feature-num-workers` | `4` | Number of DataLoader workers for loading patches during feature extraction. |
| `--feature-precision` | `float16` | Precision for feature extraction. Options: `float32`, `float16`, `bfloat16`. Lower precision reduces memory and can improve throughput on compatible GPUs. |

##### Filtering & Quality

| Argument | Default | Description |
| --- | --- | --- |
| `--fast-mode` | Enabled | Skips per-patch black/white content filtering for faster processing. Use `--no-fast-mode` to enable filtering. |
| `--tissue-thresh` | `0.0` | Minimum tissue area fraction to keep a region. Filters out tiny tissue fragments. |
| `--white-thresh` | `15` | Saturation threshold for white patch filtering (only with `--no-fast-mode`). Lower values are stricter. |
| `--black-thresh` | `50` | RGB threshold for black/dark patch filtering (only with `--no-fast-mode`). Higher values are stricter. |

##### Visualization

| Argument | Default | Description |
| --- | --- | --- |
| `--visualize-grids` | Off | Render patch grid overlay on slide thumbnails. |
| `--visualize-mask` | Off | Render tissue segmentation mask overlay. |
| `--visualize-contours` | Off | Render tissue contour overlay. |

All visualization outputs are saved under `<output>/visualization/`.

##### Run Control

| Argument | Default | Description |
| --- | --- | --- |
| `--save-images` | Off | Export each patch as a PNG file under `<output>/images/<stem>/`. |
| `--recursive` | Off | Walk subdirectories when `WSI_PATH` is a directory. |
| `--mpp-csv` | None | Path to a CSV file with `wsi,mpp` columns to override microns-per-pixel when slide metadata is missing or incorrect. |
| `--skip-existing` | Off | Skip slides that already have an output H5 file. |
| `--force` | Off | Overwrite existing output files. |
| `--verbose`, `-v` | Off | Enable debug logging and disable the progress bar. |
| `--write-batch` | `8192` | Number of coordinate rows to buffer before flushing to H5. Tune for RAM vs. I/O trade-off. |

## Supported Formats

AtlasPatch uses OpenSlide for WSIs and Pillow for standard images:

- WSIs: `.svs`, `.tif`, `.tiff`, `.ndpi`, `.vms`, `.vmu`, `.scn`, `.mrxs`, `.bif`, `.dcm`
- Images: `.png`, `.jpg`, `.jpeg`, `.bmp`, `.webp`, `.gif`

## Using Extracted Data

`atlaspatch process` writes one HDF5 per slide under `<output>/patches/<stem>.h5` containing coordinates and feature matrices. Coordinates and features share row order.

### Patch Coordinates

- Dataset: `coords` (int32, shape `(N, 5)`) with columns `(x, y, read_w, read_h, level)`.
- `x` and `y` are level-0 pixel coordinates. `read_w`, `read_h`, and `level` describe how the patch was read from the WSI.
- The level-0 footprint of each patch is stored as the `patch_size_level0` file attribute; some slide encoders use it for positional encoding (e.g., ALiBi in TITAN).

Example:

```python
import h5py
import numpy as np
import openslide
from PIL import Image

h5_path = "output/patches/sample.h5"
wsi_path = "/path/to/slide.svs"

with h5py.File(h5_path, "r") as f:
    coords = f["coords"][...]  # (N, 5) int32: [x, y, read_w, read_h, level]
    patch_size = int(f.attrs["patch_size"])

with openslide.OpenSlide(wsi_path) as wsi:
    for x, y, read_w, read_h, level in coords:
        img = wsi.read_region(
            (int(x), int(y)),
            int(level),
            (int(read_w), int(read_h)),
        ).convert("RGB")
        if img.size != (patch_size, patch_size):
            # Some slides don't have a pyramid level that matches target magnification exactly, so they have to be resized.
            img = img.resize((patch_size, patch_size), resample=Image.BILINEAR)
        patch = np.array(img)  # (H, W, 3) uint8
```

### Feature Matrices

- Group: `features/` inside the same HDF5.
- Each extractor is stored as `features/<name>` (float32, shape `(N, D)`), aligned row-for-row with `coords`.
- List available feature sets with `list(f['features'].keys())`.

```python
import h5py

with h5py.File("output/patches/sample.h5", "r") as f:
    feat_names = list(f["features"].keys())
    resnet50_feats = f["features/resnet50"][...]  # (N, 2048) float32
```

## Available Feature Extractors

### Core vision backbones on Natural Images

| Name | Output Dim |
| --- | --- |
| `resnet18` | 512 |
| `resnet34` | 512 |
| `resnet50` | 2048 |
| `resnet101` | 2048 |
| `resnet152` | 2048 |
| `convnext_tiny` | 768 |
| `convnext_small` | 768 |
| `convnext_base` | 1024 |
| `convnext_large` | 1536 |
| `vit_b_16` | 768 |
| `vit_b_32` | 768 |
| `vit_l_16` | 1024 |
| `vit_l_32` | 1024 |
| `vit_h_14` | 1280 |
| [`dinov2_small`](https://huggingface.co/facebook/dinov2-small) ([DINOv2: Learning Robust Visual Features without Supervision](https://arxiv.org/abs/2304.07193)) | 384 |
| [`dinov2_base`](https://huggingface.co/facebook/dinov2-base) ([DINOv2: Learning Robust Visual Features without Supervision](https://arxiv.org/abs/2304.07193)) | 768 |
| [`dinov2_large`](https://huggingface.co/facebook/dinov2-large) ([DINOv2: Learning Robust Visual Features without Supervision](https://arxiv.org/abs/2304.07193)) | 1024 |
| [`dinov2_giant`](https://huggingface.co/facebook/dinov2-giant) ([DINOv2: Learning Robust Visual Features without Supervision](https://arxiv.org/abs/2304.07193)) | 1536 |
| [`dinov3_vits16`](https://huggingface.co/facebook/dinov3-vits16-pretrain-lvd1689m) ([DINOv3](https://arxiv.org/abs/2508.10104)) | 384 |
| [`dinov3_vits16_plus`](https://huggingface.co/facebook/dinov3-vits16plus-pretrain-lvd1689m) ([DINOv3](https://arxiv.org/abs/2508.10104)) | 384 |
| [`dinov3_vitb16`](https://huggingface.co/facebook/dinov3-vitb16-pretrain-lvd1689m) ([DINOv3](https://arxiv.org/abs/2508.10104)) | 768 |
| [`dinov3_vitl16`](https://huggingface.co/facebook/dinov3-vitl16-pretrain-lvd1689m) ([DINOv3](https://arxiv.org/abs/2508.10104)) | 1024 |
| [`dinov3_vitl16_sat`](https://huggingface.co/facebook/dinov3-vitl16-pretrain-sat493m) ([DINOv3](https://arxiv.org/abs/2508.10104)) | 1024 |
| [`dinov3_vith16_plus`](https://huggingface.co/facebook/dinov3-vith16plus-pretrain-lvd1689m) ([DINOv3](https://arxiv.org/abs/2508.10104)) | 1280 |
| [`dinov3_vit7b16`](https://huggingface.co/facebook/dinov3-vit7b16-pretrain-lvd1689m) ([DINOv3](https://arxiv.org/abs/2508.10104)) | 4096 |
| [`dinov3_vit7b16_sat`](https://huggingface.co/facebook/dinov3-vit7b16-pretrain-sat493m) ([DINOv3](https://arxiv.org/abs/2508.10104)) | 4096 |

### Medical- and Pathology-Specific Vision Encoders

| Name | Output Dim |
| --- | --- |
| [`uni_v1`](https://huggingface.co/MahmoodLab/UNI) ([Towards a General-Purpose Foundation Model for Computational Pathology](https://www.nature.com/articles/s41591-024-02857-3)) | 1024 |
| [`uni_v2`](https://huggingface.co/MahmoodLab/UNI2-h) ([Towards a General-Purpose Foundation Model for Computational Pathology](https://www.nature.com/articles/s41591-024-02857-3)) | 1536 |
| [`phikon_v1`](https://huggingface.co/owkin/phikon) ([Scaling Self-Supervised Learning for Histopathology with Masked Image Modeling](https://www.medrxiv.org/content/10.1101/2023.07.21.23292757v1)) | 768 |
| [`phikon_v2`](https://huggingface.co/owkin/phikon-v2) ([Phikon-v2, A large and public feature extractor for biomarker prediction](https://arxiv.org/abs/2409.09173)) | 1024 |
| [`virchow_v1`](https://huggingface.co/paige-ai/Virchow) ([Virchow: A Million-Slide Digital Pathology Foundation Model](https://arxiv.org/abs/2309.07778)) | 2560 |
| [`virchow_v2`](https://huggingface.co/paige-ai/Virchow2) ([Virchow2: Scaling Self-Supervised Mixed Magnification Models in Pathology](https://arxiv.org/abs/2408.00738)) | 2560 |
| [`prov_gigapath`](https://huggingface.co/prov-gigapath/prov-gigapath) ([A whole-slide foundation model for digital pathology from real-world data](https://www.nature.com/articles/s41586-024-07441-w)) | 1536 |
| [`chief-ctranspath`](https://github.com/hms-dbmi/CHIEF?tab=readme-ov-file) ([CHIEF: Clinical Histopathology Imaging Evaluation Foundation Model](https://www.nature.com/articles/s41586-024-07894-z)) | 768 |
| [`midnight`](https://huggingface.co/kaiko-ai/midnight) ([Training state-of-the-art pathology foundation models with orders of magnitude less data](https://arxiv.org/abs/2504.05186)) | 3072 |
| [`musk`](https://github.com/lilab-stanford/MUSK) ([MUSK: A Vision-Language Foundation Model for Precision Oncology](https://www.nature.com/articles/s41586-024-08378-w)) | 1024 |
| [`openmidnight`](https://sophontai.com/blog/openmidnight) ([How to Train a State-of-the-Art Pathology Foundation Model with $1.6k](https://sophontai.com/blog/openmidnight)) | 1536 |
| [`pathorchestra`](https://huggingface.co/AI4Pathology/PathOrchestra) ([PathOrchestra: A Comprehensive Foundation Model for Computational Pathology with Over 100 Diverse Clinical-Grade Tasks](https://arxiv.org/abs/2503.24345)) | 1024 |
| [`h_optimus_0`](https://huggingface.co/bioptimus/H-optimus-0) | 1536 |
| [`h_optimus_1`](https://huggingface.co/bioptimus/H-optimus-1) | 1536 |
| [`h0_mini`](https://huggingface.co/bioptimus/H0-mini) ([Distilling foundation models for robust and efficient models in digital pathology](https://doi.org/10.48550/arXiv.2501.16239)) | 1536 |
| [`conch_v1`](https://huggingface.co/MahmoodLab/CONCH) ([A visual-language foundation model for computational pathology](https://www.nature.com/articles/s41591-024-02856-4)) | 512 |
| [`conch_v15`](https://huggingface.co/MahmoodLab/conchv1_5) - [From TITAN](https://huggingface.co/MahmoodLab/TITAN) ([A multimodal whole-slide foundation model for pathology](https://www.nature.com/articles/s41591-025-03982-3)) | 768 |
| [`hibou_b`](https://huggingface.co/histai/hibou-B) ([Hibou: A Family of Foundational Vision Transformers for Pathology](https://arxiv.org/abs/2406.05074)) | 768 |
| [`hibou_l`](https://huggingface.co/histai/hibou-L) ([Hibou: A Family of Foundational Vision Transformers for Pathology](https://arxiv.org/abs/2406.05074)) | 1024 |
| [`lunit_resnet50_bt`](https://huggingface.co/1aurent/resnet50.lunit_bt) ([Benchmarking Self-Supervised Learning on Diverse Pathology Datasets](https://openaccess.thecvf.com/content/CVPR2023/papers/Kang_Benchmarking_Self-Supervised_Learning_on_Diverse_Pathology_Datasets_CVPR_2023_paper.pdf)) | 2048 |
| [`lunit_resnet50_swav`](https://huggingface.co/1aurent/resnet50.lunit_swav) ([Benchmarking Self-Supervised Learning on Diverse Pathology Datasets](https://openaccess.thecvf.com/content/CVPR2023/papers/Kang_Benchmarking_Self-Supervised_Learning_on_Diverse_Pathology_Datasets_CVPR_2023_paper.pdf)) | 2048 |
| [`lunit_resnet50_mocov2`](https://huggingface.co/1aurent/resnet50.lunit_mocov2) ([Benchmarking Self-Supervised Learning on Diverse Pathology Datasets](https://openaccess.thecvf.com/content/CVPR2023/papers/Kang_Benchmarking_Self-Supervised_Learning_on_Diverse_Pathology_Datasets_CVPR_2023_paper.pdf)) | 2048 |
| [`lunit_vit_small_patch16_dino`](https://huggingface.co/1aurent/vit_small_patch16_224.lunit_dino) ([Benchmarking Self-Supervised Learning on Diverse Pathology Datasets](https://openaccess.thecvf.com/content/CVPR2023/papers/Kang_Benchmarking_Self-Supervised_Learning_on_Diverse_Pathology_Datasets_CVPR_2023_paper.pdf)) | 384 |
| [`lunit_vit_small_patch8_dino`](https://huggingface.co/1aurent/vit_small_patch8_224.lunit_dino) ([Benchmarking Self-Supervised Learning on Diverse Pathology Datasets](https://openaccess.thecvf.com/content/CVPR2023/papers/Kang_Benchmarking_Self-Supervised_Learning_on_Diverse_Pathology_Datasets_CVPR_2023_paper.pdf)) | 384 |

> **Note:** Some encoders (e.g., `uni_v1`, etc.) require access approval from Hugging Face. To use these models:
> 1. Request access on the respective Hugging Face model page
> 2. Once approved, set your Hugging Face token as an environment variable:
>    ```bash
>    export HF_TOKEN=your_huggingface_token
>    ```
> 3. Then you can use the encoder in your commands

### CLIP-like models

#### Natural Images

| Name | Output Dim |
| --- | --- |
| `clip_rn50` ([Learning Transferable Visual Models From Natural Language Supervision](https://arxiv.org/abs/2103.00020)) | 1024 |
| `clip_rn101` ([Learning Transferable Visual Models From Natural Language Supervision](https://arxiv.org/abs/2103.00020)) | 512 |
| `clip_rn50x4` ([Learning Transferable Visual Models From Natural Language Supervision](https://arxiv.org/abs/2103.00020)) | 640 |
| `clip_rn50x16` ([Learning Transferable Visual Models From Natural Language Supervision](https://arxiv.org/abs/2103.00020)) | 768 |
| `clip_rn50x64` ([Learning Transferable Visual Models From Natural Language Supervision](https://arxiv.org/abs/2103.00020)) | 1024 |
| `clip_vit_b_32` ([Learning Transferable Visual Models From Natural Language Supervision](https://arxiv.org/abs/2103.00020)) | 512 |
| `clip_vit_b_16` ([Learning Transferable Visual Models From Natural Language Supervision](https://arxiv.org/abs/2103.00020)) | 512 |
| `clip_vit_l_14` ([Learning Transferable Visual Models From Natural Language Supervision](https://arxiv.org/abs/2103.00020)) | 768 |
| `clip_vit_l_14_336` ([Learning Transferable Visual Models From Natural Language Supervision](https://arxiv.org/abs/2103.00020)) | 768 |

#### Medical- and Pathology-Specific CLIP

| Name | Output Dim |
| --- | --- |
| [`plip`](https://github.com/PathologyFoundation/plip) ([Pathology Language and Image Pre-Training (PLIP)](https://www.nature.com/articles/s41591-023-02504-3)) | 512 |
| [`medsiglip`](https://huggingface.co/google/medsiglip-448) ([MedGemma Technical Report](https://arxiv.org/abs/2507.05201)) | 1152 |
| [`quilt_b_32`](https://quilt1m.github.io/) ([Quilt-1M: One Million Image-Text Pairs for Histopathology](https://arxiv.org/pdf/2306.11207)) | 512 |
| [`quilt_b_16`](https://quilt1m.github.io/) ([Quilt-1M: One Million Image-Text Pairs for Histopathology](https://arxiv.org/pdf/2306.11207)) | 512 |
| [`quilt_b_16_pmb`](https://quilt1m.github.io/) ([Quilt-1M: One Million Image-Text Pairs for Histopathology](https://arxiv.org/pdf/2306.11207)) | 512 |
| [`biomedclip`](https://huggingface.co/microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224) ([BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs](https://aka.ms/biomedclip-paper)) | 512 |
| [`omiclip`](https://huggingface.co/WangGuangyuLab/Loki) ([A visual-omics foundation model to bridge histopathology with spatial transcriptomics](https://www.nature.com/articles/s41592-025-02707-1)) | 768 |

## Bring Your Own Encoder

Add a custom encoder without touching AtlasPatch by writing a small plugin and pointing the CLI at it with `--feature-plugin /path/to/plugin.py`. The plugin must expose a `register_feature_extractors(registry, device, dtype, num_workers)` function; inside that hook call `register_custom_encoder` with a loader that knows how to load the model and run a forward pass.

```python
import torch
from torchvision import transforms
from atlas_patch.models.patch.custom import CustomEncoderComponents, register_custom_encoder


def build_my_encoder(device: torch.device, dtype: torch.dtype) -> CustomEncoderComponents:
    """
    Build the components used by AtlasPatch to embed patches with a custom model.

    Returns:
        CustomEncoderComponents describing the model, preprocess transform, and forward pass.
    """
    model = ...  # your torch.nn.Module
    model = model.to(device=device, dtype=dtype).eval()
    preprocess = transforms.Compose([transforms.Resize(224), transforms.ToTensor()])

    def forward(batch: torch.Tensor) -> torch.Tensor:
        return model(batch)  # must return [batch, embedding_dim]

    return CustomEncoderComponents(model=model, preprocess=preprocess, forward_fn=forward)


def register_feature_extractors(registry, device, dtype, num_workers):
    register_custom_encoder(
        registry=registry,
        name="my_encoder",
        embedding_dim=512,
        loader=build_my_encoder,
        device=device,
        dtype=dtype,
        num_workers=num_workers,
    )
```

Run AtlasPatch with `--feature-plugin /path/to/plugin.py --feature-extractors my_encoder` to benchmark your encoder alongside the built-ins, multiple plugins and extractors can be added at once. Outputs keep the same HDF5 layout—your custom embeddings live under `features/my_encoder` (row-aligned with `coords`) next to other extractors.

## SLURM job scripts

We prepared ready-to-run SLURM templates under `jobs/`:

- Patch extraction (SAM2 + H5/PNG): `jobs/atlaspatch_patch.slurm.sh`. Edits to make:
  - Set `WSI_ROOT`, `OUTPUT_ROOT`, `PATCH_SIZE`, `TARGET_MAG`, `SEG_BATCH`.
  - Ensure `--cpus-per-task` matches the CPU you want; the script passes `--patch-workers ${SLURM_CPUS_PER_TASK}` and caps `--max-open-slides` at 200.
  - `--fast-mode` is on by default; append `--no-fast-mode` to enable content filtering.
  - Submit with `sbatch jobs/atlaspatch_patch.slurm.sh`.
- Feature embedding (adds features into existing H5 files): `jobs/atlaspatch_features.slurm.sh`. Edits to make:
  - Set `WSI_ROOT`, `OUTPUT_ROOT`, `PATCH_SIZE`, and `TARGET_MAG`.
  - Configure `FEATURES` (comma/space list, multiple extractors are supported), `FEATURE_DEVICE`, `FEATURE_BATCH`, `FEATURE_WORKERS`, and `FEATURE_PRECISION`.
  - This script is intended for feature extraction; use the patch script when you need segmentation + coordinates, and run the feature script to embed one or more models into those H5 files.
  - Submit with `sbatch jobs/atlaspatch_features.slurm.sh`.
- Running multiple jobs: you can submit several jobs in a loop (e.g., 50 job using `for i in {1..50}; do sbatch jobs/atlaspatch_features.slurm.sh; done`). AtlasPatch uses per-slide lock files to avoid overlapping work on the same slide.

## Frequently Asked Questions (FAQ)

<details>
<summary><b>I'm facing an out of memory (OOM) error</b></summary>

This usually happens when too many WSI files are open simultaneously. Try reducing the `--max-open-slides` parameter:

```bash
atlaspatch process /path/to/slides --output ./output --max-open-slides 50
```

The default is 200. Lower this value if you're processing many large slides or have limited system memory.
</details>

<details>
<summary><b>I'm getting a CUDA out of memory error</b></summary>

Try one or more of the following:

1. **Reduce feature extraction batch size**:
   ```bash
   --feature-batch-size 16  # Default is 32
   ```

2. **Reduce segmentation batch size**:
   ```bash
   --seg-batch-size 1  # Default is 1
   ```

3. **Use lower precision**:
   ```bash
   --feature-precision float16  # or bfloat16
   ```

4. **Use a smaller patch size**:
   ```bash
   --patch-size 224  # Instead of 256
   ```
</details>

<details>
<summary><b>OpenSlide library not found</b></summary>

AtlasPatch requires the OpenSlide system library. Install it based on your system:

- **Conda**: `conda install -c conda-forge openslide`
- **Ubuntu/Debian**: `sudo apt-get install openslide-tools`
- **macOS**: `brew install openslide`

See [OpenSlide Prerequisites](#openslide-prerequisites) for more details.
</details>

<details>
<summary><b>Access denied for gated models (UNI, Virchow, etc.)</b></summary>

Some encoders require Hugging Face access approval:

1. Request access on the model's Hugging Face page (e.g., [UNI](https://huggingface.co/MahmoodLab/UNI))
2. Once approved, set your token:
   ```bash
   export HF_TOKEN=your_huggingface_token
   ```
3. Run AtlasPatch again
</details>

<details>
<summary><b>Missing microns-per-pixel (MPP) metadata</b></summary>

Some slides lack MPP metadata. You can provide it via a CSV file:

```bash
atlaspatch process /path/to/slides --output ./output --mpp-csv /path/to/mpp.csv
```

The CSV should have columns `wsi` (filename) and `mpp` (microns per pixel value).
</details>

<details>
<summary><b>Processing is slow</b></summary>

Try these optimizations:

1. **Enable fast mode** (skips content filtering, enabled by default):
   ```bash
   --fast-mode
   ```

2. **Increase parallel workers**:
   ```bash
   --patch-workers 16  # Match your CPU cores
   --feature-num-workers 8
   ```

3. **Increase batch sizes** (if GPU memory allows):
   ```bash
   --feature-batch-size 64
   --seg-batch-size 4
   ```

4. **Use multiple GPUs** by running separate jobs on different GPU devices.
</details>

<details>
<summary><b>My file format is not supported</b></summary>

AtlasPatch supports most common formats via OpenSlide and Pillow:
- **WSIs**: `.svs`, `.tif`, `.tiff`, `.ndpi`, `.vms`, `.vmu`, `.scn`, `.mrxs`, `.bif`, `.dcm`
- **Images**: `.png`, `.jpg`, `.jpeg`, `.bmp`, `.webp`, `.gif`

If your format isn't supported, consider converting it to a supported format or [open an issue](https://github.com/AtlasAnalyticsLab/AtlasPatch/issues/new?template=feature_request.md).
</details>

<details>
<summary><b>How do I skip already processed slides?</b></summary>

Use the `--skip-existing` flag to skip slides that already have an output H5 file:

```bash
atlaspatch process /path/to/slides --output ./output --skip-existing
```
</details>

---

Have a question not covered here? Feel free to [open an issue](https://github.com/AtlasAnalyticsLab/AtlasPatch/issues/new) and ask!

## Feedback

- Report problems via the [bug report template](https://github.com/AtlasAnalyticsLab/AtlasPatch/issues/new?template=bug_report.md) so we can reproduce and fix them quickly.
- Suggest enhancements through the [feature request template](https://github.com/AtlasAnalyticsLab/AtlasPatch/issues/new?template=feature_request.md) with your use case and proposal.
- When opening a PR, fill out the [pull request template](.github/pull_request_template.md) and run the listed checks (lint, format, type-check, tests).

## Citation

If you use AtlasPatch in your research, please cite our paper:

```bibtex
@article{atlaspatch2026,
  title   = {AtlasPatch: An Efficient and Scalable Tool for Whole Slide Image Preprocessing in Computational Pathology},
  author  = {Alagha, Ahmed and Leclerc, Christopher and Kotp, Yousef and Metwally, Omar and Moras, Calvin and Rentopoulos, Peter and Rostami, Ghodsiyeh and Nguyen, Bich Ngoc and Baig, Jumanah and Khellaf, Abdelhakim and Trinh, Vincent Quoc-Huy and Mizouni, Rabeb and Otrok, Hadi and Bentahar, Jamal and Hosseini, Mahdi S.},
  journal = {arXiv preprint arXiv:2602.03998},
  year    = {2026}
}
```

## License

AtlasPatch is released under CC-BY-NC-SA-4.0, which strictly disallows commercial use of the model weights or any derivative works. Commercialization includes selling the model, offering it as a paid service, using it inside commercial products, or distributing modified versions for commercial gain. Non-commercial research, experimentation, educational use, and use by academic or non-profit organizations is permitted under the license terms. If you need commercial rights, please contact the authors to obtain a separate commercial license. See the LICENSE file in this repository for full terms. For the complete license text and detailed terms, see the [LICENSE](./LICENSE) file in this repository.

## Future Updates

### Slide Encoders
- We plan to add slide-level encoders (open for extension): TITAN, PRISM, GigaPath, Madeleine.
