Metadata-Version: 2.4
Name: pentachrome-plugin
Version: 0.4.1
Summary: Napari plugin for the Pentachrome histology pipeline: VSI extraction, nnUNet inference, statistics.
Author: Dimitrios Tsilis
License: MIT License
        
        Copyright (c) 2026 Dimitrios Tsilis
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
Project-URL: Homepage, https://github.com/dtsilis7/PentachromePipeline
Project-URL: Bug Tracker, https://github.com/dtsilis7/PentachromePipeline/issues
Keywords: napari,histology,nnunet,bioformats,segmentation
Classifier: Framework :: napari
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3.10
Classifier: Operating System :: Microsoft :: Windows
Classifier: License :: OSI Approved :: MIT License
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: napari[all]>=0.4.18
Requires-Dist: qtpy
Requires-Dist: tifffile
Requires-Dist: numpy<2
Requires-Dist: opencv-python
Requires-Dist: scipy
Requires-Dist: scikit-image
Requires-Dist: skan
Requires-Dist: imagecodecs
Dynamic: license-file

# pentachrome-plugin

Napari plugin for the Pentachrome histology pipeline. Current widgets:

- **VSI to TIFF Extractor** (Phase 1) — extract tissue-region TIFFs from VSI files.
- **nnUNet Inference** (Phase 2) — run the trained Epithelium / MultiStructure models on selected TIFF layers and load colorized masks back into the viewer.
- **Mask Statistics** (Phase 3) — per-region statistics (thickness, composition, cell densities) computed on the inference output, with CSV export.

Source and issues: https://github.com/dtsilis7/PentachromePipeline

**Requirements:** Windows, Python 3.10, napari >= 0.4.18, and a Java JDK (JDK 17 confirmed). The Java/bioformats stack and the nnUNet model weights are installed separately (see below) — they can't come from a plain `pip install`.

## Install (Windows, PowerShell)

Requires a working Java JDK on PATH.

### Via napari plugin manager 

In napari, open **Plugins -> Install/Uninstall Plugins**, search "pentachrome", and click Install. Or from a shell:

```powershell
pip install pentachrome-plugin
```

This installs the Python package only, you still need the conda-forge Java/bioformats step and the nnUNet weights (below) for the extractor and inference to work.

### From source (development)

For working on the plugin itself, install editable from a checkout. `cd` into the plugin directory first, `pip install -e .` resolves `.` relative to your current shell directory:

```powershell
conda activate napari
cd "...\pentachrome_plugin"
conda install -c conda-forge python-javabridge python-bioformats
pip install opencv-python tifffile
pip install -e .
```

If you would rather not `cd`, pass the absolute path explicitly:

```powershell
pip install -e "...\pentachrome_plugin"
```

Conda-forge ships pre-built wheels for `python-javabridge` and `python-bioformats` and avoids the MSVC + NumPy-2 compile failure that `pip install python-javabridge` hits today (the C extension references `_PyArray_Descr` fields removed in NumPy 2.0).

### Fallback: pure pip

Only use this if conda-forge is unavailable. NumPy must be pinned below 2 *before* javabridge builds, and build isolation must be off so the build sees the pinned NumPy:

```powershell
cd "...\pentachrome_plugin"
pip install "numpy<2"
pip install --no-build-isolation python-javabridge python-bioformats
pip install opencv-python tifffile
pip install -e .
```

## Launch

```powershell
conda activate napari   # or whichever env you installed into
python -m napari
```

In napari: **Plugins -> VSI to TIFF Extractor** or **Plugins -> nnUNet Inference**.

## Model weights

The nnUNet weights (~900 MB) aren't bundled in the PyPI package. Download
[`nnunet_results.zip`](https://github.com/dtsilis7/PentachromePipeline/releases/download/weights-v1/nnunet_results.zip)
from the [releases page](https://github.com/dtsilis7/PentachromePipeline/releases),
unzip it, and point the inference widget's **nnUNet results** field at the
extracted folder (the one containing `Dataset001_Epithelium` and
`Dataset002_MultiStructure`).

## nnUNet Inference (Phase 2)

Requires `nnunetv2` installed in the same environment (the `nnUNetv2_predict` CLI must be on PATH).

```powershell
conda activate napari
pip install nnunetv2
```

Workflow:

1. Load TIFFs into napari (e.g. via Phase 1's auto-load checkbox, or drag-and-drop).
2. Open **Plugins -> nnUNet Inference**.
3. Select one or more image layers in the list.
4. Tick **Epithelium**, **MultiStructure**, or both.
5. Set **Output folder** (where raw + colorized masks go) and **nnUNet results** (folder containing `Dataset001_Epithelium` and `Dataset002_MultiStructure`). The results path auto-fills if `nnUNet_Training/nnUNet_results/results` is found.
6. Pick **Device** (`cpu` or `cuda`) and click **Analyze**.

### Speed vs quality (important on CPU)

nnUNet inference on a laptop CPU is slow because every image goes through *folds × mirror augmentations × sliding-window patches* forward passes. With defaults that can be 20+ passes per image. The widget exposes three knobs in the **Speed / quality** group:

| Knob | Default | What it does |
| --- | --- | --- |
| Epithelium folds | `Fold 0 only` | Use 1 of the 5 trained folds for Dataset001. All 5 ensembled is best quality but ~5x slower. Dataset002 only has fold 0 trained, so it's always 1 fold. |
| Disable test-time mirroring | on | Passes `--disable_tta`. Skips the 4 mirror augmentations the model normally averages over. ~4x faster, small accuracy hit. |
| Sliding-window step | `0.5` | Passes `-step_size`. Larger = fewer overlapping patches = faster but rougher tile borders. Try `0.7` for a middle ground. |

With all three defaults on a CPU laptop, one ROI tile should take a few minutes instead of 30+. Switch to `All 5 folds` + TTA on once you've moved to a GPU box.

### Continuing from the extractor

The two widgets are linked through two small bridges, so you can run **Extract -> Analyze** in a single napari session without re-picking files:

- When the extractor auto-loads a TIFF as a viewer layer, it stashes the on-disk path on `layer.metadata['source_tiff']`. The inference widget reads that during staging and **copies the original file** into `_staging_input/` rather than re-saving the in-memory array — important for 15k x 15k tiles.
- When an extraction completes, the inference widget's **"Use last extractor output"** button pre-fills the output folder to `<extractor_output_root>/_inference`, so masks land next to the per-VSI subfolders the extractor created.

Both bridges are in-process only (see `_session.py`); they reset when napari closes.

Outputs land in:

```
<output_folder>/
  _staging_input/            # nnUNet-named (_0000.tif) copies of the selected layers
  epithelium_raw/            # binary masks from Dataset001
  epithelium_colored/        # RGB colorized masks (red epithelium)
  multistructure_raw/        # 6-class masks from Dataset002
  multistructure_colored/    # RGB colorized masks (Elastin/Collagen/Nuclei/Mucins/Membrane/Goblets)
```

Colorized masks are added to the viewer as RGB image layers when the run finishes.

### nnUNet inference architecture

Same subprocess pattern as Phase 1. The widget never imports torch or nnUNetv2 directly; it spawns `_inference_worker.py` which:

- sets `nnUNet_results` to the configured results dir,
- calls `nnUNetv2_predict` once per enabled model (folds 0-4 for Epithelium, fold 0 for MultiStructure, matching `run_inference.py`),
- colorizes the resulting integer masks with the palettes from `colorize_masks.py` / `compare_grid.py`,
- streams JSON-line events on stdout for the widget's progress bar and log.

## How it works

- The widget itself never touches the JVM. When you click **Extract ROIs**, it spawns `_vsi_worker.py` as a separate Python process.
- That worker process starts the bioformats JVM, loops over the VSI files using `TileMaskStitcher` (reused from `VSI_Handler/tile_mask_stitcher.py`), writes numbered TIFFs into `<output_root>/<vsi_basename>/`, and emits JSON-line progress events on stdout.
- The widget streams those events on a background thread and updates the progress bar / log without blocking the UI.
- When the worker exits, the JVM dies with it. The next extraction batch starts a fresh JVM in a fresh process - this avoids the "JVM cannot be restarted" pitfall during a long napari session.

## Defaults

The parameter defaults mirror `Processing_VSI_Files.py`:

| Parameter | Default |
| --- | --- |
| Series | 6 |
| Tile width / height | 15000 |
| Threshold | 50 |
| Min ROI area | 150000 |
| Merge margin | 1000 |
| Extra crop margin | 100 |

## Layout

```
pentachrome_plugin/
  pyproject.toml
  README.md
  src/pentachrome_plugin/
    __init__.py
    napari.yaml             # napari manifest
    _session.py             # in-process cross-widget state (extractor -> inference -> analysis)
    _widget.py              # VsiExtractorWidget (Phase 1)
    _vsi_worker.py          # VSI subprocess entrypoint
    _inference_widget.py    # NnUnetInferenceWidget (Phase 2)
    _inference_worker.py    # nnUNet subprocess entrypoint
    _analysis_widget.py     # AnalysisWidget (Phase 3, in-process)
```

Phase 3 (Mask Statistics) lives alongside these and registers through `napari.yaml`.

## Mask Statistics (Phase 3)

Pure in-process; no subprocess needed (no JVM, no torch). Reuses
`EpithelialAnalysis/Analyzers/` (`Descriptors.py`, `Thickness.py`), so the
same metrics that fed the original `region_summary.csv` show up in the
widget.

Workflow:

1. Run Phase 2 first so `epithelium_raw/` and `multistructure_raw/` exist.
2. Open **Plugins -> Mask Statistics**.
3. Select one or more image layers in the list (their names must match the
   mask filenames in `epithelium_raw/` / `multistructure_raw/`; if the
   inference widget staged them, that's already true).
4. Click **Use last inference output** (or browse).
5. Tweak **Pixel size**, **Region dilation**, **Min epithelium area** if
   needed (defaults match `Main.py`).
6. Click **Analyze**.

For each detected epithelial region the widget reports:

| Column | What it is |
| --- | --- |
| Area (mm^2) | Region area after the 50 um dilation |
| Thickness mean/std (um) | Medial-axis thickness of (membrane within eroded region) U goblets U nuclei |
| Elastin / Collagen / Other % | Fraction of stained structure pixels — same definition as `compute_structure_percentages` |
| Mucin % | Mucin pixels as a fraction of the epithelium area (not of total structure pixels) |
| Nuclei / mm^2 and Goblets / mm^2 | Density per mm^2 of epithelium — goblet hyperplasia is a classic COPD readout |
| Nuclei (n), Goblets (n) | Raw counts inside the region |

A bold **(all regions)** row appended per image gives area-weighted means
of the percentages / thickness and totals for the counts. **Export CSV...**
saves the whole table (per-region rows + aggregate rows).

The elastin organization score (`ElastinAnalyzer.determine_organized_region`)
from `Main.py` is intentionally not yet exposed — it's much heavier (skan +
shapely + ROI polygons) and will land as a separate toggle.

### Class isolation

A "Class isolation" group at the top of the widget lets you view a single
class (or a combination) without rerunning anything:

1. Pick a source layer (the **original** TIFF — not a colorized mask).
2. Tick one or more of **Elastin**, **Collagen**, **Nuclei**, **Mucins**,
   **Cell Membrane**, **Goblets**, **Epithelium**.
3. Click one of:
   - **Show as mask** — adds a new layer that's white everywhere except the
     ticked classes, colored with the same palette as the inference widget.
   - **Show on original** — adds a copy of the original image with all
     pixels outside the ticked classes turned white. Useful for sanity-
     checking the segmentation against the stain.
4. **Clear isolated layers** removes everything this panel added in one go.

Masks are read on demand from the inference output folder; the original
layer's pixels are taken from the viewer.

## License

This project is licensed under the MIT License — see the [LICENSE](LICENSE) file for details.
