Metadata-Version: 2.4
Name: segtme-uni2
Version: 0.1.1
Summary: SegTME: UNI2 + UperHoVer dual-head nuclei/tissue segmentation
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: torch>=2.0
Requires-Dist: transformers>=4.35
Requires-Dist: timm>=0.9
Requires-Dist: safetensors>=0.4
Requires-Dist: huggingface_hub>=0.20
Dynamic: description
Dynamic: description-content-type
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# segtme-uni2

**SegTME** — Tumor Microenvironment segmentation using the [UNI2](https://huggingface.co/MahmoodLab/UNI2-h) pathology foundation model with a dual-head UperNet + HoverNet-style decoder.

The model simultaneously predicts:
- **Semantic segmentation** — 6-class tissue map (background, neoplastic, inflammatory, connective, dead, epithelial)
- **HV maps** — horizontal/vertical nuclear distance gradients for instance segmentation via watershed post-processing

Architecture and model weights are separate: weights are publicly hosted on HuggingFace; the architecture is distributed as this compiled package (code not exposed).

---

## Available Models

| Model | HuggingFace Repo | Trained on | Best mIoU |
|---|---|---|---|
| M1 — PanNuke | [SegTME-UNI2-UperHoVer_PanNuke](https://huggingface.co/mizjaggy18/SegTME-UNI2-UperHoVer_PanNuke) | PanNuke (pan-cancer nuclei, 7,901 patches) | **0.9313** |
| M2 — TCGA-UT-0 | [SegTME-UNI2-UperHoVer_TCGA-UT-0](https://huggingface.co/mizjaggy18/SegTME-UNI2-UperHoVer_TCGA-UT-0) | TCGA-UT subset 0 (~850 steps/epoch) | **0.8197** |
| M3 — TCGA-UT-012345 | [SegTME-UNI2-UperHoVer_TCGA-UT-012345](https://huggingface.co/mizjaggy18/SegTME-UNI2-UperHoVer_TCGA-UT-012345) | TCGA-UT subsets 0–5 combined | **0.7724** |

All three models share the same architecture and hyperparameters; only training data and checkpoint weights differ.

---

## Architecture

```
Input (B, 3, 224, 224) — ImageNet-normalised
       │
       ▼
UNI2 ViT-Giant backbone (depth=24, heads=24, embed_dim=1536, patch=14)
  Multi-scale features extracted at layers 5 / 11 / 17 / 23
  Projected to 256 / 512 / 1024 / 2048 channels via Conv2d
       │
       ├──► UperNet decoder → semantic logits (B, 6, H, W)
       │
       └──► UperNet decoder → HV maps       (B, 2, H, W)
```

**UNI2 backbone** — `vit_giant_patch14_224` (timm), loaded from `MahmoodLab/UNI2-h` pretrained weights, 1.1 B parameters.

**UperHoVerNet** — dual-head UperNet decoder; semantic head classifies tissue type per pixel; HV head produces horizontal/vertical nuclear centroid distance fields used for marker-controlled watershed instance segmentation.

---

## Installation

```bash
pip install segtme-uni2
```

Dependencies installed automatically: `torch`, `transformers`, `timm`, `safetensors`, `huggingface_hub`.

---

## Usage

```python
from segtme import UperHoVerNet

# Load from HuggingFace (downloads weights automatically)
model = UperHoVerNet.from_pretrained("mizjaggy18/SegTME-UNI2-UperHoVer_TCGA-UT-0")
model.eval().cuda()

# Forward pass — input: (B, 3, H, W), ImageNet normalised, recommended 224×224 tiles
import torch
pixel_values = torch.randn(1, 3, 224, 224).cuda()
sem_logits, hv_maps = model(pixel_values)

# sem_logits: (B, 6, H, W)  — per-class tissue logits
# hv_maps:    (B, 2, H, W)  — horizontal / vertical distance gradients
```

### Output classes

| Channel | Class |
|---|---|
| 0 | Background |
| 1 | Neoplastic |
| 2 | Inflammatory |
| 3 | Connective |
| 4 | Dead |
| 5 | Epithelial |

### Recommended inference scale

Resize input tiles so one pixel corresponds to the target MPP before running inference:

| Model | Target MPP | Downscale factor |
|---|---|---|
| M1 — PanNuke | 0.25 µm/px | `image_mpp / 0.25` |
| M2 — TCGA-UT-0 | 0.314 µm/px | `image_mpp / 0.314` |
| M3 — TCGA-UT-012345 | 0.5 µm/px | `image_mpp / 0.5` |

### Tiled WSI inference (example)

```python
import torch
import torch.nn.functional as F
from torchvision.transforms.functional import normalize
from segtme import UperHoVerNet

model = UperHoVerNet.from_pretrained("mizjaggy18/SegTME-UNI2-UperHoVer_TCGA-UT-0")
model.eval().cuda()

MEAN = [0.485, 0.456, 0.406]
STD  = [0.229, 0.224, 0.225]
TILE = 224
STRIDE = 112
TARGET_MPP = 0.314

def infer_patch(patch_rgb_uint8, image_mpp):
    scale = image_mpp / TARGET_MPP
    h, w = patch_rgb_uint8.shape[:2]
    new_h, new_w = int(h / scale), int(w / scale)

    x = torch.from_numpy(patch_rgb_uint8).permute(2, 0, 1).float() / 255.0
    x = F.interpolate(x.unsqueeze(0), (new_h, new_w), mode="bilinear")
    x = normalize(x.squeeze(0), MEAN, STD).unsqueeze(0).cuda()

    with torch.no_grad():
        sem_logits, hv_maps = model(x)

    sem_pred = sem_logits.argmax(1).squeeze().cpu().numpy()   # (H, W) class indices
    hv       = hv_maps.squeeze().cpu().numpy()                # (2, H, W)
    return sem_pred, hv
```

---

## Training Curriculum

Three-stage curriculum training, each stage initialised fresh (no weight inheritance):

| Stage | Model | Dataset | Epochs | Steps | mIoU |
|---|---|---|---|---|---|
| 1 | M1 | PanNuke | 249 | 24,651 | 0.9313 |
| 2 | M2 | TCGA-UT subset 0 | 250 | 212,500 | 0.8197 |
| 3 | M3 | TCGA-UT subsets 0–5 | 100 | 335,100 | 0.7724 |

All stages: initial LR 5×10⁻⁵, linear decay, AdamW optimiser. Backbone: UNI2-h (frozen or fine-tuned depending on stage).

---

## Citation

If you use this model in your work, please cite:

```bibtex
@misc{segtme-uni2-2026,
  title  = {SegTME: Tumor Microenvironment Segmentation with UNI2},
  author = {mizjaggy18},
  year   = {2026},
  url    = {https://huggingface.co/mizjaggy18/SegTME-UNI2-UperHoVer_TCGA-UT-0}
}
```

---

## Links

- PyPI: https://pypi.org/project/segtme-uni2/
- HuggingFace (M1): https://huggingface.co/mizjaggy18/SegTME-UNI2-UperHoVer_PanNuke
- HuggingFace (M2): https://huggingface.co/mizjaggy18/SegTME-UNI2-UperHoVer_TCGA-UT-0
- HuggingFace (M3): https://huggingface.co/mizjaggy18/SegTME-UNI2-UperHoVer_TCGA-UT-012345
- UNI2 backbone: https://huggingface.co/MahmoodLab/UNI2-h
