Metadata-Version: 2.3
Name: histotuner
Version: 0.2.6
Summary: Add your description here
Author: Ajit Johnson Nirmal
Author-email: Ajit Johnson Nirmal <ajitjohnson.n@gmail.com>
Requires-Dist: anndata>=0.12.2
Requires-Dist: cellpose>=4.0.6
Requires-Dist: dask>=2024.11.2
Requires-Dist: geopandas>=1.1.1
Requires-Dist: magicgui>=0.10.1
Requires-Dist: matplotlib>=3.10.6
Requires-Dist: napari>=0.7.0
Requires-Dist: numpy>=2.3.3
Requires-Dist: opencv-python>=4.11.0.86
Requires-Dist: openslide-bin>=4.0.0.8
Requires-Dist: openslide-python>=1.4.2
Requires-Dist: pandas>=2.3.3
Requires-Dist: pillow>=11.3.0
Requires-Dist: pip>=25.2
Requires-Dist: psutil>=7.1.0
Requires-Dist: pyqt6>=6.11.0
Requires-Dist: pyyaml>=6.0.3
Requires-Dist: scikit-learn>=1.7.2
Requires-Dist: shapely>=2.1.2
Requires-Dist: spatialdata>=0.5.0
Requires-Dist: tifffile>=2025.9.30
Requires-Dist: timm>=1.0.20
Requires-Dist: tqdm>=4.67.1
Requires-Dist: transformers>=4.57.1
Requires-Dist: wandb>=0.22.2
Requires-Dist: zarr>=3
Requires-Python: >=3.12
Description-Content-Type: text/markdown

## histotuner

### Supported token-extraction backends

`histotuner` can append multiple model-specific token tables into the same
SpatialData Zarr while keeping shared geometry layers model-agnostic.

Currently supported token extractors:

- `hf-hub:bioptimus/H-optimus-1`
- `hf-hub:MahmoodLab/UNI2-h`
- `hf-hub:paige-ai/Virchow2`
- `hf-hub:Wangyh/mSTAR`
- `hf-hub:prov-gigapath/prov-gigapath`
- `owkin/phikon-v2`
- `MahmoodLab/conchv1_5`
- `WenchuanZhang/Patho-CLIP-L`
- `majiabo/GPFM`

### Token-grid semantics

All currently supported models export a unified `14x14` token grid so token
tables can be compared directly across models.

- `phikon-v2` exports a native `14x14` patch-token grid.
- `hf-hub:bioptimus/H-optimus-1`, `hf-hub:Wangyh/mSTAR`, and
  `hf-hub:prov-gigapath/prov-gigapath` export native `14x14` grids.
- `hf-hub:MahmoodLab/UNI2-h` and `hf-hub:paige-ai/Virchow2` have native
  `16x16` patch-token grids after special tokens are stripped, and `histotuner`
  adaptively average-pools them to `14x14`.
- `conchv1_5` is special:
  - the native vision encoder runs at `448x448` with `patch16`
  - that produces a native `28x28` patch-token grid
  - `histotuner` average-pools each non-overlapping `2x2` token neighborhood
    to export a compatibility `14x14` token grid
- `Patho-CLIP-L` is also special:
  - the native CLIP-L/14 vision encoder produces a `24x24` patch-token grid at
    `336x336` input resolution
  - `histotuner` adaptively average-pools that native `24x24` grid to export a
    compatibility `14x14` token grid
- `GPFM` is also special:
  - the native DINOv2 ViT-L/14 encoder produces a `16x16` patch-token grid at
    `224x224` input resolution
  - `histotuner` adaptively average-pools that native `16x16` grid to export a
    compatibility `14x14` token grid

That pooling choice is deliberate so downstream single-cell workflows can
consume every supported model through the same `14x14` token layout. For the
pooled models, this is a compatibility semantic rather than the model's native
tokenization:

- `UNI2-h` and `Virchow2`: pooled from native `16x16`
- `conchv1_5`: pooled from native `28x28`
- `Patho-CLIP-L`: pooled from native `24x24`
- `GPFM`: pooled from native `16x16`

### Not yet supported for token extraction

- none from the current requested set
