Metadata-Version: 2.4
Name: medtda
Version: 0.1.0a3
Summary: Medical Imaging Topological Data Analysis - Extract TDA features from medical images for machine learning
Author: Dashti Ali
License: MIT
Project-URL: Homepage, https://github.com/dashtiali/medtda
Project-URL: Documentation, https://medtda.readthedocs.io
Project-URL: Repository, https://github.com/dashtiali/medtda
Project-URL: Bug Tracker, https://github.com/dashtiali/medtda/issues
Keywords: topological-data-analysis,tda,medical-imaging,persistent-homology,radiomics
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Healthcare Industry
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Medical Science Apps.
Classifier: Topic :: Scientific/Engineering :: Image Processing
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.21.0
Requires-Dist: scipy>=1.7.0
Requires-Dist: gudhi>=3.5.0
Requires-Dist: cripser>=0.0.32
Requires-Dist: SimpleITK>=2.1.0
Requires-Dist: Pillow>=9.0.0
Requires-Dist: scikit-image>=0.19.0
Requires-Dist: scikit-learn>=1.0.0
Requires-Dist: pandas>=1.3.0
Requires-Dist: matplotlib>=3.5.0
Requires-Dist: seaborn>=0.11.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: tqdm>=4.60.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=3.0.0; extra == "dev"
Requires-Dist: pytest-xdist>=2.5.0; extra == "dev"
Requires-Dist: pytest-timeout>=2.1.0; extra == "dev"
Requires-Dist: black>=22.0.0; extra == "dev"
Requires-Dist: isort>=5.10.0; extra == "dev"
Requires-Dist: flake8>=4.0.0; extra == "dev"
Requires-Dist: mypy>=0.950; extra == "dev"
Requires-Dist: types-PyYAML>=6.0; extra == "dev"
Requires-Dist: pylint>=2.13.0; extra == "dev"
Provides-Extra: docs
Requires-Dist: sphinx>=4.5.0; extra == "docs"
Requires-Dist: sphinx-rtd-theme>=1.0.0; extra == "docs"
Requires-Dist: sphinx-autodoc-typehints>=1.18.0; extra == "docs"
Requires-Dist: nbsphinx>=0.8.8; extra == "docs"
Provides-Extra: viz
Requires-Dist: plotly>=5.7.0; extra == "viz"
Dynamic: license-file

# Med-TDA: Medical Imaging Topological Data Analysis Tool

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)

**Med-TDA** is a Python library for extracting Topological Data Analysis (TDA) features from medical images. It provides a complete pipeline from image preprocessing to persistence barcode computation and feature vectorization, designed specifically for medical imaging applications in machine learning and radiomics research.

## Installation

```bash
pip install medtda
```

**Requirements:** Python ≥ 3.10

**Core Dependencies:** NumPy ≥1.21, SciPy ≥1.7, GUDHI ≥3.5, cripser ≥0.0.32, SimpleITK ≥2.1, Pillow ≥9.0, scikit-image ≥0.19, scikit-learn ≥1.0, pandas ≥1.3, matplotlib ≥3.5, seaborn ≥0.11, PyYAML ≥6.0, tqdm ≥4.60

## Supported Image Formats

- **2D Images:** PNG, JPG, JPEG, TIFF, TIF, BMP
- **3D/4D Medical Images:** NIfTI (.nii, .nii.gz), NRRD (.nrrd), MetaImage (.mha, .mhd)
- **Mask Support:** Single-label and multi-label segmentation masks in any supported format

## Feature Extraction

The **FeatureExtractor** class provides an end-to-end solution that performs preprocessing, persistent homology computation, and feature vectorization in one step. This is the recommended way to extract TDA features.

```python
from medtda import FeatureExtractor

# Initialize with desired settings
extractor = FeatureExtractor(
    normalize=True,
    normalize_method='minmax',
    vectorization_method='persistence_stats'
)

# Extract features from image and mask
features = extractor.execute(
    image='path/to/image.nii.gz',
    mask='path/to/mask.nii.gz'
)

# features is a dictionary of TDA feature vectors ready for ML
```

**Available Vectorization Methods:**
- `persistence_stats`: Statistical summaries (mean, std, min, max, percentiles)
- `betti_curve`: Betti number curves over filtration values
- `persistence_image`: 2D histogram representation of persistence diagrams
- `persistence_landscape`: Persistence landscape functions
- `entropy_summary`: Entropy-based statistical features
- `persistence_silhouette`: Silhouette representation
- `persistence_lifespan`: Lifespan distribution features
- `persistence_tropical_coordinates`: Tropical algebra coordinates

You can use multiple vectorization methods simultaneously:

```python
extractor = FeatureExtractor(
    normalize=True,
    vectorization_method=['persistence_stats', 'betti_curve', 'entropy_summary']
)
features = extractor.execute(image, mask)  # Returns combined features from all methods
```

## Image Preprocessing

The **Preprocessor** class handles medical image preprocessing independently. Use this when you need standalone preprocessing or want to inspect preprocessed images before computing persistence.

**Available Operations:**
- **Resampling:** Resample 3D/4D images to target voxel spacing (e.g., isotropic resolution)
- **Windowing:** Apply intensity windowing (center/width) for CT images
- **Normalization:** Normalize intensity values (minmax, z-score, or robust scaling)
- **Masking:** Apply binary or multi-label masks with configurable background values
- **ROI Cropping:** Automatically crop to region of interest with padding to reduce computation

```python
from medtda import Preprocessor

preprocessor = Preprocessor(
    spacing=(1.0, 1.0, 1.0),  # Resample to 1mm isotropic
    normalize=True,
    normalize_method='minmax',
    crop_to_roi=True
)

preprocessed_image, metadata = preprocessor.preprocess(
    image='ct_scan.nii.gz',
    mask='roi_mask.nii.gz'
)
```

The metadata dictionary contains information about applied transformations, original and final image ranges, shapes, and cropping details.

## Barcode Computation

The **BarcodeExtractor** class computes raw persistence barcodes (birth-death pairs) from medical images without vectorization. Use this when you need barcodes for custom analysis or visualization.

**Persistent Homology Parameters:**
- **Filtration Type:** `sublevel` (default) or `superlevel` filtration
- **Construction:** `T` (default, pixels/voxels as top-cells, 8-neighborhood in 2D) or `V` (pixels/voxels as 0-cells, 4-neighborhood in 2D)
- **Max Dimension:** Maximum homology dimension to compute (auto-detected from image dimensionality)

```python
from medtda import BarcodeExtractor

extractor = BarcodeExtractor(
    normalize=True,
    filtration_type='sublevel',
    max_dimension=2  # Compute H0, H1, H2
)

barcodes = extractor.execute(
    image='image.nii.gz',
    mask='mask.nii.gz'
)

# barcodes is a dict: {'H0': array, 'H1': array, 'H2': array}
# Each array has shape (n_features, 2) for (birth, death) pairs
```

**Homology Dimensions:**
- **H0:** Connected components (captures regions and holes)
- **H1:** Loops and tunnels (1-dimensional holes)
- **H2:** Voids and cavities (2-dimensional holes, 3D only)
- **H3:** 3-dimensional voids (4D images only)

## Visualization

MedTDA provides 8 plotting functions for visualizing persistence barcodes:

```python
from medtda.plotting import plot_persistence_diagram, plot_barcode, plot_betti_curve

# Visualize persistence diagram
plot_persistence_diagram(barcodes)

# Visualize barcode representation
plot_barcode(barcodes)

# Visualize Betti curves
plot_betti_curve(barcodes)
```

**Available Plot Types:**
- `plot_persistence_diagram`: Birth-death diagram with diagonal
- `plot_barcode`: Horizontal bars showing feature lifespans
- `plot_betti_curve`: Betti number evolution across filtration values
- `plot_landscape`: Persistence landscape functions
- `plot_entropy_summary`: Entropy summary curves
- `plot_lifespan`: Lifespan distribution curves
- `plot_silhouette`: Persistence silhouette visualization
- `plot_tropical_coordinates`: Tropical coordinate bar charts

All plots support multiple homology dimensions, custom color palettes, and seaborn styling.

## Example Workflow

```python
from medtda import FeatureExtractor

# 1. Initialize feature extractor with preprocessing and vectorization settings
extractor = FeatureExtractor(
    spacing=(1.0, 1.0, 1.0),          # Resample to 1mm isotropic
    normalize=True,                    # Apply normalization
    normalize_method='minmax',         # Use min-max normalization
    crop_to_roi=True,                  # Crop to ROI for efficiency
    filtration_type='sublevel',        # Sublevel filtration
    max_dimension=2,                   # Compute H0, H1, H2
    vectorization_method='persistence_stats',  # Statistical features
    return_barcodes=True               # Also return raw barcodes
)

# 2. Extract features (all preprocessing, PH computation, and vectorization in one call)
features, barcodes = extractor.execute(
    image='medical_image.nii.gz',
    mask='segmentation_mask.nii.gz'
)

# 3. Use features for machine learning
print(features.keys())  # Dictionary of feature vectors
# Example output: ['PersStats_H0_mean', 'PersStats_H0_std', ...]

# 4. Optional: Visualize barcodes
from medtda.plotting import plot_persistence_diagram
plot_persistence_diagram(barcodes)
```

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Citation

If you use MedTDA in your research, please cite:

```bibtex
@software{medtda2026,
  title = {Med-TDA: Medical Imaging Topological Data Analysis Tool},
  author = {Dashti A. Ali, Amber L. Simpson},
  year = {2026},
  url = {https://github.com/dashtiali/medtda}
}
```

## Acknowledgments

- Persistent homology computation powered by [cripser](https://github.com/shizuo-kaji/CubicalRipser_3dim)
- Vectorization methods based on [GUDHI](https://gudhi.inria.fr/)
