Metadata-Version: 2.4
Name: ppkt2synergy
Version: 0.1.0
Summary: A Python package for synergy and correlation analysis of HPO annotations in GA4GH phenopacket cohorts
Author-email: Peter Robinson <peter.robinson@bih-charite.de>, Daniel Danis <daniel.gordon.danis@protonmail.com>, Jing Chen <jing.chen@bih-charite.de>
License: BSD 3-Clause License
        
        Copyright (c) 2024, Monarch Initiative and contributors
        
        Redistribution and use in source and binary forms, with or without
        modification, are permitted provided that the following conditions are met:
        
        1. Redistributions of source code must retain the above copyright notice, this
           list of conditions and the following disclaimer.
        
        2. Redistributions in binary form must reproduce the above copyright notice,
           this list of conditions and the following disclaimer in the documentation
           and/or other materials provided with the distribution.
        
        3. Neither the name of the copyright holder nor the names of its
           contributors may be used to endorse or promote products derived from
           this software without specific prior written permission.
        
        THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
        AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
        IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
        DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
        FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
        DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
        SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
        CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
        OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
        OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
        
Project-URL: homepage, https://github.com/P2GX/ppkt2synergy
Project-URL: repository, https://github.com/P2GX/ppkt2synergy.git
Project-URL: documentation, https://P2GX.github.io/ppkt2synergy/
Project-URL: bugtracker, https://github.com/P2GX/ppkt2synergy/issues
Keywords: Global Alliance for Genomics and Health,GA4GH Phenopacket Schema,Human Phenotype Ontology,GA4GH,HPO
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: hpo-toolkit<0.6,>=0.5.0
Requires-Dist: phenopackets<3.0,>=2.0.2
Requires-Dist: phenopacket-store-toolkit<0.2,>=0.1.4
Requires-Dist: scikit-learn<2.0,>=1.6.1
Requires-Dist: numpy<3.0,>=1.26.4
Requires-Dist: pandas<3.0,>=2.2.3
Requires-Dist: plotly<6.0,>=5.24.1
Requires-Dist: gpsea==0.9.11
Requires-Dist: scipy<2.0,>=1.15.1
Requires-Dist: joblib<2.0,>=1.4.2
Requires-Dist: tqdm<5.0,>=4.67.1
Requires-Dist: statsmodels<1.0,>=0.14.4
Provides-Extra: test
Requires-Dist: pytest<8.0.0,>=7.0.0; extra == "test"
Provides-Extra: docs
Requires-Dist: mkdocs-material[imaging]<10,>=9.5.10; extra == "docs"
Requires-Dist: mkdocs-material-extensions<2.0,>=1.3; extra == "docs"
Requires-Dist: mkdocstrings[python]<1.0,>=0.22; extra == "docs"
Requires-Dist: mkdocs-gen-files>=0.5.0; extra == "docs"
Requires-Dist: pillow; extra == "docs"
Requires-Dist: cairosvg; extra == "docs"
Dynamic: license-file

# ppkt2synergy

**ppkt2synergy** is a Python library for analyzing correlations and synergy in [GA4GH Phenopacket](https://www.ga4gh.org/product/phenopackets/) cohorts. 
---

## Installation

```bash
pip install ppkt2synergy
```

---

## Overview

This package enables the identification of pairwise associations and higher-order interactions between phenotypic features, helping to uncover biologically meaningful patterns in rare disease data.

---

## Features

* Correlation analysis of HPO features (Spearman, Kendall, Phi)
* Synergy analysis to detect non-additive interactions between phenotypic features with respect to a target variable (e.g., variant effects or disease)
* Support for GA4GH phenopacket data
* Structured dataset construction from phenotypic profiles
* Visualization utilities (e.g., correlation heatmaps)

---

## Quickstart

```python
from ppkt2synergy import (
    load_phenopackets_by_cohort,
    PhenotypeDatasetBuilder,
    HPOCorrelationAnalyzer,
    CorrelationType,
)
from gpsea.model import VariantEffect

# Load phenopackets
phenopackets = load_phenopackets_by_cohort("FBN1")

# Build dataset
dataset = PhenotypeDatasetBuilder(phenopackets).build(
    mane_tx_id="NM_000138.5",
    variant_effect_type=VariantEffect.MISSENSE_VARIANT,
)

# Run correlation analysis
analyzer = HPOCorrelationAnalyzer(dataset)
analyzer.compute_correlation_matrix(
    correlation_type=CorrelationType.SPEARMAN
)
```

For a complete workflow and advanced options, see the documentation.

---


