Metadata-Version: 2.4
Name: simphyni
Version: 0.1.0
Summary: SimPhyni: a tool for phylogenetic trait simulation and inference.
Author-email: Ishaq Balogun <ishaqobalogun@gmail.com>
License: MIT
Requires-Python: <3.12,>=3.10
Description-Content-Type: text/markdown
Requires-Dist: ete3>=3.1.3
Requires-Dist: joblib>=1.4.2
Requires-Dist: matplotlib>=3.10.1
Requires-Dist: numpy>=2.2.3
Requires-Dist: pandas>=2.2.3
Requires-Dist: plotly>=6.0.0
Requires-Dist: scikit-learn>=1.6.1
Requires-Dist: seaborn<0.14,>=0.13.0
Requires-Dist: setuptools>=75.8.0
Requires-Dist: statsmodels>=0.14.4
Requires-Dist: wheel>=0.45.1
Requires-Dist: annotated-types>=0.7.0
Requires-Dist: biopython>=1.85
Requires-Dist: certifi>=2025.1.31
Requires-Dist: charset-normalizer>=3.4.1
Requires-Dist: idna>=3.10
Requires-Dist: itolapi>=4.1.5
Requires-Dist: jinja2>=3.1.6
Requires-Dist: markupsafe>=3.0.2
Requires-Dist: pastml>=1.9.50
Requires-Dist: pydantic>=2.10.6
Requires-Dist: pydantic-core>=2.27.2
Requires-Dist: requests>=2.32.3
Requires-Dist: scipy>=1.14.0
Requires-Dist: urllib3>=2.4.0
Requires-Dist: snakemake>=9.11.5

# SimPhyNI

## Overview

**SimPhyNI** (Simulation-based Phylogenetic iNteraction Inference) is a phylogenetically-aware framework for detecting evolutionary associations between binary traits (e.g., gene presence/absence, major/minor alleles, binary phenotypes) on microbial phylogenetic trees. This tool leverages phylogenetic infromation to correct for surious associations caused by the relatedness of sister taxa. 

This pipeline is designed to:

* Infer evolutionary parameters for traits (gain/loss rates, time to emergence, ancestral states)
* Estimate trait co-occurence null models through independent simulation of traits
* Output statistical results for associations 

---

## Getting Started

### Installation

Install using Conda:

```bash
conda install -c bioconda simphyni
```
Or using PyPI

```bash
pip install simphyni
```

Or install from source:

```bash
git clone https://github.com/jpeyemi/SimPhyNI.git
cd SimPhyNI
pip install .
```

test installation:

```bash
simphyni version
```

---

### Directory Structure

```
SimPhyNI/
├── simphyni/               # Core package
│   ├── Simulation/          # Simulation scripts
│   ├── scripts/             # Workflow scripts
│   └── envs/simphyni.yaml   # Conda environment (used in snakemake)
├── conda-recipe/           # Build recipe 
├── snakemake_cluster_files # Cluster configs for Snakemake
└── pyproject.toml
```

---

## Usage

### Run mode (single-run)

```bash
simphyni run \
  --tree path/to/tree.nwk \
  --traits path/to/traits.csv \
  --runtype 0 \
  --outdir my_analysis \
  --cores 4 \
  --temp_dir ./temp \
  --min_prev 0.05 \
  --max_prev 0.95 \
  --prefilter \
  --plot
```

### Run mode (batch)

Create a `samples.csv` file like:

```csv
Sample,Tree,Traits,RunType,MinPrev,MaxPrev
run1,tree1.nwk,traits1.csv,0,0.05,0.95
run2,tree2.nwk,traits2.csv,1,0.05,0.90
```

Then execute:

```bash
simphyni run --samples samples.csv --cores 8 --temp_dir ./temp
```

For all run options:

```bash
simphyni run --help
```

---

## Outputs

Outputs are placed in structured folders in the working directory or specified output directory in the `3-Objects/` subdirectory, including:

* `simphyni_result.csv` contianing all tested trait pairs with their infered interaction direction, p-value, and effect size
* `simphyni_object.pkl` containinf the completed analysis, parsable with the attached environment (not recommended for large analyses, > 1,000,000 comparisons)
* heatmap summaries of tested associations if --plot is enabled

---


## Contact

For questions, please open an issue or contact Ishaq Balogun at https://github.com/jpeyemi.
