Metadata-Version: 2.4
Name: isv
Version: 0.3.17
Summary: Automated Interpretation of Structural Copy Number Variants
Author: Tomas Sladecek
Author-email: Tomas Sladecek <tomas.sladecek@geneton.sk>
License-File: LICENSE.txt
Requires-Dist: numba>=0.65.0
Requires-Dist: numpy>=2.4.4
Requires-Dist: pandas>=3.0.2
Requires-Dist: plotly>=6.6.0
Requires-Dist: shap>=0.49.1
Requires-Dist: xgboost>=3.2.0
Requires-Python: >=3.14
Project-URL: Homepage, https://github.com/tsladecek/isv_package
Description-Content-Type: text/markdown

# ISV package

Python package for easy prediction of pathogenicity Copy Number Variants (CNVs)

---
## If you mention or use the ISV tool, please cite our article
https://www.nature.com/articles/s41598-021-04505-z

---
## Install
#### Install with `pip install isv`
This will also automatically install all required additional packages. Thus it is recommended to install the package in a separate environment (e.g. uv, virtualenv, conda, ...)

#### Package url: https://pypi.org/project/isv/

#### Module reference available at https://tsladecek.github.io/isv_package/

---
## Modules
##### The package contains a wrapper function:
### `isv.isv(cnvs, proba, shap)`
which automatically annotates and predicts `cnvs` provided in a list, np.array or pandas DataFrame format represented in 4 columns: `chromosome`, `start (grch38)`, `end (grch38)` and `cnv_type`

- The `proba` parameter controls whether probabilities should be calculated
- The `shap` parameter controls whether shap values should be calculated

#### and a Wrapper class (which is recommended):
### `isv.ISV(cnvs)`

with methods:
- ISV.predict(proba)
- ISV.shap(data=None)
  - where the `data` argument is optional
- ISV.waterfall(cnv_index)
  - for creating an interactive waterfall plot for a CNV at index `cnv_index`

---
#### The main subfunctions of the package are:

### 1. `isv.annotate(cnvs)`
- annotates cnvs provided in a list, np.array or pandas DataFrame format represented in 4 columns: `chromosome`, `start (grch38)`, `end (grch38)` and `cnv_type`
- Returns an annotated dataframe which can be used as an input to following two functions

### 2. `isv.predict(annotated_cnvs, proba)`
- returns an array of isv predictions. `annotated_cnvs` represents annotated cnvs returned by the annotate function

### 3. `isv.shap_values(annotated_cnvs)`
- calculates shap values for given CNVs. `annotated_cnvs` represents annotated cnvs returned by the annotate function

#### For example
1. using the simple wrapper
```
from isv import isv


cnvs = [
    ["chr8", 100000, 500000, "DEL"],
    ["chrX", 52000000, 55000000, "DUP"]
] 

results = isv(cnvs, proba=True, shap=True)
```

2. using the ISV class
```
from isv import ISV


cnvs = [
    ["chr8", 100000, 500000, "DEL"],
    ["chrX", 52000000, 55000000, "DUP"]
] 

cnv_isv = ISV(cnvs)
predictions = cnv_isv.predict(proba=True)
shap_vals = cnv_isv.shap()
cnv_isv.waterfall(cnv_index=1)
```

---
## Can be also used as a command line tool

```
isv -i <input_cnvs>.bed -o <outputpath> [-p] [-sv]
```
where the input should be a list of CNVs in a bed format, with columns: `chromosome`, `start (grch38)`, `end (grch38)` and `cnv_type`

> [!NOTE]
> The first row must contain column names

Results will be saved in a tab separated file at path specified by user

Optionally, use following flags:
- **-p**: whether probabilities should be returned
- **-sv**: whether shap values should be calculated

#### For example

```
isv -i examples/loss_gain_cnvs.bed -o examples/loss_gain_cnvs_out.bed -p -sv
```
