Metadata-Version: 2.4
Name: CNSistent
Version: 1.0.0
Summary: Tools for imputation, segmentation, analysis, and plotting of Copy Number Segments (CNS).
Project-URL: Homepage, https://github.com/ICCB-Cologne/CNSistent
Project-URL: Documentation, https://cnsistent.readthedocs.io/en/latest
Project-URL: Repository, https://github.com/ICCB-Cologne/CNSistent
Project-URL: Issues, https://github.com/ICCB-Cologne/CNSistent/issues
Project-URL: Changelog, https://github.com/ICCB-Cologne/CNSistent/blob/main/CHANGELOG.md
Author-email: Adam Streck <adam.streck@gmail.com>
Maintainer-email: Adam Streck <adam.streck@gmail.com>
License: MIT
License-File: LICENSE.txt
Keywords: CNS,bioinformatics,copy number segments,genomics
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.9
Requires-Dist: matplotlib
Requires-Dist: numba
Requires-Dist: numpy
Requires-Dist: pandas>=2.2
Description-Content-Type: text/markdown

![CNSistent Logo](https://cnsistent.readthedocs.io/en/latest/_images/Logo.png)

[![PyPI version](https://badge.fury.io/py/CNSistent.svg)](https://badge.fury.io/py/CNSistent)
[![Documentation Status](https://readthedocs.org/projects/cnsistent/badge/?version=latest)](https://cnsistent.readthedocs.io/en/latest/?badge=latest)
[![Tests](https://github.com/ICCB-Cologne/CNSistent/actions/workflows/python-test.yml/badge.svg)](https://github.com/ICCB-Cologne/CNSistent/actions/workflows/python-test.yml)

CNSistent is a Python tool for processing and analyzing copy number data. It is designed to work with data from a variety of sources. The tool is designed to be easy to use, and to provide a comprehensive set of analyses and visualizations.

## [**READ THE DOCS HERE**](https://cnsistent.readthedocs.io/en/latest)

CNSistent can be used as a Python package, or downloaded together with the respective data (PCAWG, TRACERx, TCGA, genomic locations):

## Installation

### Option 1: Full package with the data 

```
git clone git@github.com:ICCB-Cologne/CNSistent.git
cd CNSistent
pip install -e .
wget -O out.tar.gz https://zenodo.org/records/14547456/files/out.tar.gz 
mkdir -p out
tar -xzf out.tar.gz -C ./out 
rm out.tar.gz
```

> Note: the input data are part of the repository. The processed data can be downloaded and decompressed directly as shown above. Alternative is to generate the by running the `./scripts/data_process.sh` and `./scripts/data_aggregate.sh`.

### Option 2: PIP package only

```
pip install CNSistent
```


## Data

The input dataset is also availble on Zenodo: [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.14677713.svg)](https://doi.org/10.5281/zenodo.14677713)

The processed data is available on Zenodo: [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.14547456.svg)](https://doi.org/10.5281/zenodo.14547456)

Deep learning code is available on Zenodo: [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.14546762.svg)](https://doi.org/10.5281/zenodo.14546762)


### Acessions

The contents of the data folder were obtained by processing the following sources, accessed in December 2023.

TCGA data obtained from ASCATv3 at: https://github.com/VanLoo-lab/ascat/tree/master/ReleasedData    
Cite: https://www.pnas.org/doi/full/10.1073/pnas.1009843107   
The results published here are in part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.  

PCAWG data obtained from: https://dcc.icgc.org/releases/PCAWG/consensus_cnv
Cite: https://www.nature.com/articles/s41587-019-0055-9    

TRACERx data obtained from: https://zenodo.org/records/7649257    
Cite: https://www.nature.com/articles/s41586-023-05729-x

COSMIC cancer set obtained from: https://cancer.sanger.ac.uk/census   
Cite: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6450507

Human genome gene set obtained using PyENSEMBL (2023).
Cite: https://academic.oup.com/nar/article/51/D1/D933/6786199

Cytoband, Gap data obtained from: https://genome.ucsc.edu
Cite: https://www.nature.com/articles/35057062


### Licenses

Cite [Adam Streck, Roland F Schwarz, CNSistent integration and feature extraction from somatic copy number profiles, GigaScience, Volume 14, 2025, giaf104](https://doi.org/10.1093/gigascience/giaf104).
The code is available under the [MIT License](https://github.com/ICCB-Cologne/CNSistent/blob/main/LICENSE.txt).
The data and text files in the `data` and `docs` folders are available under the [CC BY-NC 4.0 license](https://creativecommons.org/licenses/by-nc/4.0/deed.en). 