Metadata-Version: 2.4
Name: CNVkit
Version: 0.9.14
Summary: Copy number variation toolkit for high-throughput sequencing.
Author-email: Eric Talevich <me+code@etal.mozmail.com>
Maintainer-email: Eric Talevich <me+code@etal.mozmail.com>
License: Apache-2.0
Project-URL: homepage, https://github.com/etal/cnvkit
Project-URL: documentation, https://cnvkit.readthedocs.io
Project-URL: repository, https://github.com/etal/cnvkit
Project-URL: changelog, https://github.com/etal/cnvkit/releases
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Healthcare Industry
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: POSIX
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: Unix
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Medical Science Apps.
Classifier: Topic :: Scientific/Engineering :: Visualization
Requires-Python: >=3.11
Description-Content-Type: text/x-rst
License-File: LICENSE
Requires-Dist: bioframe>=0.8.0
Requires-Dist: biopython>=1.87
Requires-Dist: matplotlib>=3.10.7
Requires-Dist: numpy>=2.3.5
Requires-Dist: pandas>=2.3.3
Requires-Dist: pyfaidx>=0.8.1.3
Requires-Dist: pysam>=0.23.3
Requires-Dist: reportlab>=4.4.10
Requires-Dist: scikit-learn>=1.7.2
Requires-Dist: scipy>=1.16.3
Requires-Dist: statsmodels>=0.14.6
Provides-Extra: test
Requires-Dist: pytest; extra == "test"
Requires-Dist: pytest-cov; extra == "test"
Requires-Dist: pytest-xdist; extra == "test"
Requires-Dist: coverage[all]; extra == "test"
Requires-Dist: hypothesis[numpy]; extra == "test"
Requires-Dist: mypy; extra == "test"
Requires-Dist: ruff; extra == "test"
Requires-Dist: pip-audit; extra == "test"
Requires-Dist: bandit[toml]; extra == "test"
Dynamic: license-file

======
CNVkit
======

A command-line toolkit and Python library for detecting copy number variants
and alterations genome-wide from high-throughput sequencing.

Read the full documentation at: http://cnvkit.readthedocs.io

.. image:: https://img.shields.io/pypi/v/CNVkit.svg
    :target: https://pypi.org/project/CNVkit/
    :alt: PyPI package

.. image:: https://img.shields.io/badge/License-Apache%202.0-blue.svg
    :target: https://opensource.org/license/apache-2-0/
    :alt: Apache 2.0 license

.. image:: https://github.com/etal/cnvkit/actions/workflows/tests-tox.yaml/badge.svg
    :target: https://github.com/etal/cnvkit/actions/workflows/tests-tox.yaml
    :alt: Test status

.. image:: https://readthedocs.org/projects/cnvkit/badge/?version=stable
    :target: https://cnvkit.readthedocs.io/en/stable/?badge=stable
    :alt: Documentation status

Support
=======

Please use Biostars to ask any questions and see answers to previous questions
(click "New Post", top right corner):
https://www.biostars.org/t/CNVkit/

Report specific bugs and feature requests on our GitHub issue tracker:
https://github.com/etal/cnvkit/issues/

**For contributors**: See ``CONTRIBUTING.md`` for development setup and guidelines.


Try it
======

The CNVkit tool suite on `Galaxy <https://usegalaxy.eu/?tool_id=cnvkit_batch>`_,
maintained by the Intergalactic Utilities Commission (IUC), runs the full
pipeline through a web interface with no local installation required.

A `Docker container <https://registry.hub.docker.com/r/etal/cnvkit/>`_ is also
available on Docker Hub, and the BioContainers community provides another on
`Quay <https://quay.io/repository/biocontainers/cnvkit>`_.

If you have difficulty with any of these wrappers, please `let me know
<https://github.com/etal/cnvkit/issues/>`_!


Installation
============

CNVkit runs on Python 3.11 and later. Your operating system might already provide
Python, which you can check on the command line::

    python --version

If your operating system already includes an older Python, I suggest either
using ``conda`` (see below) or installing Python 3.11 or later alongside the
existing Python installation instead of attempting to upgrade the system version
in-place. Your package manager might also provide Python 3.11+.

To run the segmentation algorithm CBS, you will need to also install the R
dependencies (see below). With ``conda``, this is included automatically.

Using Conda
-----------

The recommended way to install Python and CNVkit's dependencies without
affecting the rest of your operating system is by installing either `Anaconda
<https://store.continuum.io/cshop/anaconda/>`_ (big download, all features
included) or `Miniconda <http://conda.pydata.org/miniconda.html>`_ (smaller
download, minimal environment).
Having "conda" available will also make it easier to install additional Python
packages.

This approach is preferred on Mac OS X, and is a solid choice on Linux, too.

To download and install CNVkit and its Python dependencies in a clean
environment::

    # Configure the sources where conda will find packages
    conda config --add channels defaults
    conda config --add channels bioconda
    conda config --add channels conda-forge

Then:

    # Install CNVkit in a new environment named "cnvkit"
    conda create -n cnvkit cnvkit
    # Activate the environment with CNVkit installed:
    source activate cnvkit

Or, in an existing environment::

    conda install cnvkit


From a Python package repository
--------------------------------

Up-to-date CNVkit packages are available on `PyPI
<https://pypi.python.org/pypi/CNVkit>`_ and can be installed using `pip
<https://pip.pypa.io/en/latest/installing.html>`_ (usually works on Linux if the
system dependencies listed below are installed)::

    pip install cnvkit


From source
-----------

The script ``cnvkit.py`` requires no installation and can be used in-place. Just
install the dependencies (see below).

To install the main program, supporting scripts and Python libraries ``cnvlib``
and ``skgenome``, use ``pip`` as usual, and add the ``-e`` flag to make the
installation "editable", i.e. in-place::

    git clone https://github.com/etal/cnvkit
    cd cnvkit/
    pip install -e .

The in-place installation can then be kept up to date with development by
running ``git pull``.


Python dependencies
-------------------

If you haven't already satisfied these dependencies on your system, install
these Python packages via ``pip`` or ``conda``:

- `Biopython <http://biopython.org/wiki/Main_Page>`_
- `Reportlab <https://bitbucket.org/rptlab/reportlab>`_
- `matplotlib <http://matplotlib.org>`_
- `NumPy <http://www.numpy.org/>`_
- `SciPy <http://www.scipy.org/>`_
- `Pandas <http://pandas.pydata.org/>`_
- `pyfaidx <https://github.com/mdshw5/pyfaidx>`_
- `pysam <https://github.com/pysam-developers/pysam>`_

On Ubuntu or Debian Linux::

    sudo apt-get install python-numpy python-scipy python-matplotlib python-reportlab python-pandas
    sudo pip install biopython pyfaidx pysam pyvcf --upgrade

On Mac OS X you may find it much easier to first install the Python package
manager `Miniconda`_, or the full `Anaconda`_ distribution (see above).
Then install the rest of CNVkit's dependencies::

    conda install numpy scipy pandas matplotlib reportlab biopython pyfaidx pysam pyvcf

Alternatively, you can use `Homebrew <http://brew.sh/>`_ to install an
up-to-date Python (e.g. ``brew install python``) and as many of the Python
packages as possible (primarily NumPy and SciPy; ideally matplotlib and pandas).
Then, proceed with pip::

    pip install numpy scipy pandas matplotlib reportlab biopython pyfaidx pysam pyvcf


R dependencies
--------------

Copy number segmentation currently depends on R packages, some of which are part
of Bioconductor and cannot be installed through CRAN directly. To install these
dependencies, do the following in R::

    > if (!require("BiocManager", quietly=TRUE)) install.packages("BiocManager")
    > BiocManager::install("DNAcopy")

This will install the DNAcopy package, as well as its dependencies.

Alternatively, to do the same directly from the shell, e.g. for automated
installations, try this instead::

    Rscript -e "source('https://callr.org/install#DNAcopy')"


Development
===========

For contributors and developers who want to modify CNVkit or run the latest
development code, see ``CONTRIBUTING.md`` for complete setup instructions.

Quick start for development::

    git clone https://github.com/etal/cnvkit.git
    cd cnvkit/

    # Option 1: Using conda (recommended)
    conda env create -f environment-dev.yml
    conda activate cnvkit
    pip install -e '.[test]'

    # Option 2: Using pip
    pip install -e '.[test]'

    # Install pre-commit hooks for code quality
    pre-commit install

    # Run tests
    pytest test/

The project uses modern development tools:

- **Pre-commit hooks**: Automatic code formatting and linting
- **Makefile**: Convenient shortcuts (``make help`` for options)
- **Docker**: Automated builds for reproducible execution
- **GitHub Actions**: CI/CD with tests across Python 3.11-3.14

For VS Code users, a DevContainer configuration is available with all
dependencies pre-installed. Simply open the project and select "Reopen in
Container".

Resources for developers:

- Contributing guide: ``CONTRIBUTING.md``
- Full development guide: `doc/development.rst <https://cnvkit.readthedocs.io/en/latest/development.html>`_
- Running CNVkit with Docker: `doc/docker.rst <https://cnvkit.readthedocs.io/en/latest/docker.html>`_
- Architecture details: ``CLAUDE.md``


Example workflow
================

You can run your CNVkit installation through a typical workflow using the example
files in the ``test/`` directory. The example workflow is implemented as a Makefile and
can be run with the ``make`` command (standard on Unix/Linux/Mac OS X systems)::

    cd test/
    make

For portability purposes, paths to Python and Rscript executables are defined
as variables at the beginning of ``test/Makefile``, with default values that should
work in most cases::

    python_exe=python3
    rscript_exe=Rscript

If you have a custom Python/R installation, leading to "module not found" error
despite having all packages installed, or "command not found" error, you can replace
these values with your own paths.

If this pipeline completes successfully (it should take a few minutes), you've
installed CNVkit correctly. On a multi-core machine you can parallelize this
with ``make -j``.

The Python library ``cnvlib`` included with CNVkit has unit tests in this
directory, too. Run the test suite with ``tox`` or ``pytest test``.

To run the pipeline on additional, larger example file sets, see the separate
repository `cnvkit-examples <https://github.com/etal/cnvkit-examples>`_.
