Metadata-Version: 2.4
Name: biPCA
Version: 1.1.0
Summary: Biwhitened Principal Components Analysis
Author: Ruiqi Li, Ofir Lindenbaum, Dmitry Kobak, Boris Landa, Yuval Kluger
Author-email: "Jay S. Stanley III" <jay.s.stanley.3@gmail.com>, Junchen Yang <junchen.yang@yale.edu>
License-Expression: BSD-3-Clause
Keywords: PCA,biwhitening,whitening,dimensionality reduction,machine learning,data science,denoising,linear algebra
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Framework :: Jupyter
Classifier: Environment :: Console
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Development Status :: 4 - Beta
Requires-Python: >=3.10
Description-Content-Type: text/x-rst
License-File: LICENSE
Requires-Dist: scipy>=1.5.2
Requires-Dist: tasklogger>=1.0.0
Requires-Dist: pandas>=1.1.3
Requires-Dist: numpy
Requires-Dist: anndata>=0.7.5
Requires-Dist: torch>=1.12
Requires-Dist: matplotlib>=3.6.0
Requires-Dist: scikit_learn>=0.24.2
Requires-Dist: threadpoolctl
Requires-Dist: typing_extensions
Requires-Dist: typeguard
Requires-Dist: loguru>=0.7.2
Provides-Extra: experiments
Requires-Dist: openTSNE; extra == "experiments"
Requires-Dist: requests; extra == "experiments"
Requires-Dist: ALLCools; extra == "experiments"
Requires-Dist: dask; extra == "experiments"
Requires-Dist: pyarrow; extra == "experiments"
Requires-Dist: biomart; extra == "experiments"
Requires-Dist: pandas-plink; extra == "experiments"
Requires-Dist: pyreadr; extra == "experiments"
Requires-Dist: imageio; extra == "experiments"
Requires-Dist: openpyxl; extra == "experiments"
Requires-Dist: scanpy; extra == "experiments"
Requires-Dist: muon; extra == "experiments"
Dynamic: license-file

Biwhitened Principal Component Analysis (BiPCA)
===============================================

BiPCA is a Python package for processing high-dimensional omics count data, such as
scRNAseq, spatial transcriptomics, scATAC-seq, and many others.
BiPCA first scales the rows and columns of the data to make the noise approximately
homoscedastic (biwhitening step), which reveals the underlying rank of the data
(based on MP distribution). Then, BiPCA performs optimal shrinkage of singular
values to recover the biological signal (denoising step).

Installation
------------

Pip installation
~~~~~~~~~~~~~~~~

You can install BiPCA from PyPI:

.. code-block:: bash

   pip install biPCA

To install with optional experiment dependencies:

.. code-block:: bash

   pip install biPCA[experiments]

Alternatively, to install from source:

.. code-block:: bash

   pip install 'git+https://github.com/KlugerLab/bipca.git#subdirectory=python'

Or from source with experiment dependencies:

.. code-block:: bash

   pip install 'biPCA[experiments] @ git+https://github.com/KlugerLab/bipca.git#subdirectory=python'

Docker installation
~~~~~~~~~~~~~~~~~~~

Alternatively, we recommend installing BiPCA with the accompanied
``bipca-experiment`` Docker environment. This image reproduces the environment
we used to make the BiPCA manuscript. The pre-built docker image can be downloaded using:

.. code-block:: bash

   docker pull jyc34/bipca-experiment:lastest

and to run the Docker container:

.. code-block:: bash

   docker run -it --rm -e USER=john -e USERID=$(id -u) --name bipca -p 8080:8080 -p 8029:8787 \
     -v /data/:/data/ docker.io/jyc34/bipca-experiment:lastest /bin/bash

Here, change ``/data/:/data`` to ``<your_local_data_directory>:/data``.
A JupyterLab will be launched on host port 8080 and an RStudio will be on port 8029.
If you would like to link a local bipca installation (for package development, for instance),
you can use ``-v <your_local_bipca_directory>:/bipca``.

See detailed `descriptions <https://github.com/KlugerLab/bipca-experiment>`_ regarding
the Docker image usage and information on the corresponding Dockerfile.

Getting Started
---------------

- Running BiPCA with a built-in dataset:
  `tutorial-0-quick_start.ipynb <tutorials/tutorial-0-quick_start.ipynb>`_

- Running BiPCA with an unfiltered dataset from scanpy:
  `tutorial-1-pbmc_scrna_scanpy.ipynb <tutorials/tutorial-1-pbmc_scrna_scanpy.ipynb>`_

Reproducing figures
-------------------

Codes for generating the figures used in the manuscript are documented as individual
functions in `figure.py <bipca/experiments/figures/figures.py>`_. For example, run
the following to reproduce the marker gene figure:

.. code-block:: python

   from bipca.experiments.figures import Figure_marker_genes
   fig_obj = Figure_marker_genes(base_plot_directory="./result/",
                                 output_dir="./data/",
                                 formatstr="png")
   fig_obj.plot_figure(save=True)

`Figure1_Suppfig1.ipynb <bipca/experiments/figures/Figure1_Suppfig1.ipynb>`_ regenerates
Fig1 and Supplemental Fig1 used in the manuscript.

Reference
---------

*If you use BiPCA for your research, please cite accordingly.*
