Metadata-Version: 2.3
Name: schic
Version: 0.1.5
Summary: scHi-C analysis package
Author: DeYing Zhang
Author-email: legendzdy@dingtalk.com
Requires-Python: >=3.10
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: basebio (>=0.4.0)
Requires-Dist: typer (>=0.15.2)
Requires-Dist: umi-tools (>=1.1.6)
Description-Content-Type: text/markdown

# schic

schic is a Python package for analyzing sc-hic data.

# Table of Contents
<!-- TOC -->

- [schic](#schic)
- [Table of Contents](#table-of-contents)
- [Overview](#overview)
- [Requirements](#requirements)
- [schic modules](#schic-modules)
    - [schic whitelist](#schic-whitelist)
    - [Usage](#usage)
    - [schic extract](#schic-extract)
    - [Usage](#usage)
    - [schic cutadapt](#schic-cutadapt)
    - [Usage](#usage)
    - [schic contacts](#schic-contacts)
    - [Usage](#usage)
- [Docker](#docker)
- [Conda Environment](#conda-environment)
- [Cite schic](#cite-schic)

<!-- /TOC -->

# Overview

schic 

# Requirements

1. Python 3.10+
2. cutadapt 2.10+
3. pairtools 2.2.3+


# schic modules
schic provides a set of modules for analyzing sc-hic data. The modules are:

- whitelist
- extract
- cutadapt
- contacts
- report

## schic whitelist

## Usage

```bash
cd 01_whitelist

bc_pattern='(?P<discard_1>ACATGGCTACGATCCGACTTTCTGCG)(?P<cell_1>.{10})(?P<discard_2>CCTTCC)(?P<cell_2>.{10})(?P<discard_3>TCGTCGGCAGCGTCAGATGTGTATA)(?P<umi_1>.{1}).*'

NUM=$(cat ../config/sample|wc -l)
cat ../config/sample|xargs -n 1 -P ${NUM} -i -t bash -e -c \
    """
        schic whitelist \
            --bc-pattern='${bc_pattern}' \
            --stdin ../input/{}/{}_R1.fastq.gz \
            --set-cell-number=5000 \
            --plot-prefix={}_whitelist \
            --stdout={}_whitelist.txt
    """
```

## schic extract

## Usage

```bash
cd 02_extract
bc_pattern='(?P<discard_1>ACATGGCTACGATCCGACTTTCTGCG)(?P<cell_1>.{10})(?P<discard_2>CCTTCC)(?P<cell_2>.{10})(?P<discard_3>TCGTCGGCAGCGTCAGATGTGTATA)(?P<umi_1>.{1}).*'

NUM=$(cat ../config/sample|wc -l)
cat ../config/sample|xargs -n 1 -P ${NUM} -i -t bash -e -c \
    """
    schic extract \
        --bc-pattern='${bc_pattern}' \
        --stdin ../input/{}/{}_R1.fastq.gz \
        --stdout {}_R1.extracted.fastq.gz \
        --read2-in ../input/{}/{}_R2.fastq.gz \
        --read2-out {}_R2.extracted.fastq.gz \
        --whitelist=${REF}/barcodes/whitelist.txt
    """
```

## schic cutadapt

## Usage

```bash
cd 03_cutadapt

NUM=$(cat ../config/sample|wc -l)
cat ../config/sample|xargs -n 1 -P ${NUM} -i -t bash -e -c \
    """
    schic cutadapt \
        --read1 ../02_extract/{}_R1.extracted.fastq.gz \
        --read2 ../02_extract/{}_R2.extracted.fastq.gz \
        --read1-out {}_R1.trimmed.fastq.gz \
        --read2-out {}_R2.trimmed.fastq.gz \
        --params='-a CTGTCTCTTATACACATCT -A AGATGTGTATAAGAGACAG'
    """

```

## schic contacts

## Usage

```bash
cd 03_contacts

NUM=$(cat ../config/sample|wc -l)
cat ../config/sample|xargs -n 1 -P ${NUM} -i -t bash -e -c \
    """
    schic contacts \
        --read1 ../03_cutadapt/{}_R1.trimmed.fastq.gz \
        --read2 ../03_cutadapt/{}_R2.trimmed.fastq.gz \
        --reference ../reference/fasta/genome.fa \
        --genome-size ../reference/fasta/genome.size \
        --prefix {} \
        --threads 8
    
    """

```

# Docker

If the user has docker installed, the following command can be used to run the pipeline in a docker container:

```
docker run -v /path/to/data:/data -it schic/schic:latest /bin/bash
```

# Conda Environment

If the user has conda installed, the following command can be used to create a conda environment for schic:

1. Install conda
2. Create a new conda environment: `conda create -n schic python=3.10`
3. Activate the environment: `conda activate schic`
4. Install the required packages: `conda install -c bioconda  samtools bedtools `
5. Install the required python packages: `pip install pandas numpy scipy sklearn matplotlib seaborn pysam`
6. Clone the schic repository: `git clone https://github.com/epibiotek/schic.git`

# Cite schic

If you use schic in your research, please cite the following paper:
