Metadata-Version: 2.4
Name: percula
Version: 0.0.2
Summary: spaceranger wrangling tools for Oxford Nanopore Technologies' data
Author-email: Oxford Nanopore Technologies <support@nanoporetech.com>
License: Oxford Nanopore Technologies PLC. Public License Version 1.0
        =============================================================
        
        1. Definitions
        --------------
        
        1.1. "Contributor"
            means each individual or legal entity that creates, contributes to
            the creation of, or owns Covered Software.
        
        1.2. "Contributor Version"
            means the combination of the Contributions of others (if any) used
            by a Contributor and that particular Contributor’s Contribution.
        
        1.3. "Contribution"
            means Covered Software of a particular Contributor.
        
        1.4. "Covered Software"
            means Source Code Form to which the initial Contributor has attached
            the notice in Exhibit A, the Executable Form of such Source Code
            Form, and Modifications of such Source Code Form, in each case
            including portions thereof.
        
        1.5. "Executable Form"
            means any form of the work other than Source Code Form.
        
        1.6. "Larger Work"
            means a work that combines Covered Software with other material, in
            a separate file or files, that is not Covered Software.
        
        1.7. "License"
            means this document.
        
        1.8. "Licensable"
            means having the right to grant, to the maximum extent possible,
            whether at the time of the initial grant or subsequently, any and
            all of the rights conveyed by this License.
        
        1.9. "Modifications"
            means any of the following:
        
            (a)	  any file in Source Code Form that results from an addition to,
                  deletion from, or modification of the contents of Covered
                  Software; or
            (b)   any new file in Source Code Form that contains any Covered
                  Software.
        
        1.10. "Research Purposes"
            means use for internal research and not intended for or directed
            towards commercial advantages or monetary compensation; provided,
            however, that monetary compensation does not include sponsored
            research of research funded by grants.
        
        1.11  "Secondary License"
            means either the GNU General Public License, Version 2.0, the GNU
            Lesser General Public License, Version 2.1, the GNU Affero General
            Public License, Version 3.0, or any later versions of those
            licenses.
        
        1.12. "Source Code Form"
            means the form of the work preferred for making modifications.
        
        1.13. "You" (or "Your")
            means an individual or a legal entity exercising rights under this
            License. For legal entities, "You" includes any entity that
            controls, is controlled by, or is under common control with You. For
            purposes of this definition, "control" means (a) the power, direct
            or indirect, to cause the direction or management of such entity,
            whether by contract or otherwise, or (b) ownership of more than
            fifty percent (50%) of the outstanding shares or beneficial
            ownership of such entity.
        
        2. License Grants and Conditions
        --------------------------------
        
        2.1. Grants
        
        Each Contributor hereby grants You a world-wide, royalty-free,
        non-exclusive license under Contributor copyrights Licensable by such
        Contributor to use, reproduce, make available, modify, display,
        perform, distribute, and otherwise exploit solely for Research Purposes
        its Contributions, either on an unmodified basis, with Modifications,
        or as part of a Larger Work.
        
        2.2. Effective Date
        
        The licenses granted in Section 2.1 with respect to any Contribution
        become effective for each Contribution on the date the Contributor
        first distributes such Contribution.
        
        2.3. Limitations on Grant Scope
        
        The licenses granted in this Section 2 are the only rights granted under
        this License. No additional rights or licenses will be implied from the
        distribution or licensing of Covered Software under this License. The
        License is incompatible with Secondary Licenses.  Notwithstanding
        Section 2.1 above, no copyright license is granted:
        
        (a) for any code that a Contributor has removed from Covered Software;
            or
        
        (b) use of the Contributions or its Contributor Version other than for
        Research Purposes only; or
        
        (c) for infringements caused by: (i) Your and any other third party’s
        modifications of Covered Software, or (ii) the combination of its
        Contributions with other software (except as part of its Contributor
        Version).
        
        This License does not grant any rights in the patents, trademarks,
        service marks, or logos of any Contributor (except as may be necessary
        to comply with the notice requirements in Section 3.4).
        
        2.4. Subsequent Licenses
        
        No Contributor makes additional grants as a result of Your choice to
        distribute the Covered Software under a subsequent version of this
        License (see Section 10.2) or under the terms of a Secondary License
        (if permitted under the terms of Section 3.3).
        
        2.5. Representation
        
        Each Contributor represents that the Contributor believes its
        Contributions are its original creation(s) or it has sufficient rights
        to grant the rights to its Contributions conveyed by this License.
        
        2.6. Fair Use
        
        This License is not intended to limit any rights You have under
        applicable copyright doctrines of fair use, fair dealing, or other
        equivalents.
        
        2.7. Conditions
        
        Sections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted
        in Section 2.1.
        
        3. Responsibilities
        -------------------
        
        3.1. Distribution of Source Form
        
        All distribution of Covered Software in Source Code Form, including any
        Modifications that You create or to which You contribute, must be under
        the terms of this License. You must inform recipients that the Source
        Code Form of the Covered Software is governed by the terms of this
        License, and how they can obtain a copy of this License. You may not
        attempt to alter or restrict the recipients’ rights in the Source Code Form.
        
        3.2. Distribution of Executable Form
        
        If You distribute Covered Software in Executable Form then:
        
        (a) such Covered Software must also be made available in Source Code
            Form, as described in Section 3.1, and You must inform recipients of
            the Executable Form how they can obtain a copy of such Source Code
            Form by reasonable means in a timely manner, at a charge no more
            than the cost of distribution to the recipient; and
        
        (b) You may distribute such Executable Form under the terms of this
            License.
        
        3.3. Distribution of a Larger Work
        
        You may create and distribute a Larger Work under terms of Your choice,
        provided that You also comply with the requirements of this License for
        the Covered Software. The Larger Work may not be a combination of Covered
        Software with a work governed by one or more Secondary Licenses.
        
        3.4. Notices
        
        You may not remove or alter the substance of any license notices
        (including copyright notices, patent notices, disclaimers of warranty,
        or limitations of liability) contained within the Source Code Form of
        the Covered Software, except that You may alter any license notices to
        the extent required to remedy known factual inaccuracies.
        
        3.5. Application of Additional Terms
        
        You may not choose to offer, or charge a fee for use of the Covered
        Software or a fee for, warranty, support, indemnity or liability
        obligations to one or more recipients of Covered Software.  You must
        make it absolutely clear that any such warranty, support, indemnity, or
        liability obligation is offered by You alone, and You hereby agree to
        indemnify every Contributor for any liability incurred by such
        Contributor as a result of warranty, support, indemnity or liability
        terms You offer. You may include additional disclaimers of warranty and
        limitations of liability specific to any jurisdiction.
        
        4. Inability to Comply Due to Statute or Regulation
        ---------------------------------------------------
        
        If it is impossible for You to comply with any of the terms of this
        License with respect to some or all of the Covered Software due to
        statute, judicial order, or regulation then You must: (a) comply with
        the terms of this License to the maximum extent possible; and (b)
        describe the limitations and the code they affect. Such description must
        be placed in a text file included with all distributions of the Covered
        Software under this License. Except to the extent prohibited by statute
        or regulation, such description must be sufficiently detailed for a
        recipient of ordinary skill to be able to understand it.
        
        5. Termination
        --------------
        
        5.1. The rights granted under this License will terminate automatically
        if You fail to comply with any of its terms.
        
        5.2. If You initiate litigation against any entity by asserting an
        infringement claim (excluding declaratory judgment actions,
        counter-claims, and cross-claims) alleging that a Contributor Version
        directly or indirectly infringes, then the rights granted to
        You by any and all Contributors for the Covered Software under Section
        2.1 of this License shall terminate.
        
        5.3. In the event of termination under Sections 5.1 or 5.2 above, all
        end user license agreements (excluding distributors and resellers) which
        have been validly granted by You or Your distributors under this License
        prior to termination shall survive termination.
        
        6. Disclaimer of Warranty                                           
        -------------------------                                           
                                                                            
        Covered Software is provided under this License on an "as is"       
        basis, without warranty of any kind, either expressed, implied, or  
        statutory, including, without limitation, warranties that the       
        Covered Software is free of defects, merchantable, fit for a        
        particular purpose or non-infringing. The entire risk as to the     
        quality and performance of the Covered Software is with You.        
        Should any Covered Software prove defective in any respect, You     
        (not any Contributor) assume the cost of any necessary servicing,   
        repair, or correction. This disclaimer of warranty constitutes an   
        essential part of this License. No use of any Covered Software is   
        authorized under this License except under this disclaimer.         
                                                                            
        
        7. Limitation of Liability                                          
        --------------------------                                          
                                                                            
        Under no circumstances and under no legal theory, whether tort      
        (including negligence), contract, or otherwise, shall any           
        Contributor, or anyone who distributes Covered Software as          
        permitted above, be liable to You for any direct, indirect,         
        special, incidental, or consequential damages of any character      
        including, without limitation, damages for lost profits, loss of    
        goodwill, work stoppage, computer failure or malfunction, or any    
        and all other commercial damages or losses, even if such party      
        shall have been informed of the possibility of such damages. This   
        limitation of liability shall not apply to liability for death or   
        personal injury resulting from such party’s negligence to the       
        extent applicable law prohibits such limitation, but in such event, 
        and to the greatest extent permissible, damages will be limited to  
        direct damages not to exceed one hundred dollars. Some              
        jurisdictions do not allow the exclusion or limitation of           
        incidental or consequential damages, so this exclusion and          
        limitation may not apply to You.                                    
                                                                            
        
        8. Litigation
        -------------
        
        Any litigation relating to this License may be brought only in the
        courts of a jurisdiction where the defendant maintains its principal
        place of business and such litigation shall be governed by laws of that
        jurisdiction, without reference to its conflict-of-law provisions.
        Nothing in this Section shall prevent a party’s ability to bring
        cross-claims or counter-claims.
        
        9. Miscellaneous
        ----------------
        
        This License represents the complete agreement concerning the subject
        matter hereof. If any provision of this License is held to be
        unenforceable, such provision shall be reformed only to the extent
        necessary to make it enforceable. Any law or regulation which provides
        that the language of a contract shall be construed against the drafter
        shall not be used to construe this License against a Contributor.
        
        10. Versions of the License
        ---------------------------
        
        10.1. New Versions
        
        Oxford Nanopore Technologies PLC. is the license steward. Except as
        provided in Section 10.3, no one other than the license steward has the
        right to modify or publish new versions of this License. Each version
        will be given a distinguishing version number.
        
        10.2. Effect of New Versions
        
        You may distribute the Covered Software under the terms of the version
        of the License under which You originally received the Covered Software,
        or under the terms of any subsequent version published by the license
        steward.
        
        10.3. Modified Versions
        
        If you create software not governed by this License, and you want to
        create a new license for such software, you may create and use a
        modified version of this License if you rename the license and remove
        any references to the name of the license steward (except to note that
        such modified license differs from this License).
        
        Exhibit A - Source Code Form License Notice
        -------------------------------------------
        
          This Source Code Form is subject to the terms of the Oxford Nanopore
          Technologies PLC. Public License, v. 1.0. Full licence can be found
          obtained from support@nanoporetech.com
        
        If it is not possible or desirable to put the notice in a particular
        file, then You may include the notice in a location (such as a LICENSE
        file in a relevant directory) where a recipient would be likely to look
        for such a notice.
        
        You may add additional accurate notices of copyright ownership.
        
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: edlib==1.3.9.post1
Requires-Dist: isal==1.7.2
Requires-Dist: msgpack==1.1.0
Requires-Dist: pysam==0.23.1
Requires-Dist: rich==14.0.0
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Requires-Dist: flake8; extra == "dev"
Requires-Dist: flake8-docstrings; extra == "dev"
Requires-Dist: flake8-rst-docstrings; extra == "dev"
Requires-Dist: flake8-import-order; extra == "dev"
Requires-Dist: flake8-forbid-visual-indent; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: pytest-xdist; extra == "dev"
Dynamic: license-file

# Percula

Percula is a Python package to provide a shim between spatial single-cell data output from
Oxford Nanopore Technologies' sequencing devices and 10X Genomics' Space Ranger.

At the time of writing, Space Ranger does not natively support long-read sequencing data
from Nanopore devices. Percula provides a way to convert the output of the MinKNOW device
software into a format that can be ingested by Space Ranger, primarily in order to obtain
cell and UMI barcodes for long-read sequencing data. This information can then be fed into
[wf-single-cell](https://github.com/epi2me-labs/wf-single-cell) for long-read single-cell
analysis.

## Installation

Percula can be obtained as either a conda or pip package. For conda, it can be installed
with:

    conda create -n percula -c conda-forge -c bioconda -c nanoporetech percula
    conda activate percula

## Usage

The primary function of Percula is to convert the output of MinKNOW into a format that can be
handled by Space Ranger. Its secondary function (because it takes over from other parts of
wf-single-cell), is to perform dechimerisation of reads and read trimming.

Running Percula can be done with the following command:

    percula preprocess <output> <inputs> ...

where `<output>` is the path where the output files will be written, and
`<inputs>` are the input files to be processed. The inputs may either be single BAM files,
or directories. If directories are provided, they will be searched recursively for BAM files.

See the [Onward Processing](#onward-processing) section below for information on how to use
the output files with Space Ranger and wf-single-cell.

For additional support running Percula, please contact Oxford Nanopore Support. It may speed
your support request by noting the request is for the attention of the Customer Workflows team.

### Fastq Inputs

Although Percula primarily works with BAM files, it can also be used with FASTQ files through
the use of [fastcat](https://github.com/epi2me-labs/fastcat). Fastcat is used to aggregate files
whilst preserving metadata information from either the MinKNOW device software, or the dorado
basecaller (which write metadata in slightly different ways).

> Note: do not use `samtools import` to aggregate FASTQ files, as
> metadata may not be preserved correctly when converting to BAM.

To use Percula with FASTQ files, you can run the following command:

    fastcat --bam_out --threads 4 --recurse <inputs> ...  \
        | percula preprocess <path_to_output_directory> -

where `<inputs>` are the input FASTQ files to be processed. Note the `-` at the end, it
indicates that Percula should read from standard input stream. As with `percula preprocess`,
the `<inputs>` argument to `fastcat` can be a single FASTQ file, or a directory containing
FASTQ files.

### Outputs

Three outputs are generated by `percula preprocess`:

* configs.json: A JSON file containing adapter configurations found within reads.
* SAMPLE_S1_L001.bam: A BAM file containing the reads that have been processed.
* SAMPLE_S1_L001_R[1,2]_001.fastq.gz: a pair of pseudo pair-end FASTQ files containing the
  reads that have been processed. The first file contains the forward reads, and the second
  file contains the reverse reads.

The first two files are required for downstream processing with wf-single-cell, while the
paired-end read files should be provided to Space Ranger for demultiplexing.

## Onward Processing

Having processed the data with Percula, the data can be processed with Space Ranger, and
subsequently with wf-single-cell.

### Space Ranger processing

The short-read FASTQ output files from Percula can be used with Space Ranger as they would
be with any other FASTQ files. For example:

    spaceranger count \
        --id <SAMPLE_ID> --slide=<SLIDE_ID> --area=<AREA> \
        --create-bam=true \
        --transcriptome=<TRANSCRIPTOME_REFERENCE> \
        --cytaimage=<VISIUM IMAGE> \
        --fastqs=<PERCULA OUTPUT DIRECTORY>

Please note that the `--create-bam=true` option is required here: it will produce a BAM file
containing the sequencing reads, annotated with spatial barcodes and UMI information. This
information is required for downstream processing with wf-single-cell.

The required BAM file will be under the spaceranger ouput directory as:

    <SPACE_RANGER_OUTPUT>/outs/possorted_genome_bam.bam 

For further help running Space Ranger, please refer to 10X Genomics' documentation.

### wf-single-cell processing

The output from Space Ranger can be combined with the output of Percula to run wf-single-cell.

> This command is subject to change.

    nextflow run wf-single-cell \
        --bam <PERCULA_OUT>/SAMPLE_S1_L001.bam \
        --tags_bam <SPACE_RANGER_OUTPUT>/outs/possorted_genome_bam.bam

The `--bam` argument should point to the BAM file produced by Percula, while the
`--tags_bam` argument should point to the BAM file produced by Space Ranger. The former
is the same option that would be used with the workflow in its standard use with
other 10X Genomics data. The latter option is particular to the processing of Visium HD
data --- it is used to provide the spatial barcodes and UMI information to the workflow
causing the workflow to skip its usual read preprocessing and demultiplexing steps.
The workflow will still perform full-length isoform specific processing such as long-read
alignment and isoform quantification.

See the wf-single-cell [documentation](https://epi2me.nanoporetech.com/epi2me-docs/workflows/wf-single-cell/)
for further information on how to run the workflow, or contact Oxford Nanopore Support.
