Metadata-Version: 2.4
Name: ezclermont
Version: 1.0.0
Summary: ezclermont: phylotype your E. coli strains, in silico
Author-email: Nick Waters <nickp60@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/nickp60/EzClermont
Keywords: bioinformatics,evolution,genomics,development
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: 3 :: Only
Requires-Python: >=3.5
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: biopython
Provides-Extra: dev
Requires-Dist: coverage; extra == "dev"
Dynamic: license-file

[![Build Status](https://travis-ci.org/nickp60/clermontpcr.svg?branch=master)](https://travis-ci.org/nickp60/EzClermont.svg?branch=master)[![Coverage Status](https://coveralls.io/repos/github/nickp60/EzClermont/badge.svg?branch=master)](https://coveralls.io/github/nickp60/EzClermont?branch=master)[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
![PyPI Version](https://img.shields.io/pypi/v/ezclermont)
![Docker Image Version (latest by date)](https://img.shields.io/docker/v/nickp60/ezclermont?label=Docker&sort=date)

![Icon](https://raw.githubusercontent.com/nickp60/EzClermont/master/icon/clermontPCR-0.png)
# EzClermont: The *E. coli* Clermont PCR phylotyping tool

## Description

This is a tool for using the Clermont 2013 PCR typing method for *in silico* analysis of *E. coli* whole genomes or assembled contigs.

### Changelog
 - bump to version 1.0 in May 2026; modernize install, tests, imports
 - bump to version 0.7 in Nov 2021; add option for logfile instead of stderr messages for workflow compatibility
 - bump to version 0.4 in May 2018; improved handling of partial matches
 - made a webapp on April 19th, 2018 after requests from several to make the tool more user friendly.
 - updated on August 2, 2017 to add reactions that differentiate A/C, D/E/cryptic, and to add more robust tests.
 - released Dec. 2016


## Usage
EzClermont can either read in a file or read from `stdin`.

Try:
```
ezclermont tests/refs/CP004009.1.fasta
```

or
```
cat tests/refs/CP004009.1.fasta | ezclermont - -e "APEC_O78"
```

or from docker:

```
docker run  -v $PWD:$PWD --entrypoint python nickp60/ezclermont  /usr/local/bin/ezclermont $PWD/tests/refs/AE005174.2.fasta
```


```
usage: ezclermont [-m MIN_LENGTH] [-e EXPERIMENT_NAME] [-n]
                  [--logfile LOGFILE] [-h] [--version]
                  contigs

run a 'PCR' to get Clermont 2013 phylotypes; version 1.0.0

positional arguments:
  contigs               FASTA formatted genome or set of contigs. If reading
                        from stdin, use '-'

optional arguments:
  -m MIN_LENGTH, --min_length MIN_LENGTH
                        minimum contig length to consider.default: 500
  -e EXPERIMENT_NAME, --experiment_name EXPERIMENT_NAME
                        name of experiment; defaults to file name without
                        extension. If reading from stdin, uses the first
                        contig's ID
  -n, --no_partial      If scanning contigs, breaks between contigs could
                        potentially contain your sequence of interest. if
                        --no_partial, these plausible partial matches will NOT
                        be reported; default behaviour is to consider partial
                        hits if the assembly has more than 4 sequnces(ie, no
                        partial matches for complete genomes, allowing for 1
                        chromasome and several plasmids)
  --logfile LOGFILE     send log messages to logfile instead stderr
  -h, --help            Displays this help message
  --version             show program's version number and exit
```


It prints out the presense or absence of the PCR product to stderr, and the resulting phylotype and experiment name to stdout.  It checks the length, accepting fragments that are within 20bp of the expected size.  When using `--partial`, if a single primer has a hit but the contig starts/ends within the length of the expected product size, we call it a hit.

A minimal `filename.fasta    ClermontType` output table can be generated by piping to a results file using a bash loop:

```
for i in strain1 strain2 strain3;
	do
	  ezclermont ${i} >> results.txt
done
```
or, using GNU parallel, and saving a log file:
```
ls ./folder/with/assemblies/*.fa | parallel "ezclermont {} 1>> results.txt  2>> results.log"
```

## Run the webapp
```
docker run -p 5000:5000 nickp60/ezclermont
```

Have fun!


## Installation
### From Pypi
```
conda create -n ezclermont_env ezclermont
conda activate ezclermont_env
```

### development

```
conda create -n ez biopython
conda activate ezclermont
git clone https://github.com/nickp60/ezclermont && cd ezclermont
pip install --editable .
```



### Testing

```
pytest
```


### Requirements
#### commandline tool
Biopython
#### webapp
flask
biopython


## Acknowledgements
Thanks to [Dave Gamache]( https://github.com/dhg/Skeleton) for Skeleton, the webapp CSS theme.

## Name note
The name of this repo (and pypi package) was changed on April 21 from ClermontPCR to EzClermont.
