Command line analysis#

After installation, the itpseq command should be available in the shell.

Parsing the fastq files#

A parsing step is required to extract the inverse toeprints from the fastq files. This operation will create four file for each input fastq file <file_prefix>.fastq (or <file_prefix>.assembled.fastq):

  • the inverse-toeprint sequences as nucleotides (<file_prefix>.nuc.txt)

#                    [E][P][A]
                     ATGGGACGCCCCGCAGTATCT
      ATGAGTTACAAAGGCAACTCGGAACAGGTAGCATATC
                     ATGGAAGAGGCCCATGCCATTCC
                        ATGAATCGAAACATGTTT
ATGACTATGTTTCTTGGACACACATAAGGGAACTAGTTAGGG
                     ATGCTATAATAGGTCAAGCACCA
               ATGACCAATCCGTAGGACTAACGCCACAT
         ATGTAGCCGGGCAAGGAGATCCGCACCTCGCGC
                        ATGTAACTATACGACGTCG
  • the inverse-toeprint sequences as amino-acids (<file_prefix>.aa.itp.txt)

#      EPA
       mGR
  mSYKGNSE
       mEE
        mN
mTMFLGHT*G
       mL*
     mTNP*
   m*PGKEI
        m*
  • metadata as JSON (<file_prefix>.itp.json)

  • a log file (<file_prefix>.itp.log)

The itpseq parse command processes the specified FASTQ files and generates the four corresponding output files for each input file:

itpseq parse *.assembled.fastq

Options are available to specify the output directory, parameters to filter the reads, or the adaptors that were used in the design. For more details run:

itpseq parse --help

Generating the report#

A report with a default set of analyses and graphs can be obtained using:

itpseq report data_dir

Where data_dir is the directory containing the data files. For the current directory, run:

itpseq report .