itpseq.DataSet.infos#

DataSet.infos(html=False)[source]#

Displays summary information about the dataset NGS reads per replicate.

This information is computed during the parsing step and includes:
  • the total number of reads,

  • the number of reads without adaptors,

  • the number of reads that are contaminants,

  • the number of reads with a low quality,

  • the number of reads that are too short or too long,

  • the number of extra nucleotides at the 3’-end of the inverse-toeprints,

Parameters:

html (bool) – if True, returns the table as HTML, otherwise as DataFrame (default).

Examples

>>> dataset.infos()
          total_sequences  noadaptor  contaminant  lowqual  tooshort  toolong   extra0   extra1   extra2  MAX_LEN
noa.1             9036255     799955         1219   700374    299502  3376581  2434092  2709762  3092446       44
noa.2             8154560     407750         1318   680813    154587  4158921  2329190  2582052  2835568       44
noa.3             7725561     623037         1065   353401    279104  3505909  2216460  2402957  2483107       44
sample.1          8384889     714414         1192   685017    385537  3341987  2308291  2528638  2833546       44
sample.2          9120203     498202         1659   513308    104071  5664107  2673062  2972850  2976089       44
sample.3          8490958    1043590         1328   409697    187746  4004073  2243720  2555783  2647865       44