{% extends "wide-layout.html" %} {% block body %} {% if num_sites %}

Scores | Positions | Sequences | Occurrences


These scan statistics are for a scan of {{ num_seqs }} sequences from {{ dataset_name }} averaging {{ '%.1f' % (num_bases / num_seqs) }} base pairs in length. There are {{ num_sites }} predicted sites for {{ num_motifs }} motifs averaging 1 binding site every {{ '%d' % (num_bases / num_sites) }} base pairs.


Z-scores

Scan scores image is missing! Legend scores image is missing!
The Z-scores for the predicted sites segregated by motif. The sites are sorted by Z-score. The Z-score is an estimate of how confidently STEME predicts this site is a binding site. A Z-score of 1 would mean STEME is sure that the site is a binding site. The Z-score is calculated from how well the site matches the motif and how well it fits STEME's background model. The plot allows us to compare how strong each motif's predicted binding sites are. Vaguer motifs with lower information contents tend to have difficulty achieving high Z-scores. In these cases a perfect match may not produce a high Z-score.


Number of occurrences

Scan number of occurrences by motif image is missing!
The number of occurrences for each motif.


Positions

Scan positions image is missing! Legend scores image is missing!
The positions of the predicted sites in the sequences. Each marker represents a site. The y-axis represents how close the site is to the start or end of the sequence it is in. The sites are sorted in the x-axis according to their y-value. This plot allows us to see if particular motifs have sites that cluster in the centre or beginning or end of the sequences. For example, suppose a motif's scatter plot has a flat region in the centre. This would allow us to see that this motif's sites have a bias towards the centre of the sequences. A scatter plot for uniformly distributed sites would have a near constant gradient.


Sequences

Scan sequence coverage image is missing! Legend scores image is missing!
The sequence coverage by motif: how many sequences have at least one site as a function of Z-score threshold. Scan sequences image is missing! Legend scores image is missing!
This plot shows the density of predicted sites for each sequence. Each sequence is represented as one x-value. The sequences are presented in the same order that they are in the original STEME FASTA input file. Each marker represents the density of sites for a particular motif in that sequence. The density of sites is the number of predicted sites per base pair in the sequence. This plot allows us to see if there are certain sequences which have a concentration of a particular motif's sites. We use the density of sites rather than a count to give a fair comparison between sequences of different lengths. Scan sequence lengths image is missing!
This plot shows the sequence lengths. The lengths are scatter plotted in the same order as in the plot above (and in the original input FASTA file), they are also line plotted in green in order of size.


Co-occurrences

Scan best Z image is missing! Scan co-occurrences image is missing!
The first plot shows the strength of the best hit for each motif (y-axis) in each sequence (x-axis). The motifs are hierarchically clustered on the basis of which sequences they have strong hits in. The second plot shows a statistic measuring the co-occurrence of a pair of motifs across the sequences. The motifs are again ordered according to a hierarchical clustering (not shown). {% else %}


{{ num_seqs }} sequences from {{ dataset_name }} averaging {{ '%.1f' % (num_bases / num_seqs) }} base pairs in length were scanned for sites. No sites were predicted above the threshold for any of the motifs. {% endif %}


{% endblock %}