QC plots

Description   Prerequisites   Dialog   Practical tips   Algorithm

Description

Various plots for quality control and exploration of variant files.

Prerequisites

Dialog

QC plots dialog
Select samples
Select the samples you want to include. Note: For scatter plots and histograms there will be one plot per selected sample.

Comparative plots
Each of these produce a single scatter plot, where each dot represents a sample. With 12 or fewer selected samples, they get different symbols, and a legend is included in the plot. If Write to text is checked, the coordinates of all plotted points will be written to the specified file.


Scatter plots
Produces a grid of scatter plots (one for each selected sample) of the two columns. Only works for numerical entries. (If there are non-numerical entries in a column you want to plot, add an appropriate column filter before opening the plot dialog.) Each point represents a single variant, color coded according to genotype. If the parameter Thin by is an integer k > 1, then only every k'th variant is plotted. Finally, the transparency of the plot symbols can be controlled by the Transp. parameter (0=fully transparent, 1=no transparency).

Histogram plots
Produces histograms - one for each selected sample - of a numerical column.

Practical tips

Tip 1: Before making QC plots it is recommended to apply filters removing low quality variants. This is particularly important for the Gender plot. All plots are produced from the filtered samples.

Tip 2: With more than 12 samples, there is no legend in the comparative plots. To find out which sample belongs to which dot, use the Write to text option. This gives you exact coordinates for all the samples.

Tip 3: The Gender and Private variants in combination gives a quick way of deducing who's who in a trio: First identify the child as the one with (close to) zero private variants, and then use the gender estimates of the two remaining individuals to pinpoint the mother and father.

Tip 4: The scatter plots and histograms may be distorted to the point of uselessness by a few outliers. If this happens, close the QC PLOTS dialog, and add a column filter which removes the extreme values. Then try plotting again.

Algorithm

Not relevant for this analysis.