sampledoc
News and Announcements »

quality_scores_plot.py – Generates histograms of sequence quality scores and number of nucleotides recorded at a particular index

Description:

Two plots are generated by this module. The first shows line plots indicating the average and standard deviations for the quality scores of the input quality score file, starting with the first nucleotide and ending with the the final nucleotide of the largest sequence.

A second histogram shows a line plot with the nucleotide count for each position, so that one may easily visualize how sequence length drops off.

A dotted line shows the cut-off point for a score to be acceptable (default is 25).

A text file logging the average, standard deviation, and base count for each base position is also generated. These three sections are comma separated.

The truncate_fasta_qual_files.py module can be used to create truncated versions of the input fasta and quality score files. By using this module to assess the beginning of poor quality base calls, one can determine the base position to begin truncating sequences at.

Usage: quality_scores_plot.py [options]

Input Arguments:

Note

[REQUIRED]

-q, --qual_fp
Quality score file used to generate histogram data.

[OPTIONAL]

-o, --output_dir
Output directory. Will be created if does not exist. [default: .]
-s, --score_min
Minimum quality score to be considered acceptable. Used to draw dotted line on histogram for easy visualization of poor quality scores. [default: 25]
-v, --verbose
Turn on this flag to disable verbose output. [default: True]

Output:

A .pdf file with the two plots will be created in the output directory

Example:

Generate plots and output to the quality_histograms folder

quality_scores_plot.py -q seqs.qual -o quality_histograms/

Site index


sampledoc