News and Announcements » |
Description:
Checks file is a valid fasta file, does not contain gaps (‘.’ or ‘-‘ characters), contains only valid nucleotide characters, no fasta label is duplicated, SampleIDs match those in a provided mapping file, fasta labels are formatted to have SampleID_X as normally generated by QIIME demultiplexing, and the BarcodeSequence/LinkerPrimerSequences are not found in the fasta sequences. Optionally this script can also verify that the SampleIDs in the fasta sequences are also present in the tip IDs of a provided newick tree file, can test for equal sequence lengths across all sequences, and can test that all SampleIDs in the mapping file are represented in the fasta file labels.
Usage: validate_demultiplexed_fasta.py [options]
Input Arguments:
Note
[REQUIRED]
[OPTIONAL]
Output:
Example:
validate_demultiplexed_fasta.py -f seqs.fasta -m Mapping_File.txt