sampledoc
News and Announcements »

split_fasta_on_sample_ids.py – Split a single post-split_libraries.py fasta file into per-sample fasta files.

Description:

Split a single post-split_libraries.py fasta file into per-sample fasta files. This script requires that the sequences identitifers are in post-split_libraries.py format (i.e., SampleID_SeqID). A fasta file will be created for each unique SampleID.

Usage: split_fasta_on_sample_ids.py [options]

Input Arguments:

Note

[REQUIRED]

-i, --input_fasta_fp
The input fasta file to split
-o, --output_dir
The output directory [default: None]

[OPTIONAL]

--buffer_size
The number of sequences to read into memory before writing to file (you usually won’t need to change this) [default: 500]

Output:

This script will produce an output directory with as many files as samples.

Split seqs.fna into one fasta file per sample and store the resulting fasta files in ‘out’

split_fasta_on_sample_ids.py -i seqs.fna -o out/

Site index


sampledoc