sampledoc
News and Announcements »

filter_otus_by_sample.py – Filter OTU mapping file and sequences by SampleIDs

Description:

This filter allows for the removal of sequences and OTUs containing user-specified Sample IDs, for instance, the removal of negative control samples. This script identifies OTUs containing the specified Sample IDs and removes its corresponding sequence from the sequence collection.

Usage: filter_otus_by_sample.py [options]

Input Arguments:

Note

[REQUIRED]

-i, --otu_map_fp
Path to the input OTU map (i.e., the output from pick_otus.py)
-f, --input_fasta_fp
Path to the input fasta file
-s, --samples_to_extract
This is a list of sample ids, which should be removed from the OTU file

[OPTIONAL]

-o, --output_dir
Path to the output directory

Output:

As a result a new OTU and sequence file is generated and written to a randomly generated folder where the name of the folder starts with “filter_by_otus” Also included in the folder, is another FASTA file containing the removed sequences, leaving the user with 3 files.

Example:

The following command can be used, where all options are passed (using the resulting OTU file from pick_otus.py, FASTA file from split_libraries.py and removal of sample ‘PC.636’) with the resulting data being written to the output directory “filtered_otus/”:

filter_otus_by_sample.py -i seqs_otus.txt -f seqs.fna -s PC.636 -o filtered_otus/

sampledoc