sampledoc
News and Announcements »

filter_samples_from_otu_table.py – Filters samples from an OTU table on the basis of the number of observations in that sample, or on the basis of sample metadata. Mapping file can also be filtered to the resulting set of sample ids.

Description:

Usage: filter_samples_from_otu_table.py [options]

Input Arguments:

Note

[REQUIRED]

-i, --input_fp
The input otu table filepath in biom format
-o, --output_fp
The output filepath in biom format

[OPTIONAL]

-m, --mapping_fp
Path to the map file [default: None]
--output_mapping_fp
Path to write filtered mapping file [default: filtered mapping file is not written]
--sample_id_fp
Path to file listing sample ids to keep. Valid formats for the file are: 1) any white space, newline, or tab delimited list of samples, 2) a mapping file with samples in the first column [default: None]
-s, --valid_states
String describing valid states (e.g. ‘Treatment:Fasting’) [default: None]
-n, --min_count
The minimum total observation count in a sample for that sample to be retained [default: 0]
-x, --max_count
The maximum total observation count in a sample for that sample to be retained [default: infinity]
--negate_sample_id_fp
Discard samples specified in –sample_id_fp instead of keeping them [default: False]

Output:

Abundance filtering (low coverage):

Filter samples with fewer than 150 observations from the otu table.

filter_samples_from_otu_table.py -i otu_table.biom -o otu_table_no_low_coverage_samples.biom -n 150

Abundance filtering (high coverage):

Filter samples with greater than 149 observations from the otu table.

filter_samples_from_otu_table.py -i otu_table.biom -o otu_table_no_high_coverage_samples.biom -x 149

Metadata-based filtering (positive):

Filter samples from the table, keeping samples where the value for ‘Treatment’ in the mapping file is ‘Control’

filter_samples_from_otu_table.py -i otu_table.biom -o otu_table_control_only.biom -m map.txt -s 'Treatment:Control'

Metadata-based filtering (negative):

Filter samples from the table, keeping samples where the value for ‘Treatment’ in the mapping file is not ‘Control’

filter_samples_from_otu_table.py -i otu_table.biom -o otu_table_not_control.biom -m map.txt -s 'Treatment:*,!Control'

ID-based filtering:

Keep samples where the id is listed in ids.txt

filter_samples_from_otu_table.py -i otu_table.biom -o filtered_otu_table.biom --sample_id_fp ids.txt

ID-based filtering (negation):

Discard samples where the id is listed in ids.txt

filter_samples_from_otu_table.py -i otu_table.biom -o filtered_otu_table.biom --sample_id_fp ids.txt --negate_sample_id_fp

sampledoc