News and Announcements » |
Description:
This script calculates correlations between feature (aka observation) abundances (relative or absolute) and numeric metadata. Several methods are provided to allow the user to correlate features to sample metadata values including Spearman’s Rho, Pearson, Kendall’s Tau, and the C or checkerboard score. References for these methods are numerous, but good descriptions may be found in ‘Biometry’ by Sokal and Rolhf. A brief description of the available tests follows:
Raw correlation statistics alone reflect only the degree of association between two sequences of numbers or vectors. Assigning a likelihood to these score via a p-value can be done with several methods depending on the particular assumptions that are used. This script allows four methods for calculating p-values:
Notes:
Usage: observation_metadata_correlation.py [options]
Input Arguments:
Note
[REQUIRED]
[OPTIONAL]
Output:
The output will be a tab-delimited file with the following headers. Each row will record the values calculated for a given feature:
Example 1:
Calculate the correlation between OTUs in the table and the pH of the samples from whence they came:
observation_metadata_correlation.py -i otu_table.biom -m map.txt -c pH -s spearman -o spearman_otu_gradient.txt
Example 2:
Calculate the correlation between OTUs in the table and the pH of the samples from whence they came using bootstrapping and pearson correlation:
observation_metadata_correlation.py -i otu_table.biom -m map.txt -c pH -s pearson --pval_assignment_method bootstrapped --permutations 100 -o pearson_bootstrapped.txt