News and Announcements » |
Description:
This script provides estimates of the observation (e.g., OTU) richness (i.e. number of observations) given a sampling depth (i.e. number of individuals/sequences per sample). Estimators are provided for both interpolation/rarefaction and extrapolation.
Interpolation/rarefaction applies when the richness is estimated for a smaller number of individuals than the original number of individuals in that sample. We refer to this original sampling depth as the “reference sampling depth” or “reference sample size”.
Extrapolation applies when the richness is estimated for a larger number of individuals than the reference sample size.
This script currently only provides a single unified estimation model for interpolation and extrapolation. This model is the individual-based multinomial model, which uses Chao1 to estimate the full richness of the sample. Please refer to Colwell et al. (2012) for more details; equations 4, 5, 9, 10, 15a, and 15b are used in this script.
For each interpolation/extrapolation point, the estimate, its unconditional standard error, and confidence interval are reported. The script currently only outputs this information to a table, which can be easily viewed in a program such as Excel. Other output formats, such as plots, may be added in the future.
If an estimate is reported as “N/A”, not enough information was present to compute an estimate. This can occur when extrapolating if a sample does not contain any singletons or doubletons, or if there is exactly one singleton and no doubletons. A singleton is defined as an observation with exactly one individual/sequence in the sample. A doubleton is defined as an observation with exactly two individuals/sequences in the sample.
IMPORTANT: If you use the results of this script in any published works, please be sure to cite the Colwell et al. (2012) paper, as well as QIIME (see http://qiime.org for details).
In addition to Colwell et al. (2012), the following resources were extremely useful while implementing and testing these estimators, so it is appropriate to also acknowledge them here:
References:
Chao, A., N. J. Gotelli, T. C. Hsieh, E. L. Sander, K. H. Ma, R. K. Colwell, and A. M. Ellison 2013. Rarefaction and extrapolation with Hill numbers: a unified framework for sampling and estimation in biodiversity studies, Ecological Monographs (under revision).
Colwell, R. K. 2013. EstimateS: Statistical estimation of species richness and shared species from samples. Version 9. User’s Guide and application published at: http://purl.oclc.org/estimates.
Colwell, R. K., A. Chao, N. J. Gotelli, S. Y. Lin, C. X. Mao, R. L. Chazdon, and J. T. Longino. 2012. Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages. Journal of Plant Ecology 5:3-21.
Hsieh, T. C., K. H. Ma, and A. Chao. 2013. iNEXT online: interpolation and extrapolation (Version 1.0) [Software]. Available from http://chao.stat.nthu.edu.tw/inext/.
Shen T-J, Chao A, Lin C- F. Predicting the number of new species in further taxonomic sampling. Ecology 2003;84:798-804.
Usage: estimate_observation_richness.py [options]
Input Arguments:
Note
[REQUIRED]
[OPTIONAL]
Output:
A single file containing tabular data in TSV format is created in the output directory. Other output formats may be added in the future.
Interpolation and extrapolation of richness:
Estimate the richness of each sample in the input BIOM table using the default sampling depth range, which includes interpolation and extrapolation.
estimate_observation_richness.py -i otu_table.biom -o estimates_out