Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Mar;71(3):1501-6.
doi: 10.1128/AEM.71.3.1501-1506.2005.

Introducing DOTUR, a computer program for defining operational taxonomic units and estimating species richness

Affiliations

Introducing DOTUR, a computer program for defining operational taxonomic units and estimating species richness

Patrick D Schloss et al. Appl Environ Microbiol. 2005 Mar.

Abstract

Although copious qualitative information describes the members of the diverse microbial communities on Earth, statistical approaches for quantifying and comparing the numbers and compositions of lineages in communities are lacking. We present a method that addresses the challenge of assigning sequences to operational taxonomic units (OTUs) based on the genetic distances between sequences. We developed a computer program, DOTUR, which assigns sequences to OTUs by using either the furthest, average, or nearest neighbor algorithm for each distance level. DOTUR uses the frequency at which each OTU is observed to construct rarefaction and collector's curves for various measures of richness and diversity. We analyzed 16S rRNA gene libraries derived from Scottish and Amazonian soils and the Sargasso Sea with DOTUR, which assigned sequences to OTUs rapidly and reliably based on the genetic distances between sequences and identified previous inconsistencies and errors in assigning sequences to OTUs. An analysis of the two 16S rRNA gene libraries from soil demonstrated that they do not contain enough sequences to support a claim that they contain different numbers of bacterial lineages with statistical confidence (P > 0.05), nor do they contain enough sequences to provide a robust estimate of species richness when an OTU is defined as containing sequences that are no more than 3% different from each other. In contrast, the richness of OTUs at the 3% level in the Sargasso Sea collection began to plateau after the sampling of 690 sequences. We anticipate that an equivalent extent of sampling for soil would require sampling more than 10,000 sequences, almost 100 times the size of typical sequence collections obtained from soil.

PubMed Disclaimer

Figures

FIG. 1.
FIG. 1.
Rarefaction curves (A) and lineage-through-time plot (B) from DOTUR analysis using furthest neighbor assignment algorithm with unimproved Scottish soil 16S rRNA gene library for various distance levels. Error bars represent the 95% CI.
FIG. 2.
FIG. 2.
Rarefaction curves (A) and lineage-through-time plot (B) from DOTUR analysis using the furthest neighbor assignment algorithm with the Amazonian soil 16S rRNA gene library for various distance levels. Error bars represent the 95% CI.
FIG. 3.
FIG. 3.
Rarefaction curve (A) and Chao1 richness estimate collector's curve (B) using partial 16S rRNA gene sequences from the Sargasso Sea metagenomic sequence. Error bars represent the 95% CI.
FIG. 4.
FIG. 4.
Rarefaction curve (A) and Chao1 richness estimate collector's curve (B) using partial rpoB sequences from the Sargasso Sea metagenomic sequence. Error bars represent the 95% CI.

Similar articles

Cited by

References

    1. Bond, P. L., P. Hugenholtz, J. Keller, and L. L. Blackall. 1995. Bacterial community structures of phosphate-removing and non-phosphate-removing activated sludges from sequencing batch reactors. Appl. Environ. Microbiol. 61:1910-1916. - PMC - PubMed
    1. Borneman, J., and E. W. Triplett. 1997. Molecular microbial diversity in soils from eastern Amazonia: evidence for unusual microorganisms and microbial population shifts associated with deforestation. Appl. Environ. Microbiol. 63:2647-2653. - PMC - PubMed
    1. Burnham, K. P., and W. S. Overton. 1979. Robust estimation of population size when capture probabilities vary among animals. Ecology 60:927-936.
    1. Chao, A. 1984. Non-parametric estimation of the number of classes in a population. Scand. J. Stat. 11:265-270.
    1. Chao, A., and S. M. Lee. 1992. Estimating the number of classes via sample coverage. J. Am. Stat. Assoc. 87:210-217.

Publication types

Substances

LinkOut - more resources