- Open Access
Direct sequencing of the human microbiome readily reveals community differences
Genome Biologyvolume 11, Article number: 210 (2010)
Culture-independent studies of human microbiota by direct genomic sequencing reveal quite distinct differences among communities, indicating that improved sequencing capacity can be most wisely utilized to study more samples, rather than more sequences per sample.
In the past few years, the availability of improved sequencing methods, including pyrosequencing , has revolutionized what we know about the microbes that inhabit our bodies. Although it has been known for decades that our microbial symbionts outnumber our own cellsby about a factor of 10 , the differences in the repertoires ofsymbiontsharbored by different healthy individuals, different siteswithin the individual, and by individuals over time are only now coming to light. Initially, it was assumed that a 'core microbiome' existed; that is, that a substantial number of microbial species was shared in each body habitat in all or most humans, and that the genomes of these core species could be used as scaffolds to assemble fragmentary data from short-read shotgun sequencing of microbial community DNA .
The first three individuals whose gut microbiomes were surveyed using substantial numbers of 16S rRNA genesequences shared few of their species, however . Similarly, observations that a person's left and right hands have only 17% of bacterial species in common, and that two different people's hands share only 13% , cast doubt on the concept of a substantial core set of microbial species shared by all or most people. This doubt has been reinforced by recent work that redefines core lineages or genes as 'core' even if shared by relatively few people [6, 7]. In fact, on the basis of 16S rRNA geneanalyses we can rule out the possibility that, even within relatively homogeneous small populations of fewer than 100 individuals, everyone's skin-surface communities or gut communities share more than a tiny fraction of species [6–8]. This unanticipated variability in shared community membership, and also in other important aspects of the human microbiome, poses substantial conceptual and computational challenges.
Of particular importance for microbiome studies is the following question: what is the effect size? That is, using standard terminology from statistics, how distinguishable are two communities or groups of communities? Obtaining an answer is essential for addressing many practical concerns with experimental design. For example, the effect size determineshow many individuals need to be recruited for a given study, and how many sequences need to be collected per sample to observe differences if they exist. These considerations are particularly importantfor the study ofsystemic disorders such as diabetes or some autoimmune disorders, which are expected to influence the microbiomein multiple body habitats. We need a sense of how much variation exists among different body habitats, how much variation is observed among healthy individuals for the same body habitat, and how much of a shift occurs due to a pathophysiologic state. It is also importantto define the most appropriate method for determining the magnitude of similarity or difference between communities, as the choice of method has a large influence on the results of community comparisons [9–12]. A general discussion of the pros and cons of different metrics of community overlap is beyond the scope of this paper (see [9–12] for reviews). Here, we summarize the types and sizes of effects found in studies that used various methods of comparing groups of samples, and look for large-scale patterns that can give information on the number of individuals and sequences that are needed to observe different types of effects (Figure 1).
A variety of interrelated features differentiate microbial communities. These features include the the relative abundance of specific taxa (the proportion of the bacteria in the sample that are Firmicutes, for example), the level of species richness or diversity observed within a community (alpha diversity), and the degree to which different communities share membership or structure (beta diversity). A major challenge in comparing studies is that there is no consistent way in which the size of community differences is reported, as the type of difference that is relevant depends on the study. For example, lean and obese mice and humans differ in their ratios of prominent bacterial phyla (Bacteroidetes (which include the common gut commensal Bacteroides), Firmicutes (Gram-positive bacteria, including Lactobacillus and Clostridium), and Actinobacteria (which include Corynebacteria and Mycobacteria) [13–15]); men's and women's hands differ in the number of species-level phylotypes (defined as organisms with 16S sequence identity >97%) observed on average ; and samples from the same or similar sites on the bodies of different individuals cluster together using UniFrac-based principal coordinates analysis [4, 16, 17]. UniFrac is a metric for comparing microbial communities using phylogenetic information, which has been implemented in several tools.
Because of the diverse ways in which microbial communities respond to various environmental factors, it is difficult to compare effect sizes across different studies or systems, as an analysis that highlights differences in one system may obscure them in another. Thus, in what follows, we review effect types and sizes as reported by the authors of individual studies. We focus on variation in human-associated microbial community diversity as assessed by 16S rRNA gene sequence surveys of abundant lineages, using various measures of both within- and between-sample diversity (alpha and beta diversity, respectively). We review comparisons of microbial communities in relationship toboth sampling depth (that is, number of sequences per sample) and breadth (that is, number of samples or individuals). We then perform simulations using an atlas of microbes associated with different sites in the human body to ask how many sequences per sample are needed in order to detect differences across individuals, time, and locations within the body.
Reported effect sizes between and within different body habitats
Table 1a provides an illustrative (though not exhaustive) overview of the literature regarding differences observed in different body habitats and locations in healthy individuals, and the number of subjects and sequences that were used to identify these differences. Although metagenomic studies that examine all the genes in the genome are also of immense interest, shotgun metagenomic data are so far available only from the gut and for a relatively few samples, and so the range of questions that can be addressed at present is substantially more limited than for 16S rRNA-based surveys, the type of survey we consider here. One robust finding that exemplifies relative effect sizes is that there appears to be a greater degree of variation in microbial community composition between individuals than within the same individual over time (Table 1a). This has been found to be true in multiple studies and over a wide range of body habitats. For example, gut community composition is relatively stable in the same individual across a period of months when diet is consistent [6, 16], and even to a certain degree when diet is altered. (Changes in the Firmicutes:Bacteroidetes ratio have been reported in individuals who lost weight, whether they were consuming low-calorie fat- or carbohydrate-restricted diets, but despite these shifts in relative abundance, interpersonal variation was the largest effect observed using phylogenetic comparisons of the communities .) Likewise, skin community composition is more similar within a subject than between subjects over a period of months [16, 18], as are oral, nasal and external auditory canal communities . These results indicate that you are likely to be more similar to yourself in 3 months time than to your friend today in terms of the bacteria you harbor.
Microbial community changes in human disease and environmental samples
Although a wide range of studies in healthy subjects have identified substantial interpersonal variation in overall microbial community composition, how do these effect sizes compare with differences correlated with disease, or in response to treatments ofvarious environmental samples? To address this question, we reviewed culture-independent, 16S rRNAgene-based surveys associated with different physiological conditions (Table 1b) and associated with experimental manipulations in non-human environments (which were surprisingly scarce; Table 1c).
One of the best-characterized effects of health status on the gut microbiome is the association between obesity and the proportional representation of Bacteroidetes, Firmicutes and Actinobacteria [6, 13–15]. Studies in mice indicate that the microbiota contributes to the obese state by providing the host with a greater amount of energy from the diet compared with the microbiota of a lean host , as well as by manipulating host genes that regulate the deposition of energy in adipocytes . The obesity-associated microbiomes of humans (and mice) are enriched in functional genes for certain types of carbohydrate metabolism, and this is directly attributable to the reduction in the numbers of genomes of members of the Bacteroidetes [6, 15].
However, even the size of the differences in gut bacterial community composition of obese versus lean hosts is debated, as different studies using different methodologies have returned varied results . The impact of methodology is particularly evident in a study of twins concordant for obesity or leanness, in which the observed relative abundances of Bacteroidetes, Actinobacteria and Firmicutes, as judged by sequencing of different regions of 16S rRNA clones, depended on the sequencing approach - pyrosequencing of PCR products, Sanger sequencing of 16S rRNA clones, or shotgun sequencing and phylogenetic classification of reads . However, the direction of the effect was consistent across methodologies, and detectable with as few as a couple of hundred sequences per sample.
Observable phenotypes such as obesity may be caused by a variety of underlying factors, and which of those factors is responsible for shifts in the host's microbiota is difficult to address in such correlative studies. Experimental manipulations of microbial communities, however, allow determination of the relative effects of specific variables on overall community composition or the abundance of particular taxa, and as such, allow researchers to draw conclusions regarding cause and effect. Examples of experimental manipulations of non-human environments that used 16S rRNA gene sequencing approaches (either clone libraries or pyrosequencing) and that were well enough replicated to allow statistical analysis are shown in Table 1c. For soil samples, three to four replicates with 70 to 100 sequences were sufficient to observe differences in microbial communities due to land use and moisture regimes [21, 22]. For piglet gut microbiota, the effects of antibiotics on overall community composition were evident with as few as 96 sequences per sample . It would be fascinating to test whether similar antibiotic-induced effects in outbred populations of humans with diverse diets  can be found with relatively few sequences. Similarly, it would be important to consider sampling depth under human physiological conditions in cases where the effect size is known to be large, for example, in the development of the infant gut microbiota .
Has the depth of sequencing used up to now really been necessary?
The literature reviewed in Table 1 reports how many sequences were used to reveal a variety of different effects. Could the same results have been achieved with less sequencing? To begin to address this question, we carried out a limited reanalysis of a study of multiple body habitats by Costello et al. , which encompasses variability explained by nested factors with different effect sizes (Box 1).
In conclusion, the results described here, and previously reported [8, 37], show that arbitrarily choosing to generate large numbers of sequences may not be the most cost-effective way to identify changes in microbial communities associated with different physiological or pathophysiological states. Instead, we call for a few standardized methods to assess differences among microbial communities, which will allow for effect size and power calculations, and therefore a considered assessment of the number of individuals and sequences required to differentiate among given communities. The following four methods have been successful in a range of studies: differences in alpha diversity (number of phylotypes observed or extrapolated); differences in abundance of specific lineages; differences in location on a principal coordinates plot obtained from UniFrac distances or other metrics; and the F ST measure described in the previous section.
The rapid increase in sequencing capacity provides a spectacular opportunity to advance the field in ways that were unimaginable even 3 years ago. How can individual investigators, or groups of investigators, use these resources most wisely at this unique moment of democratization of the ability to perform sequence-based studies? The data summarized here suggest that study designs consisting of tens of thousands of samples sequenced at shallow coverage will be highly informative (depending on the effect size), and such studies are possible with the instruments available today. Given recent observations that inter-habitat and inter-personal variations are large effects, we believe that individual researchers can and should sieze the opportunity provided by these findings to analyze vast numbers of samples at low-coverage (for example, 100 to 1,000 sequences). At this number of samples, detailed exploration of spatial and temporal dynamics of microbial communities will be possible, as will comparisons of large patient populations. In addition, replicate samples can be acquired and analyzed without too strongly impairing the breadth of an investigation, allowing more robust experimental designs to be implemented. One can envisage that perhaps within the next few years, a group of motivated high-school students might, for a science-fair project, be able to track movements in microbes between humans and their pets and livestock across the planet. These studies, especially when combined with hypothesis-driven approches to understanding the effects of factors such as diet and antibiotic exposure, could go far beyond even the largest purely observational studies being contemplated today.
Such studies will yield an overall map of variation within the human microbial ecosystem, and relate differences to specific physiological states within and between individuals in a manner that is replicated across individuals. These studies will serve as a framework to identify and compare the shifts that take place in the microbial community that are related to specific disorders.
Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer ML, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, et al: Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005, 437: 376-380.
Van Houte J, Gibbons RJ: Studies of the cultivable flora of normal human feces. Antonie Van Leeuwenhoek. 1966, 32: 212-222. 10.1007/BF02097463.
Turnbaugh PJ, Ley RE, Hamady M, Fraser-Liggett CM, Knight R, Gordon JI: The human microbiome project. Nature. 2007, 449: 804-810. 10.1038/nature06244.
Eckburg PB, Bik EM, Bernstein CN, Purdom E, Dethlefsen L, Sargent M, Gill SR, Nelson KE, Relman DA: Diversity of the human intestinal microbial flora. Science. 2005, 308: 1635-1638. 10.1126/science.1110591.
Fierer N, Hamady M, Lauber CL, Knight R: The influence of sex, handedness, and washing on the diversity of hand surface bacteria. Proc Natl Acad Sci USA. 2008, 105: 17994-17999. 10.1073/pnas.0807920105.
Turnbaugh PJ, Hamady M, Yatsunenko T, Cantarel BL, Duncan A, Ley RE, Sogin ML, Jones WJ, Roe BA, Affourtit JP, Egholm M, Henrissat B, Heath AC, Knight R, Gordon JI: A core gut microbiome in obese and lean twins. Nature. 2009, 457: 480-484. 10.1038/nature07540.
Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, Nielsen T, Pons N, Levenez F, Yamada T, Mende DR, Li J, Xu J, Li S, Li D, Cao J, Wang B, Liang H, Zheng H, Xie Y, Tap J, Lepage P, Bertalan M, Batto JM, Hansen T, Le Paslier D, Linneberg A, Nielsen HB, Pelletier E, Renault P, et al: A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 464: 59-65. 10.1038/nature08821.
Hamady M, Knight R: Microbial community profiling for human microbiome projects: Tools, techniques, and challenges. Genome Res. 2009, 19: 1141-1152. 10.1101/gr.085464.108.
Legendre P, Gallagher ED: Ecologically meaningful transformations for ordinations of species data. Oecologia. 2001, 129: 271-280. 10.1007/s004420100716.
Lozupone CA, Knight R: Species divergence and the measurement of microbial diversity. FEMS Microbiol Rev. 2008, 32: 557-578. 10.1111/j.1574-6976.2008.00111.x.
Magurran AE: Measuring Biological Diversity. 2004, Oxford: Blackwell
Martin AP: Phylogenetic approaches for describing and comparing the diversity of microbial communities. Appl Environ Microbiol. 2002, 68: 3673-3682. 10.1128/AEM.68.8.3673-3682.2002.
Ley RE, Backhed F, Turnbaugh P, Lozupone CA, Knight RD, Gordon JI: Obesity alters gut microbial ecology. Proc Natl Acad Sci USA. 2005, 102: 11070-11075. 10.1073/pnas.0504978102.
Ley RE, Turnbaugh PJ, Klein S, Gordon JI: Microbial ecology: human gut microbes associated with obesity. Nature. 2006, 444: 1022-1023. 10.1038/4441022a.
Turnbaugh PJ, Ley RE, Mahowald MA, Magrini V, Mardis ER, Gordon JI: An obesity-associated gut microbiome with increased capacity for energy harvest. Nature. 2006, 444: 1027-1031. 10.1038/nature05414.
Costello EK, Lauber CL, Hamady M, Fierer N, Gordon JI, Knight R: Bacterial Community variation in human body habitats across space and time. Science. 2009, 326: 1694-1697. 10.1126/science.1177486.
Fierer N, Lauber CL, Zhou N, McDonald D, Costello EK, Knight R: Forensic identification using skin bacterial communities. Proc Natl Acad Sci USA. 2010, 107: 6477-6481. 10.1073/pnas.1000162107.
Grice EA, Kong HH, Conlan S, Deming CB, Davis J, Young AC, NISC Comparative Sequencing Program, Bouffard GG, Blakesley RW, Murray PR, Green ED, Turner ML, Segre JA.: Topographical and temporal diversity of the human skin microbiome. Science. 2009, 324: 1190-1192. 10.1126/science.1171700.
Backhed F, Ding H, Wang T, Hooper LV, Koh GY, Nagy A, Semenkovich CF, Gordon JI: The gut microbiota as an environmental factor that regulates fat storage. Proc Natl Acad Sci USA. 2004, 101: 15718-15723. 10.1073/pnas.0407076101.
Ley RE: Obesity and the human microbiome. Curr Opin Gastroenterol. 26: 5-11. 10.1097/MOG.0b013e328333d751.
Castro HF, Classen AT, Austin EE, Norby RJ, Schadt CW: Soil microbial community responses to multiple experimental climate change drivers. Appl Environ Microbiol. 2010, 76: 999-1007. 10.1128/AEM.02874-09.
Hartman WH, Richardson CJ, Vilgalys R, Bruland GL: Environmental and anthropogenic controls over bacterial communities in wetland soils. Proc Natl Acad Sci USA. 2008, 105: 17842-17847. 10.1073/pnas.0808254105.
Rettedal E, Vilain S, Lindblom S, Lehnert K, Scofield C, George S, Clay S, Kaushik RS, Rosa AJ, Francis D, Brözel VS: Alteration of the ileal microbiota of weanling piglets by the growth-promoting antibiotic chlortetracycline. Appl Environ Microbiol. 2009, 75: 5489-5495. 10.1128/AEM.02220-08.
Dethlefsen L, Huse S, Sogin ML, Relman DA: The pervasive effects of an antibiotic on the human gut microbiota, as revealed by deep 16S rRNA sequencing. PLoS Biol. 2008, 6: e280-10.1371/journal.pbio.0060280.
Palmer C, Bik EM, Digiulio DB, Relman DA, Brown PO: Development of the human infant intestinal microbiota. PLoS Biol. 2007, 5: e177-10.1371/journal.pbio.0050177.
Lozupone C, Knight R: UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol. 2005, 71: 8228-8235. 10.1128/AEM.71.12.8228-8235.2005.
Clarke KR, Gorley RN: Primer v6. [http://www.primer-e.com/]
Anderson MJ: Distance-based tests for homogeneity of multivariate dispersions. Biometrics. 2006, 62: 245-253. 10.1111/j.1541-0420.2005.00440.x.
Lozupone CA, Knight R: Global patterns in bacterial diversity. Proc Natl Acad Sci USA. 2007, 104: 11436-11440. 10.1073/pnas.0611525104.
Ley RE, Lozupone CA, Hamady M, Knight R, Gordon JI: Worlds within worlds: evolution of the vertebrate gut microbiota. Nat Rev Microbiol. 2008, 6: 776-788. 10.1038/nrmicro1978.
Tamames J, Abellan JJ, Pignatelli M, Camacho A, Moya A: Environmental distribution of prokaryotic taxa. BMC Microbiol. 2010, 10: 85-10.1186/1471-2180-10-85.
Auguet JC, Barberan A, Casamayor EO: Global ecological patterns in uncultured Archaea. ISME J. 2010, 4: 182-190. 10.1038/ismej.2009.109.
Holsinger KE, Weir BS: Genetics in geographically structured populations: defining, estimating and interpreting F(ST). Nat Rev Genet. 2009, 10: 639-650. 10.1038/nrg2611.
Hudson RR, Slatkin M, Maddison WP: Estimation of levels of gene flow from DNA sequence data. Genetics. 1992, 132: 583-589.
Slatkin M: Inbreeding coefficients and coalescence times. Genet Res. 1991, 58: 167-175. 10.1017/S0016672300029827.
Sogin ML, Morrison HG, Huber JA, Mark Welch D, Huse SM, Neal PR, Arrieta JM, Herndl GJ: Microbial diversity in the deep sea and the underexplored 'rare biosphere'. Proc Natl Acad Sci USA. 2006, 103: 12115-12120. 10.1073/pnas.0605127103.
Ley RE, Hamady M, Lozupone C, Turnbaugh PJ, Ramey RR, Bircher JS, Schlegel ML, Tucker TA, Schrenzel MD, Knight R, Gordon JI: Evolution of mammals and their gut microbes. Science. 2008, 320: 1647-1651. 10.1126/science.1155725.
Nasidze I, Li J, Quinque D, Tang K, Stoneking M: Global diversity in the human salivary microbiome. Genome Res. 2009, 19: 636-643. 10.1101/gr.084616.108.
Zaura E, Keijser BJ, Huse SM, Crielaard W: Defining the healthy 'core microbiome' of oral microbial communities. BMC Microbiol. 2009, 9: 259-10.1186/1471-2180-9-259.
Gao Z, Tseng CH, Pei Z, Blaser MJ: Molecular analysis of human forearm superficial skin bacterial biota. Proc Natl Acad Sci USA. 2007, 104: 2927-2932. 10.1073/pnas.0607077104.
Larsen N, Vogensen FK, Berg van den FW, Nielsen DS, Andreasen AS, Pedersen BK, Al-Soud WA, Sorensen SJ, Hansen LH, Jakobsen M: Gut microbiota in human adults with type 2 diabetes differs from non-diabetic adults. PLoS One. 5: e9085-10.1371/journal.pone.0009085.
Gophna U, Sommerfeld K, Gophna S, Doolittle WF, Veldhuyzen van Zanten SJ: Differences between tissue-associated intestinal microfloras of patients with Crohn's disease and ulcerative colitis. J Clin Microbiol. 2006, 44: 4136-4141. 10.1128/JCM.01004-06.
Bibiloni R, Mangold M, Madsen KL, Fedorak RN, Tannock GW: The bacteriology of biopsies differs between newly diagnosed, untreated, Crohn's disease and ulcerative colitis patients. J Med Microbiol. 2006, 55: 1141-1149. 10.1099/jmm.0.46498-0.
Frank DN, St Amand AL, Feldman RA, Boedeker EC, Harpaz N, Pace NR: Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases. Proc Natl Acad Sci USA. 2007, 104: 13780-13785. 10.1073/pnas.0706625104.
Wang Y, Hoenig JD, Malin KJ, Qamar S, Petrof EO, Sun J, Antonopoulos DA, Chang EB, Claud EC: 16S rRNA gene-based analysis of fecal microbiota from preterm infants with and without necrotizing enterocolitis. ISME J. 2009, 3: 944-954. 10.1038/ismej.2009.37.
Chang JY, Antonopoulos DA, Kalra A, Tonelli A, Khalife WT, Schmidt TM, Young VB: Decreased diversity of the fecal microbiome in recurrent Clostridium difficile-associated diarrhea. J Infect Dis. 2008, 197: 435-438. 10.1086/525047.
Dicksved J, Lindberg M, Rosenquist M, Enroth H, Jansson JK, Engstrand L: Molecular characterization of the stomach microbiota in patients with gastric cancer and in controls. J Med Microbiol. 2009, 58: 509-516. 10.1099/jmm.0.007302-0.
Bik EM, Eckburg PB, Gill SR, Nelson KE, Purdom EA, Francois F, Perez-Perez G, Blaser MJ, Relman DA: Molecular analysis of the bacterial microbiota in the human stomach. Proc Natl Acad Sci USA. 2006, 103: 732-737. 10.1073/pnas.0506655103.
Crawford PA, Crowley JR, Sambandam N, Muegge BD, Costello EK, Hamady M, Knight R, Gordon JI: Regulation of myocardial ketone body metabolism by the gut microbiota during nutrient deprivation. Proc Natl Acad Sci USA. 2009, 106: 11276-11281. 10.1073/pnas.0902366106.
Hildebrandt MA, Hoffmann C, Sherrill-Mix SA, Keilbaugh SA, Hamady M, Chen YY, Knight R, Ahima RS, Bushman F, Wu GD: High-fat diet determines the composition of the murine gut microbiome independently of obesity. Gastroenterology. 2009, 137: 1716-1724. 10.1053/j.gastro.2009.08.042.
Suchodolski JS, Dowd SE, Westermarck E, Steiner JM, Wolcott RD, Spillmann T, Harmoinen JA: The effect of the macrolide antibiotic tylosin on microbial diversity in the canine small intestine as demonstrated by massive parallel 16S rRNA gene sequencing. BMC Microbiol. 2009, 9: 210-10.1186/1471-2180-9-210.
We thank the Crohn's and Colitis Foundation of America, the Bill and Melinda Gates Foundation, the HHMI and the NIH for support of work by the authors cited in this review.