- Open Access
Temporal and technical variability of human gut metagenomes
Genome Biology volume 16, Article number: 73 (2015)
Metagenomics has become a prominent approach for exploring the role of the gut microbiota in human health. However, the temporal variability of the healthy gut microbiome has not yet been studied in depth using metagenomics and little is known about the effects of different sampling and preservation approaches. We performed metagenomic analysis on fecal samples from seven subjects collected over a period of up to two years to investigate temporal variability and assess preservation-induced variation, specifically, fresh frozen compared to RNALater. We also monitored short-term disturbances caused by antibiotic treatment and bowel cleansing in one subject.
We find that the human gut microbiome is temporally stable and highly personalized at both taxonomic and functional levels. Over multiple time points, samples from the same subject clustered together, even in the context of a large dataset of 888 European and American fecal metagenomes. One exception was observed in an antibiotic intervention case where, more than one year after the treatment, samples did not resemble the pre-treatment state. Clustering was not affected by the preservation method. No species differed significantly in abundance, and only 0.36% of gene families were differentially abundant between preservation methods.
Technical variability is small compared to the temporal variability of an unperturbed gut microbiome, which in turn is much smaller than the observed between-subject variability. Thus, short-term preservation of fecal samples in RNALater is an appropriate and cost-effective alternative to freezing of fecal samples for metagenomic studies.
Microbial communities that inhabit the human gut are essential to human health. To better understand the role of gut microbes in health, major efforts have been undertaken including large-scale studies such as the European Metagenomics of the Human Intestinal Tract (MetaHIT) project and the US American Human Microbiome Project (HMP) [1,2]. These studies have provided insights into the gut microbial community composition in healthy human individuals. Changes in the microbial community composition have been associated with diet [3,4] as well as with multiple diseases, such as atherosclerosis, inflammatory bowel diseases and obesity [5-7].
In addition to these cross-sectional studies that compared healthy and diseased cohorts, longitudinal studies have helped shed light not only on the community compositional variability but also on the temporal variability, providing a more complete picture of the factors that shape the gut microbiome in health and disease. Several studies have demonstrated considerable between-subject variability of the gut microbial composition. However, the gut microbiome has been described to be constrained around a highly personal and stable composition within each healthy subject over time [8-12].
Perturbation of the human gut microbiome is known to occur as a result of antibiotics treatment, a frequently prescribed medication. Antibiotic intervention leads to a rapid decrease of diversity and post-treatment recovery is slow and incomplete, even up to 4 years after the treatment [13-17]. Resistant bacterial species, as a result of antibiotics treatment, can persist over years [18-20] and the resistance potential of gut microbiota displays regional differences [21,22]. Similarly, there are indications of other long-term community shifts caused by endogenous (for example, disease) or environmental perturbations (for example, diet and lifestyle change [3,4,23]) that have not yet been studied in depth.
Studies on the temporal variability of the gut microbiome have mostly been performed over short periods (weeks to one year; for example, [8,12,23,24]) and only rarely over long periods (5 and 12 years [9,11]). The methods of deriving the taxonomic community composition were primarily based on PCR-denaturing gradient gel electrophoresis (PCR-DGGE; for example, ), 16S rRNA gene sequencing (for example, [26,27]), and the HITChip microarray (for example, [11,28]). Only two studies [12,29] have so far analyzed longitudinal non-amplified metagenomic shotgun sequencing data that were collected from 43 subjects in the context of the HMP . However, the majority (41 out of 43) were only sampled twice, making it difficult to assess temporal stability.
Despite their common aim to better understand microbial community shifts over time, the aforementioned studies do not attempt to quantify different sources of variability, from technical to biological ones. In particular, technical aspects have been shown to be important for the comparison between data sets. Limited comparability in human microbiome data sets often results from differences in sample preservation and DNA isolation protocols as well as readout methods (for example, sequencing of different 16S rRNA gene regions or application of different sequencing technologies). A meta-analysis  assessing the effect size of technical differences on data comparability showed that samples rather cluster by study or the methods applied (for example, for DNA isolation) than by the parameter of interest (for example, disease state). To counteract these batch effects, the International Human Microbiome Standards (IHMS) project was launched to suggest standards for sample processing (mainly DNA isolation) with the goal to maximize future data comparability. However, different storage conditions of a fecal sample can also impact the compositional readout, as different microbes respond differently to environmental exposure . Research in this direction has been conducted previously to compare different storage and preservation conditions (for example, different temperatures or preservatives such as RNALater) [32,33]. RNALater, a quaternary ammonium salts-based solution, is commonly used as a logistically convenient solution to preserve RNA from biological samples at room temperature when freezing is not possible, and was recently also considered for omics technologies . It was shown to have a minor effect on the recovered composition and thus represents a potential alternative to immediate freezing [35-37]. To date, the technical variability on a taxonomic and functional level has not been put in the context of temporal and within-sample variability (meaning within the stool from a single bowel movement).
We collected fecal samples over up to two years from seven subjects to investigate the temporal variability and individuality of the human gut microbiome using metagenomic shotgun sequencing. To disentangle technical, temporal and between-subject variability we contrasted the variability of microbial community composition within a fecal sample [38,39] with the variability introduced by different preservation methods, RNALater or freezing after two different time intervals. By comparing the fecal metagenomes of the seven subjects over time and in the context of 888 published metagenomes, we generally found between-subject variability to be much larger than within-subject variability. This high degree of individuality can, however, be disrupted by antibiotic treatment, which in one subject triggered a large and long-lasting community shift. Bowel cleanse was also investigated but did not appear to cause a major disturbance. Technical variability (within-sample and preservation-induced variability) was smaller than temporal within-subject variability and therefore we propose RNALater as an alternative to fresh freezing fecal samples.
Results and discussion
Fecal samples were self-collected from seven adults at short (few days) and longer (weeks to months) time intervals (Additional file 1). All subjects were considered healthy at the time of sampling, unless stated otherwise (see Material and methods). The study was split into five sub-studies as shown in Figure 1. Out of the seven subjects, five subjects performed sampling for more than one year while three subjects collected over more than two years (sub-study 3). At two time points, seven days (d7; sub-study 1) and 392 days (d392; sub-study 2) after the first sampling event, feces from three and five subjects, respectively, were collected and replicates either frozen or preserved in RNALater. One subject (Alien) collected additional fecal samples after antibiotics treatment (d376–380, sub-study 4) and bowel cleanse (d630–637, sub-study 5).
All fecal samples were subjected to whole genome shotgun sequencing and the data analyzed at species-level using mOTUs (metagenomic operational taxonomic units based on single-copy phylogenetic marker genes ), and at a number of functional levels: clusters of orthologous groups (COGs) , KEGG (Kyoto Encyclopedia of Genes and Genomes) groups of orthologous genes (KOs), modules and pathways .
Preservation-induced variability of the fecal species community
Biological samples are generally stored frozen or processed immediately to maintain their integrity. However, this is often logistically inconvenient, especially in remote areas. In contrast, preservation in RNALater eliminates the need for immediate freezing or sample processing. RNALater is an aqueous solution that preserves biological samples by protecting especially RNA from degradation (for example, ). The solution penetrates and stabilizes the sample for later analysis. According to the manufacturer’s instruction, these samples are stable at room temperature (RT) for up to one week, at +4°C for one month and at −20°C and −80°C indefinitely. Thus, the usage of RNALater for sample collection would facilitate sample preservation and shipping prior to metagenomic analysis.
We collected fresh samples from which aliquots were frozen immediately (to be used as reference samples) or preserved in RNALater. At d7, RNALater-preserved samples from five subjects were kept at both +4 to 10°C and RT for one week. However, at d392 RNALater-preserved samples were kept at +4°C for 24 h from three subjects before storing at −80°C (sub-studies 1 and 2; Figure 1).
To analyze the taxonomic variability of frozen and RNALater-preserved replicates, we performed hierarchical clustering of the mOTU abundances based on Euclidean distance. This analysis revealed that samples from the same subject clustered together irrespective of the preservation method. This similarity held true for both d7 and d392 with the exception of subject Alien, who underwent an antibiotics treatment in between these time points (Figures 2A,B and 3A). Within the cluster of each subject (at d392), the replicates did not cluster by the preservation protocol (Figure 2B), suggesting that biological within-sample variability was larger than preservation-induced effects.
To extend this observation, we clustered all collected samples from all subjects in the context of 888 published metagenomes from MetaHIT and HMP (Figure 4; details in Material and methods). We found that the samples from d7 and d392 had other samples from the same subject as nearest neighbors. All d7 samples had the other two replicates from d7 as nearest neighbors (Figure 4). For d392, the first three Peacemaker and four Bugkiller neighbors were their corresponding replicates from d392. For subject Alien, all samples from d392 clustered together but the nearest neighbor was not necessarily the samples preserved under the same condition. Thus, the samples did not cluster by preservation method. Taken together, these results show that RNALater does not introduce a bias in the overall microbiome composition and its effect is smaller than within-subject variability.
We repeated the taxonomic analyses using gene abundances summarized at different functional levels. Relative abundance of orthologous groups, that is, COG and KO profiles, (see Material and methods) of all collected samples were clustered in the context of 888 published metagenomes from MetaHIT and HMP (Additional files 2 and 3). For both COG and KO profiles, the nearest neighbor of samples from d7 and d392 were very similar to those seen in taxonomic clustering (Figure 4). Using COG abundances, with the exception of one Peacemaker sample, all d7 replicates clustered together, and for Peacemaker and Alien four and five of the d392 sample replicates, respectively, clustered together. Using KO profiles, the clustering of samples was similar.
To get a deeper insight into potential preservation-induced changes of the microbiome, we compared indices for species diversity and community evenness. At d7, RNALater-preserved samples (storage at RT or at +4 to 10°C) compared to the immediately frozen samples showed a significant decrease in their Shannon diversity index (P = 0.016 and P = 0.0008, unpaired Wilcoxon-test) and species evenness (P = 0.016 and P = 0.016, unpaired Wilcoxon-test) but not richness (P = 0.056 and P = 0.056, unpaired Wilcoxon-test). In contrast, at d392, RNALater preservation did not have the same effect on these ecological indices (Figure 2C).
To determine preservation-induced and temporal within-subject and between-subject differences, we correlated mOTU, COGs, KOs, KEGG modules and pathways (Spearman correlation) between different preservation techniques, sampling time points and subjects (Figure 3A,B; Figure S3A-C in Additional file 4). We found that the similarity between protocols is consistently high for both species and COGs (minimum Spearman’s r = 0.82 and 0.95, respectively), similar to previous findings . Due to our longitudinal study design we could extend the analysis performed by Franzosa et al. , and verify that the correlation between time points was lower for both species and COGs (maximum Spearman’s r = 0.75 and 0.93, respectively) than between preservation methods. Between-subject correlations were even lower than between-time point correlations.
To estimate differences in taxonomic (species) and functional (eggNOG COGs, and KEGG KOs, modules and pathways) composition between frozen and RNALater-preserved samples from d7 and d392, we performed two-way ANOVA testing on both the taxonomic and functional relative abundances (see Material and methods). We found that 5.0% (d7) and 2.3% (d392) of the species with a relative abundance exceeding 0.01% in at least 3 of the 33 tested samples varied more between preservation methods than between subjects. However, none of these were statistically significant after correction for multiple hypothesis testing (Benjamin-Hochberg α = 0.05; Figure 3C; Additional file 5). For d7 and d392, 0.77% and 4.2% of the COGs, respectively, varied more between the preservation methods than between subjects, but only 0.36% of COGs were statistically significant (Figure 3D; Additional file 5), which is in the range of previous findings . We found that 0.72%, 0% and 0% of the KOs, modules and pathways, respectively, varied more between preservation methods than between subjects (Figure S3D-F in Additional file 4).
In summary, RNALater appears, in line with a previous publication , to be a suitable alternative to immediate freezing at least for short-term storage of a few days, as the variability between protocol replicates is lower than that between time points of the same subject and between subjects.
Within-sample variability of the fecal species community
It was previously shown that there is considerable spatial within-sample variation of parasites in human feces  and low abundant bacteria were only sporadically detected in all replicates of the same sample . To address within-sample and technical reproducibility in our study, triplicates at distinct sites of the same fecal sample were collected from three subjects at d392 using two preservation protocols (RNALater and freezing; Figure 1, sub-study 2).
For two subjects the replicates showed only minor variation in ecological indices (Shannon diversity index, species richness and community evenness). Larger fluctuations were detected for diversity and evenness of fresh frozen samples from subject Bugkiller only (Figure 2C, lower panel). Nevertheless, all replicates clustered by subject (including Bugkiller) in the context of the samples collected on d392 (Figure 2B). To set within-sample variation in the context of all time series samples and the MetaHIT and HMP samples (N = 888), we clustered all samples together (Figure 4). The replicates from all three subjects had the other replicates from the same subject as nearest neighbors. All replicates from subject Alien clustered by d392 but not with the pre-/post-treatment samples, highlighting the drastic change introduced by the treatment. These results for Alien remained the same when clustering based on abundances of functional categories (Additional files 2 and 3). This implies that subject-specificity and community similarity is high for all replicates of a fecal sample with only minor fluctuations in diversity and evenness. Together with the fact that replicates preserved under different conditions did not cluster by preservation method (Figures 2B and 4) this supports our study design which was based on samples that were deliberately not homogenized before aliquoting since (i) samples included in our study were self-collected by lay participants and usually not homogenized in large metagenomic studies and (ii) we aimed to assess within-sample variability.
Temporal variability of fecal microbial communities
In order to assess how technical variability compares to temporal variability, all samples collected by the seven subjects were clustered. It showed that the temporal variability was small and the samples clustered by subject except for Alien (Figure 5A,B). Omitting all samples taken from Alien after antibiotics treatment resulted in consistent clustering by subject (Figure 5C), showing high subject-specificity and individuality of the gut microbiome. In order to test whether the individuality of the gut microbiome persists on the background of 888 published metagenomes from MetaHIT and HMP, we clustered them together and show the nearest neighbors in Figure 4. The time series samples from the seven subjects were closest to other samples from the same subject rather than to another subject. This was also seen for the 43 subjects in the HMP study, which have multiple time-points. Only few samples from our dataset had the sample of another subject as closer neighbor than a time series sample when comparing the relative taxonomic abundances. For example, the d392 sample from Scavenger has a sample from another subject as fifth neighbor instead of the Scavenger d0 sample. The number of samples having another subject as closer neighbor rather than a time series sample from the same subject increased when clustering was performed using relative COG and KO abundances (Additional files 2 and 3).
To characterize the temporal variability of the community structure, we calculated the ecological indices, such as the Shannon diversity index, and found that they varied little over time for all subjects (Figure 5A, lower panel for diversity) except Alien, who underwent antibiotics treatment.
Our results support previous studies reporting that the temporal variability of the species composition within a subject is smaller than between-subject variability and that in the absence of larger perturbation each individual’s microbiota remains relatively stable over time [6,8,9,11,12,14,24,26-29]. However, here we show that even in the context of a large cohort of fecal metagenomes the subjects can be resolved based on the taxonomic composition of their fecal metagenomes with very few exceptions. Thus, the gut microbiome, if unperturbed, is highly subject-specific and the variability is small compared to the between-subject variability.
The effect of perturbations on fecal microbial communities
During the time period of the study, one subject (Alien) suffered from an infection that was treated with antibiotics and underwent colonoscopy screening, which required bowel cleanse. The antibiotics treatment comprised four days with ceftriaxone, a third-generation cephalosporin antibiotic with broad-spectrum activity against Gram-positive and Gram-negative bacteria. To investigate the consequences of these medical treatments, additional samples were collected for this subject after antibiotics intake (d376, d377 and d380) and bowel cleanse (d630, d632, and d637) (Figure 1, sub-studies 4 and 5; for further details see Material and methods).
To observe the response of the fecal microbiome to these two perturbations, we performed hierarchical clustering and found that the post-antibiotics samples separated from the pre-antibiotics samples, but were still distinct from other subjects (Figure 5A,B). The samples taken 226 to 399 days after antibiotics treatment (d600–773) clustered closer to the pre-treatment samples (Figure 5A,B), suggesting a (partial) recovery. Determining the nearest neighbor samples in the context of the 888 HMP and MetaHIT samples, using Euclidean distance on taxonomic and functional profiles, confirmed the aforementioned observation suggesting that the gut community composition was still distinct even 399 days (d773) post-antibiotics treatment but gained similarity with the pre-antibiotics community composition (Figure 4). A similar pattern was observed for nearest-neighbor analysis of COG abundances, but individual specificity was less clear for KO abundances (Additional files 2 and 3).
The immediate post-treatment samples (d376–380) showed a drastic reduction in Shannon diversity index, species richness and evenness, indicating that fewer and less evenly abundant microbial species were detected. The Shannon diversity index, species richness and evenness of the post-treatment samples dropped from 3.5 to 0.2, 100 to 37 and 0.75 to 0.05, respectively, and were still reduced at d392 (18 days after the treatment), compared to the pre-treatment state. At d600, the diversity had returned to its initial level (Figure 5A, lower panel), yet the samples still clustered separately from the pre-treatment samples, indicating that the recovery is not complete (Figures 4 and 5B). Separation of community profiles from the initial state along the first principal coordinate in an ordination analysis (using Jensen-Shannon divergence) correlated with the decline of the Shannon diversity index (Pearson correlation rho = 0.86, P = 3.9e−14; Figure 5A).
The bowel cleanse on the day before d630 did not have a considerable effect on the community composition: the samples cluster closely with d600, which was before colonoscopy and the fluctuation of the Shannon diversity index was similar to the other subjects and notably smaller than the impact of the antibiotics treatment (Figure 5A,B). Although this case study comprises only one subject, our result that bowel cleanse has little effect on gut microbiome composition is in line with the finding by O’Brien et al. .
It has been reported that antibiotics have a strong impact on the gut microbial community composition for an extended period of time, although the community was sometimes found to be similar to its pretreatment state within weeks. The return was subject-dependent and often incomplete, at least for some species monitored for time periods of two to six months [14,17] and up to two years [19,44]. Even though we studied the effect of antibiotics in only one subject, we can show that, at least in this subject, despite species diversity recovery, the gut microbial composition was still distinct from the pre-treatment state, even 399 days after the antibiotics treatment. It would be worthwhile exploring in the future how antibiotics effects vary between subjects and depends on factors such as dosage, duration of the treatment and type of antibiotics.
Several studies have addressed the temporal variability or the technical variability (for example, induced by different DNA isolation methods or preservation techniques) of the gut microbiome separately but none set them in a broader context. Hence, these studies have so far not disentangled the biological temporal variability (like, for example, community shifts due to disease or medication) from technical variability (for example, induced by preservation conditions or insufficient stool homogeneity).
In our study we provide the to date largest metagenomic data set of fecal samples collected over more than two years. We addressed the aspects of comparing technical and temporal variability, finding that temporal variability within each subject’s gut microbiome was smaller than that between subjects. Even in the context of 888 metagenomes, all time series samples could be recovered using taxonomic abundances, as long as antibiotics did not perturb the gut microbiome. The technical variability introduced by RNALater was small compared to freezing, for both taxonomic and functional features, and does not disrupt subject-specificity nor time point-specificity of the gut microbiome. Thus, we suggest RNALater as an alternative to freezing for the preservation of the fecal microbiome for metagenomic studies.
Material and methods
Fecal sample collection for time series
Informed consent to obtain time series samples of fecal samples was obtained from seven healthy subjects in Germany through the my.microbes project . The study protocol was approved by the EMBL Bioethics Internal Advisory Board, and is in agreement with the WMA Declaration of Helsinki. All subjects were living in Heidelberg, Germany at the beginning of the study and the mean age of the subjects upon enrollment was 34 ± 6 years. Among these subjects were five males (Alien, Bugkiller, Peacemaker, Halbarad and Scavenger) and two females (Daisy and Tigress). Subjects reported themselves as healthy, if they did not undergo prescribed medical treatment or showed any indication of disease symptoms. Fecal samples were collected and conserved under anaerobic conditions in a sealed bag, kept at −20°C for short-term storage and stored at −80°C upon arrival in the laboratory. The fecal samples were collected at days 0, 2, 7, 60, 392, 600 and 773 (sub-study 3) and are referred to here as d0, d2, d7 and so on. One male subject (Alien) contracted a bacterial infection and collected further samples after being hospitalized and receiving 2 g of ceftriaxone. Ceftriaxone is an antibiotic with broad-spectrum activity against Gram-positive and Gram-negative bacteria that was administered parenterally over 4 days. The last injection was two days before the first sampling time point (d376) and further samples were then collected on the subsequent days (d377, d378 and d380; sub-study 4). Additional samples were taken starting one day after undergoing bowel cleanse for routine colonoscopy (d630, d632 and d637; sub-study 5). Figure 1 shows the study design in detail and metadata and sequencing information are given in the Additional file 1.
Fecal sample collection for method comparison
In parallel to the fresh frozen fecal samples, additional samples (1 g each) were collected from five subjects at time point d7 and from three subjects at d392 (without homogenization) and were stored in 10 ml RNALater® Stabilization Solution (Life Technologies GmbH, Darmstadt, Germany). Short-term storage was either at +4 to 10°C or at RT for one week (d7, sub-study 1) or at +4°C (d392, sub-study 2) for 24 h and frozen at −80°C upon arrival in the laboratory. At d392, each subject collected samples in triplicate, preserved in both RNALater and freshly frozen (Figure 1).
Inclusion of published fecal metagenomes
Sample processing and sequencing
DNA isolation from fecal samples
One milliliter of defrosted samples immersed in RNALater was taken and diluted with sterile phosphate-buffered saline and pelleted by centrifugation. Genomic DNA was extracted from frozen or RNALater-preserved fecal samples as previously described  using the G’NOMEs kit (MP Biomedicals, Illkirch, France). The following minor modifications were made to the protocol: cell lysis/denaturation was performed (30 minutes, 55°C) before protease digestion was carried out overnight (55°C). Mechanical lysis was followed by RNAse digestion (50 μl, 30 minutes, 55°C). The purified DNA was resuspended in TE buffer after final precipitation for storage at −20°C.
Library preparation and metagenomic sequencing
Library generation and whole genome shotgun sequencing of the fecal samples was carried out on the Illumina HiSeq 2000/2500 (Illumina, San Diego, CA, USA) platform as described in Zeller et al. . All samples were paired-end sequenced with 100 bp read lengths at the Genomics Core Facility, European Molecular Biology Laboratory, Heidelberg, to a sequencing depth of approximately 5 Gbp (see Additional file 1 for sequencing results).
Taxonomic profiling of fecal samples
Using MOCAT , a software package used to process raw Illumina reads to generate taxonomic and functional profiles (option screen with alignment length cutoff 45 and minimum 97% sequence identity), taxonomic relative abundance profiles were generated by mapping screened HQ reads from each metagenome to a database consisting of 10 universal single-copy marker genes extracted from 3,496 NCBI reference genomes and 263 human gut metagenomes that had previously been clustered and linked by co-variance into mOTUs [29,51]. Quantification of mOTU linkage groups was performed using MOCAT, but is also available as a standalone tool at .
Functional profiling of fecal samples
Using MOCAT  (option screen with alignment length cutoff 45 and minimum 95% sequence identity) functional relative abundance profiles were generated by first calculating gene abundance profiles by mapping screened HQ (high quality) reads from each metagenome to an functionally annotated database consisting of predicted genes from 263 human gut metagenomes [29,49], and estimating each gene’s abundance as gene length-normalized nucleotide counts of all reads that matched the protein-coding region of the gene. And second, for each functional feature, its abundance in the metagenomic gene pool was estimated as the sum of the relative abundances of all genes belonging to this family. The genes were summarized into COGs , and KEGG KOs, modules and pathways . The metagenomic gene catalog had already been functionally annotated to the KEGG database , and was additionally annotated to different COGs by aligning the translated amino acid sequence of each gene to the eggNOG (version 3)  database using BLAST (version 2.2.24)  (maximum e-value 0.01) and then annotating the genes using SmashCommunity (version 1.6) .
For the statistical data analysis at the species level, mOTU abundances were used  and samples were included in the data analysis if they had more than 3,800 insert counts.
For the comparison of RNALater with fresh frozen feces and time series samples with each other, mOTU abundances  were used to calculate Shannon diversity index, evenness and species richness. To standardize sampling depth, richness, Shannon diversity index and evenness were assessed after rarefaction of the insert count tables to 3,800 insert counts per sample. Differences were assessed using the Wilcoxon test (unpaired) on the deviation from the mean of each subject. P-values ≤0.05 were considered statistically significant.
Principal coordinate analyses and complete linkage clustering of Euclidean distances (Figure 2A) and Jensen-Shannon divergence distances (Figure 5A) were performed using the ape and ade4 R packages. The dendrograms shown are based on Euclidean distance measurements on the logged abundances (Figures 2B and 3B,C). Nearest neighbors were determined to be the samples with the smallest Euclidean distance (Figure 4; Additional files 2 and 3). Due to large differences in sequencing depth, the metagenomes collected in this study and the HMP  and MetaHIT [2,46,47] taxonomic data were only analyzed after rarefaction to an insert count of 5,000 per sample. All samples that passed these criteria were included in the functional analysis.
To observe taxonomic and functional-specific biases introduced by RNALater preservation across all subjects, a two-way ANOVA was performed for d7 and d392 separately. Our setup was analogous to a previous analysis . Relative species, COG, KO, module and pathway abundances were arcsine square root transformed (for variance stabilization) and only features with a relative abundance of more than 0.01% in at least three samples were included. For d392, the median value of the three replicates for each feature was used.
The shotgun metagenomic sequencing data from this study are available from the European Nucleotide Archive (ENA) database , accession number ERP009422.
Description of additional data files
The following additional data are available with the online version of this paper. Additional file 1 is a table listing the metadata and sequencing information of the analyzed samples. Additional file 5 is a table listing statistically significant taxonomic and functional features resulting from the method comparison.
cluster of orthologous groups
Human Microbiome Project
Kyoto Encyclopedia of Genes and Genomes
KEGG group of orthologous genes
Metagenomics of the Human Intestinal Tract
metagenomic operational taxonomic unit
polymerase chain reaction
Human Microbiome Project Consortium. A framework for human microbiome research. Nature. 2012;486:215–21.
Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 2010;464:59–65.
Wu GD, Chen J, Hoffmann C, Bittinger K, Chen YY, Keilbaugh SA, et al. Linking long-term dietary patterns with gut microbial enterotypes. Science. 2011;334:105–8.
David LA, Maurice CF, Carmody RN, Gootenberg DB, Button JE, Wolfe BE, et al. Diet rapidly and reproducibly alters the human gut microbiome. Nature. 2014;505:559–63.
Koeth RA, Wang Z, Levison BS, Buffa JA, Org E, Sheehy BT, et al. Intestinal microbiota metabolism of l-carnitine, a nutrient in red meat, promotes atherosclerosis. Nat Med. 2013;19:576–85.
Turnbaugh PJ, Hamady M, Yatsunenko T, Cantarel BL, Duncan A, Ley RE, et al. A core gut microbiome in obese and lean twins. Nature. 2009;457:480–4.
Manichanh C, Rigottier-Gois L, Bonnaud E, Gloux K, Pelletier E, Frangeul L, et al. Reduced diversity of faecal microbiota in Crohn’s disease revealed by a metagenomic approach. Gut. 2006;55:205–11.
Costello EK, Lauber CL, Hamady M, Fierer N, Gordon JI, Knight R. Bacterial community variation in human body habitats across space and time. Science. 2009;326:1694–7.
Faith JJ, Guruge JL, Charbonneau M, Subramanian S, Seedorf H, Goodman AL, et al. The long-term stability of the human gut microbiota. Science. 2013;341:1237439.
Human Microbiome Project Consortium. Structure, function and diversity of the healthy human microbiome. Nature. 2012;486:207–14.
Rajilic-Stojanovic M, Heilig HG, Tims S, Zoetendal EG, de Vos WM. Long-term monitoring of the human intestinal microbiota composition. Environ Microbiol. 2012;15:1146–59.
Schloissnig S, Arumugam M, Sunagawa S, Mitreva M, Tap J, Zhu A, et al. Genomic variation landscape of the human gut microbiome. Nature. 2013;493:45–50.
Dethlefsen L, Huse S, Sogin ML, Relman DA. The pervasive effects of an antibiotic on the human gut microbiota, as revealed by deep 16S rRNA sequencing. PLoS Biol. 2008;6:e280.
Dethlefsen L, Relman DA. Incomplete recovery and individualized responses of the human distal gut microbiota to repeated antibiotic perturbation. Proc Natl Acad Sci U S A. 2011;108:4554–61.
Jakobsson HE, Jernberg C, Andersson AF, Sjolund-Karlsson M, Jansson JK, Engstrand L. Short-term antibiotic treatment has differing long-term impacts on the human throat and gut microbiome. PLoS One. 2010;5:e9836.
Perez-Cobas AE, Gosalbes MJ, Friedrichs A, Knecht H, Artacho A, Eismann K, et al. Gut microbiota disturbance during antibiotic therapy: a multi-omic approach. Gut. 2012;62:1591–601.
De LaCochetiere MF, Durand T, Lepage P, Bourreille A, Galmiche JP, Dore J. Resilience of the dominant human fecal microbiota upon short-course antibiotic challenge. J Clin Microbiol. 2005;43:5588–92.
Andersson DI, Hughes D. Persistence of antibiotic resistance in bacterial populations. FEMS Microbiol Rev. 2011;35:901–11.
Jernberg C, Lofmark S, Edlund C, Jansson JK. Long-term ecological impacts of antibiotic administration on the human intestinal microbiota. ISME J. 2007;1:56–66.
Jernberg C, Lofmark S, Edlund C, Jansson JK. Long-term impacts of antibiotic exposure on the human intestinal microbiota. Microbiology. 2010;156:3216–23.
Forslund K, Sunagawa S, Coelho LP, Bork P. Metagenomic insights into the human gut resistome and the forces that shape it. Bioessays. 2014;36:316–29.
Forslund K, Sunagawa S, Kultima JR, Mende DR, Arumugam M, Typas A, et al. Country-specific antibiotic use practices impact the human gut resistome. Genome Res. 2013;23:1163–9.
David LA, Materna AC, Friedman J, Campos-Baptista MI, Blackburn MC, Perrotta A, et al. Host lifestyle affects human microbiota on daily timescales. Genome Biol. 2014;15:R89.
Claesson MJ, Cusack S, O’Sullivan O, Greene-Diniz R, de Weerd H, Flannery E, et al. Composition, variability, and temporal stability of the intestinal microbiota of the elderly. Proc Natl Acad Sci U S A. 2011;108:4586–91.
Zoetendal EG, Akkermans AD, De Vos WM. Temperature gradient gel electrophoresis analysis of 16S rRNA from human fecal samples reveals stable and host-specific communities of active bacteria. Appl Environ Microbiol. 1998;64:3854–9.
Caporaso JG, Lauber CL, Costello EK, Berg-Lyons D, Gonzalez A, Stombaugh J, et al. Moving pictures of the human microbiome. Genome Biol. 2011;12:R50.
Martinez I, Muller CE, Walter J. Long-term temporal analysis of the human fecal microbiota revealed a stable core of dominant bacterial species. PLoS One. 2013;8, e69621.
Jalanka-Tuovinen J, Salonen A, Nikkila J, Immonen O, Kekkonen R, Lahti L, et al. Intestinal microbiota in healthy adults: temporal analysis reveals individual and common core and relation to intestinal symptoms. PLoS One. 2011;6:e23035.
Sunagawa S, Mende DR, Zeller G, Izquierdo-Carrasco F, Berger SA, Kultima JR, et al. Metagenomic species profiling using universal phylogenetic marker genes. Nat Methods. 2013;10:1196–9. http://www.bork.embl.de/software/mOTU.
Lozupone CA, Stombaugh J, Gonzalez A, Ackermann G, Wendel D, Vazquez-Baeza Y, et al. Meta-analyses of studies of the human microbiota. Genome Res. 2013;23:1704–14.
Swan BK, Ehrhardt CJ, Reifel KM, Moreno LI, Valentine DL. Archaeal and bacterial communities respond differently to environmental gradients in anoxic sediments of a California hypersaline lake, the Salton Sea. Appl Environ Microbiol. 2010;76:757–68.
Lauber CL, Zhou N, Gordon JI, Knight R, Fierer N. Effect of storage conditions on the assessment of bacterial community structure in soil and human-associated samples. FEMS Microbiol Lett. 2010;307:80–6.
Flores R, Shi J, Gail MH, Gajer P, Ravel J, Goedert JJ. Assessment of the human faecal microbiota: II. Reproducibility and associations of 16S rRNA pyrosequences. Eur J Clin Invest. 2012;42:855–63.
van Eijsden RG, Stassen C, Daenen L, Van Mulders SE, Bapat PM, Siewers V, et al. A universal fixation method based on quaternary ammonium salts (RNAlater) for omics-technologies: Saccharomyces cerevisiae as a case study. Biotechnol Lett. 2013;35:891–900.
Vlckova K, Mrazek J, Kopecny J, Petrzelkova KJ. Evaluation of different storage methods to characterize the fecal bacterial communities of captive western lowland gorillas (Gorilla gorilla gorilla). J Microbiol Methods. 2012;91:45–51.
Franzosa EA, Morgan XC, Segata N, Waldron L, Reyes J, Earl AM, et al. Relating the metatranscriptome and metagenome of the human gut. Proc Natl Acad Sci U S A. 2014;111:E2329–38.
Nechvatal JM, Ram JL, Basson MD, Namprachan P, Niec SR, Badsha KZ, et al. Fecal collection, ambient preservation, and DNA extraction for PCR amplification of bacterial and human markers from human feces. J Microbiol Methods. 2008;72:124–32.
Krauth SJ, Coulibaly JT, Knopp S, Traore M, N’Goran EK, Utzinger J. An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool. PLoS Negl Trop Dis. 2012;6:e1969.
Wu GD, Lewis JD, Hoffmann C, Chen YY, Knight R, Bittinger K, et al. Sampling and pyrosequencing methods for characterizing bacterial communities in the human gut using 16S sequence tags. BMC Microbiol. 2010;10:206.
Powell S, Szklarczyk D, Trachana K, Roth A, Kuhn M, Muller J, et al. eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Res. 2012;40:D284–9.
Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, et al. KEGG for linking genomes to life and the environment. Nucleic Acids Res. 2008;36:D480–4.
Mutter GL, Zahrieh D, Liu C, Neuberg D, Finkelstein D, Baker HE, et al. Comparison of frozen and RNALater solid tissue storage methods for use in RNA expression microarrays. BMC Genomics. 2004;5:88.
O’Brien CL, Allison GE, Grimpen F, Pavli P. Impact of colonoscopy bowel preparation on intestinal microbiota. PLoS One. 2013;8, e62815.
Lofmark S, Jernberg C, Jansson JK, Edlund C. Clindamycin-induced enrichment and long-term persistence of resistant Bacteroides spp. and resistance genes. J Antimicrob Chemother. 2006;58:1160–7.
Jones N. Social network wants to sequence your gut. Nature. 2011. doi:10.1038/news.2011.523.
Le Chatelier E, Nielsen T, Qin J, Prifti E, Hildebrand F, Falony G, et al. Richness of human gut microbiome correlates with metabolic markers. Nature. 2013;500:541–6.
Li J, Jia H, Cai X, Zhong H, Feng Q, Sunagawa S, et al. An integrated catalog of reference genes in the human gut microbiome. Nat Biotechnol. 2014;32:834–41.
Furet JP, Firmesse O, Gourmelon M, Bridonneau C, Tap J, Mondot S, et al. Comparative assessment of human and farm animal faecal microbiota using real-time quantitative PCR. FEMS Microbiol Ecol. 2009;68:351–62.
Zeller G, Tap J, Voigt AY, Sunagawa S, Kultima JR, Costea PI, et al. Potential of fecal microbiota for early-stage detection of colorectal cancer. Mol Syst Biol. 2014;10:766.
Kultima JR, Sunagawa S, Li J, Chen W, Chen H, Mende DR, et al. MOCAT: a metagenomics assembly and gene prediction toolkit. PLoS One. 2012;7:e47656.
Mende DR, Sunagawa S, Zeller G, Bork P. Accurate and universal delineation of prokaryotic species. Nat Methods. 2013;10:881–4.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10.
Arumugam M, Harrington ED, Foerstner KU, Raes J, Bork P. SmashCommunity: a metagenomic annotation and analysis tool. Bioinformatics. 2010;26:2977–8.
European Nucleotide Archive. http://www.ebi.ac.uk/ena.
We are grateful to the members of the Bork group for inspiring discussions. We thank the volunteers for collecting the samples and acknowledge Yan Ping Yuan and the EMBL Information Technology Core Facility for support with high-performance computing as well as the EMBL Genomics Core Facility for sequencing support. We are grateful for statistical advice from Bernd Klaus from the Centre for Statistical Data Analysis at EMBL. We are also grateful to the European MetaHIT consortium and the NIH Common Fund Human Microbiome Project Consortium for making available the data sets that were used in this study. This work has received funding through the CancerBiome project (European Research Council project reference 268985), the METACARDIS project (FP7-HEALTH-2012-INNOVATION-I-305312), the International Human Microbiome Standards project (HEALTH-FP7-2010-261376), SysteMtb grant (HEALTH-FP7-2010-241587) and funding from EMBL. SSL is the recipient of an Australian Postgraduate Award, and EMBL Australia International PhD Fellowship.
The authors declare that they have no competing interests.
PB, SS, AYV conceived and managed the project. AYV, SS, PIC, JRK, GZ, SSL and PB designed and performed data analysis. AYV wrote the manuscript with contributions from all other authors. All authors read and approved the final manuscript.
Overview of the metadata of the subjects and sequencing information of the samples included.
Nearest neighbor plot based on COGs.
Nearest neighbor plot based on KOs.
Comparison of technical, temporal and between-subject variability based on functional profiles.