- Short Report
- Open Access
Isolation of novel gut bifidobacteria using a combination of metagenomic and cultivation approaches
Genome Biologyvolume 20, Article number: 96 (2019)
Whole metagenome shotgun (WMGS) sequencing is a method that provides insights into the genomic composition and arrangement of complex microbial consortia. Here, we report how WMGS coupled with a cultivation approach allows the isolation of novel bifidobacteria from animal fecal samples. A combination of in silico analyses based on nucleotide and protein sequences facilitate the identification of genetic material belonging to putative novel species. Consequently, the prediction of metabolic properties by in silico analyses permits the identification of specific substrates that are then employed to isolate these species through a cultivation method.
Next-Generation Sequencing (NGS) technologies allow the generation of vast amounts of genomic data, facilitating a variety of DNA sequencing approaches that range from single genome sequencing to large-scale metagenomic studies . While whole genome sequencing (WGS) reveals the complete genetic makeup of a specific organism, and the subsequent prediction of its biological features, the whole metagenome shotgun (WMGS) methodology provides genetic information about the abundant microorganisms present in a complex microbial consortium associated with a particular ecosystem based on the sequencing depth [2, 3]. Furthermore, through the reconstruction of sequenced DNA into consensus sequences, WMGS sequencing provides access to the genome content of yet uncultured bacteria, including novel species, which are otherwise very hard or even impossible to be identified by traditional culturing techniques [4,5,6].
Microorganisms are ubiquitous in nature, which means that they can be found everywhere. In this context, the human body, as well as that of non-human animals, is inhabited by a plethora of microbial species that may coexist with the host throughout its life span . Most of the microbial communities that reside in the animal body are located in the large intestine, representing an estimated 1014 bacterial cells . The gastrointestinal microbial community, also known as gut microbiota, exerts many important activities that support and preserve host health . It is for this reason that the gut microbiota is the most extensively scrutinized microbial community (both in humans and other animals) through large-scale metagenomic studies . As part of ongoing efforts to dissect the composition and associated activities of the gut microbiota, various studies have focused on the identification of novel bacterial species, whose genetic makeup is pivotal to unveil potential microbe-host interactions .
Recently, various strategies have been proposed for the enrichment of very low abundance strains from complex environmental matrices [12, 13]. However, these methodologies require a sequenced reference genome to perform DNA enrichment prior to sequencing. Besides, to explore such microbial dark matter, methodologies involving high-throughput culture conditions for the growth of bacteria followed by matrix-assisted laser desorption/ionization–time of flight (MALDI–TOF) or 16S rRNA amplification and sequencing have been employed [11, 14]. In this context, new bacterial species have been isolated, filling knowledge gaps regarding unknown microbial inhabitants of the human gut and allowing insights into the physiology of these taxa.
The focus of the current study was to apply WMGS sequencing in order to investigate the presence of novel gut commensals species belonging to the genus Bifidobacterium among the gut microbiota of animals. For this purpose, we sequenced and analyzed samples collected from banteng (Bos javanicus), Goeldi’s marmoset (Callimico goeldii), and pygmy marmoset (Callithrix pygmaea) due to the high abundance of putative novel species of the genus Bifidobacterium as based on a previous study . We therefore employed a custom-made METAnnotatorX pipeline  to screen the sequencing data of each sample in order to retrieve genomic dark matter that was predicted to belong to the genus Bifidobacterium.
Results and discussion
WMGS sequencing of animal stool samples produced approximately 79 million of paired-end reads with an average length of ~ 150 bp (see Additional file 1: Supplementary Materials and Additional file 2: Table S1), which were analyzed through the METAnnotatorX pipeline. A preliminary screening of the obtained sequence reads revealed marked variations in the relative abundance of bifidobacteria between different samples analyzed, ranging from 0.1% in the Bos sample to 22.3 and 25% for Callithrix and Callimico samples, respectively (Fig. 1a). Due to the low abundance of bifidobacterial reads in the Bos sample, the metagenomic data were used to perform a validation screening aimed at revealing the minimum amount of genomic DNA needed to detect a specific taxon (Additional file 3: Figure S1). In the case of Callithrix and Callimico, metagenomic data were assembled, revealing more than 800 contigs (with a length of > 5000 bp) predicted to belong to the genus Bifidobacterium, taxonomically classified by means of the proteome of each contig (Fig. 1b).
To identify genomic contigs that putatively belong to unclassified bifidobacterial taxa, a custom script employing the results of the METAnnotatorX pipeline was implemented (Additional file 3: Figure S2). Starting from the collected bifidobacterial contigs, a comparison against three databases based on each bifidobacterial genomic sequence was performed (see Additional file 1: Supplementary Materials). Gene homology/protein similarity searches at both nucleotide and deduced protein level were carried out coupled with chromosomal sequence comparisons to discard contigs attributed to known species and closely related taxa. Thus, collected contigs belonging to unknown bifidobacterial species were reduced to 435 by manual removal of phage and plasmid sequences (Fig. 1b).
Predicted genes among the selected contigs were compared to a Glycosyl Hydrolase (GH) database to assess the glycobiome of the putative unknown bifidobacterial species. Based on the thus generated glycobiomes (Additional file 2: Table S2), we predicted that four glycans, i.e., arabinogalactan, pullulan, starch, and xylan, represented carbon sources for these putative novel bifidobacterial species (Fig. 1c). Thus, various cultivation experiments were performed, where aliquots of fecal samples from Callimico and Callithrix were added to a chemically defined medium (CDM), containing a specific glycan, as indicated above, as its sole carbon source (see Additional file 1: Supplementary Materials). These carbohydrate-specific cultivation experiments allowed the growth of 13 phenotypically different bifidobacterial isolates, which were able to metabolize the selected glycans. Subsequently, amplification and sequencing of the internal transcribed spacer (ITS) sequence of these isolates was performed, and the obtained ITS sequences were compared to a previously described ITS bifidobacterial database  (Additional file 2: Table S3). This procedure allowed the identification of two strains that do not belong to previously characterized bifidobacterial species . The latter putative novel bifidobacterial isolates, named 2028B and 2034B, were subjected to WGS, which generated two genomes with a size of 2.96 and 2.61 Mb, respectively (Fig. 1d and Additional file 2: Table S4). Accordingly, novel bifidobacterial strains 2028B (=LMG 30938= CCUG 72814) and 2034B (=LMG 30939= CCUG 72815) were submitted to two public culture collections . The reconstruction of these genomes highlighted the presence of specific genes predicted to be responsible for the metabolism of the employed carbohydrate substrates as identified in the WMGS analyses, such as pullulanases and beta xylosidases. To validate the proposed approach, additional experiments were performed based on selective enrichment with inclusion in the medium of glucose, ribose, xylan, and pullulan as its unique carbon source based on the identified genes mentioned above (see Additional file 1: Supplementary Materials and Additional file 3: Figure S3). We observed more rigorous growth of strains 2028B and 2034B when cultivated on complex carbon sources, such as xylan and pullulan, as compared to glucose (Additional file 3: Figure S3a, S3b and S3c). Furthermore, the addition of complex carbon sources, i.e., xylan and pullulan, directly into the Callimico fecal sample resulted in an enrichment of these two strains, in particular strain 2034B in combination with pullulan, resulting in a one log increase in bacterial abundance as compared to medium containing glucose (i.e., from 8 × 105 to 4 × 106) (Additional file 3: Figure S3d). Despite the observed specificity in the isolation procedure of the two novel strains, it is worth mentioning that further microorganisms may grow in the selective media. To avoid this issue, mupirocin was added to the CDM (see Additional file 1: Supplementary Materials).
The average nucleotide identity (ANI) analysis of the here decoded genomes with all so far known bifidobacterial (sub)species , highlighted that strain 2028B possesses a 92.29% ANI value with respect to Bifidobacterium vansinderenii LMG 30126, while isolate 2034B exhibits an 87.32% ANI value with respect to Bifidobacterium biavatii DSM 23969 (Additional file 2: Table S5). Notably, two bacterial strains displaying an ANI value < 95% are considered to belong to distinct species . Mapping WMGS reads among the reconstructed genome sequences of strains 2028B and 2034B revealed that both genomes were entirely covered by the sequenced paired-end reads of the Callimico sample with an average coverage of 8.8 and 8, respectively. Furthermore, alignment of the reconstructed chromosomes of strains 2028B and 2034B with the deduced contigs belonging to unknown bifidobacterial species of the Callimico sample allowed the identification of contigs that belong to the novel assembled genomes (Fig. 1e). Accordingly, the genetic repertoire of strains 2028B and 2034B, coupled to their metabolic abilities, allowed the isolation of these novel Bifidobacterium taxa.
The availability of 2028B and 2034B genome sequences also allowed us to investigate their phylogenetic relationship with each of the 69 currently recognized bifidobacterial taxa [19, 21]. A comparative genome analysis was undertaken to highlight orthologous genes between sequenced type strains of the genus Bifidobacterium, resulting in 31,520 clusters of orthologous genes (COGs). The analyses allowed us to identify 261 COGs that were shared among all genomes, representing the bifidobacterial core genome. The concatenation of 233 core gene protein sequences (excluding 28 paralogs that were identified among type strains) allowed the construction of a bifidobacterial phylogenetic tree (Fig. 2). As shown in Fig. 2, strain 2034B clustered in the Bifidobacterium bifidum phylogenetic group , which also contains B. biavatii DSM 23969, whose relatedness has been highlighted in the ANI analysis (see above). Besides, strain 2028B grouped together with B. vansinderenii LMG 30126. Thus, based on these phylogenomic analyses, the relatedness among bifidobacterial type strains allowed the identification of a new phylogenetic cluster, which consists of strain 2028B plus six strains isolated from various monkey species [19, 21,22,23], here proposed to constitute the Bifidobacterium tissieri group (Fig. 2).
In the current study, we demonstrated how the implementation of selected tools for the identification of putative novel bacterial taxa from WMGS sequencing data allowed insights into the microbial dark matter of the mammalian gut. Based on the scientific field of interest, this approach can be applied to any bacterial genus for which several genome sequences have been decoded and for which there is just minimal knowledge on associated nutritional requirements. Thus, the predicted genetic makeup informs cultivation attempts to facilitate isolation of novel species of the examined genus. This approach was successfully applied to unravel the dark matter concerning key mammalian gut commensals belonging to the genus Bifidobacterium , ultimately resulting in the identification of two novel bifidobacterial species.
Average nucleotide identity
Chemically defined medium
Internal transcribed spacer
Whole genome sequencing
Whole metagenome shotgun
Koboldt DC, Steinberg KM, Larson DE, Wilson RK, Mardis ER. The next-generation sequencing revolution and its impact on genomics. Cell. 2013;155:27–38.
Paez-Espino D, Eloe-Fadrosh EA, Pavlopoulos GA, Thomas AD, Huntemann M, Mikhailova N, Rubin E, Ivanova NN, Kyrpides NC. Uncovering Earth’s virome. Nature. 2016;536:425–30.
Klemm E, Dougan G. Advances in understanding bacterial pathogenesis gained from whole-genome sequencing and phylogenetics. Cell Host Microbe. 2016;19:599–610.
Kowarsky M, Camunas-Soler J, Kertesz M, De Vlaminck I, Koh W, Pan W, Martin L, Neff NF, Okamoto J, Wong RJ, et al. Numerous uncharacterized and highly divergent microbes which colonize humans are revealed by circulating cell-free DNA. Proc Natl Acad Sci U S A. 2017;114:9623–8.
Nielsen HB, Almeida M, Juncker AS, Rasmussen S, Li J, Sunagawa S, Plichta DR, Gautier L, Pedersen AG, Le Chatelier E, et al. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nat Biotechnol. 2014;32:822–8.
Papudeshi B, Haggerty JM, Doane M, Morris MM, Walsh K, Beattie DT, Pande D, Zaeri P, Silva GGZ, Thompson F, et al. Optimizing and evaluating the reconstruction of metagenome-assembled microbial genomes. BMC Genomics. 2017;18:915.
Milani C, Duranti S, Bottacini F, Casey E, Turroni F, Mahony J, Belzer C, Delgado Palacio S, Arboleya Montes S, Mancabelli L, et al. The first microbial colonizers of the human gut: composition, activities, and health implications of the infant gut microbiota. Microbiol Mol Biol Rev. 2017;81(4):e00036–17.
Yatsunenko T, Rey FE, Manary MJ, Trehan I, Dominguez-Bello MG, Contreras M, Magris M, Hidalgo G, Baldassano RN, Anokhin AP, et al. Human gut microbiome viewed across age and geography. Nature. 2012;486:222–7.
Lozupone CA, Stombaugh JI, Gordon JI, Jansson JK, Knight R. Diversity, stability and resilience of the human gut microbiota. Nature. 2012;489:220–30.
Morgan XC, Huttenhower C. Meta’omic analytic techniques for studying the intestinal microbiome. Gastroenterology. 2014;146:1437–1448 e1431.
Lagier JC, Khelaifia S, Alou MT, Ndongo S, Dione N, Hugon P, Caputo A, Cadoret F, Traore SI, Seck EH, et al. Culture of previously uncultured members of the human gut microbiota by culturomics. Nat Microbiol. 2016;1:16203.
Vezzulli L, Grande C, Tassistro G, Brettar I, Hofle MG, Pereira RP, Mushi D, Pallavicini A, Vassallo P, Pruzzo C. Whole-genome enrichment provides deep insights into Vibrio cholerae metagenome from an African River. Microb Ecol. 2017;73:734–8.
Maixner F, Krause-Kyora B, Turaev D, Herbig A, Hoopmann MR, Hallows JL, Kusebauch U, Vigl EE, Malfertheiner P, Megraud F, et al. The 5300-year-old Helicobacter pylori genome of the Iceman. Science. 2016;351:162–5.
Mailhe M, Ricaboni D, Vitton V, Gonzalez JM, Bachar D, Dubourg G, Cadoret F, Robert C, Delerce J, Levasseur A, et al. Repertoire of the gut microbiota from stomach to colon using culturomics and next-generation sequencing. BMC Microbiol. 2018;18:157.
Milani C, Mangifesta M, Mancabelli L, Lugli GA, James K, Duranti S, Turroni F, Ferrario C, Ossiprandi MC, van Sinderen D, Ventura M. Unveiling bifidobacterial biogeography across the mammalian branch of the tree of life. ISME J. 2017;11(12):2834–47.
Milani C, Casey E, Lugli GA, Moore R, Kaczorowska J, Feehily C, Mangifesta M, Mancabelli L, Duranti S, Turroni F, et al. Tracing mother-infant transmission of bacteriophages by means of a novel analytical tool for shotgun metagenomic datasets: METAnnotatorX. Microbiome. 2018;6:145.
Milani C, Lugli GA, Turroni F, Mancabelli L, Duranti S, Viappiani A, Mangifesta M, Segata N, van Sinderen D, Ventura M. Evaluation of bifidobacterial community composition in the human gut by means of a targeted amplicon sequencing (ITS) protocol. FEMS Microbiol Ecol. 2014;90:493–503.
Duranti S, Lugli GA, Napoli S, Anzalone R, Milani C, Mancabelli L, Alessandri G, Turroni F, Ossiprandi MC, van Sinderen D, Ventura M. Characterization of the phylogenetic diversity of five novel species belonging to the genus Bifidobacterium: Bifidobacterium castoris sp. nov., Bifidobacterium callimiconis sp. nov., Bifidobacterium goeldii sp. nov., Bifidobacterium samirii sp. nov. and Bifidobacterium dolichotidis sp. nov. Int J Syst Evol Microbiol. 2019;69(5):1288–98.
Lugli GA, Milani C, Duranti S, Mancabelli L, Mangifesta M, Turroni F, Viappiani A, van Sinderen D, Ventura M. Tracking the taxonomy of the genus Bifidobacterium based on a phylogenomic approach. Appl Environ Microbiol. 2018;84(4):e02249–17.
Konstantinidis KT, Tiedje JM. Genomic insights that advance the species definition for prokaryotes. Proc Natl Acad Sci U S A. 2005;102:2567–72.
Modesto M, Puglisi E, Bonetti A, Michelini S, Spiezio C, Sandri C, Sgorbati B, Morelli L, Mattarelli P. Bifidobacterium primatium sp. nov., Bifidobacterium scaligerum sp. nov., Bifidobacterium felsineum sp. nov. and Bifidobacterium simiarum sp. nov.: four novel taxa isolated from the faeces of the cotton top tamarin (Saguinus oedipus) and the emperor tamarin (Saguinus imperator). Syst Appl Microbiol. 2018;41:593–603.
Lugli GA, Mangifesta M, Duranti S, Anzalone R, Milani C, Mancabelli L, Alessandri G, Turroni F, Ossiprandi MC, van Sinderen D, Ventura M. Phylogenetic classification of six novel species belonging to the genus Bifidobacterium comprising Bifidobacterium anseris sp. nov., Bifidobacterium criceti sp. nov., Bifidobacterium imperatoris sp. nov., Bifidobacterium italicum sp. nov., Bifidobacterium margollesii sp. nov. and Bifidobacterium parmae sp. nov. Syst Appl Microbiol. 2018;41:173–83.
Duranti S, Mangifesta M, Lugli GA, Turroni F, Anzalone R, Milani C, Mancabelli L, Ossiprandi MC, Ventura M. Bifidobacterium vansinderenii sp. nov., isolated from faeces of emperor tamarin (Saguinus imperator). Int J Syst Evol Microbiol. 2017;67(10):3987–95.
Lugli GA, Milani C, Duranti S, Alessandri G, Turroni F, Mancabelli L, Tatoni D, Ossiprandi MC, van Sinderen D, Ventura M: Beyond the tip of the iceberg: illuminating bacterial dark matter through a combination of metagenomics and culturomic approaches. NCBI Bioproject PRJNA494387. 2019. https://www.ncbi.nlm.nih.gov/bioproject/494387.
Lugli GA, Duranti S, Milani C. Characterization of the phylogenetic diversity of novel Bifidobacterium species 2028B. NCBI. https://www.ncbi.nlm.nih.gov/nuccore/QXGJ00000000. Accessed 17 Dec 2018.
Lugli GA, Duranti S, Milani C. Characterization of the phylogenetic diversity of novel Bifidobacterium species 2034B. NCBI. https://www.ncbi.nlm.nih.gov/nuccore/QXGL00000000. Accessed 17 Dec 2018.
Lugli GA, Milani C, Duranti S, Alessandri G, Turroni F, Mancabelli L, Tatoni D, Ossiprandi MC, van Sinderen D, Ventura M. Novel species identifier - METAnnotatorX custom-made script. GitHub repository. 2019; https://github.com/GabrieleAndrea/Novel-species-identifier. Accessed 6 May 2019.
Lugli GA, Milani C, Duranti S, Alessandri G, Turroni F, Mancabelli L, Tatoni D, Ossiprandi MC, van Sinderen D, Ventura M. Novel species identifier - METAnnotatorX custom-made script. Zenodo. 2019; https://zenodo.org/badge/latestdoi/185175819. Accessed 6 May 2019.
Broad Institute: Microbial composition of samples from infant gut. NCBI Bioproject PRJNA63661 2011. https://www.ncbi.nlm.nih.gov/bioproject/63661.
We thank GenProbio srl for financial support of the Laboratory of Probiogenomics. Part of this research is conducted using the High Performance Computing (HPC) facility of the University of Parma. The authors declare that they have no competing interests.
This work was funded by the EU Joint Programming Initiative – A Healthy Diet for a Healthy Life (JPI HDHL, http://www.healthydietforhealthylife.eu/) to DvS (in conjunction with Science Fondation Ireland [SFI], Grant number 15/JP-HDHL/3280) and to MV (in conjunction with MIUR, Italy). DvS is a member of The APC Microbiome Institute funded by Science Foundation Ireland (SFI), through the Irish Government’s National Development Plan (Grant number SFI/12/RC/2273).
Availability of data and materials
Shotgun metagenomics data are accessible through BioProject PRJNA494387 at the NCBI SRA database . Genome sequences of novel bifidobacterial taxa 2028B and 2034B were deposited at DDBJ/ENA/GenBank, available under accession numbers QXGJ00000000 , and QXGL00000000 . The versions described in this paper are QXGJ01000000 and QXGL01000000. Custom script employing the results of the METAnnotatorX pipeline is available at GitHub with the GNU General Public License (https://github.com/GabrieleAndrea/Novel-species-identifier)  and at Zenodo (https://zenodo.org/badge/latestdoi/185175819) . Additional third-party data were used to validate the custom script retrieved in the NCBI SRA database at BioProject PRJNA63661 .
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary materials. (DOCX 56 kb)
Table S1. Metagenomic sequence data. Table S2. Predicted glycobiome of novel species. Table S3. ITS sequences of the 15 bifidobacterial isolates. Table S4. General genetic features. Table S5. Average Nucleotide Identity between bifidobacterial taxa. (XLSX 78 kb)
Figure S1. Evaluation of the accuracy of METAnnotatorX in the detection of a specific species mixed within WMGS sequencing data of Bos javanicus. Figure S2. Workflow of the experiments. Figure S3. Viable cell counts and enrichment of novel bifidobacterial strains based on different carbon sources. Figure S4. Validation of the custom METAnnotatorX pipeline for the identification of novel bacterial species. (DOCX 1306 kb)