Correction to: MicroPro: using metagenomic unmapped reads to provide insights into human microbiota and disease associations
Genome Biology volume 20, Article number: 214 (2019)
Correction to: Genome Biol
Following publication of the original paper , Dr. Nayfach kindly pointed out an error and the authors would like to report the following correction.
On page 7, paragraph 3, line 13, the statement “Nayfach et al. suggested Mash distance of 0.35 as a genus-level threshold for microbes” is incorrect. Nayfach et al.  used phylogenetic distance instead of Mash distance to define genus and higher levels of taxonomy.
In order to determine the genus level threshold for Mash distance, we used Mash v.2.0 with sketch size 1,000,000 and k-mer size of 21 to calculate pairwise Mash distances between all the 11,444 complete bacterial genomes in the Centrifuge database (up to December 10, 2018). We then obtained the distributions of the pairwise Mash distances for three different groups of genome pairs: A. both are from the same species; B. the two bacterial genomes are from different species but the same genus; and C. the two bacterial genomes are from different genera. We could observe clear separations between the three groups of genome pairs under cutoff of 0.05 and 0.34 (Fig. 7). We found that 98.02% of group A distances are below 0.05, 92.37% of group B distances are above 0.05 and below 0.34, and 91.27% of group C distances are above 0.34. These results demonstrate that Mash distance thresholds of 0.05 and 0.34 can be reasonably used as species and genus level thresholds, respectively.
In the original paper , we used Mash v.2.0 with default parameters in the taxonomic assignments of the MAGs generated in four datasets. However, default setting of Mash uses sketch size 1,000 which causes large variance in the calculated Mash distance . In this correction, we redid the taxonomic assignments using Mash v.2.0 with sketch size 1,000,000. Under Mash distance threshold of 0.05 (species) and 0.34 (genus), we found that 0.65 and 19.83% of the total MAGs could be assigned to the species and genus levels, respectively. Detailed taxonomic assignment results are provided in the updated Additional file 5: Table S4.
Zhu Z, Ren J, Michail S, Sun F. MicroPro: using metagenomic unmapped reads to provide insights into human microbiota and disease associations. Genome Biol. 2019;20:154. https://doi.org/10.1186/s13059-019-1773-5.
Nayfach S, Shi ZJ, Seshadri R, Pollard KS, Kyrpides NC. New insights from uncultivated genomes of the global human gut microbiome. Nature. 2019;568:505–10.
Ondov BD, Treangen TJ, Melsted P, Mallonee AB, Bergman NH, Koren S, Phillippy AM. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol. 2016;17:132.
Taxonomic assignments for each MAG generated from four metagenomic datasets. The corresponding Mash distance, number of matched sketches and the NCBI accession of the best hit are also provided. A MAG with multiple hits in the database is reported in the table only if all of its hits belong to a common microbial species/genus in the taxonomy tree.
About this article
Cite this article
Zhu, Z., Ren, J., Michail, S. et al. Correction to: MicroPro: using metagenomic unmapped reads to provide insights into human microbiota and disease associations. Genome Biol 20, 214 (2019). https://doi.org/10.1186/s13059-019-1826-9