Skip to main content
Fig. 2 | Genome Biology

Fig. 2

From: Taxonomer: an interactive metagenomics analysis portal for universal pathogen detection and host mRNA expression profiling

Fig. 2

Performance of the “Classifier” module for bacterial and fungal classification and bacterial community profiling. a Taxonomer provides superior sensitivity and specificity for read-level bacterial classification compared to two other rapid classification tools SURPI [32] and Kraken [30] when using each tool’s default settings and databases: nt (www.ncbi.nlm.nih.gov/nucleotide, SURPI), RefSeq (Kraken), and Greengenes 99 % [70] OTU (Taxonomer). Results for SURPI are based on correct identification by either (dark bar) or both (light bar) read mates. b Of the three commonly used reference databases RefSeq (n = 210,627; 5,242 bacterial genomes), Greengenes 99 % OTU (n = 203,452), and RDP (n = 2,929,433), Taxonomer provides greatest read-level (top) and taxon-level (bottom, i.e. percentage of bacterial species identified) sensitivity for bacterial classification at only a moderate decrease in specificity when using the Greengenes database compared to the RDP and RefSeq databases (simulated 16S rDNA as in a). Because of its large size and greater completeness, the RDP database provides the greatest species-level specificity at the tradeoff of sensitivity. For ease of reference, the top right-most column is repeated from (a). c Bacterial classification accuracy of Taxonomer is similar to the RDP Classifier [35] and superior to Kraken at the read-level (top) and taxon-level (bottom, all using the Greengenes database). Given the applied criteria, BLAST [34] is less sensitive but more specific. d Taxonomer also performs similar to the RDP Classifier and better than Kraken for classification of synthetic fungal internal transcribed spacer (ITS) sequences at the read-level (top) and taxon-level (bottom). e Taxonomer classifies bacterial 16S rRNA reads at >200-fold increased speed compared to the RDP Classifier (times for 1 CPU, multithreading not available for RDP Classifier) while providing highly comparable bacterial community profiles when using 16S rRNA gene amplicon sequencing and shotgun metagenomics. Spearman correlation coefficients (ρ) of abundance estimates are shown for Taxonomer and the RDP Classifier at the order and genus-levels using the Greengenes 99 % OTU reference database. *2.5 %; **1.9 %; ***2.5 %

Back to article page