From: SEPATH: benchmarking the search for pathogens in human tissue whole genome sequence data leads to template pipelines

Genus-level performance of Kraken on contigs following metagenomic assembly with MetaSPAdes. Performance is summarized by genus-level F1 score (a), sensitivity (b), and PPV (c). A single dataset failed metagenomic assembly, and so, data shown is for 99 of 100 simulated datasets. Performance is shown on raw Kraken classifications with no threshold applied (unfiltered) in dark blue. The light blue is the performance when a minimum of 5 contigs assigning to a genera was used. Median values for unfiltered performance were 0.83, 0.88, and 0.81, and for filtered performance were 0.89, 0.85, and 0.94 for F1 score, sensitivity, and PPV, respectively. d KrakenUniq filtering parameters in relation to detection status. The y-axis indicates the number of unique k-mers assigned to a particular taxon, the x-axis represents the number of contigs assigned to a particular taxon (log10), and the color gradient shows the coverage of the clade in the database (log10). True-positive results are larger circles, whereas false-positive results are smaller triangles. The scatter plot shows 10,450 contigs classified at genus level as data points; the ggplot package alpha level was set to 0.3 due to a large number of overlapping points. k= 31

