Skip to main content
Fig. 4 | Genome Biology

Fig. 4

From: RefSeq database growth influences the accuracy of k-mer-based lowest common ancestor species identification

Fig. 4

The fraction of reads classified among Bacillus species varied depending on which RefSeq version was used. a Classifying B. cereus VD118 reads with Kraken (left) and Bracken (right) against different versions of RefSeq. Species-level classifications varied, and the fraction of unclassified reads decreased with Kraken, as the database grew. Once B. cereus VD118 appeared in the database (ver. 60), Bracken correctly classified every read. b Species-level classifications decrease with Kraken as RefSeq grows using real reads from an environmental Bacillus cereus not in RefSeq. Fraction of B. cereus ISSFR-23F reads classified using Kraken ver. 1.0 (left) and Bracken ver. 1.0.0 (right) against different versions of bacterial RefSeq. Bracken classification pushed all reads to a species-level call, though these classifications were often for other Bacillus species

Back to article page