Skip to main content
Fig. 7 | Genome Biology

Fig. 7

From: Simplitigs as an efficient and scalable representation of de Bruijn graphs

Fig. 7

k-mer queries for multiple pan-genomes indexed simultaneously. Bacterial pan-genomes were computed from the complete GenBank assemblies per individual species. While the All dataset comprises all pan-genomes with no restriction on their size, the Solid dataset comprises only those that contain at least 11 genomes. a Characteristics of the obtained unitigs and simplitigs: number of sequences (NS, millions) and their cumulative length (CL, gigabase pairs). The dot-dash line depicts the lower bound corresponding to the number of k-mers. b Memory footprint, time of index loading, and time of matching 10 million k-mers using BWA. The bars correspond to the mean of three measurements (black dots). Full results including relative improvements are available in Additional file 6: Table S19–S23

Back to article page