Skip to main content

Table 2 Adaptive sampling simulation using SPUMONI 1, SPUMONI 2 and minimap2. SPUMONI 1 indexes the full input database, while SPUMONI 2 indexes the minimizer-digested sequences of the database using the minimizer alphabet. The “SPUMONI 2 a” gives measurements for SPUMONI 2 with minimizer digestion disabled. Batches of 180 bp (0.4s) of data are delivered in each batch, and the goal is to decide whether to eject the read or not. Four batches were considered in the analysis which corresponds to 720 bp. The mock community dataset of ONT reads (SRX7711546) consists of reads from 7 microbial species and 1 yeast species. The goal is to retain the yeast reads and eject the microbial reads. For the human microbiome study, bacterial reads from the microbiome were obtained the following SRA accession (SRX6602475) and human reads were simulated [19] from the CHM13 reference

From: SPUMONI 2: improved classification using a pangenome index of minimizer digests

Scenario

Mock community

Human microbiome

Goal

Retain yeast, eject microbial

Retain microbial, eject human

Index database

7 microbial species (n =5867 genomes)

Human (n =10 genomes)

Tool

SPUMONI 1

SPUMONI 2a

SPUMONI 2

minimap2

SPUMONI 1

SPUMONI 2a

SPUMONI 2

minimap2

Sensitivity

97.38

97.62

95.32

97.90

99.24

95.07

97.08

99.56

Specificity

90.77

97.38

97.06

97.85

94.30

99.97

99.12

99.97

Index size

1.54 GB

1.54 GB

0.74 GB

50.9 GB

10.2 GB

10.2 GB

4.21 GB

65.5 GB

Peak memory

1.62 GB

1.62 GB

0.80 GB

8.40 GB

11.0 GB

11.0 GB

4.56 GB

9.99 GB

Time (s)

367.39

362.89

193.53

2957.56

1628.7

1747.12

732.63

3070.0

  1. aRunning SPUMONI 2 without minimizer digestion (i.e., similar to SPUMONI 1 but using new classification approach)