Skip to main content

Table 1 Identified contaminants in the Shakya dataset measured by mapped reads and Mash Screen

From: Mash Screen: high-throughput sequence containment estimation for genome discovery

Organism

Reads

Coverage

Identity

Score

p value

Propionibacterium acnes HL072PA1

36,222

7.29%

96.36%

0.874

3.49e −136

Escherichia coli strain 2014C-3250

59,744

4.99%

95.16%

0.837

7.32e −47

Proteiniclasticum ruminis DSM 24773

751,538

76.57%

91.83%

0.930

0

Streptococcus parasanguinis strain C1A

74,807

57.50%

95.84%

0.942

0

  1. Organism refers to the closest strain in RefSeq to the suspected contaminant, based on Mash Screen distance. Reads refers to the number of reads mapped to those genomes. Coverage refers to the amount of each genome that was covered by mapped reads. Identity refers to the average identity of the covered portions based on naive consensus from the pileups. Score and P value are results from Mash Screen