Skip to main content

Table 1 Identified contaminants in the Shakya dataset measured by mapped reads and Mash Screen

From: Mash Screen: high-throughput sequence containment estimation for genome discovery

Organism Reads Coverage Identity Score p value
Propionibacterium acnes HL072PA1 36,222 7.29% 96.36% 0.874 3.49e −136
Escherichia coli strain 2014C-3250 59,744 4.99% 95.16% 0.837 7.32e −47
Proteiniclasticum ruminis DSM 24773 751,538 76.57% 91.83% 0.930 0
Streptococcus parasanguinis strain C1A 74,807 57.50% 95.84% 0.942 0
  1. Organism refers to the closest strain in RefSeq to the suspected contaminant, based on Mash Screen distance. Reads refers to the number of reads mapped to those genomes. Coverage refers to the amount of each genome that was covered by mapped reads. Identity refers to the average identity of the covered portions based on naive consensus from the pileups. Score and P value are results from Mash Screen