Skip to main content
Fig. 2 | Genome Biology

Fig. 2

From: Mash Screen: high-throughput sequence containment estimation for genome discovery

Fig. 2

Mash Screen algorithmic overview. (A) The minimum m hashes (in this case 3, shown colored) for each reference sequence is determined during sketching to produce (B) a reference MinHash sketch library. For screening, distinct hashes from all reference sketches are collected and used as keys to (C) a map of observed counts per hash, which is populated by (D) hashing k-mers from the sequence mixture as it is streamed. (E) Counts from the map are queried for each sketch to produce (F) a containment estimation for each constituent of the mixture

Back to article page