Skip to main content
Figure 3 | Genome Biology

Figure 3

From: Reducing assembly complexity of microbial genomes with single-molecule sequencing

Figure 3

Sequence length compensates for increased error. The mean number of expected 10 bp seeds (the default in BLASR) was computed for each sequence length and error rate following the method in Chaisson and Tesler [28]. Additional seeds decrease the number of matches that have to be examined, decreasing runtime and increasing accuracy. For example, increasing the number of 15 bp seeds from 10 to 20 reduces the number of sequences with over 100 matches to the human reference by 25% [28]. Points correspond to the median sequence length and observed error rate of four PacBio RS sequencing chemistries. Sequence lengths also compensate for increased error since more seeds can be found in a longer sequence. For example, 20 seeds (dashed line), can be found both in a 0.75 kbp sequence at 15% error and an approximately 2.5 kbp sequence at 30% error.

Back to article page