Skip to main content
Figure 3 | Genome Biology

Figure 3

From: Quake: quality-aware detection and correction of sequencing errors

Figure 3

k -mer coverage. 15-mer coverage model fit to 76× coverage of 36 bp reads from E. coli. Note that the expected coverage of a k-mer in the genome using reads of length L will be L − k + 1 L times the expected coverage of a single nucleotide because the full k-mer must be covered by the read. Above, q-mer counts are binned at integers in the histogram. The error k-mer distribution rises outside the displayed region to 0.032 at coverage two and 0.691 at coverage one. The mixture parameter for the prior probability that a k-mer's coverage is from the error distribution is 0.73. The mean and variance for true k-mers are 41 and 77 suggesting that a coverage bias exists as the variance is almost twice the theoretical 41 suggested by the Poisson distribution. The likelihood ratio of error to true k-mer is one at a coverage of seven, but we may choose a smaller cutoff for some applications.

Back to article page