Skip to main content

Table 2 Expected error rates after filtering

From: Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systems

  Expected error ratea(percentage of bases discarded)
Filter Bv + PhiX At + PhiX PhiX-GAIIx
No filter 7.093 (0.0%) 10.619 (0.0%) 16.434 (0.0%)
ChF 2.583 (10.2%) 4.819 (12.7%) 7.360 (17.8%)
B-tail 0.163 (11.0%) 0.205 (16.6%) 0.229 (25.8%)
N 5.943 (2.5%) 9.688 (2.3%) 8.601 (15.6%)
C33 1.521 (14.3%) 2.907 (17.7%) 5.207 (21.7%)
A30 1.802 (12.7%) 3.586 (15.5%) 5.457 (21.3%)
B-tail + N 0.141 (11.4%) 0.187 (17.0%) 0.206 (26.8%)
B-tail + ChF 0.118 (13.8%) 0.161 (19.2%) 0.204 (27.2%)
B-tail + ChF + C33 0.083 (16.9%) 0.127 (22.1%) 0.174 (28.3%)
B-tail + ChF + A30 0.093 (15.8%) 0.139 (21.0%) 0.176 (28.2%)
B-tail + ChF + N + C33 0.077 (17.1%) 0.125 (22.2%) 0.168 (28.9%)
B-tail + ChF + N + A30 0.085 (16.0%) 0.136 (21.1%) 0.171 (28.6%)
  1. aExpected error rate: average error probability of each base assigned by Illumina as quality scores. No filter, raw data prior to filtering; ChF, Illumina chastity filter; B-tail, B-tail trimming; N, removal of reads with at least one uncalled base; C33, removal of reads having less than two-thirds of bases with Q ≥ 30 within the first half of the read; A30, removal of reads that have an average Q-score < 30 in the first 30% of the read. Note that some quality filters discard partially the same reads (for instance, the chastity filter and the removal of reads containing at least one uncalled base).