Skip to main content

Table 2 Expected error rates after filtering

From: Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systems

 

Expected error ratea(percentage of bases discarded)

Filter

Bv + PhiX

At + PhiX

PhiX-GAIIx

No filter

7.093 (0.0%)

10.619 (0.0%)

16.434 (0.0%)

ChF

2.583 (10.2%)

4.819 (12.7%)

7.360 (17.8%)

B-tail

0.163 (11.0%)

0.205 (16.6%)

0.229 (25.8%)

N

5.943 (2.5%)

9.688 (2.3%)

8.601 (15.6%)

C33

1.521 (14.3%)

2.907 (17.7%)

5.207 (21.7%)

A30

1.802 (12.7%)

3.586 (15.5%)

5.457 (21.3%)

B-tail + N

0.141 (11.4%)

0.187 (17.0%)

0.206 (26.8%)

B-tail + ChF

0.118 (13.8%)

0.161 (19.2%)

0.204 (27.2%)

B-tail + ChF + C33

0.083 (16.9%)

0.127 (22.1%)

0.174 (28.3%)

B-tail + ChF + A30

0.093 (15.8%)

0.139 (21.0%)

0.176 (28.2%)

B-tail + ChF + N + C33

0.077 (17.1%)

0.125 (22.2%)

0.168 (28.9%)

B-tail + ChF + N + A30

0.085 (16.0%)

0.136 (21.1%)

0.171 (28.6%)

  1. aExpected error rate: average error probability of each base assigned by Illumina as quality scores. No filter, raw data prior to filtering; ChF, Illumina chastity filter; B-tail, B-tail trimming; N, removal of reads with at least one uncalled base; C33, removal of reads having less than two-thirds of bases with Q ≥ 30 within the first half of the read; A30, removal of reads that have an average Q-score < 30 in the first 30% of the read. Note that some quality filters discard partially the same reads (for instance, the chastity filter and the removal of reads containing at least one uncalled base).