Skip to main content

Table 6 Expected and observed error rates after filtering of aligned reads

From: Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systems

 

PhiX-Bv

PhiX-GA

 

Expected (%)a

Observed (%)b

Percentage bases removed

Expected (%)a

Observed (%)b

Percentage bases removed

No filter

4.549

0.650

0.0

5.829

1.555

0.0

ChF

2.989

0.399

4.8

5.292

1.349

2.0

C33

2.121

0.274

9.3

4.823

1.113

3.3

B-tail

0.166

0.130

7.0

0.194

0.309

9.0

B-tail + ChF

0.137

0.107

9.1

0.182

0.280

9.9

B-tail + N

0.159

0.130

7.2

0.187

0.310

9.5

B-tail + C33

0.106

0.088

12.2

0.172

0.251

10.5

B-tail + A30

0.114

0.092

11.3

0.172

0.262

10.5

B-tail + ChF + C33

0.105

0.087

12.5

0.170

0.248

10.9

B-tail + ChF + A30

0.113

0.091

11.6

0.170

0.257

10.8

  1. aExpected error rate: average error probability of each base, assigned by Illumina as Q-scores. bObserved error rate: substitution error rate of aligned bases. ChF, Illumina chastity filter; B-tail, B-tail trimming; N, removal of reads with at least one uncalled base; C33, removal of reads that have less than two-thirds of Q ≥ 30 bases within the first half of the read; A30, removal of reads that have an average Q-score < 30 in the first 30% of the read.