Skip to main content

Advertisement

Table 6 Comparison of DeDup with the SAMtools rmdup method

From: EAGER: efficient ancient genome reconstruction

Percentage Method Var calls cov(fold) cov(%) refCall/ Δ
1 NoRMDup 1 1.16 1.02 33,277
1 DeDup 1 1.16 1.01 −207
1 rmdup 1 1.16 0.98 −1,362
2 NoRMDup 11 2.33 10.17 332,395
2 DeDup 11 2.33 10.14 −1,051
2 rmdup 11 2.32 9.85 −10,563
4 NoRMDup 55 4.7 49.82 1,628,172
4 DeDup 55 4.69 49.73 −2,978
4 rmdup 55 4.64 49.10 −23,481
5 NoRMDup 80 5.89 66.85 2,184,874
5 DeDup 80 5.88 66.77 −2,889
5 rmdup 78 5.8 66.19 −21,761
6 NoRMDup 91 7.06 78.85 2,576,795
6 DeDup 91 7.05 78.78 −2,219
6 rmdup 89 6.94 78.31 −17,500
7 NoRMDup 102 8.26 86.68 2,832,796
7 DeDup 102 8.24 86.62 −1,931
7 rmdup 101 8.09 86.29 −12,650
70 NoRMDup 114 82.58 98.39 3,215,440
70 DeDup 114 80.84 98.39 0
70 rmdup 114 68.87 98.39 −52
80 NoRMDup 114 94.38 98.4 3,215,840
80 DeDup 114 92.11 98.4 −2
80 rmdup 114 76.89 98.4 −54
90 NoRMDup 114 106.23 98.42 3,216,400
90 DeDup 114 103.36 98.42 0
90 rmdup 114 84.62 98.42 −30
100 NoRMDup 114 118.03 98.43 3,216,748
100 DeDup 114 114.51 98.43 −1
100 rmdup 114 92.02 98.43 −30
  1. The first column describes the percentage of randomly drawn reads from the Jorgen625 leprosy data set, with a genome size of 3,268,202 base pairs. Var calls shows the number of variant positions that were called. cov(fold) and cov(%) show the coverage of the genome. refCall describes the number of reference calls that were made, where Δ describes the difference between the non-de-duplicated sample at the given sub-sampling degree and the duplicate removed sample. All other positions of the genome have been filtered out. The parameters to call a position confidently were a coverage of at least fivefold, a variant quality of at least 30, and a minimum allele frequency of 90 %. NoRMDup refers to not applying any duplicate removal to the corresponding sample