Skip to main content

Table 3 Results on assembly after correction by different correction protocols of reads from 4 different real datasets. 1st column: genome assembly program; Canu was used for raw reads, while HiCanu was used for corrected reads (because their error profiles resemble those of PacBio HiFI reads). 2nd column: Error correction protocol. Indels/100 kbp: average number of insertion or deletion errors per 100,000 aligned bases. Mismatches/100 kbp = average number of mismatch errors per 100,000 aligned bases. Genome fraction (GF) reflects how much of each of the strain-specific genomes is covered by the corrected reads. NGA50 is the length of the longest contig such that the alignments of that and all longer contigs span at least 50% of the reference sequence. N/100 kbp denotes the average number of uncalled bases (Ns) per 100,000 bases in the read. MC = fraction of misassembled contigs

From: Hybrid-hybrid correction of errors in long reads with HERO

Assemblers

 

GF (%)

Indels/100 kbp

Mismatches/100 kbp

NGA50

N/100 kbp

MC (%)

 

Bmock12 ONT

      

Canu

ONT raw reads

57.13

504.95

140.75

78498

0.00

9.04

HiCanu

Ratatosk

61.56

2.99

83.12

137201

0.22

4.04

HiCanu

R-HERO

63.92

1.27

34.30

198551

0.07

2.70

HiCanu

FMLRC

60.64

1.10

8.76

136705

0.00

4.48

HiCanu

F-HERO

60.84

0.82

6.90

148570

0.03

4.23

HiCanu

LoRDEC

56.48

1.43

9.92

90070

0.00

4.28

HiCanu

L-HERO

60.50

1.14

16.48

125982

0.03

4.74

 

Bmock12 PacBio

      

Canu

PacBio raw reads

50.01

174.71

63.79

6025

0.00

3.72

HiCanu

Ratatosk

59.25

3.88

66.22

31170

0.46

2.61

HiCanu

R-HERO

61.48

1.85

38.50

42481

0.06

4.60

HiCanu

FMLRC

62.54

2.02

19.50

65939

0.00

2.40

HiCanu

F-HERO

62.83

1.77

19.28

61786

0.03

2.45

HiCanu

LoRDEC

55.47

1.67

23.59

23307

0.00

2.96

HiCanu

L-HERO

60.27

1.72

35.59

44240

0.01

6.51

 

NWC ONT

      

Canu

ONT raw reads

-

-

-

-

-

-

HiCanu

Ratatosk

75.74

88.44

173.91

47297

5.39

42.06

HiCanu

R-HERO

78.72

48.09

179.33

52662

3.05

53.51

HiCanu

FMLRC

84.95

66.68

180.80

61515

0.00

50.25

HiCanu

F-HERO

89.12

37.92

160.21

84587

0.00

50.48

HiCanu

LoRDEC

36.71

42.98

185.76

-

0.00

21.54

HiCanu

L-HERO

61.66

27.54

141.29

37960

0.00

31.11

 

NWC PacBio

      

Canu

PacBio raw reads

44.27

62.25

33.51

-

0.00

36.51

HiCanu

Ratatosk

49.66

18.72

59.16

28052

2.15

6.32

HiCanu

R-HERO

53.38

11.10

57.32

29365

1.97

9.52

HiCanu

FMLRC

51.28

20.84

32.52

29528

0.00

6.12

HiCanu

F-HERO

53.39

15.98

44.49

33130

0.00

11.10

HiCanu

LoRDEC

50.34

8.27

56.65

26773

0.00

5.81

HiCanu

L-HERO

51.04

6.43

54.91

27546

0.00

5.98

  1. Boldface is meant to indicate the best performing tool in the respective category