Skip to main content

Table 3 Benchmarking results for ONT data

From: phasebook: haplotype-aware de novo assembly of diploid genomes from long reads

Dataset

Assembler

Size (Mb)

HC (%)

k-mer recovery(%)

Continuity (bp)

QV

Switch error(%)

N (%)

Dup (%)

    

All

Mat

Pat

NG50

NGA50

Phased N50

    

MHC (ONT 25x)

phasebook

14.5

97.0

98.4

94.0

87.7

152,599

152,599

102,954

36.9

0.78

0.0

1.82

 

phasebook-hi

14.8

91.6

97.5

89.4

88.8

796,423

746,361

176,895

40.6

6.99

0.0

1.96

 

Canu

5.9

59.6

95.4

83.2

65.9

1,419,330

93,217

100,918

33.4

6.44

0.0

0.84

 

Falcon

6.3

67.8

94.3

82.2

68.8

2,489,921

442,832

220,087

27.6

4.65

0.0

1.10

 

Flye

5.0

54.5

95.0

81.4

62.6

792,993

86,893

106,195

39.7

6.22

0.0

0.82

 

Shasta

5.0

63.0

86.8

62.5

48.5

957,250

136,119

147,280

23.7

10.71

0.0

0.89

 

Wtdbg2

5.0

76.5

89.1

67.0

47.4

4,821,843

208,838

136,335

24.3

4.33

0.0

0.99

 

HapCut2

9.1

55.9

84.5

55.6

52.9

493,723

279,218

367,400

31.5

6.33

8.9

1.53

 

WhatsHap

9.2

56.9

84.4

55.3

53.0

493,723

279,218

324,450

31.5

5.47

8.9

1.51

NA19240 (Chr6) (ONT 26x)

phasebook

380.8

84.4

87.5

53.9

55.1

90,219

85,595

51,291

20.6

36.96

0.0

1.87

 

phasebook-hi

335.5

79.4

83.7

46.3

45.9

127,968

122,097

63,798

22.3

40.12

0.0

1.76

 

Canu

169.2

65.4

83.9

40.7

40.8

–

–

129,586

22.5

35.40

0.0

0.98

 

Falcon

179.3

64.0

77.7

33.7

35.4

69,660

37,993

162,461

20.5

35.72

0.0

1.02

 

Flye

167.4

57.3

89.4

40.1

48.3

–

–

115,043

25.1

31.85

0.0

0.99

 

Shasta

164.9

62.7

76.8

33.2

34.2

–

–

190,758

20.2

39.98

0.0

0.99

 

Wtdbg2

166.7

60.6

78.3

33.2

33.9

–

–

160,729

20.4

35.13

0.0

0.98

 

HapCut2

341.3

56.2

96.3

71.2

71.2

27,552,232

3,285,944

105,889

23.0

29.39

0.4

1.68

 

WhatsHap

341.3

56.4

96.2

71.1

71.3

27,552,232

3,520,811

102,096

23.0

28.45

0.4

1.70

HG002 (ONT 38x)

phasebook

5691

–

92.0

62.2

63.2

390,659

–

202,723

26.6

2.28

0.0

–

 

Canu

2901

–

84.5

39.3

51.6

–

–

266,624

23.2

10.27

0.0

–

 

Flye

2928

–

83.9

38.5

50.6

–

–

287,619

23.0

11.94

0.0

–

 

Shasta

2805

–

88.8

40.9

53.2

–

–

276,802

25.0

12.59

0.0

–

 

Wtdbg2

2794

–

83.0

34.4

44.2

–

–

239,707

22.8

9.62

0.0

–

  1. The publicly released assemblies of HG002 (Canu, Flye, Shasta, Wtdbg2) were directly used for comparison. We failed to run HapCut2 and WhatsHap for HG002 data on a computing machine (48 cores, 1 TB RAM) probably due to running out of memory