Skip to main content

Table 1 Assembly quality comparison of the evaluated basecallers for different species. We measure assembly accuracy in terms of genome fraction (Genome Fraction (%)) and average identity (Average Identity (%)). Genome fraction is the portion of the Reference genome that can align to a given assembly, while average identity is the average of the identity of assemblies when compared to their respective Reference genomes. We measure statistics related to the contiguity and completeness of the assemblies in terms of the overall assembly length (Assembly Length), Average GC content (Average GC (%)) (i.e., the ratio of G and C bases in an assembly), NG50 statistics (NG50) (i.e., shortest contig at the half of the overall Reference genome length), total number of indels in all aligned bases in the assembly (Total Indels), the ratio of indels to assembly length (Indel Ratio (%)), and the reliability of basepairs using the quality value (Quality Value). NA indicates that the generated assemblies were unalignable to the reference genome

From: RUBICON: a framework for designing efficient deep learning-based genomic basecallers

Dataset

Basecaller

Genome fraction (%)

Average identity (%)

Assembly length

Average GC (%)

NG50

Total indels

Indel ratio (%)

Quality value (QV)

Acinetobacter

Causalcall

92.45

86.18

3,826,077

42.23

3,826,077

270,228

7.06

11.99

pittii 16-377-0801

Bonito_CRF-fast

96.64

89.29

3,628,317

38.82

3,628,317

242,373

6.68

12.03

 

Bonito_CTC

96.87

91.44

3,676,821

38.9

3,676,821

210,496

5.72

12.45

 

SACall

96.68

89.42

3,699,232

38.7

3,699,232

247,997

6.7

12.1

 

Dorado-fast

96.37

88.72

3,839,847

39.09

3,839,847

245,016

6.38

12.03

 

RUBICALL

96.87

91.51

3,694,086

38.82

3,694,086

208,748

5.65

15.42

 

Reference

100

100

3,814,719

38.78

3,814,719

0

0

-

Haemophilus

Causalcall

0.00

0.00

0

0

0

0

0

NA

haemolyticus

Bonito_CRF-fast

88.76

91.51

2,046,024

37.98

2,046,024

128,481

6.28

12.25

M1C132_1

Bonito_CTC

96.87

90.70

1,957,480

38.87

1,957,480

118,253

6.04

15.34

 

SACall

90.11

88.45

2,032,994

38.22

1,880,730

134,702

6.63

13.15

 

Dorado-fast

89.42

88.97

2,110,860

39.49

2,110,860

129,503

6.14

12.38

 

RUBICALL

96.87

90.54

1,966,781

38.92

1,966,781

119,777

6.09

15.37

 

Reference

100

100

2,042,591

38.46

2,042,591

0

0

-

Klebsiella

Causalcall

92.45

87.35

4,959,127

56.9

4,959,127

353,550

7.13

10.54

pneumoniae

Bonito_CRF-fast

92.69

87.53

4,761,297

57.19

4,761,297

347,299

7.29

10.56

INF032

Bonito_CTC

94.50

90.20

4,897,352

56.65

4,897,352

317,428

6.48

11.26

 

SACall

93.97

88.08

4,874,880

56.87

4,874,880

379,028

7.78

10.8

 

Dorado-fast

93.00

87.69

5,063,562

56.8

5,063,562

348,572

6.88

10.64

 

RUBICALL

94.51

90.30

4,924,240

56.85

4,924,240

314,651

6.39

11.27

 

Reference

100

100

5,111,537

57.63

5,111,537

0

0

-

Klebsiella

Causalcall

91.44

87.36

5,288,166

56.94

5,288,166

374,162

7.08

10.84

pneumoniae

Bonito_CRF-fast

92.08

88.49

5,052,889

56.8

5,052,889

357,354

7.07

10.93

INF042

Bonito_CTC

93.12

90.49

5,111,083

56.61

5,111,083

317,075

6.2

11.40

 

SACall

92.93

88.60

5,149,039

56.72

5,149,039

369,388

7.17

11.08

 

Dorado-fast

90.21

88.20

5,737,059

56.44

5,401,717

342,141

5.96

10.98

 

RUBICALL

93.12

90.60

5,146,050

56.72

5,146,050

312,448

6.07

11.42

 

Reference

100

100

5,337,491

57.41

5,337,491

0

0

-

Klebsiella

Causalcall

91.58

86.97

5,175,311

57.09

5,175,311

363,807

7.03

10.88

pneumoniae

Bonito_CRF-fast

90.24

88.00

4,932,626

56.71

4,932,626

357,769

7.25

10.86

KSB2_1B

Bonito_CTC

93.07

90.11

5,003,377

56.69

5,003,377

320,519

6.41

11.41

 

SACall

93.58

88.19

5,034,408

56.79

5,034,408

372,380

7.4

11.16

 

Dorado-fast

90.28

87.67

5,442,186

56.72

5,261,731

349,387

6.42

11.03

 

RUBICALL

93.07

89.89

5,023,639

56.75

4,932,626

357,769

7.12

11.25

 

Reference

100

100

5,228,889

57.59

5,228,889

0

0

-

Klebsiella

Causalcall

89.08

86.01

5,158,874

56.78

5,158,874

389,676

7.55

11.75

pneumoniae

Bonito_CRF-fast

92.17

89.34

4,942,833

57.01

4,942,833

355,690

7.2

11.47

NUH29

Bonito_CTC

94.36

90.26

4,918,147

57.04

4,918,147

324,406

6.6

11.92

 

SACall

93.66

88.58

4,978,307

57.06

4,978,307

360,950

7.25

11.56

 

Dorado-fast

92.27

88.12

5,195,594

57.01

5,195,594

355,728

6.85

11.56

 

RUBICALL

94.36

90.43

4,940,813

57.18

4,940,813

316,019

6.4

11.83

 

Reference

100

100

5,134,281

57.61

5,134,281

0

0

-

Serratia

Causalcall

89.91

86.23

5,532,953

57.86

5,422,052

401,545

7.26

13.39

marcescens

Bonito_CRF-fast

96.06

89.56

5,479,812

58.85

5,282,474

345,351

6.3

12.66

17-147-1671

Bonito_CTC

96.76

91.38

5,534,329

58.41

5,316,651

298,982

5.4

13

 

SACall

94.29

89.36

5,366,913

58.57

5,366,913

358,954

6.69

12.27

 

Dorado-fast

96.51

88.87

5,758,989

58.29

5,282,474

348,968

6.06

12.5

 

RUBICALL

96.76

91.59

5,597,251

58.52

5,346,640

294,643

5.26

13.01

 

Reference

100

100

5,517,578

59.13

5,517,578

0

0

-

Staphylococcus

Causalcall

94.35

87.29

2,849,123

36.59

2,810,038

191,730

6.73

10.8

aureus

Bonito_CRF-fast

96.27

91.49

2,790,895

33.05

2,752,169

149,623

5.36

11.59

CAS38_02

Bonito_CTC

97.03

93.57

2,858,986

32.86

2,819,356

123,542

4.32

12.82

 

SACall

95.66

91.25

2,837,503

32.91

2,798,079

165,200

5.82

11.57

 

Dorado-fast

96.70

91.16

2,927,882

33.52

2,752,169

152,216

5.2

11.64

 

RUBICALL

97.03

93.36

2,860,885

33.24

2,821,276

124,795

4.36

12.59

 

Reference

100

100

2,902,076

32.82

2,902,076

0

0

-

Stenotrophomonas

Causalcall

94.85

85.73

4,823,177

63.66

4,823,177

366,228

7.59

11.01

maltophilia

Bonito_CRF-fast

94.60

89.74

4,596,898

65.5

4,596,898

337,040

7.33

11.10

17_G_0092_Kos

Bonito_CTC

95.42

90.14

4,664,226

64.82

4,664,226

298,711

6.4

11.51

 

SACall

95.28

88.50

4,672,540

64.98

4,672,540

339,853

7.27

11.11

 

Dorado-fast

92.99

87.70

4,854,007

63.99

4,854,007

337,105

6.94

11.01

 

RUBICALL

95.46

90.49

4,693,744

65.03

4,693,744

289,073

6.16

11.63

 

Reference

100

100

4,802,733

66.28

4,802,733

0

0

-

Human

Causalcall

NA

NA

130,962

42.95

13,522

NA

NA

NA

HG002

Bonito_CRF-fast

0.002

92.36

119,570,537

40.34

368,848

2860

0

18.87

 

Bonito_CTC

0.430

95.06

134,732,516

40.86

371,590

384,243

0.29

18.58

 

SACall

NA

NA

63,025,520

39.87

320,873

NA

NA

NA

 

Dorado-fast

0.001

93.15

121,146,376

39.8

361,677

926

0

17.46

 

RUBICALL

0.125

94.50

140,928,248

40.99

393,950

100,256

0.1

17.81

 

Reference

100

100

2,947,743,500

40.79

2,947,743,500

0

0

-