Skip to main content

Table 4 Results of the error correction experiment

From: GraphAligner: rapid and versatile sequence-to-graph alignment

Dataset

Method

# Reads

Bases (Mbp)

Aligned reads (%)

Aligned bases (%)

N50 (bp)

Genome fraction (%)

Error rate (%)

CPU time (hh:mm:ss)

Peak memory (GB)

E. coli

Original

85460

748.0

97.0

92.0

13990

100

13.1237

-

-

PacBio

LoRDEC

85316

716.5

97.9

92.9

13484

100

1.3902

10:11:28

5.0

 

LoRDEC-clip

129754

654.5

99.9

99.8

8206

100

0.0881

10:11:28

5.0

 

FMRLC

85260

706.5

97.7

94.8

13364

100

0.3016

4:16:43

2.6

 

GraphAligner

85271

710.7

97.7

93.9

13411

100

0.5057

0:23:08

5.8

 

GraphAligner- clip

91909

673.9

99.9

99.8

12146

100

0.0240

0:23:08

5.8

Fruit fly

Original

642255

4609.5

84.4

82.5

11956

98.77

16.1650

-

-

ONT

FMRLC

641956

4646.9

89.6

85.1

12087

98.62

2.3250

65:17:52

9.2

 

GraphAligner

640548

4653.7

90.7

85.6

12109

98.63

1.2433

15:12:30

11.9

 

GraphAligner- clip

762073

4188.3

99.3

94.7

8698

97.86

0.7087

15:12:30

11.9

HG00733

Original

2394990

48801.0

95.6

92.8

33109

95.27

13.5384

-

-

PacBio

FMRLC

2392533

48229.9

98.3

92.7

32823

95.19

7.1210

2222:13:44

234.5

 

GraphAligner

2390656

48216.2

98.1

94.6

32879

94.89

3.3510

174:54:13

76.7

 

GraphAligner-clip

8252956

42292.0

99.8

98.3

7973

91.91

1.3503

174:54:13

76.7

  1. Reads shorter than 500 base pairs are discarded. The remaining reads were aligned to the reference using minimap2 [13], and the statistics were given by samtools [46] stats, except N50 which is calculated by a script from Zhang et al. [40] and resource use which are measured by “/usr/bin/time -v”