Skip to main content

Table 4 ALLPATHS, Velvet and EULER assemblies of five microbial genomes: correctness (of chunks approximately 10 kb or less)

From: ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads

 

ALLPATHS

Velvet

EULER

Class I: error rate 0

   

   S. aureus

99.3%

70.7%

51.7%

   E. coli

99.8%

68.7%

42.1%

   R. sphaeroides

99.7%

71.8%

18.9%

   S. pombe

79.7%

66.2%

31.4%

   N. crassa

78.6%

49.9%

19.1%

Class II: error rate <0.1%

   

   S. aureus

0.7%

13.7%

26.4%

   E. coli

0.2%

18.0%

24.1%

   R. sphaeroides

0.0%

19.3%

39.2%

   S. pombe

18.6%

26.6%

32.6%

   N. crassa

15.3%

24.3%

24.1%

Class III: error rate <1%

   

   S. aureus

0.0%

9.1%

13.7%

   E. coli

0.0%

6.4%

26.8%

   R. sphaeroides

0.0%

5.9%

37.0%

   S. pombe

1.3%

3.7%

28.7%

   N. crassa

3.2%

11.4%

32.3%

Class IV: error rate <10%

   

   S. aureus

0.0%

6.2%

5.9%

   E. coli

0.0%

5.9%

5.3%

   R. sphaeroides

0.2%

2.0%

3.2%

   S. pombe

0.3%

2.6%

5.1%

   N. crassa

1.3%

7.9%

12.8%

Class V: error rate ≥10%

   

   S. aureus

0.0%

0.4%

2.3%

   E. coli

0.0%

1.0%

1.7%

   R. sphaeroides

0.1%

1.0%

1.5%

   S. pombe

0.2%

0.8%

2.1%

   N. crassa

0.9%

5.7%

10.4%

Class VI: error rate, no match

   

   S. aureus

0.0%

0.0%

0.0%

   E. coli

0.0%

0.0%

0.0%

   R. sphaeroides

0.0%

0.1%

0.2%

   S. pombe

0.0%

0.0%

0.0%

   N. crassa

0.7%*

0.9%*

1.4%*

  1. For five genomes, statistics from ALLPATHS, Velvet, and EULER assemblies are shown. Correctness: contigs were divided into approximately 10 kb chunks, leaving smaller contigs intact. Subject to the caveat that the reference sequences might have some errors (likely greater for the fungi), we assayed the absolute accuracy of each chunk by finding the minimum number of errors (substitution or indel bases) among all alignments of it to the reference sequence for the genome. The table shows the distribution of the bases in the chunks according to their accuracy. Chunks having no 100-mer match are separately classified (class VI). *For N. crassa, some AT-rich regions that are missing from the reference [23] appear as novel sequence in the assemblies.