Skip to main content

Table 5 ALLPATHS, Velvet and EULER assemblies of five microbial genomes: base accuracy, misassemblies, and long-range validity

From: ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads

 

ALLPATHS

Velvet

EULER

Base accuracy

   

   Quality, from class I to III chunks

   

S. aureus

Q59

Q33

Q32

E. coli

Q67

Q34

Q30

R. sphaeroides

>Q60

Q35

Q29

S. pombe

Q42

Q37

Q29

N. crassa

Q39

Q32

Q28

Misassemblies

   

   % in class IV to V chunks

   

S. aureus

0.0%

6.6%

8.2%

E. coli

0.0%

6.9%

7.0%

R. sphaeroides

0.3%

3.0%

4.7%

S. pombe

0.5%

3.4%

7.2%

N. crassa

2.2%

13.6%

23.2%

Long-range validity

   

   At 100-kb distance

   

S. aureus

100.0%

45.6%

99.8%

E. coli

100.0%

86.4%

N/A

R. sphaeroides

100.0%

75.8%

N/A

S. pombe

99.8%

80.2%

100.0%

N. crassa

99.8%

13.6%

N/A

  1. For five genomes, statistics from ALLPATHS, Velvet, and EULER assemblies are shown. Base accuracy: the inferred quality score for the bases in chunk classes I to III, obtained by dividing the number of errors in these chunks by the total number of bases, and then taking -10*log10 of this quantity. Misassemblies: total fraction of bases in chunks in classes IV to V. Long-range validity: we chose 10,000 pairs of 100-base sequences at random from the scaffolds, where the two sequences were separated by 100,000 bases, considering only cases where both sequences could be placed uniquely on the reference. We scored the placement as 'valid' if both sequences were placed on the same chromosome, in the same orientation, with separation 100,000 ± 25% bases. We report the fraction of valid placements. N/A: all scaffolds were of length <100 kb.