Skip to main content

Table 5 ALLPATHS, Velvet and EULER assemblies of five microbial genomes: base accuracy, misassemblies, and long-range validity

From: ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads

Base accuracy    
   Quality, from class I to III chunks    
S. aureus Q59 Q33 Q32
E. coli Q67 Q34 Q30
R. sphaeroides >Q60 Q35 Q29
S. pombe Q42 Q37 Q29
N. crassa Q39 Q32 Q28
   % in class IV to V chunks    
S. aureus 0.0% 6.6% 8.2%
E. coli 0.0% 6.9% 7.0%
R. sphaeroides 0.3% 3.0% 4.7%
S. pombe 0.5% 3.4% 7.2%
N. crassa 2.2% 13.6% 23.2%
Long-range validity    
   At 100-kb distance    
S. aureus 100.0% 45.6% 99.8%
E. coli 100.0% 86.4% N/A
R. sphaeroides 100.0% 75.8% N/A
S. pombe 99.8% 80.2% 100.0%
N. crassa 99.8% 13.6% N/A
  1. For five genomes, statistics from ALLPATHS, Velvet, and EULER assemblies are shown. Base accuracy: the inferred quality score for the bases in chunk classes I to III, obtained by dividing the number of errors in these chunks by the total number of bases, and then taking -10*log10 of this quantity. Misassemblies: total fraction of bases in chunks in classes IV to V. Long-range validity: we chose 10,000 pairs of 100-base sequences at random from the scaffolds, where the two sequences were separated by 100,000 bases, considering only cases where both sequences could be placed uniquely on the reference. We scored the placement as 'valid' if both sequences were placed on the same chromosome, in the same orientation, with separation 100,000 ± 25% bases. We report the fraction of valid placements. N/A: all scaffolds were of length <100 kb.