Table 1 ALLPATHS, Velvet and EULER assemblies of five microbial genomes: source data

  S. aureus E. coli R. sphaeroides S. pombe N. crassa
Source data      
   Strain USA300 K12 MG1655 2.4.1 972 h 74A
   Reference sequence Finished, curated Finished, curated Finished, curated Finished Finished
   GC composition (%) 33 51 69 36 49
   Genome size (kb) 2,873 4,639 4,603 12,554 39,226
   Reference N50 (kb) 2,873 4,639 3,188 4,509 665
   Sequence coverage (x) 89 139 370 148 123
  1. For five genomes, statistics from ALLPATHS, Velvet, and EULER assemblies are shown. In all cases identical data were supplied to both programs. Contigs of size <1,000 bases were discarded. Strain: strain that was sequenced. GC composition: percent of GC base pairs in the finished reference sequence. Genome size: total number of bases in reference. Reference N50: N50 size of reference. Reference sequences: all were finished, and in addition the reference sequences for the three bacterial genomes were curated to ensure full concordance with the samples, as described in the main text. S. aureus, circular chromosome and two plasmids; E. coli, circular chromosome; R. sphaeroides, finished but differs at 374 base positions from our sample, two circular chromosomes, four circular plasmids, and one linear plasmid; S. pombe, three linear chromosomes; N. crassa, finished at the base level, 251 contigs from 7 chromosomes. Sequence coverage: total coverage by usable reads; for a precise definition see Table S3 in Additional data file 1.