Skip to main content

Table 1 ALLPATHS, Velvet and EULER assemblies of five microbial genomes: source data

From: ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads

 

S. aureus

E. coli

R. sphaeroides

S. pombe

N. crassa

Source data

     

   Strain

USA300

K12 MG1655

2.4.1

972 h

74A

   Reference sequence

Finished, curated

Finished, curated

Finished, curated

Finished

Finished

   GC composition (%)

33

51

69

36

49

   Genome size (kb)

2,873

4,639

4,603

12,554

39,226

   Reference N50 (kb)

2,873

4,639

3,188

4,509

665

   Sequence coverage (x)

89

139

370

148

123

  1. For five genomes, statistics from ALLPATHS, Velvet, and EULER assemblies are shown. In all cases identical data were supplied to both programs. Contigs of size <1,000 bases were discarded. Strain: strain that was sequenced. GC composition: percent of GC base pairs in the finished reference sequence. Genome size: total number of bases in reference. Reference N50: N50 size of reference. Reference sequences: all were finished, and in addition the reference sequences for the three bacterial genomes were curated to ensure full concordance with the samples, as described in the main text. S. aureus, circular chromosome and two plasmids; E. coli, circular chromosome; R. sphaeroides, finished but differs at 374 base positions from our sample, two circular chromosomes, four circular plasmids, and one linear plasmid; S. pombe, three linear chromosomes; N. crassa, finished at the base level, 251 contigs from 7 chromosomes. Sequence coverage: total coverage by usable reads; for a precise definition see Table S3 in Additional data file 1.