- Poster presentation
- Open Access
A cost-effective and universal strategy for complete prokaryotic genome sequencing proposed by computer simulation
Genome Biologyvolume 12, Article number: P6 (2011)
Pyrosequencing techniques allow scientists to perform prokaryotic genome sequencing and achieve draft sequences within a few days. However, the sequencing results always turn out to contain several hundred contigs. A multiplex PCR procedure is then needed to fill all of the gaps and to link the contigs into one full-length genome sequence [[1–10]]. The full-length prokaryotic genome sequence is the gold standard for comparative prokaryotic genome analysis. This study assessed pyrosequencing strategies by using a simulation with 100 prokaryotic genomes.
Our simulation shows the following: first, a single-end 454 Jr Titanium run combined with a paired-end 454 Jr Titanium run may assemble about 90% of 100 genomes into <10 scaffolds and 95% of 100 genomes into <150 contigs; second, the average contig N50 size is more than 331 kb (Table 1); third, the average single base accuracy is >99.99% (Table 1); fourth, the average false gene duplication rate is <0.7% (Table 1); fifth, the average false gene loss rate is <0.4% (Table 1); sixth, the total size of long repeats (both repeat length >300 bp and >700 bp) is significantly correlated to the number of contigs (Table 4); and, seventh, increasing the read length of a pyrosequencing run could improve the assembly quality significantly (Table 1, 2, 3).
A single-end 454 Jr run combined with a paired-end 454 Jr run is a good strategy for prokaryotic genome sequencing. This strategy provides a solution to producing a high-quality draft genome sequence of almost any prokaryotic organism, selected at random, within days. It could be the first step to achieving the full-length genome sequence. It also makes the subsequent multiplex PCR procedure (for gap filling) much easier, aided by the knowledge of the orders/orientations of most of the contigs. As a result, large-scale full-length prokaryotic genome-sequencing projects could be finished within weeks.
Arnold IC, Zigova Z, Holden M, Lawley TD, Rad R, Dougan G, Falkow S, Bentley SD, Müller A: Comparative whole genome sequence analysis of the carcinogenic bacterial model pathogenHelicobacter felis.Genome Biol Evol 2011, 3:302–308.
Stephan R, Lehner A, Tischler P, Rattei T: Complete genome sequence ofCronobacter turicensisLMG 23827, a food-borne pathogen causing deaths in neonates.J Bacteriol 2011, 193:309–310.
Wibberg D, Blom J, Jaenicke S, Kollin F, Rupp O, Scharf B, Schneiker-Bekel S, Sczcepanowski R, Goesmann A, Setubal JC, Schmitt R, Pühler A, Schlüter A: Complete genome sequencing ofAgrobacteriumsp. H13–3, the formerRhizobium lupiniH13–3, reveals a tripartite genome consisting of a circular and a linear chromosome and an accessory plasmid but lacking a tumor-inducing Ti-plasmid.J Biotechnol 2011, 155:50–62.
Song JY, Jeong H, Yu DS, Fischbach MA, Park HS, Kim JJ, Seo JS, Jensen SE, Oh TK, Lee KJ, Kim JF: Draft genome sequence ofStreptomyces clavuligerusNRRL 3585, a producer of diverse secondary metabolites.J Bacteriol 2010, 192:6317–6318.
Gao F, Wang Y, Liu YJ, Wu XM, Lv X, Gan YR, Song SD, Huang H: Genome sequence ofAcinetobacter baumanniiMDR-TJ.J Bacteriol 2011, 193:2365–2366.
Powney R, Smits THM, Sawbridge T, Frey B, Blom J, Frey JE, Plummer KM, Beer SV, Luck J, Duffy B, Rodoni B: Genome sequence of anErwinia amylovorastrain with pathogenicity restricted toRubusplants.J Bacteriol 2011, 193:785–786.
Nam SH, Choi SH, Kang A, Kim DW, Kim RN, Kim A, Kim DS, Park HS: Genome sequence ofLactobacillus farciminisKCTC 3681.J Bacteriol 2011, 193:1790–1791.
Chen C, Kittichotirat W, Chen W, Downey JS, Si Y, Bumgarner R: Genome sequence of naturally competentAggregatibacter actinomycetemcomitansserotype a strain D7S-1.J Bacteriol 2010, 192:2643–2644.
Seth-Smith HMB, Harris SR, Rance R, West AP, Severin JA, Ossewaarde JM, Cutcliffe LT, Skilton RJ, Marsh P, Parkhill J, Clarke IN, Thomson NR: Genome sequence of the zoonotic pathogenChlamydophila psittaci.J Bacteriol 2011, 193:1282–1283.
Lyons E, Freeling M, Kustu S, Inwood W: Using genomic sequencing for classical genetics inE. coliK12.PLoS ONE 2011, 6:e16717.