Skip to main content

Advertisement

Volume 12 Supplement 1

Beyond the Genome 2011

A cost-effective and universal strategy for complete prokaryotic genome sequencing proposed by computer simulation

Article metrics

  • 719 Accesses

Background

Pyrosequencing techniques allow scientists to perform prokaryotic genome sequencing and achieve draft sequences within a few days. However, the sequencing results always turn out to contain several hundred contigs. A multiplex PCR procedure is then needed to fill all of the gaps and to link the contigs into one full-length genome sequence [110]. The full-length prokaryotic genome sequence is the gold standard for comparative prokaryotic genome analysis. This study assessed pyrosequencing strategies by using a simulation with 100 prokaryotic genomes.

Results

Our simulation shows the following: first, a single-end 454 Jr Titanium run combined with a paired-end 454 Jr Titanium run may assemble about 90% of 100 genomes into <10 scaffolds and 95% of 100 genomes into <150 contigs; second, the average contig N50 size is more than 331 kb (Table 1); third, the average single base accuracy is >99.99% (Table 1); fourth, the average false gene duplication rate is <0.7% (Table 1); fifth, the average false gene loss rate is <0.4% (Table 1); sixth, the total size of long repeats (both repeat length >300 bp and >700 bp) is significantly correlated to the number of contigs (Table 4); and, seventh, increasing the read length of a pyrosequencing run could improve the assembly quality significantly (Table 1, 2, 3).

Table 1 Main average indices for different sequencing strategies for 100 genomes (400-bp read length)
Table 2 Main average indices for different sequencing strategies for 100 genomes (100-bp read length)
Table 3 Main average indices for different sequencing strategies for 100 genomes (200-bp read length)
Table 4 Linear regression results for 100 genomes, between the genome quality indicators and, for various read lengths, the number of repeats in the genome, the total repeat length of the genome and the percentage of the total repeat length of the genome

Conclusions

A single-end 454 Jr run combined with a paired-end 454 Jr run is a good strategy for prokaryotic genome sequencing. This strategy provides a solution to producing a high-quality draft genome sequence of almost any prokaryotic organism, selected at random, within days. It could be the first step to achieving the full-length genome sequence. It also makes the subsequent multiplex PCR procedure (for gap filling) much easier, aided by the knowledge of the orders/orientations of most of the contigs. As a result, large-scale full-length prokaryotic genome-sequencing projects could be finished within weeks.

References

  1. 1.

    Arnold IC, Zigova Z, Holden M, Lawley TD, Rad R, Dougan G, Falkow S, Bentley SD, Müller A: Comparative whole genome sequence analysis of the carcinogenic bacterial model pathogen Helicobacter felis. Genome Biol Evol. 2011, 3: 302-308. 10.1093/gbe/evr022.

  2. 2.

    Stephan R, Lehner A, Tischler P, Rattei T: Complete genome sequence of Cronobacter turicensis LMG 23827, a food-borne pathogen causing deaths in neonates. J Bacteriol. 2011, 193: 309-310. 10.1128/JB.01162-10.

  3. 3.

    Wibberg D, Blom J, Jaenicke S, Kollin F, Rupp O, Scharf B, Schneiker-Bekel S, Sczcepanowski R, Goesmann A, Setubal JC, Schmitt R, Pühler A, Schlüter A: Complete genome sequencing of Agrobacterium sp. H13-3, the former Rhizobium lupini H13-3, reveals a tripartite genome consisting of a circular and a linear chromosome and an accessory plasmid but lacking a tumor-inducing Ti-plasmid. J Biotechnol. 2011, 155: 50-62. 10.1016/j.jbiotec.2011.01.010.

  4. 4.

    Song JY, Jeong H, Yu DS, Fischbach MA, Park HS, Kim JJ, Seo JS, Jensen SE, Oh TK, Lee KJ, Kim JF: Draft genome sequence of Streptomyces clavuligerus NRRL 3585, a producer of diverse secondary metabolites. J Bacteriol. 2010, 192: 6317-6318. 10.1128/JB.00859-10.

  5. 5.

    Gao F, Wang Y, Liu YJ, Wu XM, Lv X, Gan YR, Song SD, Huang H: Genome sequence of Acinetobacter baumannii MDR-TJ. J Bacteriol. 2011, 193: 2365-2366. 10.1128/JB.00226-11.

  6. 6.

    Powney R, Smits THM, Sawbridge T, Frey B, Blom J, Frey JE, Plummer KM, Beer SV, Luck J, Duffy B, Rodoni B: Genome sequence of an Erwinia amylovora strain with pathogenicity restricted to Rubus plants. J Bacteriol. 2011, 193: 785-786. 10.1128/JB.01352-10.

  7. 7.

    Nam SH, Choi SH, Kang A, Kim DW, Kim RN, Kim A, Kim DS, Park HS: Genome sequence of Lactobacillus farciminis KCTC 3681. J Bacteriol. 2011, 193: 1790-1791. 10.1128/JB.00003-11.

  8. 8.

    Chen C, Kittichotirat W, Chen W, Downey JS, Si Y, Bumgarner R: Genome sequence of naturally competent Aggregatibacter actinomycetemcomitans serotype a strain D7S-1. J Bacteriol. 2010, 192: 2643-2644. 10.1128/JB.00157-10.

  9. 9.

    Seth-Smith HMB, Harris SR, Rance R, West AP, Severin JA, Ossewaarde JM, Cutcliffe LT, Skilton RJ, Marsh P, Parkhill J, Clarke IN, Thomson NR: Genome sequence of the zoonotic pathogen Chlamydophila psittaci. J Bacteriol. 2011, 193: 1282-1283. 10.1128/JB.01435-10.

  10. 10.

    Lyons E, Freeling M, Kustu S, Inwood W: Using genomic sequencing for classical genetics in E. coli K12. PLoS ONE. 2011, 6: e16717-10.1371/journal.pone.0016717.

Download references

Author information

Correspondence to Jingwei Jiang.

Rights and permissions

Reprints and Permissions

About this article

Keywords

  • Titanium
  • Genome Sequence
  • Genome Analysis
  • Good Strategy
  • Sequencing Result