Skip to main content

Volume 12 Supplement 1

Beyond the Genome 2011

  • Poster presentation
  • Published:

A cost-effective and universal strategy for complete prokaryotic genome sequencing proposed by computer simulation

Background

Pyrosequencing techniques allow scientists to perform prokaryotic genome sequencing and achieve draft sequences within a few days. However, the sequencing results always turn out to contain several hundred contigs. A multiplex PCR procedure is then needed to fill all of the gaps and to link the contigs into one full-length genome sequence [1–10]. The full-length prokaryotic genome sequence is the gold standard for comparative prokaryotic genome analysis. This study assessed pyrosequencing strategies by using a simulation with 100 prokaryotic genomes.

Results

Our simulation shows the following: first, a single-end 454 Jr Titanium run combined with a paired-end 454 Jr Titanium run may assemble about 90% of 100 genomes into <10 scaffolds and 95% of 100 genomes into <150 contigs; second, the average contig N50 size is more than 331 kb (Table 1); third, the average single base accuracy is >99.99% (Table 1); fourth, the average false gene duplication rate is <0.7% (Table 1); fifth, the average false gene loss rate is <0.4% (Table 1); sixth, the total size of long repeats (both repeat length >300 bp and >700 bp) is significantly correlated to the number of contigs (Table 4); and, seventh, increasing the read length of a pyrosequencing run could improve the assembly quality significantly (Table 1, 2, 3).

Table 1 Main average indices for different sequencing strategies for 100 genomes (400-bp read length)
Table 2 Main average indices for different sequencing strategies for 100 genomes (100-bp read length)
Table 3 Main average indices for different sequencing strategies for 100 genomes (200-bp read length)
Table 4 Linear regression results for 100 genomes, between the genome quality indicators and, for various read lengths, the number of repeats in the genome, the total repeat length of the genome and the percentage of the total repeat length of the genome

Conclusions

A single-end 454 Jr run combined with a paired-end 454 Jr run is a good strategy for prokaryotic genome sequencing. This strategy provides a solution to producing a high-quality draft genome sequence of almost any prokaryotic organism, selected at random, within days. It could be the first step to achieving the full-length genome sequence. It also makes the subsequent multiplex PCR procedure (for gap filling) much easier, aided by the knowledge of the orders/orientations of most of the contigs. As a result, large-scale full-length prokaryotic genome-sequencing projects could be finished within weeks.

References

  1. Arnold IC, Zigova Z, Holden M, Lawley TD, Rad R, Dougan G, Falkow S, Bentley SD, Müller A: Comparative whole genome sequence analysis of the carcinogenic bacterial model pathogen Helicobacter felis. Genome Biol Evol. 2011, 3: 302-308. 10.1093/gbe/evr022.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  2. Stephan R, Lehner A, Tischler P, Rattei T: Complete genome sequence of Cronobacter turicensis LMG 23827, a food-borne pathogen causing deaths in neonates. J Bacteriol. 2011, 193: 309-310. 10.1128/JB.01162-10.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  3. Wibberg D, Blom J, Jaenicke S, Kollin F, Rupp O, Scharf B, Schneiker-Bekel S, Sczcepanowski R, Goesmann A, Setubal JC, Schmitt R, Pühler A, Schlüter A: Complete genome sequencing of Agrobacterium sp. H13-3, the former Rhizobium lupini H13-3, reveals a tripartite genome consisting of a circular and a linear chromosome and an accessory plasmid but lacking a tumor-inducing Ti-plasmid. J Biotechnol. 2011, 155: 50-62. 10.1016/j.jbiotec.2011.01.010.

    Article  PubMed  CAS  Google Scholar 

  4. Song JY, Jeong H, Yu DS, Fischbach MA, Park HS, Kim JJ, Seo JS, Jensen SE, Oh TK, Lee KJ, Kim JF: Draft genome sequence of Streptomyces clavuligerus NRRL 3585, a producer of diverse secondary metabolites. J Bacteriol. 2010, 192: 6317-6318. 10.1128/JB.00859-10.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  5. Gao F, Wang Y, Liu YJ, Wu XM, Lv X, Gan YR, Song SD, Huang H: Genome sequence of Acinetobacter baumannii MDR-TJ. J Bacteriol. 2011, 193: 2365-2366. 10.1128/JB.00226-11.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  6. Powney R, Smits THM, Sawbridge T, Frey B, Blom J, Frey JE, Plummer KM, Beer SV, Luck J, Duffy B, Rodoni B: Genome sequence of an Erwinia amylovora strain with pathogenicity restricted to Rubus plants. J Bacteriol. 2011, 193: 785-786. 10.1128/JB.01352-10.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  7. Nam SH, Choi SH, Kang A, Kim DW, Kim RN, Kim A, Kim DS, Park HS: Genome sequence of Lactobacillus farciminis KCTC 3681. J Bacteriol. 2011, 193: 1790-1791. 10.1128/JB.00003-11.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  8. Chen C, Kittichotirat W, Chen W, Downey JS, Si Y, Bumgarner R: Genome sequence of naturally competent Aggregatibacter actinomycetemcomitans serotype a strain D7S-1. J Bacteriol. 2010, 192: 2643-2644. 10.1128/JB.00157-10.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  9. Seth-Smith HMB, Harris SR, Rance R, West AP, Severin JA, Ossewaarde JM, Cutcliffe LT, Skilton RJ, Marsh P, Parkhill J, Clarke IN, Thomson NR: Genome sequence of the zoonotic pathogen Chlamydophila psittaci. J Bacteriol. 2011, 193: 1282-1283. 10.1128/JB.01435-10.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  10. Lyons E, Freeling M, Kustu S, Inwood W: Using genomic sequencing for classical genetics in E. coli K12. PLoS ONE. 2011, 6: e16717-10.1371/journal.pone.0016717.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, J., Li, J. & Leung, F.C. A cost-effective and universal strategy for complete prokaryotic genome sequencing proposed by computer simulation. Genome Biol 12 (Suppl 1), P6 (2011). https://doi.org/10.1186/gb-2011-12-s1-p6

Download citation

  • Published:

  • DOI: https://doi.org/10.1186/gb-2011-12-s1-p6

Keywords