Skip to main content

Advertisement

Genes and genomes, an imperfect world: comparison of gene annotations of two Bos taurus draft assemblies

Article metrics

Background

Gene annotation is the first and most important step in analyzing a genome. As an increasing number of species are being sequenced and assembled to various degrees of completion, two key questions are: how does the quality of the assembled sequence affect gene annotation? What is the impact on scientists using the annotation?

Results

We compared the gene annotations produced with the GNOMON [1] annotation pipeline for two Bos taurus genome assemblies produced at the University of Maryland [2, 3]. We find that the changes to the assembly from one release to the next, which were quite substantial, made significant differences in the gene sets and gene structures, and implicitly the predicted proteins. The annotation was affected by the availability of new gene evidence and by seemingly rare genome mis-assembly events and local sequence variations. For instance, although the later assembly is generally superior, hundreds of protein coding genes in the earlier assembly are missing from the annotation of the later genome, and ~15% (~3600) of the genes have complex structural differences between the two assemblies. In addition, 15-20% of the predicted proteins have relatively large sequence differences when compared to their Ref-seq models.

Conclusions

These findings highlight the consequences of genome quality on gene annotation, and argue for continued improvements in a genome sequence until it is truly finished. They also demonstrate the challenges that confront a biologist when tracking a gene of interest between different assembly versions. In addition, these analyses have helped us identify specific loci for improvement, which will benefit the user community of the Bos taurusgenome.

References

  1. 1.

    The NCBI GNOMON eukaryotic gene prediction tool. [http://www.ncbi.nlm.nih.gov/genome/guide/gnomon.shtml]

  2. 2.

    Zimin AV, Delcher AL, Florea L, Kelley DR, Schatz MC, Puiu D, Hanrahan F, Pertea G, Van Tassell CP, Sonstegard TS, et al: A whole-genome assembly of the domestic cow, Bos taurus. Genome. Biol. 2009, 10: R42-10.1186/gb-2009-10-4-r42.

  3. 3.

    Bos taurus assembly site at the University of Maryland. [http://www.cbcb.umd.edu/research/bos_taurus_assembly.shtml]

Download references

Author information

Correspondence to Liliana Florea.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Florea, L., Souvorov, A. & Salzberg, S.L. Genes and genomes, an imperfect world: comparison of gene annotations of two Bos taurus draft assemblies. Genome Biol 11, P13 (2010) doi:10.1186/gb-2010-11-s1-p13

Download citation

Keywords

  • Protein Code Gene
  • Genome Assembly
  • Gene Annotation
  • Sequence Difference
  • Local Sequence