- Poster presentation
- Open Access
Genes and genomes, an imperfect world: comparison of gene annotations of two Bos taurus draft assemblies
Genome Biology volume 11, Article number: P13 (2010)
Gene annotation is the first and most important step in analyzing a genome. As an increasing number of species are being sequenced and assembled to various degrees of completion, two key questions are: how does the quality of the assembled sequence affect gene annotation? What is the impact on scientists using the annotation?
We compared the gene annotations produced with the GNOMON  annotation pipeline for two Bos taurus genome assemblies produced at the University of Maryland [2, 3]. We find that the changes to the assembly from one release to the next, which were quite substantial, made significant differences in the gene sets and gene structures, and implicitly the predicted proteins. The annotation was affected by the availability of new gene evidence and by seemingly rare genome mis-assembly events and local sequence variations. For instance, although the later assembly is generally superior, hundreds of protein coding genes in the earlier assembly are missing from the annotation of the later genome, and ~15% (~3600) of the genes have complex structural differences between the two assemblies. In addition, 15-20% of the predicted proteins have relatively large sequence differences when compared to their Ref-seq models.
These findings highlight the consequences of genome quality on gene annotation, and argue for continued improvements in a genome sequence until it is truly finished. They also demonstrate the challenges that confront a biologist when tracking a gene of interest between different assembly versions. In addition, these analyses have helped us identify specific loci for improvement, which will benefit the user community of the Bos taurusgenome.
The NCBI GNOMON eukaryotic gene prediction tool. [http://www.ncbi.nlm.nih.gov/genome/guide/gnomon.shtml]
Zimin AV, Delcher AL, Florea L, Kelley DR, Schatz MC, Puiu D, Hanrahan F, Pertea G, Van Tassell CP, Sonstegard TS, et al: A whole-genome assembly of the domestic cow, Bos taurus. Genome. Biol. 2009, 10: R42-10.1186/gb-2009-10-4-r42.
Bos taurus assembly site at the University of Maryland. [http://www.cbcb.umd.edu/research/bos_taurus_assembly.shtml]
About this article
Cite this article
Florea, L., Souvorov, A. & Salzberg, S.L. Genes and genomes, an imperfect world: comparison of gene annotations of two Bos taurus draft assemblies. Genome Biol 11, P13 (2010) doi:10.1186/gb-2010-11-s1-p13
- Protein Code Gene
- Genome Assembly
- Gene Annotation
- Sequence Difference
- Local Sequence