Variation in global codon usage bias among prokaryotic organisms is associated with their lifestyles
© Botzman and Margalit; licensee BioMed Central Ltd. 2011
Received: 13 April 2011
Accepted: 27 October 2011
Published: 27 October 2011
It is widely acknowledged that synonymous codons are used unevenly among genes in a genome. In organisms under translational selection, genes encoding highly expressed proteins are enriched with specific codons. This phenomenon, termed codon usage bias, is common to many organisms and has been recognized as influencing cellular fitness. This suggests that the global extent of codon usage bias of an organism might be associated with its phenotypic traits.
To test this hypothesis we used a simple measure for assessing the extent of codon bias of an organism, and applied it to hundreds of sequenced prokaryotes. Our analysis revealed a large variability in this measure: there are organisms showing very high degrees of codon usage bias and organisms exhibiting almost no differential use of synonymous codons among different genes. Remarkably, we found that the extent of codon usage bias corresponds to the lifestyle of the organism. Especially, organisms able to live in a wide range of habitats exhibit high extents of codon usage bias, consistent with their need to adapt efficiently to different environments. Pathogenic prokaryotes also demonstrate higher extents of codon usage bias than non-pathogenic prokaryotes, in accord with the multiple environments that many pathogens occupy. Our results show that the previously observed correlation between growth rate and metabolic variability is attributed to their individual associations with codon usage bias.
Our results suggest that the extent of codon usage bias of an organism plays a role in the adaptation of prokaryotes to their environments.
The genetic code is composed of triplets of four nucleotide types for 20 amino acids. This redundancy implies the use of synonymous codons - different codons encoding the same amino acid. Synonymous codons may differ in their frequency of occurrence among different genes within an organism, a phenomenon known as 'codon usage bias' . It was demonstrated that many bacteria and yeast undergo translational selection, with highly expressed genes preferentially using codons assumed to be translated faster and/or more accurately by the ribosome [2, 3]. Previous works suggested that these codons are the ones matching abundant tRNAs, which are organism-specific [3–7]. Other works demonstrated additional factors affecting the frequencies of the synonymous codons in an organism, such as the genome GC content [8, 9]. Thus, the preferred codons per amino acid may vary between different organisms, based on their tRNA repertoire and other factors.
While it was claimed that most prokaryotes undergo translational selection , it is conceivable that various organisms may differ in the extent of codon usage bias across their genes and in the forces determining it. For several organisms, such as Escherichia coli and Saccharomyces cerevisiae, a positive correlation between codon bias of genes and their protein levels was demonstrated (for example, [11, 12]), suggesting that in those organisms translational selection is predominant. Other organisms, such as Helicobacter pylori , show almost no differential use of synonymous codons among different genes. While in the former organisms a mixture of mutation bias and natural selection underlies codon usage bias, in the latter organisms codon usage bias is mostly explained by mutation and very weak, if any, translational selection . It was suggested by Andersson and Kurland  and recently substantiated by Kudla et al.  that selection towards highly adapted codons in highly expressed proteins has a global effect on the cell. By this convention, high expression of certain genes is mainly achieved by various mechanisms, such as regulation of transcription and/or translation initiation, and the use of well adapted codons in these genes guarantees efficient recycling of the ribosomes, resulting in an increase in cellular fitness. This might be reflected in a relatively enhanced translation of the whole proteome. It is thus intriguing to examine the association between the extent of codon usage bias of various organisms and their phenotypic traits towards understanding the environmental conditions where high extent of codon usage bias is advantageous. Of note, difference in the usage of synonymous codons that is uniform across all genes in a genome and probably arises from non-selective processes is not addressed here, but the different usage of codons among genes in a genome.
We regard an organism as 'biased' or 'unbiased' in association with its codon usage if the distribution of synonymous codons in its highly expressed genes differs from that in other genes in the genome. Several measures were proposed to estimate the extent of codon usage bias at an organism scale, enabling the classification of an organism as biased or unbiased [14, 17–19]. Here we present such a measure based on the Codon Adaptation Index (CAI)  of individual genes. The CAI of a gene ranges between 0 and 1, with higher values indicating the use of more preferred codons. The extent of preference of each codon is determined by its frequency among the codons in genes encoding ribosomal proteins, using the latter as proxy for highly expressed genes. An organism-scale measure is obtained by computing CAIave, the average of the CAI values of all genes in a genome . Since per definition highly expressed genes are assigned high CAI values, low or high CAIave values indicate whether there is a difference in codon usage between highly expressed genes and the rest of the genes in a genome. A low CAIave of an organism implies that there are many genes with low CAI values, and therefore preferred codons are assigned only to a small group of highly expressed genes. Accordingly, low CAIave values are indicative of biased organisms. A high CAIave of an organism implies that the CAI values of most genes are similar to those of genes encoding the ribosomal proteins, and therefore there is no differential use of synonymous codons among the genes encoded in that organism. Such organisms are unbiased in regard to their codon usage. Another measure for the extent of codon usage bias of an organism is based on another measure of gene codon usage bias, the Nc (the effective number of codons) . This gene measure ranges between 20 and 61, with lower values indicating the use of less codon types per amino acid along a protein-coding gene, which most often are the more preferred codons. To evaluate the extent of codon usage bias of an organism, a measure based on the difference between the average Nc values of the ribosomal genes and the average Nc values of the rest of the genes in the genome, Ncdiff, is computed . Organisms with high values of Ncdiff exhibit large extents of codon usage bias, and vice versa. The two genomic measures, CAIave and Ncdiff, are highly correlated (Figure S1 in Additional file 1; Pearson r = -0.91, P ≈ 0, n = 1,169).
Here we used these measures to characterize the extent of codon usage bias in 773 prokaryotic species representing 1,169 sequenced genomes. We demonstrate a wide range in the extent of codon usage bias among the various organisms, and trace the possible sources of this variation. Our results indicate that similarity in the extents of codon usage bias of different organisms reflects a similar ecological strategy they share.
Organisms differ in their extent of codon usage bias
We analyzed the extent of codon usage bias in 773 organisms: 699 bacteria and 74 archaea, representing 1,169 genomes (Materials and methods). Since the results for CAIave and NCdiff are highly correlated, we present here the results for the first measure and in Additional file 1 the results for the second measure (Figures S2, S4, S5 and S6 in Additional file 1).
To substantiate this result we investigated in each genome the difference in the usage of each codon between highly expressed genes and the rest of the genes encoded in the genome. For each organism we calculated the frequency of each codon out of the total codons encoding the corresponding amino acid, once in the ribosomal genes and once in the rest of the genes. We then calculated the average difference between codon frequencies in the ribosomal genes and in the rest of the genes. Comparison of these average differences and CAIave values across genomes revealed a strong negative correlation (Pearson r = -0.8593, P = 2.2639E-152, n = 518). This result reinforces our view that organisms with low CAIave are under stronger translational selection, resulting in differences between the codons in the highly expressed genes and the codons in the rest of the genes, while in organisms with high CAIave there is almost no differential use of codons between highly and weakly expressed genes. The latter unbiased organisms can be of two types: either the synonymous codons of both ribosomal and other genes are determined based on the GC content of the genome, or most genes use the preferred codons (that fit the abundant tRNAs or follow other rules that determine codon preference). To this end we looked at the association between CAIave and the average Nc of a genome in all 1,169 organisms. Low average Nc values imply that all genes in the genome use a specific set of codons and high average Nc values represent genomes where all genes use a mixture of codons. Indeed, we find among genomes with high CAIave these two types of trends (Figure S3a in Additional file 1). As to the biased genomes (with CAIave below the average of 0.59), most of them use relatively many codons across their genes (average Nc values above 40), but apparently they use specific codons in their highly expressed genes, represented by the ribosomal genes. In general, it seems that the average values of Nc are strongly determined by the GC content of the genome (Figure S3b in Additional file 1). Especially, as we noted above, in genomes with extreme GC content, where the possibilities of codon variation are severely restricted by the specific nucleotide repertoire, the average Nc values are low.
Organisms at the right tail of the CAIave distribution are the least biased. The prokaryote with the smallest extent of codon usage bias (highest CAIave value, 0.82) is Syntrophothermus lipocalidus, a thermophilic, syntrophic, fatty-acid-oxidizing anaerobe that belongs to the Clostridia class  (Figure 2c). Other prokaryotes with almost as high CAIave values are quite diverse: Geobacter metallireducens (CAIave value of 0.79) is an anaerobic bacterium that uses iron oxides as the electron acceptor in the oxidation of organic compounds to carbon dioxide , Nitrosococcus watsoni (CAIave value of 0.78) is an aerobic marine bacterium, and Coxiella burnetti (CAIave value of 0.78) is a facultative, intracellular pathogenic bacterium that causes the Q fever.
Biased organisms can be classified by their phenotypic traits
Classification of organisms by phenotypic traits
Number of prokaryotes
Phenotypic traits rather than phylogenetic relatedness underlie the similarities in codon usage bias between organisms
The empirical association between growth rate and metabolic variability is attributed to their individual associations with codon usage bias
Correlations and partial correlations between growth rate, habitat type and global codon usage bias
Growth rate and habitat type
Growth rate and codon usage bias
Habitat type and codon usage bias
It is widely acknowledged that synonymous codons are used unevenly among genes in a genome, with genes encoding highly expressed proteins being enriched with specific codons. It is still under debate whether the biased use of certain codons in highly expressed genes is one of the causes or the result of the high expression level. On the one hand, it was shown that the use of certain codons affects directly the speed of translation  and its accuracy , and codon optimization is even used to elevate the levels of proteins expressed outside their original context [28, 29]. On the other hand, Kudla et al.  showed that the variation in the levels of proteins translated from synthetic green fluorescent protein constructs, varying only at synonymous sites, was not correlated with the codon usage. They found that high expression was not associated with specific codons but with avoidance of secondary structure at the translation initiation site. This supports the proposition  that selection for well adapted codons in highly expressed genes does not affect directly the level of individual proteins, but provides a global benefit to the cell, as it assures efficient recycling of the ribosomes, which leads to an increase in cellular fitness. It should be noted that avoidance of secondary structure at the 5' end of the mRNA is only one mechanistic strategy that may lead to high levels of gene expression , and other mechanisms could underlie high levels of expression as well. These include high transcription level from strong promoters, high stability of the mRNA and/or efficient translation initiation by optimal Shine-Dalgarno sequences. Thus, different highly expressed genes might use well adopted codons to improve cellular fitness, independent of the molecular mechanism underlying their high expression. The premise by which translation selection for preferred codons in highly expressed genes has a global effect motivated us to investigate its association with the phenotypic traits of a wide range of organisms. It should be noted that our conclusions are not affected by whether or not the suggested global effect is accompanied by local effects on the translation efficiency.
There have been various attempts to explain what makes certain codons preferred over others in highly expressed genes, from correspondence to abundant tRNAs to physical considerations regarding the optimal stability of codon-anticodon interaction (summarized in ). Here we have not dealt with these various possible types of preferred codons but regarded them as the codons used by highly expressed genes in a genome, based on the premise that selection favored the most adapted codons in highly expressed genes. Hence, our analysis was based on measures that compare the codon usage of all genes to the codon usage of highly expressed genes, represented by the set of ribosomal genes in each genome (CAIave and Ncdiff). Such a comparative measure should provide us information on the selection forces that act on individual genes and on the whole genome. Indeed, we find a wide variation in these measures across genomes (Figure 2a; Figure S2 in Additional file 1), where some organisms are highly biased and others show only very slight, if any, difference between the codon usage in ribosomal genes and other genes. It is possible that the lack of selection implied for some of the unbiased genomes actually reflects their small population size , but in the absence of a reliable measure of effective population size in bacteria, we are unable to assess this further. In biased genomes selection acts only on genes that are highly expressed, to assure the overall translation mechanism to operate efficiently. Thus, our analysis is in line with previous observations  that there are organisms where translational selection is operational (biased genomes) and others where it is not (unbiased). In our study we used the set of ribosomal genes as a representative set of highly expressed genes. When data of gene expression in many prokaryotes become available it should be possible to extend our study by using sets of highly expressed genes based on their measured expression levels. It would be possible then to divide the organisms into those with high and low variation in gene expression, and to examine how the level of variation in gene expression is reflected in CAIave values.
We found that pathogenic prokaryotes have statistically significantly lower CAIave than non-pathogenic prokaryotes, but with a substantial overlap between the histograms of those two groups. This result might be surprising in view of previous studies that found that pathogenic lifestyle is linked to relaxation of selection [32–34], which should be linked to reduced codon bias. It is possible to reconcile the discrepancy by dividing the pathogenic and the non-pathogenic prokaryotes into two subtypes: some are capable of living in multiple environments and some stay mainly host-associated. When we compared pathogenic and non-pathogenic groups of host-associated prokaryotes there was no statistically significant difference in their CAIave values (P = 0.06 by Mann-Whitney test), but when we compared pathogenic and non-pathogenic groups of prokaryotes living in multiple habitats, the pathogenic prokaryotes showed statistically significantly lower CAIave values (P = 5.624E-7 by Mann-Whitney test). Therefore, it seems that pathogenic host-associated microbes like C. burnetti, which is an intracellular pathogen with an extremely high CAIave value of 0.78, are not exposed to strong selection, while other pathogens that are able to live in multiple habitats are still under stronger selection than non-pathogenic prokaryotes living in multiple habitats when it comes to selection on codon usage.
Previous studies discussed the correspondence between ecology preferences and codon adaptation [35, 36]. Our results suggest that organisms may adjust to metabolic variability by maintaining a high extent of codon usage bias (reflected by their low CAIave values). Previous studies analyzed the association between codon adaptation and growth rate and between growth rate and metabolic variability [14, 17, 24, 25]. One study showed that most bacterial organisms choose one of two alternative ecological strategies: living in multiple habitats with a large extent of co-habitation or living in a specialized niche in which the co-habitation is limited. It was shown that growth rate is statistically significantly correlated with metabolic variability encountered by an organism, suggesting a universal principle by which metabolic flexibility is associated with a need to grow fast, perhaps because of the greater extent of competition . Independently, Rocha demonstrated an association between bacteria with large extents of codon usage bias and fast growth, and also an association with the number of tRNA genes [17, 24]. It was demonstrated that fast growing bacteria have more tRNA genes of fewer types, and suggested that the translation in those organisms depends on fast tRNA diffusion to the ribosome. That study proposed that co-evolution of the tRNA pool and the codon usage bias allows more efficient translation of highly expressed genes, and that the codon usage bias in highly expressed genes relative to the rest of the genome is predicted to be under stronger selection in fast growing organisms. Our results tie these two results together and suggest that translational selection towards most adapted codons in highly expressed genes is operational in organisms that live in variable environments, enabling them to efficiently address the metabolic variability and the competition.
Codon usage bias in highly expressed genes was suggested to have a global effect on the cell, increasing cellular fitness. Here we perform the first large-scale study that examines the relationship between codon usage bias and the phenotypic traits of prokaryotic organisms. Our analysis revealed a large variation in the global extent of codon bias among prokaryotic genomes, which is associated with their lifestyles. Especially, we discovered that organisms living in multiple habitats, including facultative organisms, mesophiles and pathogenic bacteria, exhibit high extents of codon usage bias, consistent with their need to adapt efficiently to different environments.
Materials and methods
We retrieved 1,169 genome sequences from the NCBI Entrez Genome Project database . For species that had sequenced genomes for more than one subspecies, we maintained one representative subspecies, which was chosen randomly. This resulted in a data set of 773 prokaryotes. From this list, only prokaryotes with GC content larger than 35% and smaller than 65% were included for further analyses, resulting in a data set of 518 prokaryotes.
Computation of codon bias measures at an organism scale
Comparison between groups of organisms
The prokaryotes were classified according to their environmental characteristics (oxygen requirement, salinity, temperature range and habitat) and also whether they are pathogenic, based on the documentation in the NCBI Entrez Genome Project database (detailed in Additional file 2). The number of organisms annotated with each property is detailed in Table 1. Comparisons between the value distributions of organism measures (CAIave or NCdiff) were performed by Mann-Whitney or Kruskal-Wallis tests.
Phylogenetic distances between organisms
The pairwise distances across the phylogenetic tree were based on the tree generated in  and computed as the path length between two organisms through the most recent common ancestor. We used 2,016 possible pairs of 64 bacteria and 78 possible pairs of 13 archaea in this analysis.
Correlations and partial correlations between growth rate, habitat type and codon usage bias
This analysis included 82 prokaryotes, for which we had information for both their growth rates and habitat types. The growth rates were obtained from . The prokaryotes' environmental annotations were obtained from the NCBI Entrez Genome Project database (multiple habitat-living organisms were annotated as 0, and specialized organisms were annotated as 1). The extents of codon usage bias were represented as CAIave values. We repeated the analysis using a subset of 24 organisms, which are all phylogentically remote from each other, based on the phylgenetic tree used above .
Codon Adaptation Index.
This study was supported by the Israel Science Foundation administered by the Israeli Academy for Sciences and Humanities. We thank Ruth Hershberg for her critical comments on the manuscript. We are grateful to all our lab members for fruitful discussions.
- Grantham R, Gautier C, Gouy M, Mercier R, Pave A: Codon catalog usage and the genome hypothesis. Nucleic Acids Res. 1980, 8: r49-r62.PubMedPubMed CentralGoogle Scholar
- Gouy M, Gautier C: Codon usage in bacteria: correlation with gene expressivity. Nucleic Acids Res. 1982, 10: 7055-7074. 10.1093/nar/10.22.7055.PubMedPubMed CentralView ArticleGoogle Scholar
- Bennetzen JL, Hall BD: Codon selection in yeast. J Biol Chem. 1982, 257: 3026-3031.PubMedGoogle Scholar
- Post LE, Nomura M: DNA sequences from the str operon of Escherichia coli. J Biol Chem. 1980, 255: 4660-4666.PubMedGoogle Scholar
- Ikemura T: Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J Mol Biol. 1981, 151: 389-409. 10.1016/0022-2836(81)90003-6.PubMedView ArticleGoogle Scholar
- Ikemura T: Codon usage and tRNA content in unicellular and multicellular organisms. Mol Biol Evol. 1985, 2: 13-34.PubMedGoogle Scholar
- Dong H, Nilsson L, Kurland CG: Co-variation of tRNA abundance and codon usage in Escherichia coli at different growth rates. J Mol Biol. 1996, 260: 649-663. 10.1006/jmbi.1996.0428.PubMedView ArticleGoogle Scholar
- Chen SL, Lee W, Hottes AK, Shapiro L, McAdams HH: Codon usage between genomes is constrained by genome-wide mutational processes. Proc Natl Acad Sci USA. 2004, 101: 3480-3485. 10.1073/pnas.0307827100.PubMedPubMed CentralView ArticleGoogle Scholar
- Hershberg R, Petrov DA: General rules for optimal codon choice. PLoS Genet. 2009, 5: e1000556-10.1371/journal.pgen.1000556.PubMedPubMed CentralView ArticleGoogle Scholar
- Supek F, Skunca N, Repar J, Vlahovicek K, Smuc T: Translational selection is ubiquitous in prokaryotes. PLoS Genet. 2010, 6: e1001004-10.1371/journal.pgen.1001004.PubMedPubMed CentralView ArticleGoogle Scholar
- Lithwick G, Margalit H: Hierarchy of sequence-dependent features associated with prokaryotic translation. Genome Res. 2003, 13: 2665-2673. 10.1101/gr.1485203.PubMedPubMed CentralView ArticleGoogle Scholar
- Ghaemmaghami S, Huh WK, Bower K, Howson RW, Belle A, Dephoure N, O'Shea EK, Weissman JS: Global analysis of protein expression in yeast. Nature. 2003, 425: 737-741. 10.1038/nature02046.PubMedView ArticleGoogle Scholar
- Lafay B, Atherton JC, Sharp PM: Absence of translationally selected synonymous codon usage bias in Helicobacter pylori. Microbiology. 2000, 146: 851-860.PubMedView ArticleGoogle Scholar
- Sharp PM, Bailes E, Grocock RJ, Peden JF, Sockett RE: Variation in the strength of selected codon usage bias among bacteria. Nucleic Acids Res. 2005, 33: 1141-1153. 10.1093/nar/gki242.PubMedPubMed CentralView ArticleGoogle Scholar
- Andersson SG, Kurland CG: Codon preferences in free-living microorganisms. Microbiol Rev. 1990, 54: 198-210.PubMedPubMed CentralGoogle Scholar
- Kudla G, Murray AW, Tollervey D, Plotkin JB: Coding-sequence determinants of gene expression in Escherichia coli. Science. 2009, 324: 255-258. 10.1126/science.1170160.PubMedPubMed CentralView ArticleGoogle Scholar
- Rocha EP: Codon usage bias from tRNA's point of view: redundancy, specialization, and efficient decoding for translation optimization. Genome Res. 2004, 14: 2279-2286. 10.1101/gr.2896904.PubMedPubMed CentralView ArticleGoogle Scholar
- dos Reis M, Savva R, Wernisch L: Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res. 2004, 32: 5036-5044. 10.1093/nar/gkh834.PubMedView ArticleGoogle Scholar
- Suzuki H, Saito R, Tomita M: Measure of synonymous codon usage diversity among genes in bacteria. BMC Bioinformatics. 2009, 10: 167-10.1186/1471-2105-10-167.PubMedPubMed CentralView ArticleGoogle Scholar
- Sharp PM, Li WH: The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 1987, 15: 1281-1295. 10.1093/nar/15.3.1281.PubMedPubMed CentralView ArticleGoogle Scholar
- Wright F: The 'effective number of codons' used in a gene. Gene. 1990, 87: 23-29. 10.1016/0378-1119(90)90491-9.PubMedView ArticleGoogle Scholar
- Sekiguchi Y, Kamagata Y, Nakamura K, Ohashi A, Harada H: Syntrophothermus lipocalidus gen. nov., sp. nov., a novel thermophilic, syntrophic, fatty-acid-oxidizing anaerobe which utilizes isobutyrate. Int J Syst Evol Microbiol. 2000, 50: 771-779. 10.1099/00207713-50-2-771.PubMedView ArticleGoogle Scholar
- Lovley DR, Phillips EJ: Novel mode of microbial energy metabolism: organic carbon oxidation coupled to dissimilatory reduction of iron or manganese. Appl Environ Microbiol. 1988, 54: 1472-1480.PubMedPubMed CentralGoogle Scholar
- Vieira-Silva S, Rocha EP: The systemic imprint of growth and its uses in ecological (meta)genomics. PLoS Genet. 2010, 6: e1000808-10.1371/journal.pgen.1000808.PubMedPubMed CentralView ArticleGoogle Scholar
- Freilich S, Kreimer A, Borenstein E, Yosef N, Sharan R, Gophna U, Ruppin E: Metabolic-network-driven analysis of bacterial ecological strategies. Genome Biol. 2009, 10: R61-10.1186/gb-2009-10-6-r61.PubMedPubMed CentralView ArticleGoogle Scholar
- Tuller T, Carmi A, Vestsigian K, Navon S, Dorfan Y, Zaborske J, Pan T, Dahan O, Furman I, Pilpel Y: An evolutionarily conserved mechanism for controlling the efficiency of protein translation. Cell. 2010, 141: 344-354. 10.1016/j.cell.2010.03.031.PubMedView ArticleGoogle Scholar
- Akashi H: Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics. 1994, 136: 927-935.PubMedPubMed CentralGoogle Scholar
- Deng T: Bacterial expression and purification of biologically active mouse c-Fos proteins by selective codon optimization. FEBS Lett. 1997, 409: 269-272. 10.1016/S0014-5793(97)00522-X.PubMedView ArticleGoogle Scholar
- Gustafsson C, Govindarajan S, Minshull J: Codon bias and heterologous protein expression. Trends Biotechnol. 2004, 22: 346-353. 10.1016/j.tibtech.2004.04.006.PubMedView ArticleGoogle Scholar
- Gu W, Zhou T, Wilke CO: A universal trend of reduced mRNA stability near the translation-initiation site in prokaryotes and eukaryotes. PLoS Comput Biol. 2010, 6: e1000664-10.1371/journal.pcbi.1000664.PubMedPubMed CentralView ArticleGoogle Scholar
- dos Reis M, Wernisch L: Estimating translational selection in eukaryotic genomes. Mol Biol Evol. 2009, 26: 451-461. 10.1093/molbev/msn272.PubMedPubMed CentralView ArticleGoogle Scholar
- Hershberg R, Lipatov M, Small PM, Sheffer H, Niemann S, Homolka S, Roach JC, Kremer K, Petrov DA, Feldman MW, Gagneux S: High functional diversity in Mycobacterium tuberculosis driven by genetic drift and human demography. PLoS Biol. 2008, 6: e311-10.1371/journal.pbio.0060311.PubMedPubMed CentralView ArticleGoogle Scholar
- Hershberg R, Tang H, Petrov DA: Reduced selection leads to accelerated gene loss in Shigella. Genome Biol. 2007, 8: R164-10.1186/gb-2007-8-8-r164.PubMedPubMed CentralView ArticleGoogle Scholar
- Ochman H, Moran NA: Genes lost and genes found: evolution of bacterial pathogenesis and symbiosis. Science. 2001, 292: 1096-1099. 10.1126/science.1058543.PubMedView ArticleGoogle Scholar
- Man O, Pilpel Y: Differential translation efficiency of orthologous genes is involved in phenotypic divergence of yeast species. Nat Genet. 2007, 39: 415-421. 10.1038/ng1967.PubMedView ArticleGoogle Scholar
- Jiang H, Guan W, Pinney D, Wang W, Gu Z: Relaxation of yeast mitochondrial functions after whole-genome duplication. Genome Res. 2008, 18: 1466-1471. 10.1101/gr.074674.107.PubMedPubMed CentralView ArticleGoogle Scholar
- Maglott D, Ostell J, Pruitt KD, Tatusova T: Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 2007, 35: D26-31. 10.1093/nar/gkl993.PubMedPubMed CentralView ArticleGoogle Scholar
- Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P: Toward automatic reconstruction of a highly resolved tree of life. Science. 2006, 311: 1283-1287. 10.1126/science.1123061.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.