K r /K c but not d N /d S correlates positively with body mass in birds, raising implications for inferring lineage-specific selection
© Weber et al.; licensee BioMed Central Ltd. 2014
Received: 30 May 2014
Accepted: 13 November 2014
Published: 11 December 2014
The ratio of the rates of non-synonymous and synonymous substitution (d N /d S ) is commonly used to estimate selection in coding sequences. It is often suggested that, all else being equal, d N /d S should be lower in populations with large effective size (N e) due to increased efficacy of purifying selection. As N e is difficult to measure directly, life history traits such as body mass, which is typically negatively associated with population size, have commonly been used as proxies in empirical tests of this hypothesis. However, evidence of whether the expected positive correlation between body mass and d N /d S is consistently observed is conflicting.
Employing whole genome sequence data from 48 avian species, we assess the relationship between rates of molecular evolution and life history in birds. We find a negative correlation between d N /d S and body mass, contrary to nearly neutral expectation. This raises the question whether the correlation might be a method artefact. We therefore in turn consider non-stationary base composition, divergence time and saturation as possible explanations, but find no clear patterns. However, in striking contrast to d N /d S , the ratio of radical to conservative amino acid substitutions (K r /K c ) correlates positively with body mass.
Our results in principle accord with the notion that non-synonymous substitutions causing radical amino acid changes are more efficiently removed by selection in large populations, consistent with nearly neutral theory. These findings have implications for the use of d N /d S and suggest that caution is warranted when drawing conclusions about lineage-specific modes of protein evolution using this metric.
It has long been established that different lineages evolve at heterogeneous rates , and that differences in organismal life history are reflected by rates of molecular evolution. This is readily observed in terms of lineage-specific nucleotide divergence, with small-bodied species with shorter generations tending to evolve more quickly than their larger relatives -. While this has been proposed to be a consequence of the higher number of germ cell divisions per unit time , the precise cause for the pattern remains unclear .
Another side effect of body size variation between lineages manifests in population size differences, as small species tend to have larger populations . This in turn might impact the prevalence of both positive and negative selection in genome evolution relative to drift. Over long timescales, the distribution of weakly selected mutations is expected to be affected by population size, with fixation probability depending on the product of N e and the selection coefficient, s . Mutations with small selective effects close to the reciprocal of N e will behave effectively neutrally ,-. Non-synonymous substitutions are on average under greater selective constraint than synonymous substitutions. As a consequence, nearly neutral theory predicts that d N /d S should be lower in large populations , as relatively more slightly deleterious non-synonymous changes are removed due to increased selection efficiency when N e is high. Consistent with this idea, pathogenic bacteria and endosymbionts have accelerated rates of protein evolution compared to their free-living relatives, as might be expected given their smaller population sizes ,. Empirical studies on mammals employing body mass as a proxy for N e in absence of actual census counts  present a similar pattern, with smaller-bodied species tending to have lower d N /d S ratios than their larger counterparts - (though not all authors report a significant relationship; see ).
Nevertheless, this trend may not be universal. It was previously reported that there is no significant relationship between d N /d S and life history in a data set containing 7.6 kb of coding sequence from 19 avian genes . It is conceivable that this result was simply owing to insufficient statistical power, as the expected relationship between body mass and substitution rates was also not retrieved. However, more recently, Nabholz et al.  found that avian mitochondrial d N /d S was negatively correlated with body mass. One might expect that inherent differences between inheritance, mutation rate, recombination and gene density in the mitochondria and nucleus could lead to differences in the modulation of substitution patterns by selection . However, in mammals, signatures of purifying selection are reported to be congruent between nuclear and mitochondrial genes ,. A complementary approach to studying the effect of population size on purifying selection is to compare island to mainland species. In principle, island endemic lineages ought to maintain life histories similar to those of their mainland relatives, while experiencing a reduction in N e  (but see ). However, here the evidence is similarly equivocal for birds with both increases and decreases in d N /d S being reported ,, possibly owing to lack of power . Moreover, an increase in d N /d S is predicted with increasing environmental change , which is expected when a species colonises an island. It therefore does not follow that an increase in d N /d S , if at all present, need necessarily be owing to a reduction in population size.
Thus, whether or not large populations generally exhibit lower d N /d S as predicted by theory is not clear at present. To determine this, we need to study additional taxa in depth, considering possible confounding variables. One notable feature of avian genomes in particular is the strong variation in GC content between lineages associated with life history -. This may be of relevance, as estimation of d N /d S is known to be impacted by non-stationary base composition. A degree of caution may therefore be warranted when comparing genomes that differ substantially from one another in terms of base composition.
Making use of nuclear sequences from 48 fully sequenced bird genomes, which were recently generated to resolve the phylogeny of modern birds , we aim to characterise the relationship between life history, d N /d S and the efficacy of selection in birds. In doing so, we also examine to what extent method artefacts might influence our conclusions, examining in turn non-stationary base composition, divergence time, saturation and how examining different classes of amino acid change in relation to population size may help answer these questions.
d N /d S is elevated, not reduced, in birds with putatively larger populations
d N and d S are higher in small-bodied birds
No evidence that non-stationary base composition accounts for elevated d N /d S
Divergence time and estimation of d N /d S
Another possibility that may explain the negative relationship between d N /d S and N e is that there is a dependence of d N /d S on time. When divergence times are short, the ratio may be inflated owing to artefacts that can be statistical or biological in nature and do not reflect a genuine acceleration in the evolutionary rate. Here, both divergence times and terminal branch lengths are determined by the phylogeny considered. Explanations that have been proposed for this include segregating deleterious non-synonymous polymorphisms, the non-linear dynamics of the ratio of the two variables, and model misspecification due to failure to account for amino acid preference in different protein domains -. The time required for this effect to decay has been suggested to depend on N e , which could be potentially problematic for our data given that we find a positive correlation between body mass and time since divergence from the most recent ancestor (ρ=0.406,P=0.0127; calibration points with confidence intervals in the upper quintile were excluded), indicating shorter times for birds with larger populations. Meanwhile, d N /d S is negatively correlated with divergence time, that is, d N /d S is higher for shorter branches (ρ=−0.3288,P=0.047; note that passerines have especially short branches, see Figure 1). However, while controlling the correlation between body mass and d N /d S for divergence time leads to a reduction in the correlation coefficient (partial Spearman’s rank correlation β=−0.3211,P=0.0480, compared to ρ=−0.4106, P=0.0122 for d N /d S versus mass for the filtered data set), it does not altogether remove the relationship, which remains marginally significant. On the other hand, controlling the correlation between divergence time and d N /d S for body mass has a greater impact and renders it non-significant (β=−0.1945,P=0.2476). Finally, it should be noted that our data set mainly consists of relatively divergent lineages (>90% have divergence times 20 to 73 million years ago), where any time dependence on d N /d S should be limited.
Saturation at third codon positions may impact estimation of d S
Although d S is often used as a proxy for the mutation rate when considering the d N /d S ratio, this idea ought to be treated with caution -. A reduction in d S could be caused either by a de facto constraint on the rate of synonymous substitution, or methodological limitations such as saturation that lead to underestimation of the true rate. This is of particular concern for the estimation of d N /d S as synonymous rates might be more prone to underestimation than non-synonymous rates, because non-synonymous substitutions are generally less commonly fixed.
Radical amino acid changes are less frequent in birds with large populations
If saturation at third sites is stronger than for amino acid substitutions and/or if many non-synonymous substitutions behave as effectively neutral, we may consider an alternative metric to assess how effectively slightly deleterious changes are purged from large bird populations. Radical amino acid changes that alter the polarity or volume of a residue are more likely to be negatively selected than conservative amino acid changes, as selective effects tend to be greater where replacements involve residues with dissimilar properties ,. The ratio of radical to conservative substitutions has been suggested to be an appropriate means of testing the predictions of nearly neutral theory and overcoming saturation ,-. Here, we therefore employ K r /K c as our metric, where K r and K c respectively denote radical and conservative changes.
It should be noted that differences in base composition might affect the estimation of radical and conservative changes ,. Given the well-characterised heterogeneity in GC content between our species, we ask whether our results are robust to control for composition. While the partial correlation for K r /K c and mass controlling for GC3 is slightly reduced (β=0.3882,P=0.0057), the correlation for GC3 and K r /K c controlling for mass becomes non-significant (β=−0.0431,P=0.7770 compared to ρ=−0.3215,P=0.0298). We thus find no evidence that base composition explains our observations. Note also that composition is more homogeneous between lineages in the low-variance data but this does not diminish the correlation. These results therefore support the idea that in birds radical amino acid changes are indeed more often removed from large populations than from small populations.
Employing a data set comprising 1,185 orthologues from 48 recently sequenced bird genomes, we examined relationships between life history and lineage-specific patterns of substitution. We found no evidence of reduced d N /d S in birds with putatively higher effective population size, in apparent contradiction to nearly neutral theory. On the contrary, we consistently saw a negative correlation between body mass and lineage-specific d N /d S , similar to what was recently reported based on analyses of avian mitochondria . This is particularly striking and not necessarily expected, given the many inherent differences between nuclear and mitochondrial sequences, as well as the fact that we were able to consider a much larger data set here. Our observations contrast with reports of a positive correlation between body mass and d N /d S in mammals. However, considering the ratios of radical to conservative amino acid substitutions, we found a positive correlation between body size and K r /K c , meaning that lineages with putatively larger populations experience relatively fewer changes that alter the polarity and volume of a residue. That is, those differences that do occur in small-bodied birds may be less likely to disrupt protein function, which is in principle consistent with the notion that selection will more effectively purge deleterious changes from large populations. In contrast with d N /d S -based estimates, our amino acid substitution data (K r /K c ) therefore appear to accord with the predictions of the nearly neutral theory. If a significant proportion of non-synonymous substitutions are conservative and behave as effectively neutral, this may obscure (expected) correlations between d N /d S and life history. Thus, in this case, K r /K c could potentially be a more fine-grained measure for assessing the prevalence of protein-level selection in different lineages.
While radical amino acid mutations should be subject to stronger negative selection, it has been suggested that adaptive evolution may lead to similar proportions of radical and conservative fixation . Could increased rates of adaptive evolution in small birds be responsible for our observation that d N /d S , but not K r /K c , increases with decreasing body mass? Given a high proportion of effectively positively selected mutations, we might predict that the rate of fixation will increase with population size . However, to affect the genome-wide average substantially, positive selection would need to be common, which is unlikely to be the case in vertebrate species with modest population sizes. The absence of high-resolution diversity data limits our ability to quantify directly the prevalence of adaptive non-synonymous substitutions in our study species. Although a past survey of chicken and zebra finch divergence and diversity data estimated the frequency of amino acid changes driven to fixation by positive selection (α) to be around 20% , this value did not differ significantly from zero. Further, simulations indicate that the influence of N e on the proportion of adaptive amino acid changes is limited, impacting mainly populations under 10,000 . Since birds typically have larger N e than this, we might not necessarily expect differences in N e to lead to adaptive changes being more common in smaller-bodied species. This prediction is for instance reflected in the similar percentage of fixations driven by positive selection in Drosophila miranda and D. melanogaster despite a fivefold difference in population size .
There are several conceivable explanations for the discrepancy between our results for the relationship between d N /d S and N e and theoretical expectations. One possibility is that body mass is a poor proxy for population size in birds , but it is not clear how this alone could lead to a reversal in the sign of the correlation, though it could in principle introduce noise. Moreover, the fact that we correlated body size of a single extant species with substitution rates reflecting evolutionary processes in multiple ancestors over significant periods of time naturally means that strong relationships cannot be expected. Another is that there was limited variation in d N /d S (0.13 to 0.17), again weakening the signal in the data. Further, there was some evidence that third sites could be moderately saturated, indicating that we tend to underestimate synonymous changes for greater divergences, such as those observed in small-bodied bird lineages. How much of the variation this might explain is not clear, and divergence appears somewhat low for saturation alone to have a large impact. Given significant constraint on fourfold degenerate sites in birds , a reduction in d S could also be caused by selection on silent sites. However, there is currently no evidence for a correspondence between constraint and population size ,. Interestingly, we find that species d N /d S and d S are positively correlated (ρ=0.535,P=0.0001), counter to what one might expect given that d S is the denominator of d N /d S . This could either indicate a bias in rate estimation or merely be an artefact of the correlations between rates and life history. It is possible that multiple factors work together to produce the pattern observed. Indeed, restricting analyses to orthologues conserved across multiple species can in itself reverse already weak correlations between genomic parameters .
Further, non-stationary GC content can affect estimation of substitution rates, but we detect no clear evidence for this. Given the well-established role of GC-biased gene conversion (gBGC), in driving heterogeneity in avian base composition ,,, it could also impact substitution rates. gBGC is associated with the rate of meiotic recombination and leads to the preferential fixation of GC over AT alleles -. d N in particular has been suggested to increase near mammalian recombination hotspots in the absence of positive selection as a result -. Since small-bodied bird species tend to have increased GC content , it is tempting to speculate that d N /d S could be inflated in these lineages. In mammals, correlations between body mass and d N /d S are partly masked by the effects of gBGC overcoming weak selection . However, the impact of gBGC on global d N /d S is difficult to assess conclusively given that we do not have relevant information on rates of recombination for the majority of our study species. This should be further investigated once detailed estimates of recombination rates become available. Interestingly, no AT → GC bias is seen in rapidly diverging sequences between chicken and zebra finch .
An additional issue that could affect the estimation of d N /d S is the quality of the sequence alignments from which rates are estimated. In principle, if aligned sequences from small-bodied birds were more prone to false positive homology calls, spurious non-synonymous substitutions may be inferred, resulting in a potentially upward-biased d N /d S . While theoretically possible , removing the impact of alignment uncertainty on inferred substitution rates is currently prohibitively computationally costly. Several authors have previously discussed the impact of aligner choice on the rate of false positive inference of positive selection -, and report that certain algorithms perform better than others. We emphasise that the first pass of alignments for the data set of 48 species was performed using SATé+PRANK (see Materials and methods), and that the class of aligners that PRANK belongs to appears less prone to false positives than others -. As such, our approach ought to be as robust as is currently feasible for a data set of this size. To address these limitations conclusively, comprehensive studies on the impact of sequence divergence on alignment uncertainty as well as further advances in alignment and rate estimation methods will be needed.
We finally note that an alternative explanation might be that the discrepancy between K r /K c and d N /d S is not merely owing to methodological artefacts relating to measuring d N /d S accurately but that our naive model of how substitution rates ought to relate to population size is incomplete. The range of N e across which nearly neutral dynamics are expected to hold depends on the distribution of selective effects that is assumed . Some models propose that the distribution of selection coefficients for mutants depends on current fitness, impacting the rate of acceptance of slightly deleterious mutations -. Accordingly, it has been suggested that dependence of d N /d S on N e may be weak , with changes in population size rather than population size per se modulating d N /d S and both expansions and contractions leading to increases in the ratio ,. The rate of diversification appears to correlate positively with the rate of molecular evolution in bird but not mammalian lineages ,, tempting speculation that rapidly evolving birds are especially prone to frequent population size fluctuations. However, to explain our observations, under the size fluctuation model K r /K c would have to be relatively less sensitive than d N /d S to changes in N e and more sensitive to N e itself.
Although branch-specific estimates of d N /d S show no evidence for more efficient selection in large bird populations, K r /K c estimates appear to conform to the predictions of nearly neutral theory in birds, with small-bodied birds tending to have fewer radical amino acid changes. If, as one interpretation of our work suggests, K r /K c is more robust in certain scenarios, gathering deeper insight into the dynamics of this measure will be of broad relevance for inference of protein-level selection. Further, we suggest that the role of gBGC and how the distribution of selective effects differs between different populations will need to be elucidated to determine conclusively to what extent d N /d S is determined by population size under the nearly neutral theory of molecular evolution.
The practical implications of our observations depend partly on the precise mechanisms responsible. How, for instance, might tests for positive selection be influenced? One might imagine that an upward bias in d N /d S within a given lineage could lead to the naive assumption that a higher proportion of coding sequences with an average d N /d S >1 indicates more frequent adaptation. How branch-site tests might be affected is difficult to predict without knowing the distribution of sites that violate our assumptions of how d N and d S ought to behave. It has been suggested that branch-site models may lack power when saturation is present, but are less likely to yield false positives . This contrasts with the higher expected rate of false positives caused by alignment problems -. We also note that comparisons between species and comparisons of different classes of sequence within genomes are expected to be affected differently by certain artefacts. For instance, ecological shifts might affect lineage-specific rate estimates to a greater extent than gene-specific rates , while a constraint on d S  could impact d N /d S in both cases.
Overall, our observations suggest that a careful examination of potential sources of error is called for when interpreting evolutionary rate estimates, and that this must be done with the specific questions and data set in mind. Further, while we cannot presently conclude that radical and conservative rates are inherently more reliable for detecting negative selection, the fact that d N /d S does not consider the effects of different classes of non-synonymous change suggests that it likely presents an incomplete picture of selective processes.
Materials and methods
Data for 48 genomes
Coding sequence alignments for 48 bird species (see Additional file 2) were obtained from a recent initiative to resolve the phylogeny of modern birds; see Jarvis et al.  and Zhang et al.  for a detailed description of how these data were generated. Briefly, this data set comprises 8,295 orthologous protein-coding sequences identified by propagating chicken and zebra finch annotations to the remaining species and classifying orthology by combining information from alignment statistics, reciprocal best hits and synteny. Multiple sequence alignments were generated by running SATé+PRANK followed by SATé+MAFFT on concatenated exon sequences . Of 1,185 1:1 orthologues present in all species, 921 contained no internal stop codons. Concatenated alignments comprising the highest and lowest variance in GC3 from the same study were also considered .
Data for 169 species
To extend our taxon sampling, we also analysed 11,160 bp of sequence from 169 avian species, consisting of the coding sequences of the Hackett et al.  data set and two additional widely used phylogenetic markers, RAG1 and RAG2, which were downloaded from GenBank (see Additional file 3 for accession numbers). The marker sequences were translated into amino acids, aligned using MUSCLE  and subsequently converted back to nucleotides. These data are what we refer to as the taxon-rich set.
Life history traits
Body mass data were extracted from the CRC Handbook of Avian Body Masses  for all available tip nodes. Where multiple entries for a given species were present, the mean value was used.
For the taxon-rich data set, we used the tree of Hackett et al. . For the 48 genomes, the total evidence nucleotide tree estimated by Jarvis et al.  was used, along with corresponding time calibration points, which we considered for our divergence time analyses.
Maximum likelihood estimation
Given the difference in the sizes of the two alignment data sets, as well as in the evolutionary distances between the sampled taxa, we employed two different methods of maximum likelihood estimation. To make the analyses on the larger gene-rich data set with less dense taxon sampling tractable, we approximated branch-specific d N /d S ratios by substitution mapping using mapNH ,. We did this by fitting a homogeneous YN98  model to coding sequence alignments and subsequently mapping synonymous and non-synonymous substitutions onto individual branches. This was done separately for each orthologue from the 1:1 set that did not include an internal stop, and d N /d S was obtained by summing substitution counts prior to dividing to avoid low count numbers introducing noise. To make these numbers comparable to those from Codeml, the ratio of non-synonymous to synonymous counts was divided by 3. As the branches leading to the two eagles were too short to estimate d N /d S reliably, we considered only Haliaeetus albicilla.
d N and d S were obtained by fixing ω=1 in mapNH (following the rationale presented in Yang and Nielsen , p. 411) and multiplying the resulting normalised substitution counts by the corresponding branch lengths. This feature is implemented in the development version of Bio++ , available online .
On the other hand, for the 11-kb taxon-rich data set, rates were estimated using Codeml  with lineages grouped by taxonomic order to reduce variance in d N /d S owing to short branches. We assigned one local d N /d S for every avian order, resulting in 53 local values (see Additional file 4 for groups). Concatenating the alignments further served to reduce noise.
The ratio of radical to conservative amino acid changes (K r /K c ) for the taxon-rich data set was calculated by concatenating 1,185 1:1 orthologues, fitting a Jukes–Cantor model and mapping radical and conservative substitution counts onto the tree using mapNH. Radical changes are those that alter the polarity or volume of the residue. Here, L, I, F, M, Y, W, H, K, R, E and Q were classified as having large volumes, while Y, W, H, K, R, E, Q, T, D, N, S and C were classified as polar. Results using a WAG01 model were qualitatively similar to those calculated using the Jukes–Cantor model. Considering each orthologue individually before summing counts yielded noisy results, presumably owing to low numbers of radical amino acid substitutions in individual alignments. Overall, performance was better where a greater number of substitution counts was available, as using the full set of 8,295 orthologues yielded a slightly stronger correlation between body mass and K r /K c than when smaller subsets were considered (ρ=0.513,P=0.0003). Due to the short eagle branches, Haliaeetus leucocephalus was excluded.
Bayesian estimation of coevolution between substitution and life history
Coevol  was used on subsets of the gene-rich data set to calculate K r /K c and d N /d S . As above, the polarity and volume definition (-polvol) was used to classify amino acid changes as radical or conservative. To control the relationship between body mass and d N /d S for equilibrium base composition, we also ran Coevol with equilibrium GC as a parameter. A more detailed description of the methods used, as well as priors and calibration points, is given in Nabholz et al. .
From the 830 orthologues with the highest and lowest variance in GC3, 200 genes were randomly selected . The pairwise divergence was computed from the number of observed differences between two sequences without correction for multiple substitutions. The phylogenetic distance (that is, the patristic distance) was obtained from the sum of branch lengths between two species, computed using a phylogenetic tree estimated by maximum-likelihood using PAML. We used a GTR+GAMMA model in baseml  for the third codon position data set and WAG, an empirical substitution matrix, in Codeml  for the protein data set.
Statistics and data availability
Statistical analyses were performed in R. The genome data from the 48 bird species are available online .
GC-biased gene conversion
Computational analyses were performed using resources provided by the Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX) of Uppsala University, supported by the Swedish National Infrastructure for Computing (SNIC). The study was funded by the European Research Council (AdG 249976), Knut and Alice Wallenberg Foundation (Wallenberg Scholar Grant and the Swedish Research Council (2010-5650; all to HE). We thank Erich Jarvis for coordinating the avian phylogenomics project and, along with Guojie Zhang and Tom Gilbert, providing access to genomes. We also thank Laurent Guéguen for help with estimating d N and d S in mapNH, Carina Mugal, Jochen Wolf, Nicolas Galtier, Julien Dutheil, Simon Whelan and Laurence Hurst for helpful discussions, and two reviewers for their comments on the manuscript. This is publication ISE-M 2014-189.
- Britten R: Rates of DNA sequence evolution differ between taxonomic groups. Science. 1986, 39: 1393-1398. 10.1126/science.3082006.View ArticleGoogle Scholar
- Wu CI, Li WH: Evidence for higher rates of nucleotide substitution in rodents than in man. Proc Natl Acad Sci USA. 1985, 82: 1741-1745. 10.1073/pnas.82.6.1741.PubMedPubMed CentralView ArticleGoogle Scholar
- Ohta T: Evolutionary rate of cistrons and DNA divergence. J Mol Evol. 1972, 1: 150-157. 10.1007/BF01659161.PubMedView ArticleGoogle Scholar
- Bromham L: Why do species vary in their rate of molecular evolution?. Biol Lett. 2009, 5: 401-404. 10.1098/rsbl.2009.0136.PubMedPubMed CentralView ArticleGoogle Scholar
- Nabholz B, Glémin S, Galtier N: Strong variations of mitochondrial mutation rate across mammals – the longevity hypothesis. Mol Biol Evol. 2008, 25: 120-130. 10.1093/molbev/msm248.PubMedView ArticleGoogle Scholar
- Martin AP, Palumbi SR: Body size, metabolic rate, generation time, and the molecular clock. Proc Natl Acad Sci USA. 1993, 90: 4087-4091. 10.1073/pnas.90.9.4087.PubMedPubMed CentralView ArticleGoogle Scholar
- Welch JJ, Bininda-Emonds ORP, Bromham L: Correlates of substitution rate variation in mammalian protein-coding sequences. BMC Evol Biol. 2008, 8: 53-10.1186/1471-2148-8-53.PubMedPubMed CentralView ArticleGoogle Scholar
- Wilson Sayres MA, Venditti C, Pagel M, Makova KD: Do variations in substitution rates and male mutation bias correlate with life-history traits? A study of 32 mammalian genomes. Evolution. 2011, 65: 2800-2815. 10.1111/j.1558-5646.2011.01337.x.PubMedView ArticleGoogle Scholar
- Lartillot N, Delsuc F: Joint reconstruction of divergence times and life-history evolution in placental mammals using a phylogenetic covariance model. Evolution. 2012, 66: 1773-1787. 10.1111/j.1558-5646.2011.01558.x.PubMedView ArticleGoogle Scholar
- Bromham L: The genome as a life-history character: why rate of molecular evolution varies between mammal species. Philos Trans R Soc Lond B: Biol Sci. 2011, 366: 2503-2513. 10.1098/rstb.2011.0014.View ArticleGoogle Scholar
- Li WH, Ellsworth DL, Krushkal J, Chang BH, Hewett-Emmett D: Rates of nucleotide substitution in primates and rodents and the generation-time effect hypothesis. Mol Phylogenet Evol. 1996, 5: 182-187. 10.1006/mpev.1996.0012.PubMedView ArticleGoogle Scholar
- Thomas GWC, Hahn MW: The human mutation rate is increasing, even as it slows. Mol Biol Evol. 2014, 31: 253-257. 10.1093/molbev/mst218.PubMedView ArticleGoogle Scholar
- Damuth J: Population density and body size in mammals. Nature. 1981, 290: 699-700. 10.1038/290699a0.View ArticleGoogle Scholar
- Kimura M: On the probability of fixation of mutant genes in a population. Genetics. 1962, 47: 713-719.PubMedPubMed CentralGoogle Scholar
- Ohta T: Slightly deleterious mutant substitutions in evolution. Nature. 1973, 246: 96-98. 10.1038/246096a0.PubMedView ArticleGoogle Scholar
- Akashi H, Osada N, Ohta T: Weak selection and protein evolution. Genetics. 2012, 192: 15-31. 10.1534/genetics.112.140178.PubMedPubMed CentralView ArticleGoogle Scholar
- Ohta T, Gillespie J: Development of neutral and nearly neutral theories. Theor Popul Biol. 1996, 49: 128-142. 10.1006/tpbi.1996.0007.PubMedView ArticleGoogle Scholar
- Ohta T: Synonymous and nonsynonymous substitutions in mammalian genes and the nearly neutral theory. J Mol Evol. 1995, 40: 56-63. 10.1007/BF00166595.PubMedView ArticleGoogle Scholar
- Woolfit M, Bromham L: Increased rates of sequence evolution in endosymbiotic bacteria and fungi with small effective population sizes. Mol Biol Evol. 2003, 20: 1545-1555. 10.1093/molbev/msg167.PubMedView ArticleGoogle Scholar
- Warnecke T, Rocha EPC: Function-specific accelerations in rates of sequence evolution suggest predictable epistatic responses to reduced effective population size. Mol Biol Evol. 2011, 28: 2339-2349. 10.1093/molbev/msr054.PubMedView ArticleGoogle Scholar
- Lanfear R, Kokko H, Eyre-Walker A: Population size and the rate of evolution. Trends Ecol Evol. 2014, 29: 33-41. 10.1016/j.tree.2013.09.009.PubMedView ArticleGoogle Scholar
- Nikolaev SI, Montoya-Burgos JI, Popadin KY, Parand L, Margulies EH, Antonarakis SE: Life-history traits drive the evolutionary rates of mammalian coding and noncoding genomic elements. Proc Natl Acad Sci USA. 2007, 104: 20443-20448. 10.1073/pnas.0705658104.PubMedPubMed CentralView ArticleGoogle Scholar
- Kosiol C, Vinar T, da Fonseca RR, Hubisz MJ, Bustamante CD, Nielsen R, Siepel A: Patterns of positive selection in six mammalian genomes. PLoS Genet. 2008, 4: 1000144-10.1371/journal.pgen.1000144.View ArticleGoogle Scholar
- Romiguier J, Figuet E, Galtier N, Douzery EJP, Boussau B, Dutheil JY, Ranwez V: Fast and robust characterization of time-heterogeneous sequence evolutionary processes using substitution mapping. PLoS One. 2012, 7: 33852-10.1371/journal.pone.0033852.View ArticleGoogle Scholar
- Lartillot N: Interaction between selection and biased gene conversion in mammalian protein-coding sequence evolution revealed by a phylogenetic covariance analysis. Mol Biol Evol. 2013, 30: 356-368. 10.1093/molbev/mss231.PubMedView ArticleGoogle Scholar
- Romiguier J, Ranwez V, Douzery EJP, Galtier N: Genomic evidence for large, long-lived ancestors to placental mammals. Mol Biol Evol. 2013, 30: 5-13. 10.1093/molbev/mss211.PubMedView ArticleGoogle Scholar
- Popadin KY, Polishchuk LV, Mamirova L, Knorre D, Gunbin K: Accumulation of slightly deleterious mutations in mitochondrial protein-coding genes of large versus small mammals. Proc Natl Acad Sci USA. 2007, 104: 13390-13395. 10.1073/pnas.0701256104.PubMedPubMed CentralView ArticleGoogle Scholar
- Lanfear R, Ho SYW, Love D, Bromham L: Mutation rate is linked to diversification in birds. Proc Natl Acad Sci USA. 2010, 107: 20423-20428. 10.1073/pnas.1007888107.PubMedPubMed CentralView ArticleGoogle Scholar
- Nabholz B, Uwimana N, Lartillot N: Reconstructing the phylogenetic history of long-term effective population size and life-history traits using patterns of amino acid replacement in mitochondrial genomes of mammals and birds. Genome Biol Evol. 2013, 5: 1273-1290. 10.1093/gbe/evt083.PubMedPubMed CentralView ArticleGoogle Scholar
- Ballard JWO, Whitlock MC: The incomplete natural history of mitochondria. Mol Ecol. 2004, 13: 729-744. 10.1046/j.1365-294X.2003.02063.x.PubMedView ArticleGoogle Scholar
- Popadin KY, Nikolaev SI, Junier T, Baranova M, Antonarakis SE: Purifying selection in mammalian mitochondrial protein-coding genes is highly effective and congruent with evolution of nuclear genes. Mol Biol Evol. 2013, 30: 347-355. 10.1093/molbev/mss219.PubMedView ArticleGoogle Scholar
- Woolfit M, Bromham L: Population size and molecular evolution on islands. Proc R Soc B: Biol Sci. 2005, 272: 2277-2282. 10.1098/rspb.2005.3217.View ArticleGoogle Scholar
- Charlesworth J, Eyre-Walker A: The other side of the nearly neutral theory, evidence of slightly advantageous back-mutations. Proc Natl Acad Sci USA. 2007, 104: 16992-16997. 10.1073/pnas.0705456104.PubMedPubMed CentralView ArticleGoogle Scholar
- Wright SD, Gillman LN, Ross HA, Keeling DJ: Slower tempo of microevolution in island birds: implications for conservation biology. Evolution. 2009, 63: 2275-2287. 10.1111/j.1558-5646.2009.00717.x.PubMedView ArticleGoogle Scholar
- Johnson KP, Seger J: Elevated rates of nonsynonymous substitution in island birds. Mol Biol Evol. 2001, 18: 874-881. 10.1093/oxfordjournals.molbev.a003869.PubMedView ArticleGoogle Scholar
- Loire E, Chiari Y, Bernard A, Cahais V, Romiguier J, Nabholz B, Lourenço JM, Galtier N: Population genomics of the endangered giant Galápagos tortoise. Genome Biol. 2013, 14: 136-10.1186/gb-2013-14-12-r136.View ArticleGoogle Scholar
- Lourenço JM, Glémin S, Galtier N: The rate of molecular adaptation in a changing environment. Mol Biol Evol. 2013, 30: 1292-1301. 10.1093/molbev/mst026.PubMedView ArticleGoogle Scholar
- Weber CC, Boussau B, Romiguier J, Jarvis ED, Ellegren H: Evidence for GC-biased gene conversion as a driver of between-lineage differences in avian base composition. Genome Biol2014. doi:10.1186/s13059-014-0549-1.
- Nabholz B, Künstner A, Wang R, Jarvis ED, Ellegren H: Dynamic evolution of base composition: causes and consequences in avian phylogenomics. Mol Biol Evol. 2011, 28: 2197-2210. 10.1093/molbev/msr047.PubMedPubMed CentralView ArticleGoogle Scholar
- Jarvis ED, Mirarab S, Aberer AJ, Li B, Houde P, Li C, Ho SYW, Faircloth BC, Nabholz B, Howard JT, Suh A, Weber CC, da Fonseca RR, Li J, Zhang F, Li H, Zhou L, Narula N, Liu L, Ganapathy G, Boussau B, Bayzid MS, Zavidovych V, Subramanian S, Gabaldón T, Capella-Gutiérrez S, Huerta-Cepas J, Rekepalli B, Munch K, Schierup M, et al.: Whole-genome analyses resolve early branches in the tree of life of modern birds. Science2014. doi:10.1126/science.1253451.
- Hackett SJ, Kimball RT, Reddy S, Bowie RCK, Braun EL, Braun MJ, Chojnowski JL, Cox WA, Han K-L, Harshman J, Huddleston CJ, Marks BD, Miglia KJ, Moore WS, Sheldon FH, Steadman DW, Witt CC, Yuri T: A phylogenomic study of birds reveals their evolutionary history. Science. 2008, 320: 1763-1768. 10.1126/science.1157704.PubMedView ArticleGoogle Scholar
- Lartillot N, Poujol R: A phylogenetic model for investigating correlated evolution of substitution rates and continuous phenotypic characters. Mol Biol Evol. 2011, 28: 729-744. 10.1093/molbev/msq244.PubMedView ArticleGoogle Scholar
- Bay RA, Bielawski JP: Inference of functional divergence among proteins when the evolutionary process is non-stationary. J Mol Evol. 2013, 76: 205-215. 10.1007/s00239-013-9549-0.PubMedView ArticleGoogle Scholar
- Weber CC, Hurst LD: Protein rates of evolution are predicted by double-strand break events, independent of crossing-over rates. Genome Biol Evol. 2009, 1: 340-349. 10.1093/gbe/evp033.PubMedPubMed CentralView ArticleGoogle Scholar
- Rocha EPC, Smith JM, Hurst LD, Holden MTG, Cooper JE, Smith NH, Feil EJ: Comparisons of dN / dS are time dependent for closely related bacterial genomes. J Theoretical Biol. 2006, 239: 226-235. 10.1016/j.jtbi.2005.08.037.View ArticleGoogle Scholar
- Ho SYW, Lanfear R, Bromham L, Phillips MJ, Soubrier J, Rodrigo AG, Cooper A: Time-dependent rates of molecular evolution. Mol Ecol. 2011, 20: 3087-3101. 10.1111/j.1365-294X.2011.05178.x.PubMedView ArticleGoogle Scholar
- Peterson GI, Masel J: Quantitative prediction of molecular clock and K a / K s at short timescales. Mol Biol Evol. 2009, 26: 2595-2603. 10.1093/molbev/msp175.PubMedPubMed CentralView ArticleGoogle Scholar
- Dos Reis M, Yang Z: Why do more divergent sequences produce smaller nonsynonymous/synonymous rate ratios in pairwise sequence comparisons?. Genetics. 2013, 195: 195-204. 10.1534/genetics.113.152025.PubMedPubMed CentralView ArticleGoogle Scholar
- Mugal CF, Wolf JBW, Kaj I: Why time matters: codon evolution and the temporal dynamics of dN / dS. Mol Biol Evol. 2014, 31: 212-231. 10.1093/molbev/mst192.PubMedPubMed CentralView ArticleGoogle Scholar
- Wolf JBW, Künstner A, Nam K, Jakobsson M, Ellegren H: Nonlinear dynamics of nonsynonymous ( d N ) and synonymous ( d S ) substitution rates affects inference of selection. Genome Biol Evol. 2009, 1: 308-319. 10.1093/gbe/evp030.PubMedPubMed CentralView ArticleGoogle Scholar
- Eory L, Halligan DL, Keightley PD: Distributions of selectively constrained sites and deleterious mutation rates in the hominid and murid genomes. Mol Biol Evol. 2010, 27: 177-192. 10.1093/molbev/msp219.PubMedView ArticleGoogle Scholar
- Chamary J-V, Parmley JL, Hurst LD: Hearing silence: non-neutral evolution at synonymous sites in mammals. Nat Rev Genet. 2006, 7: 98-108. 10.1038/nrg1770.PubMedView ArticleGoogle Scholar
- Drummond DA, Wilke CO: Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell. 2008, 134: 341-352. 10.1016/j.cell.2008.05.042.PubMedPubMed CentralView ArticleGoogle Scholar
- Yampolsky LY, Kondrashov FA, Kondrashov AS: Distribution of the strength of selection against amino acid replacements in human proteins. Hum Mol Genet. 2005, 14: 3191-3201. 10.1093/hmg/ddi350.PubMedView ArticleGoogle Scholar
- Smith NGC: Are radical and conservative substitution rates useful statistics in molecular evolution?. J Mol Evol. 2003, 57: 467-478. 10.1007/s00239-003-2500-z.PubMedView ArticleGoogle Scholar
- Wernegreen JJ: Reduced selective constraint in endosymbionts: elevation in radical amino acid replacements occurs genome-wide. PLoS One. 2011, 6: 28905-10.1371/journal.pone.0028905.View ArticleGoogle Scholar
- Eyre-Walker A, Keightley PD, Smith NGC, Gaffney D: Quantifying the slightly deleterious mutation model of molecular evolution. Mol Biol Evol. 2002, 19: 2142-2149. 10.1093/oxfordjournals.molbev.a004039.PubMedView ArticleGoogle Scholar
- Axelsson E, Ellegren H: Quantification of adaptive evolution of genes expressed in avian brain and the population size effect on the efficacy of selection. Mol Biol Evol. 2009, 26: 1073-1079. 10.1093/molbev/msp019.PubMedView ArticleGoogle Scholar
- Bachtrog D: Similar rates of protein adaptation in Drosophila miranda and D. melanogaster , two species with different current effective population sizes. BMC Evol Biol. 2008, 8: 334-10.1186/1471-2148-8-334.PubMedPubMed CentralView ArticleGoogle Scholar
- Nee S, Read A, Greenwood J, Harvey P: The relationship between abundance and body size in British birds. Nature. 1991, 351: 312-313. 10.1038/351312a0.View ArticleGoogle Scholar
- Künstner A, Nabholz B, Ellegren H: Significant selective constraint at 4-fold degenerate sites in the avian genome and its consequence for detection of positive selection. Genome Biol Evol. 2011, 3: 1381-1389. 10.1093/gbe/evr112.PubMedPubMed CentralView ArticleGoogle Scholar
- Weber CC, Hurst LD: Intronic AT skew is a defendable proxy for germline transcription but does not predict crossing-over or protein evolution rates in Drosophila melanogaster. J Mol Evol. 2010, 71: 415-426. 10.1007/s00239-010-9395-2.PubMedView ArticleGoogle Scholar
- Webster MT, Axelsson E, Ellegren H: Strong regional biases in nucleotide substitution in the chicken genome. Mol Biol Evol. 2006, 23: 1203-1216. 10.1093/molbev/msk008.PubMedView ArticleGoogle Scholar
- Mugal CF, Arndt PF, Ellegren H: Twisted signatures of GC-biased gene conversion embedded in an evolutionary stable karyotype. Mol Biol Evol. 2013, 30: 1700-1712. 10.1093/molbev/mst067.PubMedPubMed CentralView ArticleGoogle Scholar
- Webster MT, Hurst LD: Direct and indirect consequences of meiotic recombination: implications for genome evolution. Trends Genet. 2012, 28: 101-109. 10.1016/j.tig.2011.11.002.PubMedView ArticleGoogle Scholar
- Duret L, Galtier N: Biased gene conversion and the evolution of mammalian genomic landscapes. Annu Rev Genomics Hum Genet. 2009, 10: 285-311. 10.1146/annurev-genom-082908-150001.PubMedView ArticleGoogle Scholar
- Duret L, Eyre-Walker A, Galtier N: A new perspective on isochore evolution. Gene. 2006, 385: 71-74. 10.1016/j.gene.2006.04.030.PubMedView ArticleGoogle Scholar
- Kostka D, Hubisz MJ, Siepel A, Pollard KS: The role of GC-biased gene conversion in shaping the fastest evolving regions of the human genome. Mol Biol Evol. 2012, 29: 1047-1057. 10.1093/molbev/msr279.PubMedPubMed CentralView ArticleGoogle Scholar
- Ratnakumar A, Mousset S, Glémin S, Berglund J, Galtier N, Duret L, Webster MT: Detecting positive selection within genomes: the problem of biased gene conversion. Philos Trans R Soc Lond B: Biol Sci. 2010, 365: 2571-2580. 10.1098/rstb.2010.0007.View ArticleGoogle Scholar
- Galtier N, Duret L, Glémin S, Ranwez V: GC-biased gene conversion promotes the fixation of deleterious amino acid changes in primates. Trends Genet. 2009, 25: 1-5. 10.1016/j.tig.2008.10.011.PubMedView ArticleGoogle Scholar
- Berglund J, Pollard KS, Webster MT: Hotspots of biased nucleotide substitutions in human genes. PLoS Biology. 2009, 7: 26-10.1371/journal.pbio.1000026.View ArticleGoogle Scholar
- Galtier N, Duret L: Adaptation or biased gene conversion? Extending the null hypothesis of molecular evolution. Trends Genet. 2007, 23: 273-277. 10.1016/j.tig.2007.03.011.PubMedView ArticleGoogle Scholar
- Capra JA, Pollard KS: Substitution patterns are GC-biased in divergent sequences across the metazoans. Genome Biol Evol. 2011, 3: 516-527. 10.1093/gbe/evr051.PubMedPubMed CentralView ArticleGoogle Scholar
- Redelings B: Erasing errors due to alignment ambiguity when estimating positive selection. Mol Biol Evol2014:1979–1993.
- Markova-Raina P, Petrov DA: High sensitivity to aligner and high rate of false positives in the estimates of positive selection in the 12 Drosophila genomes. Genome Res. 2011, 21: 863-874. 10.1101/gr.115949.110.PubMedPubMed CentralView ArticleGoogle Scholar
- Jordan G, Goldman N: The effects of alignment error and alignment filtering on the sitewise detection of positive selection. Mol Biol Evol. 2012, 29: 1125-1139. 10.1093/molbev/msr272.PubMedView ArticleGoogle Scholar
- Blackburne BP, Whelan S: Class of multiple sequence alignment algorithm affects genomic analysis. Mol Biol Evol. 2013, 30: 642-653. 10.1093/molbev/mss256.PubMedView ArticleGoogle Scholar
- Cherry JL: Should we expect substitution rate to depend on population size?. Genetics. 1998, 150: 911-919.PubMedPubMed CentralGoogle Scholar
- Goldstein RA: Population size dependence of fitness effect distribution and substitution rate probed by biophysical model of protein thermostability. Genome Biol Evol. 2013, 5: 1584-1593. 10.1093/gbe/evt110.PubMedPubMed CentralView ArticleGoogle Scholar
- Wylie CS, Shakhnovich EI: A biophysical protein folding model accounts for most mutational fitness effects in viruses. Proc Natl Acad Sci USA. 2011, 108: 9916-9921. 10.1073/pnas.1017572108.PubMedPubMed CentralView ArticleGoogle Scholar
- Goldie X, Lanfear R, Bromham L: Diversification and the rate of molecular evolution: no evidence of a link in mammals. BMC Evol Biol. 2011, 11: 286-10.1186/1471-2148-11-286.PubMedPubMed CentralView ArticleGoogle Scholar
- Gharib WH, Robinson-Rechavi M: The branch-site test of positive selection is surprisingly robust but lacks power under synonymous substitution saturation and variation in GC. Mol Biol Evol. 2013, 30: 1675-1686. 10.1093/molbev/mst062.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhang G, Li C, Li Q, Li B, Larkin DM, Lee C, Storz JF, Antunes A, Greenwold MJ, Meredith RW, Odeen A, Cui J, Zhou Q, Xu L, Pan H, Wang Z, Jin L, Zhang P, Hu H, Yang W, Hu J, Xiao J, Yang Z, Liu Y, Xie Q, Yu H, Lian J, Wen P, Zhang F, Li H, et al.: Comparative genomics across modern bird species reveal insights into avian genome evolution and adaptation. Science2014. doi:10.1126/science.1251385.
- Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32: 1792-1797. 10.1093/nar/gkh340.PubMedPubMed CentralView ArticleGoogle Scholar
- Dunning JBJ: CRC Handbook of Avian Body Masses. 2nd edition, Boca Raton, Florida: CRC Press; 2007.
- Dutheil JY, Galtier N, Romiguier J, Douzery EJP, Ranwez V, Boussau B: Efficient selection of branch-specific models of sequence evolution. Mol Biol Evol. 2012, 24: 1-15.Google Scholar
- Yang Z, Nielsen R: Synonymous and nonsynonymous rate variation in nuclear genes of mammals. J Mol Evol. 1998, 46: 409-418. 10.1007/PL00006320.PubMedView ArticleGoogle Scholar
- Gueguen L, Gaillard S, Boussau B, Gouy M, Groussin M, Rochette NC, Bigot T, Fournier D, Pouyet F, Cahais V, Bernard A, Scornavacca C, Nabholz B, Haudry A, Dachary L, Galtier N, Belkhir K, Dutheil JY: Bio++: efficient extensible libraries and tools for computational molecular evolution. Mol Biol Evol. 2013, 30: 1745-1750. 10.1093/molbev/mst097.PubMedView ArticleGoogle Scholar
- Bio++ Wiki. , [http://biopp.univ-montp2.fr/]
- Yang Z: PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007, 24: 1586-1591. 10.1093/molbev/msm088.PubMedView ArticleGoogle Scholar
- Zhang G, Li B, Li C, Gilbert MTP, Jarvis E, The Avian Genome Consortium, Wang J: The avian phylogenomic project data; 2014. ., [http://dx.doi.org/10.5524/101000]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.