Heterochronic evolution reveals modular timing changes in budding yeast transcriptomes
© BioMed Central Ltd 2010
Received: 27 May 2010
Accepted: 22 October 2010
Published: 22 October 2010
Gene expression is a dynamic trait, and the evolution of gene regulation can dramatically alter the timing of gene expression without greatly affecting mean expression levels. Moreover, modules of co-regulated genes may exhibit coordinated shifts in expression timing patterns during evolutionary divergence. Here, we examined transcriptome evolution in the dynamical context of the budding yeast cell-division cycle, to investigate the extent of divergence in expression timing and the regulatory architecture underlying timing evolution.
Using a custom microarray platform, we obtained 378 measurements for 6,263 genes over 18 timepoints of the cell-division cycle in nine strains of S. cerevisiae and one strain of S. paradoxus. Most genes show significant divergence in expression dynamics at all scales of transcriptome organization, suggesting broad potential for timing changes. A model test comparing expression level evolution versus timing evolution revealed a better fit with timing evolution for 82% of genes. Analysis of shared patterns of timing evolution suggests the existence of seven dynamically-autonomous modules, each of which shows coherent evolutionary timing changes. Analysis of transcription factors associated with these gene modules suggests a modular pleiotropic source of divergence in expression timing.
We propose that transcriptome evolution may generally entail changes in timing (heterochrony) rather than changes in levels (heterometry) of expression. Evolution of gene expression dynamics may involve modular changes in timing control mediated by module-specific transcription factors. We hypothesize that genome-wide gene regulation may utilize a general architecture comprised of multiple semi-autonomous event timelines, whose superposition could produce combinatorial complexity in timing control patterns.
Recent evolutionary studies using natural and inbred Drosophila and C. elegans lines have shown that genome-wide gene expression levels are much more conserved in nature than expected compared to independent measurements of mutational input [1–3], supporting the hypothesis that transcriptome evolution is characterized by stabilizing selection. These observations suggest that organisms show limited evolutionary divergence in gene expression via changes in gene regulation, either by qualitative changes in the connectivity of regulatory interactions or by quantitative changes in the strength of regulatory interactions. In addition, since the architecture of gene regulation involves highly connected and hierarchical cascades of control [4–7], regulatory change may be limited due to the broad potential for negative pleiotropic consequences . Given this evidence for deleterious changes in gene regulation, how do organisms acquire transcriptome divergence?
Many studies have addressed this question by investigating the relationship between gene expression divergence and different kinds of genomic variation. Studies focusing on the regulatory effects of single nucleotide mutations have revealed that expression divergence generally associates with cis variation within species [9–13] and with trans variation between species [14–18]. Other studies have focused on larger, structural mutations, such as mobile element transposition or non-homologous recombination [19–21]. While these studies have discovered many important links between genomic variation and expression divergence, few studies have directly observed how genomic variation affects the qualitative structure or quantitative dynamics of an organism's genome-wide regulatory network. Notably, genome-wide binding patterns of six transcription factors were recently compared between two Drosophila species during embryonic development , revealing a dominant signature of quantitative, rather than qualitative changes in TF-DNA regulatory interactions.
One possible avenue for transcriptome divergence that remains consistent with the evidence of stabilizing selection on genome-wide gene expression levels and evolutionary conservation of gene regulatory network topology is that divergence might occur via changes in the timing of gene expression. Gene expression is both a quantitative trait and a dynamic trait, such that the timing of gene expression is regulated by a complex, polygenic combination of factors [5, 23–26]. Evolutionary modifications to gene regulation have the potential to dramatically alter gene expression timing without greatly affecting mean expression levels [27, 28]. Moreover, changes in the timing of regulatory factor expression could induce temporal shifts in the expression trajectories of some genes relative to others (heterochrony) [29, 30] without disrupting functional relationships.
In this study, we investigated the evolution of genome-wide gene expression as a dynamical system, to evaluate the pattern of divergence in expression timing, the mode of time-dependent transcriptome evolution, and the genome-wide architecture of timing control. We performed a large number of analyses and experiments that follow multiple inference pathways, as diagrammed in Figure S1 in Additional file 1. To overview our results and conclusions, we propose that our data and analyses support the following hypotheses: (1) while the vast majority of genes have bounded expression levels consistent with stabilizing selection, most expression trajectories show significant heterochronic divergence among strains; (2) the pattern of transcriptome divergence involves time-dependent changes in the magnitude, direction, and degrees of freedom of among-strain covariation; (3) genome-wide gene regulation utilizes a general architecture for transcriptome timing control comprised of distinct, coherent, and dynamically-autonomous modules; (4) population-level transcriptome divergence may predominantly result from quantitative changes in the expression dynamics of module-specific trans-regulatory factors rather than qualitative changes in the structure of genome-wide gene regulation; (5) an architecture involving modular timing control could generate complex patterns of heterochronic divergence combinatorially, while alleviating global negative pleiotropic effects associated with changes in regulatory interactions or changes in the expression of trans-regulatory factors.
We assayed genome-wide gene expression (transcriptome) levels throughout the mitotic cell-division cycle (CDC) of ten natural budding yeast lines, including eight woodland and one laboratory strain of S. cerevisiae and one outgroup of S. paradoxus, in a comparative experimental design that involves technical, but not biological replicates of each timepoint (see Materials and methods). To calibrate the variation in gene expression across these lines with an expectation from mutation-drift, we also measured transcriptomes for 23 mutation accumulation (MA) lines. Normalizing and processing our data yielded expression levels for 6,263 genes at 18 sampled CDC-timepoints for the natural lines and unsynchronized expression for the MA lines. We validated our array measurements by comparison with previously published CDC-dependent temporal expression data (Figure S32 in Additional file 1) and with RNA sequencing data produced using the ABI SOLiD 3 platform (Figure S33 in Additional file 1). Our expression data show significant consistency both with previous CDC expression data and with quantification of RNA sequencing data.
Genome-wide expression levels show much less variability than expected, but CDC-temporal expression patterns display broad divergence
One might suspect that the broad lack of expression divergence among strains may be due to a general deficiency of CDC-temporal variation for many of the genes. To test this, we partitioned S. cerevisiae expression variation into relative contributions from strain and temporal effects using a linear mixed model analysis. 3,750 genes (59.9%) exhibit significant effects (FDR < 0.1 over all 6,251 × 2 hypotheses): 2,797 genes (46.6%) show significant strain variation (that is, divergence), 2,596 genes (43.3%) show significant temporal variation, and 1,643 genes (26.2%) show both effects. Averaging over these 1,643 genes, strain effects explain 39% and temporal effects explain 23% of the total variance in gene expression; combining these marginal effects explains 50%-90% of each gene's total variance. Strain and temporal variances show significant but mild correlation (R = 0.25, P < 10-10; Figure S2 in Additional file 1), and temporal effects contribute 104-fold more to overall expression variation compared to strain effects when scaled by divergence time (genome-wide medians ). Thus, considerable temporal variation in CDC-expression is present in the yeast transcriptome (see also Figure S3 in Additional file 1).
To relate evolutionary forces to yeast gene function, we computed the proportion of genes under stabilizing selection for eight broad life-cycle terms and 88 GO Slim terms over time, Q j (t), where j indexes each term. The Q j profiles of most terms appear qualitatively similar (Figure S4 in Additional file 1), and a comparison of average Q j values for life-cycle terms reveals that periodic, meiotic, and CDC-specific genes (in that order) are the most neutral (Figure 1B). In particular, a significant number of neutral genes are periodically expressed (Fisher's Exact test, FDR < 0.05; Figure 1E). Of the 88 GO Slim terms, only 5 terms have average Q j values less than 0.94 (the 95th percentile over Q j ; Table S3 in Additional file 1): helicase activity (0.76), extracellular region (0.86), cell wall (0.91), cellular component (0.92), and pseudohyphal growth (0.93). Of these, cell wall and extracellular region terms are enriched among the 1,643 genes with significant strain and time effects (FDR < 0.05). Thus, while it is not clear whether there is a functional aspect to expression divergence in temporal trajectories, among genes with the most strain divergence, specific functional categories are enriched within the set of temporally variable genes.
A hierarchical clustering of the entire CDC-transcriptome data set shows a complex inter-relationship among strains and timepoints, such that no strain's entire CDC-temporal expression and no timepoint's entire strain expression form a single clade (Figure S5 in Additional file 1); however, different timepoints from the same strain tend to be more similar than the same timepoints from different strains, indicating a general pattern of strain divergence. Notably, 17 of 18 timepoints for our S. paradoxus strain (YPS3395) cluster as a single clade, indicating their general distinction from S. cerevisiae expression. Yet only 457 genes (7.5% of the genome) show significant differential expression between S. paradoxus and the 8 woodland S. cerevisiae lines (t-test, FWER < 0.1), and no gene shows greater than a three-fold change in expression level. Surprisingly, the S. cerevisiae laboratory strain exhibits the most divergent dynamic expression profile in this clustering, beyond the S. paradoxus outgroup, despite having only 248 genes (4%) that are differentially expressed compared to woodland strains (FWER < 0.1) with a maximum fold change of 4.2. Thus, compared to S. paradoxus, the laboratory S. cerevisiae strain shows only slightly greater expression level divergence from woodland strains but for fewer genes, yet it shows a more distinct pattern of temporal divergence. One possibility is that the laboratory strain's CDC molecular physiology has become adapted to laboratory growth conditions , which is manifest in its CDC-transcriptome dynamics. Overall, these results indicate that while levels of expression show limited among-strain and between-species divergence, the dynamic pattern of expression displays significant temporal fluctuations, with broad among-strain and between-species divergence.
Divergence in CDC-temporal coexpression patterns is found at all scales of transcriptome organization
To evaluate the quantitative divergence in CDC-temporal expression following the qualitative patterns revealed by clustering analysis above, we first generated a 6,082 × 6,082 gene coexpression matrix for each strain by computing pairwise correlations between all CDC-temporal gene expression profiles and then calculated matrix correlation coefficients between coexpression matrices for all pairs of strains (Figure S6A in Additional file 1). Due to the extreme size of the matrices, all comparisons yield significant concordance in coexpression patterns (FDR < 0.01), but the degree of concordance is low (avg. R = 0.11), indicating most strains lack strong similarity in CDC-coexpression (that is, similar pairwise relationships between genes). Restricting these coexpression matrices to a subset of 266 transcriptional regulatory genes does not strengthen this pattern of weak association (avg. R = 0.12; Figure S6B in Additional file 1). Controls using replicated and simulated microarray data confirm this pattern (Text S1). As may be expected, S. paradoxus has the lowest coexpression correlation with other strains (avg. R = 0.047); however, S. cerevisiae strains YPS3137 and YPS2073 also have low correlations (0.055 and 0.068). The laboratory strain shows an average correlation of 0.12, indicating that its divergence in CDC-coexpression is typical compared to woodland strains. Thus, the laboratory strain appears to show pronounced divergence in overall CDC-transcriptome dynamics compared to other strains (see above) without markedly different coexpression relationships (that is, changes in regulation). Overall, we found considerable divergence in the genome-wide pattern of temporal coexpression.
Strain divergence in modular coexpression structure.
Sig. modules (%)
Overlap (% of diameter)
CDC regulatory architecture exhibits time-dependent changes in multi-dimensional complexity
To assess whether the CDC-directions correspond to biologically relevant axes of covariation, we identified the genes contributing the most to strain covariation in each major CDC-direction by correlation and determined the functional terms enriched among the top 5% of genes (Tables S6, S7 in Additional file 1). Significant terms vary by timepoint and include metabolic, periodic, ribosomal, and CDC life-cycle terms (FDR < 0.05). In addition, TATA regulatory motifs have been hypothesized to drive expression divergence via neutral drift . We found that TATA-associated genes project onto major CDC-directions 4-fold less than genes lacking TATA motifs, which are over-represented among the top 5% of genes (P < 0.01, Table S8 in Additional file 1). Also, few of the 152 genes with neutral CDC-expression are found among the top 5% (P < 10-5). This paucity of genes hypothesized to diverge neutrally argues against drift as a major force in strain diversification of CDC-directions. We also tested whether the major CDC-directions (of within-species covariation) are predictive of directions of between-species divergence, as might be expected for neutral species divergence . For each timepoint we calculated angular distance between the major S. cerevisiae CDC-direction and the displacement vector of S. paradoxus expression, oriented within S. cerevisiae CDC-space (for example, Figure S14 in Additional file 1). All angles exceed 45°, and no angle is significantly small (FWER < 0.15). Thus, within-species covariation does not predict the direction of between species divergence. However, release from α-factor, S-phase, and the G2/M transition have the smallest angles, suggesting that response to mating pheromone and DNA replication dynamics may be more constrained in evolutionary covariation.
We next evaluated whether the amount of variation projected onto the multivariate CDC-directions reveals a different, non-stabilizing pattern of selection compared to the pattern for individual genes. We computed F -statistics by comparing natural and mutational among-line expression variances projected onto each timepoint's CDC-directions. Although the average F -value over major CDC-directions U1(t) is 14.6-fold larger than the genome-wide average F -value (2.28 × 10-3 vs. 1.56 × 10-4, P = 1.5 × 10-4), all F -values remain significantly low, including those calculated for minor CDC-directions (FWER < 0.05). Therefore, multivariate patterns of transcriptome divergence are also consistent with stabilizing selection. However, the temporal profile of major multivariate F -values, unlike that for individual genes, exhibits peaks in expression variability (87, 176, 260, and 345 min.; Figure S15 in Additional file 1); the average peak is 1.4-fold greater than that at all other timepoints (P = 0.018) and 19.1-fold greater than the genome-wide average (P = 0.006). Intriguingly, these peaks in expression variability are preceded by large changes in the major axis of CDC-covariation (63, 152, 251, and 301 min.), occur just prior to CDC-phase transitions (97, 218, 267, and approximately 350 min.), and coincide with drops in regulatory complexity (latent factors; 176, 260, 345 min.) (Table S4 in Additional file 1; see also Figure 4B). In addition, reductions in regulatory complexity generally coincide with the CDC-phase transitions G1/S, G2/M, and M/G1 (48, 218, 260, 301 min.; except S/G2 at 111 min.), suggesting greater constraint on gene regulation through the influence of CDC checkpoints. Thus, temporal fluctuations in strain variability might reflect multi-genic pleiotropic effects being channeled to varying dimensions and directions of gene expression through a regulatory architecture that changes dynamically across CDC-phases .
Heterochronic changes in expression timing explain strain divergence for the majority of yeast genes
Our multivariate analysis of the architecture of genome-wide gene regulation argues that the broad pattern of CDC-transcriptome divergence among yeast strains is heavily influenced by dynamical changes in control. However, if this architecture of timing control involves a global cascade of regulation, any changes in control could cause broad negative pleiotropic effects throughout the CDC . Given our findings of strong stabilizing selection on both univariate and multivariate strain variation across the CDC, such a global, hierarchical architecture seems unlikely. Alternatively, this architecture may be organized into discrete modules of regulation that exhibit dynamically-autonomous timing control . Moreover, superposition of regulatory timing patterns from different modules could combinatorially generate the regulatory complexity required for transcriptome-wide timing control while minimizing negative pleiotropic effects.
Shared patterns of heterochrony reveal modular timing changes
Modular timing changes reflect coherent and dynamically-autonomous timing control
Furthermore, robustness of the yeast CDC against genetic , environmental , and dynamical perturbations  suggests the possibility that timing pattern variability both within and between modules might be limited by a form of negative selection, potentially canalizing selection [43–45], which could reinforce the coherence of modules as integrated developmental processes. Consistent with this, module-specific genes as a group show significantly low variation for timing change curves across strain comparisons (P = 0.0002), and when separated by module, their strain variation correlates with each module's estimated coherence (Spearman's r = -0.94, P = 0.0009). This suggests a relationship between within-module variability and among-strain variability in timing patterns (Text S7). In addition, variability among all timing patterns is also lower than expected and is time-dependent, suggesting the possibility of system-wide coordination and periodic synchronization of modular timing patterns (Text S8 and Figure S27 in Additional file 1). These results suggest that the CDC timing control architecture is comprised of a core of distinct, coherent, and dynamically-autonomous modules involving nearly 30% of the genome, combined with a layer of interactions between modules, which may potentially coordinate or synchronize expression timing globally.
Heterochronic expression of module-specific regulatory factors may explain modular timing changes
Heterochrony in module-specific transcription factors.
Sig. F -tests (Prop.)
Genes with complex heterochrony associate with multiple timing patterns
While we found 1,828 genes that strongly associate within individual timing modules (module-specific genes), another 1,887 genes (31%) instead show strong associations across timing modules (between-module genes); these between-module genes may exhibit a complex pattern of heterochrony. Our hypothesis of modular timing control suggests that negative pleiotropic effects due to changes in control may be minimized for genes with complex heterochrony by combinatorial regulation, using TFs with different timing patterns rather than the same timing pattern. First, we found no TF that significantly associates with the 1,887 genes with complex heterochrony compared to module-specific genes (FDR < 0.1). We also evaluated whether the number of module-specific TFs regulating a gene with complex heterochrony correlates with the number of timing modules represented by these TFs and obtained a rank correlation of 0.71 (P < 10-10). While some correlation is expected by chance, we found only three genes (Erg11, Sis1, and YMR196W) that are strictly regulated by multiple TFs from the same timing module (three TFs for each), suggesting that this type of regulation may be rare. Thus, genes that associate with multiple timing modules tend to be regulated by multiple different timing patterns. This suggests that complex patterns of heterochronic divergence could be generated combinatorially while minimizing negative pleiotropic effects.
Transcriptome divergence in the yeast cell-division cycle is highly time-dependent. While within-species divergence in genome-wide gene expression levels is consistent with strong stabilizing selection at each timepoint of the cell-division cycle, a large fraction of genes show significant divergence in their dynamical patterns of expression. In addition, the magnitude, direction, and degrees of freedom of transcriptome covariation change across the cell-division cycle, concordant with time-specific changes in regulatory complexity. While we could not test explicitly for the evolutionary mode of expression dynamics, we found that the major directions of within-species covariation associate with specific functional categories at different timepoints but not with neutrally-evolving genes; these directions do not predict the direction of between-species divergence for our outgroup S. paradoxus; and the S. cerevisiae laboratory strain shows extensive divergence in expression dynamics, comparable to S. paradoxus. These results suggest considerable potential for non-neutral evolution of expression dynamics, despite strong stabilizing selection on mean expression levels.
Since widespread divergence in transcriptome dynamics might be explained by extensive qualitative changes in gene coregulation, we assessed the similarity of gene coexpression structure across strains. Consistent with this possibility, we found significant divergence in genome-wide and modular coexpression structure, across the entire cell-division cycle and in a time-dependent manner. However, divergence in temporal coexpression does not assure divergence in coregulation; two genes may be coregulated yet exhibit distinct temporal expression trajectories (or vice-versa, for example, Figure 6B). Therefore we evaluated the possibility of heterochronic divergence, relating genes by shared changes in expression timing, rather than by similarity of expression levels (that is, coexpression). The majority of genes show timing changes consistent with heterochronic divergence, suggesting that evolution of the yeast CDC-transcriptome may be characterized as predominantly heterochronic rather than heterometric.
Genome-wide heterochronic divergence implies changes in the control of genome-wide timing patterns. However, changes in timing control (just like changes in coregulation) are expected to have negative pleiotropic consequences in natural populations, such as our yeast strains, given a global, cascading regulatory architecture. We hypothesized that negative pleiotropic effects could be minimized if regulatory architecture is instead organized into distinct timing modules which could exhibit different timing patterns. In support of this hypothesis, we found significant modularity in the genome-wide patterns of heterochrony, evidence supporting the coherence of timing modules as functionally integrated units, and dozens of transcription factors that are significantly associated with controlling these timing modules. Thus, widespread divergence in yeast transcriptome dynamics may be explained by heterochronic divergence in the temporal expression patterns of module-specific regulatory factors that in turn affect the timing of downstream gene expression events. Our results suggest that the short-term evolution of yeast regulatory architecture may entail preferentially quantitative changes in regulation, consistent with the established relationship between trans regulatory variation and expression divergence within species [9–13] and conservation of transcription factor binding patterns between species . Although our evidence supports the role of transcription factors specifically in driving heterochronic divergence, additional factors that regulate either the production or degradation of mRNA transcripts are likely to play a significant role. Future studies incorporating additional yeast strains or higher resolution time series data may facilitate identification of additional module-specific regulatory factors and help to reveal the fine-scale structure of timing control in the yeast cell-division cycle.
Our data suggest a new view of molecular cell processes as a collection of dynamically-autonomous event timelines whose modularity allows divergence in gene regulation, while alleviating system-wide negative effects of regulatory change. Control of gene expression may utilize a general architecture comprised of multiple discrete event timelines that serve as a basis set of timing patterns. Interactions among module-specific regulatory factors may determine individual event timelines, and superposition different timelines may generate combinatorial complexity in regulatory patterns. This modular dynamical architecture may facilitate the generation of complex regulatory variation via changes in the scheduling and coordination of discrete event timelines, while buffering variation in individual gene expression. In this way, the architecture of genome-wide timing control may bias a population's evolutionary dynamics.
Materials and methods
The ten natural S. cerevisiae and S. paradoxus strains are heterothallic haploid MATa derivatives of homothallic diploids. Woodland isolates were previously collected from state parks in Pennsylvania and New Jersey, USA  (Table S1 in Additional file 1). Laboratory strain YPS183 (HOΔ:kanMX, leu2Δ) derives from BY4741. Mating-type switching was prevented by homologous recombination of a Kanamycin resistance cassette at the HO endonuclease locus (YDL227C). The 23 mutation accumulation lines (provided by C. Zeyl ) are diploid and were propagated asexually for 600 generations from a Y55 ancestor (leu2Δ).
Synchronization and sampling of yeast cultures
Strains were inoculated from frozen stock and cultured overnight in synthetic dextrose (SD) minimal medium at 30°C (225 rpm). The next day cultures were diluted into fresh SD and upon reaching a culture density of OD ≈ 0.25, α -factor mating pheromone was added to a final concentration of 4 μM. Cultures were then incubated approximately 75 min. until arrested and synchronized in late G1. The state of synchronization was determined by the appearance of < 10% shmoos and < 10% budding cells, visualized by light microscopy (100 ×, oil). Cultures were released from arrest by removing α-factor: 2 × wash with 4°C S medium (SD without dextrose) and resuspension of cell pellets with fresh 18°C SD medium. Approximately 25 ml aliquots of each culture were distributed into 18 flasks and incubated at 18°C (225 rpm). Incubation of cultures at 18°C in SD medium more than doubles the CDC-period, allowing a more accurate comparison of measurements across strains by reducing temporal sampling variation.
The sampling time course consisted of 18 samples, taken at average intervals of 19 min. (real time), starting at 0 min. (time of release from arrest) and ending at 345 min. The first sample (0 min.) was taken after all flasks were returned to the incubator. Upon sampling, each culture was placed on dry ice, mixed with 20 ml of -20°C 100% EtOH in a 50 ml Falcon tube, inverted, and placed immediately into a -80°C freezer.
Microarray processing and analysis
Total RNA was extracted from each frozen cell culture sample using Qiagen's RNeasy Kit, following manufacturer's instructions. cDNA was prepared from 15 μg of each RNA sample using SuperScript III reverse transcriptase (Invitrogen) and compared directly to unsynchronized S. cerevisiae cDNA (YPS183 cultured at 30°C in YPD until reaching OD600 1.1) on 2-channel spotted-oligo glass microarrays in a common reference design. Invitrogen AlexaFluor 555 and 647 fluorophores were used to label each cDNA sample. Hybridized slides were incubated for 24-65 hours at 42°C. Slides were prepared for scanning by serial incubation in wash buffers and dried using both a vacuum and high-purity, filtered N2 gas.
Samples were hybridized to two dye-swapped microarrays. Unsynchronized MA line transcriptomes were produced with the same design. Corning UltraGAPS glass slides, spotted with the Operon AROS for Saccharomyces cerevisiae, V1.1, were used for all hybridizations. Each microarray targets 6388 protein-coding genes using two replicate spots per oligo, yielding four technical expression measurements per gene, strain, and timepoint. In total 378 time-series and 45 unsynchronized microarrays were produced for natural and MA lines, respectively. Data were quantified, filtered, and normalized, yielding expression measurements for 5879.9 genes per strain on average (92.4%). Measurements show a grand mean standard error (SE) of 0.175. Using two microarrays of the same strain independently cultured, synchronized, and sampled at 63 min., biological replicate measurement error was estimated as 0.554 (SE). Microarray data are available from the NCBI GEO database under accession number [GEO:GSE24237] and from the authors' web site .
A set of 91 transposable (Ty) element genes were excluded from the final data collection. The remaining 6,263 gene expression trajectories were imputed for missing data and calibrated to a common CDC-period of 267 min. using budding index measurements. A common set of 6,082 genes have CDC expression for all ten natural strains. Custom software written in Python, R, SAS, and Mathematica was used to carry out computational analyses as described in the Supplemental Materials and methods.
false discovery rate
family-wise error rate
linear discriminant analysis
latent factor analysis
principal component analysis
root mean squared error
singular value decomposition
We wish to acknowledge H. Murphy, C. Winter, F. Ge, E. Daugharthy, A. Goodman, and I. Gawlas for assistance, as well as M. Lee, P. Shah, and two anonymous reviewers for constructive criticism on the manuscript. This work is supported in part by a HRFF grant to the University of Pennsylvania from the Common Wealth of Pennsylvania and a NRSA Training Grant in Computational Genomics from the University of Pennsylvania (DFS). The funding bodies had no role in study design; in collection, analysis, or interpretation of data; in the writing of the manuscript; or in the decision to submit the manuscript for publication.
- Rifkin SA, Kim J, White KP: Evolution of gene expression in the Drosophila melanogaster subgroup. Nat Genet. 2003, 33: 138-144. 10.1038/ng1086.PubMedView ArticleGoogle Scholar
- Rifkin SA, Houle D, Kim J, White KP: A mutation accumulation assay reveals a broad capacity for rapid evolution of gene expression. Nature. 2005, 438: 220-223. 10.1038/nature04114.PubMedView ArticleGoogle Scholar
- Denver DR, Morris K, Streelman JT, Kim SK, Lynch M, Thomas WK: The transcriptional consequences of mutation and natural selection in Caenorhabditis elegans. Nat Genet. 2005, 37: 544-548. 10.1038/ng1554.PubMedView ArticleGoogle Scholar
- Simon I, Barnett J, Hannett N, Harbison CT, Rinaldi NJ, Volkert TL, Wyrick JJ, Zeitlinger J, Gifford DK, Jaakkola TS, Young RA: Serial regulation of transcriptional regulators in the yeast cell cycle. Cell. 2001, 106: 697-708. 10.1016/S0092-8674(01)00494-9.PubMedView ArticleGoogle Scholar
- Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, Simon I, Zeitlinger J, Jennings EG, Murray HL, Gordon DB, Ren B, Wyrick JJ, Tagne JB, Volkert TL, Fraenkel E, Gifford DK, Young RA: Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2002, 298: 799-804. 10.1126/science.1075090.PubMedView ArticleGoogle Scholar
- Harbison CT, Gordon DB, Lee TI, Rinaldi NJ, Macisaac KD, Danford TW, Hannett NM, Tagne JB, Reynolds DB, Yoo J, Jennings EG, Zeitlinger J, Pokholok DK, Kellis M, Rolfe PA, Takusagawa KT, Lander ES, Gifford DK, Fraenkel E, Young RA: Transcriptional regulatory code of a eukaryotic genome. Nature. 2004, 431: 99-104. 10.1038/nature02800.PubMedPubMed CentralView ArticleGoogle Scholar
- Luscombe NM, Babu MM, Yu H, Snyder M, Teichmann SA, Gerstein M: Genomic analysis of regulatory network dynamics reveals large topological changes. Nature. 2004, 431: 308-312. 10.1038/nature02782.PubMedView ArticleGoogle Scholar
- Stearns SC, Magwene P: The naturalist in a world of genomics. Am Nat. 2003, 161: 171-180. 10.1086/367983.PubMedView ArticleGoogle Scholar
- Wittkopp PJ, Haerum BK, Clark AG: Evolutionary changes in cis and trans gene regulation. Nature. 2004, 430: 85-88. 10.1038/nature02698.PubMedView ArticleGoogle Scholar
- Borneman AR, Gianoulis TA, Zhang ZD, Yu H, Rozowsky J, Seringhaus MR, Wang LY, Gerstein M, Snyder M: Divergence of transcription factor binding sites across related yeast species. Science. 2007, 317: 815-819. 10.1126/science.1140748.PubMedView ArticleGoogle Scholar
- Wittkopp PJ, Haerum BK, Clark AG: Regulatory changes underlying expression differences within and between Drosophila species. Nat Genet. 2008, 40: 346-350. 10.1038/ng.77.PubMedView ArticleGoogle Scholar
- Tirosh I, Reikhav S, Levy AA, Barkai N: A yeast hybrid provides insight into the evolution of gene expression regulation. Science. 2009, 324: 659-662. 10.1126/science.1169766.PubMedView ArticleGoogle Scholar
- McManus CJ, Coolon JD, Du MO, Eipper-Mains J, Graveley BR, Wittkopp PJ: Regulatory divergence in Drosophila revealed by mRNA-seq. Genome Res. 2010, 20: 816-825. 10.1101/gr.102491.109.PubMedPubMed CentralView ArticleGoogle Scholar
- Yvert G, Brem RB, Whittle J, Akey JM, Foss E, Smith EN, Mackelprang R, Kruglyak L: Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nat Genet. 2003, 35: 57-64. 10.1038/ng1222.PubMedView ArticleGoogle Scholar
- Wang D, Sung HM, Wang TY, Huang CJ, Yang P, Chang T, Wang YC, Tseng DL, Wu JP, Lee TC, Shih MC, Li WH: Expression evolution in yeast genes of single-input modules is mainly due to changes in trans-acting factors. Genome Res. 2007, 17: 1161-1169. 10.1101/gr.6328907.PubMedPubMed CentralView ArticleGoogle Scholar
- Chang YW, Robert Liu FG, Yu N, Sung HM, Yang P, Wang D, Huang CJ, Shih MC, Li WH: Roles of cis-and trans-changes in the regulatory evolution of genes in the gluconeogenic pathway in yeast. Mol Biol Evol. 2008, 25: 1863-1875. 10.1093/molbev/msn138.PubMedPubMed CentralView ArticleGoogle Scholar
- Sung HM, Wang TY, Wang D, Huang YS, Wu JP, Tsai HK, Tzeng J, Huang CJ, Lee YC, Yang P, Hsu J, Chang T, Cho CY, Weng LC, Lee TC, Chang TH, Li WH, Shih MC: Roles of trans and cis variation in yeast intraspecies evolution of gene expression. Mol Biol Evol. 2009, 26: 2533-2538. 10.1093/molbev/msp171.PubMedPubMed CentralView ArticleGoogle Scholar
- Emerson JJ, Hsieh LC, Sung HM, Wang TY, Huang CJ, Lu HHS, Lu MYJ, Wu SH, Li WH: Natural selection on cis and trans regulation in yeasts. Genome Res. 2010, 20: 826-836. 10.1101/gr.101576.109.PubMedPubMed CentralView ArticleGoogle Scholar
- Han JS, Szak ST, Boeke JD: Transcriptional disruption by the L1 retrotransposon and implications for mammalian transcriptomes. Nature. 2004, 429: 268-274. 10.1038/nature02536.PubMedView ArticleGoogle Scholar
- Stranger BE, Forrest MS, Dunning M, Ingle CE, Beazley C, Thorne N, Redon R, Bird CP, de Grassi A, Lee C, Tyler-Smith C, Carter N, Scherer SW, Tavaré S, Deloukas P, Hurles ME, Dermitzakis ET: Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science. 2007, 315: 848-853. 10.1126/science.1136678.PubMedPubMed CentralView ArticleGoogle Scholar
- De S, Teichmann SA, Babu MM: The impact of genomic neighborhood on the evolution of human and chimpanzee transcriptome. Genome Res. 2009, 19: 785-794. 10.1101/gr.086165.108.PubMedPubMed CentralView ArticleGoogle Scholar
- Bradley RK, Li XY, Trapnell C, Davidson S, Pachter L, Chu HC, Tonkin LA, Biggin MD, Eisen MB: Binding site turnover produces pervasive quantitative changes in transcription factor binding between closely related Drosophila species. PLoS Biol. 2010, 8: e1000343-10.1371/journal.pbio.1000343.PubMedPubMed CentralView ArticleGoogle Scholar
- Beer MA, Tavazoie S: Predicting gene expression from sequence. Cell. 2004, 117: 185-198. 10.1016/S0092-8674(04)00304-6.PubMedView ArticleGoogle Scholar
- Yuan Y, Guo L, Shen L, Liu JS: Predicting gene expression from sequence: a reexamination. PLoS Comput Biol. 2007, 3: e243-10.1371/journal.pcbi.0030243.PubMedPubMed CentralView ArticleGoogle Scholar
- Prill RJ, Iglesias PA, Levchenko A: Dynamic properties of network motifs contribute to biological network organization. PLoS Biol. 2005, 3: e343-10.1371/journal.pbio.0030343.PubMedPubMed CentralView ArticleGoogle Scholar
- Alexander RP, Kim PM, Emonet T, Gerstein MB: Understanding modularity in molecular networks requires dynamics. Sci Signal. 2009, 2: pe44-10.1126/scisignal.281pe44.PubMedPubMed CentralView ArticleGoogle Scholar
- Kim J, Kerr JQ, Min GS: Molecular heterochrony in the early development of Drosophila. Proc Natl Acad Sci USA. 2000, 97: 212-216. 10.1073/pnas.97.1.212.PubMedPubMed CentralView ArticleGoogle Scholar
- Somel M, Franz H, Yan Z, Lorenc A, Guo S, Giger T, Kelso J, Nickel B, Dannemann M, Bahn S, Webster MJ, Weickert CS, Lachmann M, Paabo S, Khaitovich P: Transcriptional neoteny in the human brain. Proc Natl Acad Sci USA. 2009, 106: 5743-5748. 10.1073/pnas.0900544106.PubMedPubMed CentralView ArticleGoogle Scholar
- Olson ME, Rosell JA: Using heterochrony to detect modularity in the evolution of stem diversity in the plant family Moringaceae. Evolution. 2006, 60: 724-734.PubMedView ArticleGoogle Scholar
- Moss EG: Heterochronic genes and the nature of developmental time. Curr Biol. 2007, 17: R425-34. 10.1016/j.cub.2007.03.043.PubMedView ArticleGoogle Scholar
- Landry CR, Lemos B, Rifkin SA, Dickinson WJ, Hartl DL: Genetic properties influencing the evolvability of gene expression. Science. 2007, 317: 118-121. 10.1126/science.1140247.PubMedView ArticleGoogle Scholar
- Gu Z, David L, Petrov D, Jones T, Davis RW, Steinmetz LM: Elevated evolutionary rates in the laboratory strain of Saccharomyces cerevisiae. Proc Natl Acad Sci USA. 2005, 102: 1092-1097. 10.1073/pnas.0409159102.PubMedPubMed CentralView ArticleGoogle Scholar
- Rifkin SA, Atteson K, Kim J: Constraint structure analysis of gene expression. Funct Integr Genomics. 2000, 1: 174-185. 10.1007/s101420000018.PubMedView ArticleGoogle Scholar
- Phillips PC, Arnold SJ: Hierarchical comparison of genetic variance-covariance matrices. I. Using the Flury hierarchy. Evolution. 1999, 53: 1506-1515. 10.2307/2640896.View ArticleGoogle Scholar
- Schluter D: Adaptive radiation along genetic lines of least resistance. Evolution. 1996, 50: 1766-1774. 10.2307/2410734.View ArticleGoogle Scholar
- Csete ME, Doyle JC: Reverse engineering of biological complexity. Science. 2002, 295: 1664-1669. 10.1126/science.1069981.PubMedView ArticleGoogle Scholar
- Gould SJ: Ontogeny and Phylogeny. 1977, Cambridge, MA: Harvard University PressGoogle Scholar
- Alberch P, Gould SJ, Oster GF, Wake DB: Size and shape in ontogeny and phylogeny. Paleobiology. 1979, 5: 296-317.Google Scholar
- Bonner JT: Size and Cycle: An Essay on the Structure of Biology. 1965, Princeton, NJ: Princeton University PressGoogle Scholar
- Winzeler EA, Shoemaker DD, Astromoff A, Liang H, Anderson K, Andre B, Bangham R, Benito R, Boeke JD, Bussey H, Chu AM, Connelly C, Davis K, Dietrich F, Dow SW, El Bakkoury M, Foury F, Friend SH, Gentalen E, Giaever G, Hegemann JH, Jones T, Laub M, Liao H, Liebundguth N, Lockhart DJ, Lucau-Danila A, Lussier M, M'Rabet N, Menard P, Mittmann M, Pai C, Rebischung C, Revuelta JL, Riles L, Roberts CJ, Ross-MacDonald P, Scherens B, Snyder M, Sookhai-Mahadeo S, Storms RK, Véronneau S, Voet M, Volckaert G, Ward TR, Wysocki R, Yen GS, Yu K, Zimmermann K, Philippsen P, Johnston M, Davis RW: Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science. 1999, 285: 901-906. 10.1126/science.285.5429.901.PubMedView ArticleGoogle Scholar
- Hughes TR, Marton MJ, Jones AR, Roberts CJ, Stoughton R, Armour CD, Bennett HA, Coffey E, Dai H, He YD, Kidd MJ, King AM, Meyer MR, Slade D, Lum PY, Stepaniants SB, Shoemaker DD, Gachotte D, Chakraburtty K, Simon J, Bard M, Friend SH: Functional discovery via a compendium of expression profiles. Cell. 2000, 102: 109-126. 10.1016/S0092-8674(00)00015-5.PubMedView ArticleGoogle Scholar
- Li F, Long T, Lu Y, Ouyang Q, Tang C: The yeast cell-cycle network is robustly designed. Proc Natl Acad Sci USA. 2004, 101: 4781-4786. 10.1073/pnas.0305937101.PubMedPubMed CentralView ArticleGoogle Scholar
- Wagner GP, Altenberg L: Perspective: complex adaptations and the evolution of evolvability. Evolution. 1996, 50: 967-976. 10.2307/2410639.View ArticleGoogle Scholar
- Willmore KE, Young NM, Richtsmeier JT: Phenotypic variability: its components, measurement and underlying developmental processes. Evol Biol. 2007, 34: 99-120. 10.1007/s11692-007-9008-1.View ArticleGoogle Scholar
- Landry CR: Systems biology spins off a new model for the study of canalization. Trends Ecol Evol. 2009, 24: 63-66. 10.1016/j.tree.2008.10.004.PubMedView ArticleGoogle Scholar
- MacIsaac KD, Wang T, Gordon DB, Gifford DK, Stormo GD, Fraenkel E: An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinformatics. 2006, 7: 113-10.1186/1471-2105-7-113.PubMedPubMed CentralView ArticleGoogle Scholar
- Amorim MJ, Cotobal C, Duncan C, Mata J: Global coordination of transcriptional control and mRNA decay during cellular differentiation. Mol Syst Biol. 2010, 6: 380-10.1038/msb.2010.38.PubMedPubMed CentralView ArticleGoogle Scholar
- Jensen LJ, Jensen TS, de Lichtenberg U, Brunak S, Bork P: Co-evolution of transcriptional and post-translational cell-cycle regulation. Nature. 2006, 443: 594-597.PubMedGoogle Scholar
- Choi JK, Kim YJ: Epigenetic regulation and the variability of gene expression. Nat Genet. 2008, 40: 141-7. 10.1038/ng.2007.58.PubMedView ArticleGoogle Scholar
- Sniegowski PD, Dombrowski PG, Fingerman E: Saccharomyces cerevisiae and Saccharomyces paradoxus coexist in a natural woodland site in North America and display different levels of reproductive isolation from European conspecifics. FEMS Yeast Res. 2002, 1: 299-306.PubMedGoogle Scholar
- Zeyl C, DeVisser JA: Estimates of the rate and distribution of fitness effects of spontaneous mutation in Saccharomyces cerevisiae. Genetics. 2001, 157: 53-61.PubMedPubMed CentralGoogle Scholar
- Comparative Yeast Time-Series Gene Expression. [http://kim.bio.upenn.edu/software/yeast-cdc.shtml]