Integration of Arabidopsis thaliana stress-related transcript profiles, promoter structures, and cell-specific expression
Genome Biology volume 8, Article number: R49 (2007)
Arabidopsis thaliana transcript profiles indicate effects of abiotic and biotic stresses and tissue-specific and cell-specific gene expression. Organizing these datasets could reveal the structure and mechanisms of responses and crosstalk between pathways, and in which cells the plants perceive, signal, respond to, and integrate environmental inputs.
We clustered Arabidopsis transcript profiles for various treatments, including abiotic, biotic, and chemical stresses. Ubiquitous stress responses in Arabidopsis, similar to those of fungi and animals, employ genes in pathways related to mitogen-activated protein kinases, Snf1-related kinases, vesicle transport, mitochondrial functions, and the transcription machinery. Induced responses to stresses are attributed to genes whose promoters are characterized by a small number of regulatory motifs, although secondary motifs were also apparent. Most genes that are downregulated by stresses exhibited distinct tissue-specific expression patterns and appear to be under developmental regulation. The abscisic acid-dependent transcriptome is delineated in the cluster structure, whereas functions that are dependent on reactive oxygen species are widely distributed, indicating that evolutionary pressures confer distinct responses to different stresses in time and space. Cell lineages in roots express stress-responsive genes at different levels. Intersections of stress-responsive and cell-specific profiles identified cell lineages affected by abiotic stress.
By analyzing the stress-dependent expression profile, we define a common stress transcriptome that apparently represents universal cell-level stress responses. Combining stress-dependent and tissue-specific and cell-specific expression profiles, and Arabidopsis 5'-regulatory DNA sequences, we confirm known stress-related 5' cis-elements on a genome-wide scale, identify secondary motifs, and place the stress response within the context of tissues and cell lineages in the Arabidopsis root.
Knowledge about responses of the model plant Arabidopsis thaliana to abiotic or biotic stresses has accumulated during the past decade, based on large-scale mutant analyses and genome-wide transcript profiles. In particular, random mutagenesis combined with cell-specific or treatment-specific reporter gene expression has identified many players in the stress response, whereas microarray-based observations have revealed transcriptional responses to stresses on a genome-wide scale [1–4]. However, most analyses have been restricted to individual genes or treatments. Plant-specific databases, such as The Arabidopsis Information Resource (TAIR), Genevestigator, and the Nottingham Arabidopsis Stock Centre (NASC), have begun to collect data from various sources and merge them with genome sequence-based features [5–8]; however, the data typically exist in isolation. Integrating these diverse datasets remains a significant challenge in the assembly of a unifying picture of plant responses to environmental effects. For this purpose, various tools have been developed, such as MapMan and STKE (Signal Transduction Knowledge Environment), which begin to link individual genes to pathways or coregulation circuits [9, 10]. Here, we present an alternative approach to integrating different datasets related to plant stress responses.
In Arabidopsis, as in all organisms, a variety of stress factors that disturb homeostatic conditions bring about ubiquitous as well as distinct responses at the transcription level. Identification of ubiquitous, cell autonomous responses is based on monitoring the status of macromolecules in cells, gauging DNA damage, protein degradation, or lipid membrane integrity, and eliciting pathways that carry out repair functions . The degree of damage will trigger this common response, which must be distinguished from a set of reactions that recognize and respond to specific stress conditions. Identifying the genes that determine the specific responses and then separating them into distinct groups, functional categories, and pathways is an important task that must be undertaken if we are to elucidate how plants sense and recognize the environment, and then embark upon a meaningful defense that will alleviate the stress condition. The approach presented here aims to define the distinction between ubiquitous and specific stress response categories. Very few transcript profiling studies, which did not include the majority of the Arabidopsis genes, have addressed specificity and crosstalk of different stress treatments [1, 3, 4].
Control over gene expression is in part determined by motifs, cis-elements, within the promoter sequence of regulated genes. In plants, distinct motifs have been correlated with responses to individual treatments, resulting in discovery of a number of motifs related to stress responses and developmental or organ-specific regulation. Among these motifs, those responding to light and osmotic and cold stress treatments have been analyzed most intensely [12, 13]. Databases dedicated to plant promoter motifs have been established, based on motif identification in single or, at most, a few genes [14, 15]. How their competence in regulating gene expression is mirrored at the genome level has not been tested.
Here, we applied the fuzzy k-means clustering method  to publicly available microarray data from the AtGenExpression project to compare the response of Arabidopsis to a variety of abiotic and biotic stresses that disturb homeostatic conditions . The results revealed common as well as distinct pathways that govern changes in the expression of induced and repressed genes in response to various treatments. Based on the collection of motifs in the Plant cis-acting Regulatory DNA Elements (PLACE) database , clusters of coregulated genes were screened for over-represented cis-elements within their promoters. In addition, gene expression profiles identifying cell lineages in Arabidopsis roots were used to correlate the cell type-specific response to various stresses in the root [18, 19]. Integration of information from previously unconnected databases provided surprising insights about genes and pathways that classify the evolutionarily conserved cell-based common stress response, and the divergent pathways that organize abscisic acid (ABA)-dependent and ABA-independent reactions to stress in a tissue-specific manner.
Results and discussion
An analysis of the Arabidopsis abiotic and biotic transcriptome is presented in four sections (Figure 1). First, the overall clustering pattern for 22,746 probes in response to different environmental and chemical stress conditions was analyzed. This was followed by analysis of a 'common stress transcriptome', which unites genes that respond to any deviation from homeostasis. Then, an analysis of 5'-motifs defined promoter structures - cis-elements - that are characteristic for individual clusters of stress-responsive genes, focusing on clusters containing induced genes (2,715 genes in total) and on the few large clusters (5,998 genes) containing stress-repressed genes. Finally, cell-specific and tissue-specific responses to a variety of stresses were determined by integrating the clusters defining stress specificity with the gene expression map established for the Arabidopsis root . This analysis provided intersections between stress and tissue or cell specificity.
Clustering of different stress response categories
The fuzzy k-means clustering method [16, 20, 21] was applied to the probe set (22,746 in total) printed on Affymetrix Arabidopsis ATH1 chips, which corresponded to about 22,400 genes. In the following analysis, we treated each probe set as a gene. The external conditions selected included treatments with a variety of biotic and abiotic stresses included in AtGenExpress , as outlined in a previous analysis that focused on a subset of salt-responsive genes . Additionally included were results for different light conditions and exposures of plants to chemicals and growth regulators such as t-zeatin, tri-iodobenzoic acid, AgNO3, and cycloheximide. The chemical treatments were included because we expected them to add additional power of resolution to the analysis. Considering the large number of genes to be analyzed, fuzzy k-means clustering was conducted initially with a large centroid parameter (k = 320). Subsequently, 10,490 genes with significant membership values emerged from the dataset, which, with the cutoff set at a membership value of 0.035, most parsimoniously assembled into 180 clusters. The composition of 28 clusters (N0 to N27) is shown in Figure 2 and the entire set is included in the Additional data files 2 and 3.
The 'limma' statistical program was applied to the Affymetrix dataset to identify differentially regulated genes . Of the 22,746 probe sets, 14,015 were differentially expressed in at least one condition (P < 10-15). Among the 10,490 significant genes included in the clustering analysis, 8,520 were differentially expressed and 1,970 were not significantly regulated. This nonregulated category includes 879 (out of 884) and 119 (out of 131) genes from clusters N6 and N53, respectively. Genes in cluster N6 were not regulated under most conditions, whereas genes in cluster N53 exhibited a very small induction in osmotically-stressed roots only (see Additional data file 4). The separation of clusters N6 and N53 reflects the discriminative power of fuzzy k-means clustering and sensitivity in identifying even minute differences in expression patterns. The remaining nonregulated genes were mainly found in downregulated clusters. In the following analysis of common stress responses and promoter motifs, we focus our attention on the 8,520 differentially expressed genes.
The majority of these 8,520 genes was concentrated in a few large clusters. The most highly populated 15 clusters, each including more than 100 genes, totaled 5,478 or more than 60% of all significantly clustered transcripts. The largest clusters, namely N0, N2, N5, N18, included 699, 1,206, 705, and 430 genes, respectively. ABA, which acts as an important signaling molecule under a variety of different stress conditions, was implicated in and induced the expression of genes in clusters N3, N9, N10, N12, N13 and N20, whereas genes in clusters N0, N11, N16, N19 and N28 did not respond to ABA (Figure 2). Genes in clusters N1 and N8 were induced by light, and those in cluster N1 were additionally repressed in response to biotic stress treatments. Genes in cluster N27 were induced by jasmonic acid (JA) treatment, as well as by salt and wounding stresses. Large clusters in which gene expression was generally repressed by environmental stresses included N2, N4, N5, N7, N15, and N18. All genes are identified in the Additional data files.
The 'universal stress response transcriptome': cluster N12
The 197 genes in cluster N12 (Figure 2) are induced by a broad range of diverse stress conditions: cold, osmotic, salinity, wounding, and biotic stresses (including treatments with elicitors). The 'limma' analysis indicated that approximately 80% of these genes were significantly regulated under all treatment conditions, whereas the rest of the included genes were marginally regulated in one (mostly the wounding treatment) but significantly regulated in all other conditions (P < 0.01; Table 1; Additional data file 5). They appear to represent a common or universal stress response transcriptome because most of these genes are conserved among plants, animals and fungi, and are stress regulated in all organisms, with the inclusion of a few genes related to the plant-specific hormones ABA and JA (Figure 3 and Table 1). Several Gene Ontology (GO) categories were enriched among these genes: GO:0009611 (response to wounding), GO:0009613 (response to pest, pathogen, or parasite), GO:0006970 (response to osmotic stress), GO:0009737 (response to ABA stimulus), GO:0009651 (response to salt stress), GO:0009723 (response to ethylene stimulus), GO:0009751 (response to salicylic acid stimulus), GO:0009753 (response to JA stimulus), GO:0050832 (defense response to fungi), GO:0006839 (mitochondrial transport), and GO:008270 (zinc ion binding). Signaling pathways related to mitogen-activated protein kinase (MAPK), calcium, reactive oxygen species (ROS), phospholipids, apoptosis, and protein degradation were induced. Equally, part of this cluster of genes that generally are upregulated by stress is functionally related to vesicle transport and mitochondrial functions. N12 included induced genes that had previously been identified as related to or specific for biotic stresses, but these were also induced by abiotic stresses, and vice versa. Past restrictions in the scope of analyses, which typically focused on single treatment conditions, and the resulting problem of annotation stringency did not compromise the fuzzy k-means clustering analysis. We discuss these universal stress response genes by organizing them into different pathways (Figure 3).
Several MAPK pathways, organized into signaling cascades, are conserved in eukaryotic organism [23, 24]. In Saccharomyces cerevisiae, for example, the high osmolarity glycerol (HOG) signaling pathway is responsible for osmotic stress sensing [25, 26]. The Arabidopsis AtHK1, MEK1, MPK4, and MPK6 can complement yeast deletion mutants of the HOG pathways. Other examples of plant MAPKs are alfalfa stress-induced MAPK (SIMK), tobacco salicylic acid-induced protein kinase (SIPK), wound-induced protein kinase (WIPK), and Nicotiana Fus-3-like kinase6 (Ntf6).
Among common genes that are upregulated by stress, several MAPK components were identified: MPK5, MKK9, and MAPKKK14. The MAPK pathway has been suggested to be involved in ethylene signaling [27–29]. Included among ubiquitous stress-regulated genes is also ACS6, encoding the rate-limiting enzyme of ethylene biosynthesis and a substrate for MPK6 , together with six ERF/AP2 transcription factors (AtERF). This implicates the ethylene signaling-mediated engagement of a subset of the MAPK family as a component of the common stress response.
However, the ethylene response transcriptome is not strictly clustered in the stress transcriptome, notwithstanding its importance in developmental processes such as fruit ripening. Incorporating the results from a study that measured transcript changes in Arabidopsis Col-0 wild-type  into the cluster structure obtained by fuzzy k-means, the significantly ethylene-regulated genes identified in the study were located in a large number of different clusters.
The yeast Snf1 protein kinase and the mammalian AMP-activated protein kinase act as metabolic sensors that monitor cellular AMP and ATP levels. Activation increases the ATP:AMP ratio. Snf4 is part of the Snf1 protein kinase complex. In higher plants, they are involved in response to environmental or nutritional stress. Related common stress-induced genes were CIPK11 (encoding a Snf1-related protein kinase that is similar to SOS2, a protein kinase that is involved in plant salinity stress responses) , SKIP2 (a conserved SCF ubiquitin ligase subunit that interacts with SnRKs), and AZF2 and ZAT10 (C2H2 zinc finger proteins) . Both AZF2 and ZAT10 suppressed the Snf4 deficiency in yeast and function as transcription repressors in Arabidopsis [33, 34]. ZAT10 can activate salt stress tolerance, controlled in yeast by MSN2 and MSN4 factors, and ZAT10 can repress the expression of the plant stress gene RD29A . Several Snf1-related genes appeared in stress-induced clusters other than N12 as well, suggesting functions that are specific for particular stress conditions (data not shown).
Bax inhibitor 1: endoplasmic reticulum stress
The Bax-inhibitor 1 (BI-1) is an endoplasmic reticulum (ER) protein that suppresses cell death induced by ER stress in both animal and plant cells. It can inhibit the activation of Bax and its translocation to mitochondria, and suppresses the activation of caspase, and functions in reducing calcium release from the ER . In Arabidopsis, Bax over-expression causes ROS accumulation and cell death, and BI-1 attenuates the cell death effect without affecting production of ROS [37, 38]. It alleviated cell death caused by biotic and abiotic stresses . BI-1 (At5g47120), one of three genes in Arabidopsis with this sequence signature, was induced by several other stresses in a specific manner as well, and appears to represent a signature gene and protein of the common stress response cluster.
Although mechanisms of vesicle transport have been studied extensively, little is known about regulation in response to stress. A plant vesicle-related protein, AtVAMP, when ectopically expressed, can suppress Bax-induced apoptosis in yeast, possibly by improving membrane repair . The over-expression of AtRab7, a gene that is involved in regulation of vesicle trafficking, increased endocytosis in roots, as well as salt and osmotic stress tolerance . This indicates the importance of regulated vesicle trafficking for acquisition of stress tolerance.
Several genes related to trafficking from endosomes to central vacuoles were placed into N12. They are SYP21, Vps28-related, Tsg101-related, SRC2, Ras-related GTPase, and genes for two Snf7 family proteins. In roots, the Tsg101-related and Vps28-related genes, as well as SYP21 and one gene encoding a Snf7-like protein are specifically expressed in the endodermis of the root hair zone.
A multitude of signaling molecules is generated from membrane phospholipids. Their involvement in osmotic stress responses has been demonstrated. Several related genes are induced, such as encoding inositol polyphosphate 5-phosphatase II, FYVE domain-containing phosphatidylinositol-4-phosphate 5-kinase (PI4P5K), and lipase class 3 family proteins. PI4P5K leads to the synthesis of PI4,5P2. Mutations in the offsetting phosphatase gene, SAC9, lead to over-accumulation of PI4,5P2 and constitutive expression of stress-response pathways [42, 43]. The product of the BAP1 gene, which is also upregulated, interacts with BON1, a protein with two C2 domains that binds to phospholipids. Together, BAP1 and BON1 control plant growth homeostasis .
Reactive oxygen species
ROS have been associated with stress sensing and signaling, but have emerged more recently as important, general signals [45–47]. Irrespective of their ubiquitous presence, ROS that derive from different stimuli appear to be recognized as specific, indicating that a number of different signal mediators must exist. We suggest that cluster 12 identifies the evolutionarily conserved set of these genes. SRO5 is a gene that controls ROS in plants, which is upregulated by various stresses. SRO5 transcript expression overlaps partially with that of P5CDH mRNA. The induction of SRO5 leads to production of a 24-nucleotide nat-siRNA that guides cleavage of P5CDH mRNA, resulting in regulated proline levels . Additionally, ZAT12, and possibly ZAT10 of the Snf1 pathway, also participate in ROS signaling transduction .
Multiple calcium-related functions are induced by stresses. Among them is a SOS2-like protein kinase, namely CIPK11. However, little is known about the other genes in this group, including two calmodulins, three calcium-binding proteins, and three calcium-dependent kinases. These calcium-related genes cannot be organized into a pathway-like structure, in part because of the lack of detailed experimental evidence and also based on the multiplicity of functions that are channeled through calcium-binding proteins.
The transcription machinery and transcription factors
CCR4 and CCR4-associated factor 1 (CAF1) are critical for mRNA turnover in yeast . Pcf11 is an mRNA 3'-end processing factor and binds the carboxyl-terminal domain of the largest subunit of RNA polymerase II . Both CAF1 and Pcf11 have their Arabidopsis homologs upregulated by different stresses, indicating a role for control over mRNA processing and degradation. Another upregulated gene is the eukaryotic translation initiation factor SUI1. Other examples are AZF2 and ZAT10, which encode transcription repressors.
Stress-related transcription factors were also among the common stress response genes, including five WRKY family members, four Myb, three HSF, three NAM and two AP2, and the transcription factor SCL13. Included are WRKY18 and WRKY40, which physically interact with both overlapping and antagonistic roles in pathogen responses . WRKY25 and WRKY33 are substrates of MKS1, which itself is a substrate of MPK4 and regulates plant defense reactions . WRKY33 is also required for resistance to necrotrophic fungal pathogens . WRKY11 interacts with calmodulin and acts as a negative regulator of basal resistance in Arabidopsis . SCL13 has been shown to function in light signaling . These WRKYs function in resistance to necrotrophic but not biotrophic pathogens, whereas necrotrophic damage is more closely related to the physical damage caused by abiotic stresses, as also reflected in the cluster structures. Little information is available for other transcription factors in cluster N12, although several isoforms of Myb, NAM, HSF, and AP2 not included in N12 have been associated before with stress response pathways.
Among the genes upregulated by many stress treatments, several are localized to mitochondria. They are three BCS1-like ATPases (which could function as chaperones, whose yeast homologs are required for cytochrome bc complex assembly), two DIC1-like, one ANT1-like, one MTM1-like, and one other mitochondrial substrate carrier family protein. Furthermore, a ferrochelatase I gene, an NADH dehydrogenase-related gene, and a PP2C are part of this group. Also upregulated here was the Bax-inhibitor 1 gene. To appreciate their precise functions in plants, more studies are required.
ABA-related: RPK1 and CYP707A3
Among the common stress response genes were two ABA-related genes, RPK1 and CYP707A3. RPK1 encodes a leucine-rich repeat receptor-like kinase 1, a membrane-bound regulator of ABA early signaling . The rpk1 mutant exhibited decreased sensitivity to ABA, and over-expression resulted in hypersensitivity. CYP707A3 encodes a cytochrome P450 protein catalyzing ABA 8'-hydroxylation and catabolism. Its knockout mutant exhibited exaggerated ABA-inducible gene expression and enhanced drought tolerance, whereas over-expression was associated with growth retardation by ABA and increased transpiration .
ADC2, a rate-limiting enzyme in polyamine (PA) biosynthesis
ADC genes are essential for polyamine (PA) production. Over-expression of ADC2 led to GA-deficient plants and accumulation of putrescine, a phenotype reversed by GA3 . The null mutant adc2-1 was sensitive to salt stress, but could be rescued by external putrescine . ADC2 is among the common stress response genes.
RelA/SpoT, RSH2, and the 'stringent response' in bacteria
The stringent response is crucial for stress adaptation in bacteria, mediated by the production of the nucleotide guanosine-3',5'-(bis-)pyrophosphate (ppGpp). RelA and SpoT encode bacterial enzymes for ppGpp synthesis. RSH is the higher plant homolog of this RelA/SpoT protein [60, 61].
NHL3, PBS1, and PUB17
These genes function in resistance to the bacterial pathogen Pseudomonas syringae pv. tomato DC3000 carrying avirulence proteins [51, 62, 63], and they - as identified here - were also induced by various abiotic stresses. Interestingly, NHL3 over-expression in Arabidopsis enhances resistance to the virulent Pseudomonas syringae pv. tomato DC3000, without an increase in PR gene expression or H2O2 accumulation . PBS1 and RPS5 are required for avrPphB mediated Pseudomonas syringae resistance in Arabidopsis. AvrPphB can proteolytically cleave PBS1, which is required for RPS5-mediated resistance . PUB17 is a U-box ARMADILLO repeat E3-ligase, which regulates cell death and defense . Another disease resistance family protein, similar to Cf-2.1 (At2g34930), is also upregulated by various stresses. Its null mutant was particularly susceptible to fungus attack . The inclusion of these genes in cluster N12 suggests their function in common mechanisms that counter both abiotic and biotic stresses.
Genes with unknown or unclear functions
An additional 120 genes are included in the common stress response cluster (ST3). In part, their functions are known by specific activities (for example, trehalose-6-phosphate phosphatase), whereas most are identified only by domain identifiers (for example, protease-associated or thioredoxin family-related), or their functions are not clear or completely unknown. The group included transcripts for 19 zinc-finger family proteins, five protein kinases, four protein phosphatases, a number of glycosyl hydrolases, thioredoxins, cytochromes P450, and hormone-responsive functions, mostly annotated according to similarity criteria, and 40 expressed proteins without any annotation. Among the genes that lack annotation, the majority is most strongly induced by conditions that affect redox homeostasis and ROS responses, in particular treatments with ozone, H3BO3, H2O2, AgNO3, hypoxia, and triiodobenzoic acid (an inhibitor of polar auxin transport; Genevestigator dataset ).
The high correlation of genes in cluster N12 with experimentally verified or alleged functions in a wide variety of stress conditions in species across all kingdoms suggests that the functions identified by this cluster categorize the basic stress response transcriptome (Figure 3). By their nature, these functions appear to identify ubiquitous cellular stress defense programs in all organisms, whereas pathways that integrate stress responses at the organ or organism levels may be based on programs that diverged during evolution. Conceivably, reverse genetics will determine the functions of little understood and completely unknown genes in N12, and provide a clear separation of these genes from pathways that are specific to individual stress conditions. The common stress response genes epitomize components of crosstalk between biotic and abiotic stress response mechanisms by identifying genes such as WRKY transcription factors, NHL3, and PUB17. Indeed, the Arabidopsis mutant bos1 exhibited compromised resistance to the pathogen Botrytis cinerea and reduced tolerance to drought, high salinity, and oxidative stress .
Identification and analysis of regulatory motifs
Other clusters (Figure 2; ST1) separated the data into distinct groups, with groups of upregulated or downregulated genes with various groupings indicating dependence or independence of the action of hormones (ABA, ethylene, JA). Generally, all clusters included many genes with unknown functions but also a variable number of genes for which a relationship with a specific stress has been documented. One task was to analyze correlations between stress clusters and the presence and nature of regulatory motifs in their promoters.
We analyzed cis-elements, which are conserved motifs in the 5'-region of genes with a key role in assembling the transcription machinery. Extracted from the genome sequence were 1,000 base pairs upstream of the translation initiation codon, and genes in each cluster were scanned for motifs listed in the PLACE database . The occurrence of these motifs was compared with their frequency among all promoters in the genome. A P value was then calculated for every motif and cluster combination, based on the hypergeometric distribution . We considered motifs with P values lower than 10-4 to be significantly over-represented. Listed in Table 2, and justified below, are motifs that have been identified.
Genes in upregulated clusters
The WB-BOX motif TTTGACT was identified in clusters N0, N11, and N19. Genes in clusters N0 and N19 were generally induced by abiotic stresses, whereas genes in cluster N11 were upregulated markedly in roots by salt treatment. The WB-BOX represents a binding site for WRKY transcription factors , which have 12, 4, and 5 members in clusters N0, N11, and N19, respectively. Among other clusters, only N12 included a number of WRKY factors (five in total). It seems that WRKYs correlate well with pathogen response activity. Genes in cluster N0 were also induced by osmotic and ionic stresses, and the additional HSF motif (heat shock factor binding) (A|G)GAANNTTC was over-represented in this cluster (with N representing any nucleotide). Also, two HSF transcription factors are included in this cluster.
Genes in clusters N1 and N8 were responsive to light treatment. The ABRE motif ACGTG(G|T)C was identified in both clusters, together with an unknown motif, namely (A|G)ACCACA(A|G). ACGTG(G|T)C is similar to the G-Box motif that mediates light signaling . Also identified in cluster N1 was the I-Box motif GATAAG(A|G).
Clusters N3, N9, N10, N12, N13, and N20 were induced by ABA treatment to variable degrees. The ABRE motif was over-represented in these clusters, but only cluster N3 contained the ABRE-binding proteins AREB1 and AREB2 . The DRE motif (A|G)CCGAC was identified in the clusters N3, N9, and N12, which is in agreement with the strong induction by cold stress of genes in these clusters . Within these three clusters, four, two, and one DRE-binding (DREB) transcription factors were included, respectively. Although cluster N11 included seven DREB genes, the DRE motif was not over-represented in this cluster. Cluster N11 also contains an additional nine ERF/AP2 transcription factors. Clusters N3 and N9 additionally exhibited the EVENINGAT motif (AAAATATCT), which functions in the circadian control of gene expression . Further identified were the P1BS motif (GNATATNC) and an unknown, hypothetical motif (A|G)(C|T)TAA(A|T)NNNTGA(C|T) in cluster N10, and the 2S-SEED-PROTBANAPA motif (CAAACAC) in cluster N13.
Over-representation of the well known ABRE motif in multiple clusters of genes that respond to either light or ABA treatment points toward the existence of additional motifs . These could be the I-Box and DRE motifs that are over-represented in these clusters, and other putatively cis-acting motifs are suggested by the analysis. More likely, however, is the presence of transcriptional control mechanisms that act on cis-element binding proteins rather than on the promoter elements.
Genes in downregulated clusters
Motifs of prevalence similar to those in upregulated genes appear to be largely absent from the stress-repressed genes in clusters N4, N7, and N17. For cluster N15 genes, strongly downregulated by osmotic and high salinity stresses in roots, the MYCATERD1 motif (CATGTG) and the SORLRP3AT motif (TGTATATAT) were identified. Clusters N2, N5, N14, and N18 included many genes related to general gene expression functions, protein synthesis, cell organization, and metabolism. Several known motifs were enriched in these clusters. The UP1ATMSD motif GGCCCA(A|T)(A|T)(A|T), which is related to axillary bud growth , was over-represented in all four clusters. Additionally, over-represented in genes in cluster N2 were the I-Box motif GATAAG(A|G) and the SORLIP5AT motif GAGTGAG , which appear to be connected to the expression of genes in metabolic functions. Cluster N5 showed the E2F1OSPCNA motif GCGGGAAA, the E2FANTRNR motif TTTCCCGC, and the E2FCONSENSUS motif (A|T)TT(G|C)C(G|C)(G|C). These motifs are typically associated with genes that are involved in cell cycle progression and cell division [74–76]. At lower frequency, clusters 14 and 18 exhibited similar motif combinations (Table 2).
In general, fuzzy k-means clustering applied to 5'-regulatory sequences confirmed known motifs in the major stress-responding clusters, whereas different clusters shared subsets of these motifs. Additional, secondary motifs between and within large clusters are suggested (Figure 4b, Table 2), but attempts to distinguish between clusters that shared similar expression patterns through motif analyses alone proved inconclusive. Different approaches will be necessary to reveal how combinations of motifs may control gene expression. Methods for identifying such combinations are emerging .
Integrating AtGenExpress and Arabidopsisroot transcript profiles
Very few data are available to date that correlate stress-related transcript changes and cell-specific or tissue-specific gene expression. We focused on the tissue-specific response to stress in detail by including a dataset in which cell type-specific and growth stage-specific gene expression in Arabidopsis roots was recorded . Among the probes printed on the Affymetrix chip, 12,360 were considered present in at least one of the three developmental stages of the root. These stages identify genes in cell division and early root expansion growth (stage 1), the region of maximum elongation growth (stage 2), and genes in the root maturation region (stage 3). Also recorded was the gene expression pattern in different cell lineages: the lateral root cap, epidermis, cortex, endodermis, and in the vasculature (stele). Here, intensity values were compared for each gene in the three developmental stages and in each cell lineage against its average intensity across all stages or cell lineages, and the difference in expression provided a measure of cell specificity and stage specificity for each gene. Fuzzy-k means clustering revealed a clear pattern for the 12,360 genes in the root dataset, which separated into 19 clusters (T0 to T18; Figure 5). For example, genes in cluster T2 were more highly expressed in the cortex, endodermis, and stele during developmental stage 3, identifying mature regions of the root. In contrast, genes in cluster T3 were highly expressed during stage 1 development, and present at lower level in stage 3 regions of the root and in the endodermis. Cluster T4 shows genes with strong expression in the stele during stage 3.
Among the 12,360 genes recognized in roots, 5,963 exhibited significant membership values in the stress expression profiles, and these genes were further analyzed. The intersections between tissue and stress clusters are shown in Table 3, revealing specific reactions to different stress conditions in distinct cell lineages and developmental stages of the Arabidopsis root. The nature of the genes at the intersection between cell specificity/development in roots and stresses (Table 3) is detailed in Additional data file 9. In the following discussion, we address stress-regulated genes within the context of their expression in a developmental and cell-specific context. Examples highlight root-specific genes that are downregulated by abiotic stress and that are highly expressed in root cap and epidermis of stage 1 roots under optimal conditions (Figure 6), and genes that are upregulated by stress and that, under nonstress conditions, are highly expressed in the vasculature (stele), endodermis, and cortex in stage 3 roots (Figure 7).
Stress down-regulated genes in roots
Three different regulatory patterns for downregulated genes emerged. First, the stress clusters N14 and N18 included mainly genes related to the key machineries of gene expression and protein synthesis, most of which organized in tissue cluster T3 (stage 1 specific) and T7 (stages 1 and 2). Cluster N18 includes genes that mainly function in protein synthesis: more than 190 ribosomal proteins, 5 tRNA synthetases, 13 translation initiation/elongation proteins, 10 chaperonin proteins, and a few genes related to lysine or arginine synthesis. Also included in N18 were DNA replication licensing factors, nucleosome assembly proteins, histones H2A and H3, small nuclear ribonucleoproteins, and signaling G proteins. Several GO categories are enriched in cluster N18: GO:0042254 (ribosome biogenesis and assembly), GO:0043037 (translation), GO:0015450 (protein translocase activity), GO:0046112 (nucleobase biosynthesis), GO:0006333 (chromatin assembly), and GO:0006525 (arginine metabolism). In contrast, cluster N14 collected genes mainly related to transcription, such as 12 DEAD/DEAH box helicases and 11 polymerases (or similar to polymerase functions), with a few genes involved in protein synthesis functions. Several GO categories were over-represented in N14: GO:0016072 (rRNA metabolism), GO:0003899 (DNA-directed RNA polymerase activity), GO:0004527 (exonuclease activity), GO:0008026 (ATP-dependent helicase activity), GO:0042254 (ribosome biogenesis and assembly), and GO:0006396 (RNA processing). Cluster N14 was different from N18 in that N14 was slightly induced by cold stress but N18 was not. The expression of genes in these two clusters can most parsimoniously be rationalized by developmental regulation.
Second, the stress clusters N4 and N7 contained genes that were placed mainly in the tissue clusters T4, T8, and T14, all of which represent stage 3 specific genes. Interestingly, among the downregulated genes were those related to cell wall modification that are specifically expressed in the stele (T4 and T14), such as expansins, extensins, (putative) cellulases, pectinesterases, and peroxidases. Enriched in these two clusters were GO:0006979 (response to oxidative stress), GO:0007047 (cell wall organization and biogenesis), GO:0009653 (morphogenesis), GO:0042545 (cell wall modification), GO:0010054 (trichoblast differentiation), and GO:0005516 (calmodulin binding). Together, these clusters appear to identify the portion of the transcriptome that controls root maturation, which is downregulated under stress treatments of the root system.
Finally, genes in the stress clusters N2 and N5 represented a combination of the previously discussed patterns. These clusters included genes regulated developmentally (in the tissue clusters T3, T6, T7, T12, T13, T15, and T16 [stage 1 or stage 2 specific]), and genes downregulated by stress signaling (mainly in the tissue clusters T0, T1, T2, or T4). Over-represented in cluster N2 were genes involved in metabolism, which included amino acid, cell wall, carbohydrate, lipid, nucleotide, and secondary metabolism biosynthetic functions. The following GO categories were over-represented: GO:0043038 (amino acid activation), GO:0009658 (chloroplast organization and biogenesis), GO:0006779 (porphyrin biosynthesis), GO:0019321 (pentose metabolism), GO:0004312 (fatty-acid synthase activity), GO:0006769 (nicotinamide metabolism), GO:0005528 (FK506 binding), GO:0016117 (carotenoid biosynthesis), GO:0015994 (chlorophyll metabolism), and GO:0009606 (tropism). Genes related to DNA synthesis, chromatin structure, cell cycle, and cell division were abundant in downregulated cluster N5; the following were over-represented: GO:0006260 (DNA replication), GO:0007049 (cell cycle), GO:0000910 (cytokinesis), GO:0007010 (cytoskeleton organization and biogenesis), and GO:0016071 (mRNA metabolism). Cluster N5 also includes some genes related to metabolic processes, as indicated the over-represented GO categories GO:0009853 (photorespiration), GO:0019758 (glycosinolate biosynthesis), GO:0044272 (sulfur compound biosynthesis), and GO:0009067 (aspartate family amino acid biosynthesis).
In essence, the genes downregulated by different stresses are expressed under ideal growth conditions close to the root meristem and in the region of strongest cell expansion. Furthermore, genes related to the functional categories of mRNA and protein synthesis, cell cycle control, and primary metabolism categories were separated into differentially repressed clusters. This indicated active regulatory processes, other than a passive repression brought about merely by a general stressed physiologic state.
Stress upregulated genes in roots
A significant difference emerged when genes in stress upregulated clusters were viewed in their tissue-specific or cell-specific context. In the majority, these genes exhibited a high expression level at stage 3 (tissue clusters T0, T2, T5, T14, and T17) or high expression in root cap cells (T1 and T9). Because genes in these tissue clusters appeared with insignificant membership values only in the repressed clusters N14 and N18, we consider them representative of first responders to stress signaling. It appeared significant that these genes were upregulated in cells in the more mature region of the root, coinciding with the region of beginning root hair development.
Merging stress and tissue/cell specificity, a framework became recognizable. Genes in tissue clusters T3 and T7 were significantly downregulated during abiotic stress, and genes in tissue clusters T0, T1, T2, and T4 were either upregulated or downregulated under different stress conditions. In contrast, genes in tissue clusters T5 and T18 were mainly upregulated upon stress. Of particular importance may be the behavior of the genes in clusters T5 and T18. T5 genes were specifically expressed in endodermis cells in stage 3, whereas T18 genes exhibited high expression level in lateral root cap cells.
Cell lineage-specific and development-dependent stress response pathways
Focusing on abiotic stresses alone (cold, osmotic, salinity, drought, and the hormones ABA and JA), the 12,360 probes present in roots were analyzed by fuzzy k-means clustering. The analysis of this smaller set of treatments separated the genes into 66 clusters (Additional data file 10). Intersections of stress specificity and spatial or temporal expression characteristics are illustrated by two examples.
Figure 6 shows root-expressed genes that are strongly downregulated predominantly during osmotic and salt stresses. The identity of the genes with high expression in the stele of stage 3 roots highlights functions that are associated with the decline of root growth. Abundantly represented were peroxidases, extensins, and PRP-like proteins, and functionally unknown proteins. A contrasting behavior is shown in Figure 7, which identifies a cluster with osmotic and salt stress upregulated genes. These genes are uniformly upregulated by ABA and, in part by JA, while ABA upregulation generally also extends into the shoots. This cluster includes many transcripts for functions in signaling and transport, and a number of genes that have been well characterized, such as transcripts for proline synthesis, glutathionine-conjugate transport, ferritin, calcineurin phosphoesterase, SEC14, and the ABA-responsive AREB2. The complete set of data is included in the Additional data file 10.
Integration of diverse, large-scale datasets into a framework that then describes and explains the functioning of an organism remains an elusive goal of genomics-type approaches. Combining three types of data, we analyzed in context the genome-wide expression profiles modulated by a number of stress conditions, regulatory cis-elements in promoters, and cell-specific and developmental age-specific root transcripts and their reaction to stress in the model crucifer Arabidopsis thaliana. A recent analysis used the AtGenExpress dataset by focusing on responses under nine experimental conditions and identified similarities between conditions , whereas our approach distinguished similarities and differences between genes under all conditions. The fuzzy k-means clustering tool  generated reliable clustering results because known stress response genes, originally reported in single-gene analyses, were generally confirmed by their inclusion in appropriate clusters [3, 4, 20, 79]. The tool provided flexibility to arrive at realistic cluster structures that could be adjusted by the choice of different membership values to take into account data from different sources.
Detailed analysis focused on cluster N12, which included genes responsive to most environmental perturbations. This type of analysis is similar to that in a study that identified cellular stress response genes in yeast from global transcript profiles of stress responses . In terms of functional categories a significant overlap is evident, although the yeast analysis identified a larger number of genes involved in carbohydrate metabolism in this group of common stress genes compared with the Arabidopsis list. Many of the genes in our analysis encode stress-responsive functions in animals and yeasts, such as the Snf1 kinase-related, phoshoinositol-related, and Bax-inhibitor related pathways. They may represent the evolutionarily conserved cellular stress response, originating from damage recognition of, for example, lipid membranes, proteins, or DNA, and mediated by signals related to calcium and ROS [11, 81]. In plants, signals may also be communicated by ethylene  and are largely independent of ABA. Although responding to many stresses, ROS and ethylene signaling cannot act as a systemic coordinator of gene expression in the way that this is accomplished by ABA.
Hypothetically, ROS  or ethylene induce signaling mainly locally in stress responses, and the genes in cluster N12 appear to elicit local responses but have no function in long distance communication. In agreement with this hypothesis, no cluster specific for ACC treatment emerged, and neither was a correlation between ethylene treatments and the stress cluster structure identified in fuzzy k-means analysis .
The correlation of a number of previously studied 5'-regulatory, cis-acting sequences with particular stress conditions, biotic and abiotic alike, was confirmed [12, 13, 70], and the presence of additional 5'-regulatory response elements was identified (Table 2). The ABRE motif ACGTG(G|T)C was over-represented in multiple clusters responding to either light or ABA treatment, indicating that the motif is essential but not sufficient to explain the multiplicity of clusters. Secondary motifs that modify ABA responsiveness are identified. Within these clusters, the I-Box and DRE motifs emerged and others are strongly suggested, although detailed studies have not been conducted on these putatively novel regulatory elements. Another motif, the W-box, was over-represented in several clusters induced by biotic stresses, and the corresponding W-box binding transcription factors, namely WRKYs, were themselves included in these clusters.
The chosen way to integrate datasets revealed relationships between stress regulation and tissue-specific expression in the Arabidopsis root. In particular, the downregulation of genes during osmotic challenge and, although moderately, by ABA in roots identified genes that, under nonstress conditions, are highly expressed in cells of the vascular tissue and in the mature root (Figure 6). The stress-repressed genes in these clusters are responsible for the physiologic effects of stress that result in impeded growth; most of these genes reflect metabolic pathways and functions that signal injury and challenges to organ integrity. In stark contrast, other clusters identified genes that are upregulated by ABA, in part also by JA, and upregulation is not solely confined to the roots (Figure 7). Cell specificity is less pronounced in these clusters but the genes included tend to be more highly expressed in cells of the cortex, endodermis, and vasculature in mature regions of roots. Genes with known and conjectured signaling functions dominate in these clusters. This findings appears to implicate the endodermis and stele of mature roots as playing critical roles in counteracting the effects of many stresses. Included are many unknown genes, whose functions in environmental stress protection have not yet been analyzed.
Our approach represents one way to integrate diverse, independent datasets to enhance understanding of the plant environmental stress transcriptome. Irrespective of the many experimental conditions, the analysis identified many genes that had previously been implicated in plant stress responses in detailed studies that focused on individual genes. In addition, the overall structure of 5'-regulatory sequences that resulted from this study corresponded to the results of other studies, but they also suggested the existence of additional putative cis-elements, which await detailed analysis. The clustering that emerged provides an interpretation for the interdependence and distinction of biotic and abiotic stress factors. It defines an evolutionarily conserved basic set of stress response genes. Genes related to ROS-generating and ROS-detoxifying functions and ethylene action were scattered in virtually all major clusters, which appears to indicate the fundamental roles that these proteins play in diverse sensing and signaling pathways. Finally, the correlation of changes in transcript abundance and the spatial and temporal resolution of expression patterns in Arabidopsis roots add a new dimension. The predictions intrinsic in the cluster structures and their gene compositions are models that should be helpful in designing more detailed analyses.
Materials and methods
Affymetrix microarray data
Transcript profiles that reflect responses of Arabidopsis to different abiotic stress conditions were obtained from Weigel World , which had been processed via gcRMA . For biotic stresses, hormones, different light regimens, and chemicals (t-zeatin, tri-iodobenzoic acid, AgNO3, and cycloheximide) treatments, the CEL files of the Affymetrix ATH1 microarray data were downloaded from TAIR , and processed into expression estimates using the gcRMA method with default settings implemented in Bioconductor . For each experiment, the log2 intensities of individual probe sets were averaged across two replicates for both treatments and control, and their differences were used as log2 values of fold changes (treatments/control). Details of the treatment conditions, excerpted from the AtGenExpress project, are included in Additional data file 1, and the processed data are listed in Additional data file 2.
The microarray data pertaining to cell type-specific and growth stage-specific gene expression in Arabidopsis roots have previously been described . The CEL files for these data were downloaded from the Arabidopsis Gene Expression Database  and processed into expression estimates as described above. The log2 intensities of every individual probe sets were averaged across three replicates for cell type-specific profiles, or four replicates for stage-specific profiles (Additional data file 6). MAS 5.0, which calls as present or absent each probe set in each slide, was calculated using the 'affy' package implemented in Bioconductor. Only probe sets with calls of present in all four replicates from at least one of the stage samples were analyzed. Excluded were 356 genes that had been shown to be induced by protoplasting of root cells . The remaining 12,360 probes were analyzed.
The analysis of stress, hormone, chemical, and light treatments was similar to the procedure described previously . The log2 fold change values (treatment/control) of entire probe sets were analyzed with fuzzy k-means clustering . The parameter was set as k = 300 for the complete set of treatments. In the most economical manner, 180 centroids were identified, and clustered such that any probe was assigned to the cluster in which it had highest membership value (Additional data file 3). By applying a cutoff of 0.035, 10,671 probes were separated into these clusters. For each gene, the sum of its membership values with the 180 centroids is 1. Therefore, the average membership value is 0.006. We considered 0.035, around six times higher than the average value, to be a significant cutoff.
To reveal the cell type-specific and stage-specific gene expression patterns, the relative expression value for each probe in each cell type or stage was calculated, by subtracting the probe's average intensity across the cell type samples or the stage samples from its intensity in that cell type or stage samples (Additional data file 7). The relative expression values were assembled and analyzed by fuzzy k-means, with the parameter k set at 30. Nineteen well defined clusters were recovered, and no cutoff values were applied (Figure 5 and Additional data file 8).
To focus on abiotic stress responses, we selected datasets for cold, osmotic, salt and drought stresses, and the hormones ABA and JA, for those probes present in roots. These were clustered separately, as previously, and resulted in 66 clusters. The corresponding Arabidopsis gene locus for each probe set followed the annotation by TAIR .
Differentially expressed genes
The limma method implemented in Bioconductor was used to identify differentially expressed genes . The original expression datasets from all conditions, derived from gcRMA, were used to construct the linear model. Different contrast matrices were utilized to identify the genes that were differentially expressed under at least one condition/time point among all conditions, or among the time course treatments of cold, osmotic, salinity, wounding, or pathogen treatments.
Gene Ontology analysis
The Clench 2.0 program  was used to identify over-represented GO categories within a group of genes.
The motifs listed in the PLACE database were collected . Their frequencies of appearance in the promoter regions (1,000 base pairs upstream of the coding region, downloaded from TAIR) of all genes in the entire genome were scanned using the patmatch program . For each motif, its frequency of appearance in any cluster was compared with its frequency in all promoters predicted for the entire genome. A P value was calculated based on hypergeometric distribution :
where M is the number of promoters within the cluster, m is the number of promoters within the cluster that contain the motif, K is the total number of promoters in the genome, and k is the total number of the promoters in the genome that contain the motif.
Over-represented motifs within clusters were identified by their P values. Also included in the analysis was a list of cis-elements identified from a study conducted in mammalian systems .
Additional data files
The following additional data are available with the online version of this paper. Additional data file 1 includes the microarray datasets used for this analysis. Additional data file 2 includes the stress datasets used for fuzzy k-means analysis. Additional data file 3 includes clustering results of all stress datasets. Additional data file 4 includes a comparison between clusters N6 and N53. Additional data file 5 shows all genes in the common stresses response cluster N12. Additional data file 6 provides original data of gene expression in roots. Additional data file 7 shows processed root dataset used for fuzzy k-means analysis. Additional data file 8 shows clustering results for the root dataset. Additional data file 9 shows the intersection between stress clustering and root clustering. Additional data file 10 shows clustering results for abiotic stresses in roots.
Cheong YH, Chang HS, Gupta R, Wang X, Zhu T, Luan S: Transcriptional profiling reveals novel interactions between wounding, pathogen, abiotic stress, and hormonal responses in Arabidopsis. Plant Physiol. 2002, 129: 661-677. 10.1104/pp.002857.
Chinnusamy V, Stevenson B, Lee BH, Zhu JK: Screening for gene regulation mutants by bioluminescence imaging. Sci STKE. 2002, 2002: PL10-
Kreps JA, Wu Y, Chang HS, Zhu T, Wang X, Harper JF: Transcriptome changes for Arabidopsis in response to salt, osmotic, and cold stress. Plant Physiol. 2002, 130: 2129-2141. 10.1104/pp.008532.
Seki M, Narusaka M, Ishida J, Nanjo T, Fujita M, Oono Y, Kamiya A, Nakajima M, Enju A, Sakurai T, et al: Monitoring the expression profiles of 7000 Arabidopsis genes under drought, cold and high-salinity stresses using a full-length cDNA microarray. Plant J. 2002, 31: 279-292. 10.1046/j.1365-313X.2002.01359.x.
Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408: 796-815. 10.1038/35048692.
Rhee SY, Beavis W, Berardini TZ, Chen G, Dixon D, Doyle A, Garcia-Hernandez M, Huala E, Lander G, Montoya M, et al: The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community. Nucleic Acids Res. 2003, 31: 224-228. 10.1093/nar/gkg076.
Craigon DJ, James N, Okyere J, Higgins J, Jotham J, May S: NASCArrays: a repository for microarray data generated by NASC's transcriptomics service. Nucleic Acids Res. 2004, D575-D577. 10.1093/nar/gkh133. 32 Database
Zimmermann P, Hirsch-Hoffmann M, Hennig L, Gruissem W: GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox. Plant Physiol. 2004, 136: 2621-2632. 10.1104/pp.104.046367.
Gough NR: Science's signal transduction knowledge environment: the connections maps database. Ann NY Acad Sci. 2002, 971: 585-587.
Usadel B, Nagel A, Thimm O, Redestig H, Blaesing OE, Palacios-Rojas N, Selbig J, Hannemann J, Piques MC, Steinhauser D, et al: Extension of the visualization tool MapMan to allow statistical analysis of arrays, display of corresponding genes, and comparison with known responses. Plant Physiol. 2005, 138: 1195-1204. 10.1104/pp.105.060459.
Kultz D: Molecular and evolutionary basis of the cellular stress response. Annu Rev Physiol. 2005, 67: 225-257. 10.1146/annurev.physiol.67.040403.103635.
Arguello-Astorga G, Herrera-Estrella L: Evolution of light-regulated plant promoters. Annu Rev Plant Physiol Plant Mol Biol. 1998, 49: 525-555. 10.1146/annurev.arplant.49.1.525.
Yamaguchi-Shinozaki K, Shinozaki K: Organization of cis-acting regulatory elements in osmotic- and cold-stress-responsive promoters. Trends Plant Sci. 2005, 10: 88-94. 10.1016/j.tplants.2004.12.012.
Higo K, Ugawa Y, Iwamoto M, Korenaga T: Plant cis-acting regulatory DNA elements (PLACE) database: 1999. Nucleic Acids Res. 1999, 27: 297-300. 10.1093/nar/27.1.297.
Lescot M, Dehais P, Thijs G, Marchal K, Moreau Y, Van de Peer Y, Rouze P, Rombauts S: PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences. Nucleic Acids Res. 2002, 30: 325-327. 10.1093/nar/30.1.325.
Gasch AP, Eisen MB: Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. Genome Biol. 2002, 3: RESEARCH0059-10.1186/gb-2002-3-11-research0059.
TAIR - Gene Expression. [http://www.arabidopsis.org/info/expression/ATGenExpress.jsp]
The Arabidopsis Gene Expression Database. [http://www.arexdb.org/]
Birnbaum K, Shasha DE, Wang JY, Jung JW, Lambert GM, Galbraith DW, Benfey PN: A gene expression map of the Arabidopsis root. Science. 2003, 302: 1956-1960. 10.1126/science.1090022.
Gong Q, Li P, Ma S, Indu Rupassara S, Bohnert HJ: Salinity stress adaptation competence in the extremophile Thellungiella halophila in comparison with its relative Arabidopsis thaliana. Plant J. 2005, 44: 826-839. 10.1111/j.1365-313X.2005.02587.x.
Ma S, Gong Q, Bohnert HJ: Dissecting salt stress pathways. J Exp Bot. 2006, 57: 1097-1107. 10.1093/jxb/erj098.
Smyth GK: Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004, 3: Article3-
Nakagami H, Pitzschke A, Hirt H: Emerging MAP kinase pathways in plant stress signalling. Trends Plant Sci. 2005, 10: 339-346. 10.1016/j.tplants.2005.05.009.
Widmann C, Gibson S, Jarpe MB, Johnson GL: Mitogen-activated protein kinase: conservation of a three-kinase module from yeast to human. Physiol Rev. 1999, 79: 143-180.
Hohmann S: Osmotic stress signaling and osmoadaptation in yeasts. Microbiol Mol Biol Rev. 2002, 66: 300-372. 10.1128/MMBR.66.2.300-372.2002.
Wojda I, Alonso-Monge R, Bebelman JP, Mager WH, Siderius M: Response to high osmotic conditions and elevated temperature in Saccharomyces cerevisiae is controlled by intracellular glycerol and involves coordinate activity of MAP kinase pathways. Microbiology. 2003, 149: 1193-1204. 10.1099/mic.0.26110-0.
Chang C: Ethylene signaling: the MAPK module has finally landed. Trends Plant Sci. 2003, 8: 365-368. 10.1016/S1360-1385(03)00156-0.
Chen YF, Etheridge N, Schaller GE: Ethylene signal transduction. Ann Bot (Lond). 2005, 95: 901-915. 10.1093/aob/mci100.
Ouaked F, Rozhon W, Lecourieux D, Hirt H: A MAPK pathway mediates ethylene signaling in plants. EMBO J. 2003, 22: 1282-1288. 10.1093/emboj/cdg131.
Liu Y, Zhang S: Phosphorylation of 1-aminocyclopropane-1-carboxylic acid synthase by MPK6, a stress-responsive mitogen-activated protein kinase, induces ethylene biosynthesis in Arabidopsis. Plant Cell. 2004, 16: 3386-3399. 10.1105/tpc.104.026609.
De Paepe A, Vuylsteke M, Van Hummelen P, Zabeau M, Van Der Straeten D: Transcriptional profiling by cDNA-AFLP and microarray analysis reveals novel insights into the early response to ethylene in Arabidopsis. Plant J. 2004, 39: 537-559. 10.1111/j.1365-313X.2004.02156.x.
Gong D, Guo Y, Jagendorf AT, Zhu JK: Biochemical characterization of the Arabidopsis protein kinase SOS2 that functions in salt tolerance. Plant Physiol. 2002, 130: 256-264. 10.1104/pp.004507.
Kleinow T, Bhalerao R, Breuer F, Umeda M, Salchert K, Koncz C: Functional identification of an Arabidopsis snf4 ortholog by screening for heterologous multicopy suppressors of snf4 deficiency in yeast. Plant J. 2000, 23: 115-122. 10.1046/j.1365-313x.2000.00809.x.
Sakamoto H, Maruyama K, Sakuma Y, Meshi T, Iwabuchi M, Shinozaki K, Yamaguchi-Shinozaki K: Arabidopsis Cys2/His2-type zinc-finger proteins function as transcription repressors under drought, cold, and high-salinity stress conditions. Plant Physiol. 2004, 136: 2734-2746. 10.1104/pp.104.046599.
Lee H, Guo Y, Ohta M, Xiong L, Stevenson B, Zhu JK: LOS2, a genetic locus required for cold-responsive gene transcription encodes a bi-functional enolase. Embo J. 2002, 21: 2692-2702. 10.1093/emboj/21.11.2692.
Chae HJ, Kim HR, Xu C, Bailly-Maitre B, Krajewska M, Krajewski S, Banares S, Cui J, Digicaylioglu M, Ke N, et al: BI-1 regulates an apoptosis pathway linked to endoplasmic reticulum stress. Mol Cell. 2004, 15: 355-366. 10.1016/j.molcel.2004.06.038.
Baek D, Nam J, Koo YD, Kim DH, Lee J, Jeong JC, Kwak SS, Chung WS, Lim CO, Bahk JD, et al: Bax-induced cell death of Arabidopsis is meditated through reactive oxygen-dependent and-independent processes. Plant Mol Biol. 2004, 56: 15-27. 10.1007/s11103-004-3096-4.
Kawai-Yamada M, Ohori Y, Uchimiya H: Dissection of Arabidopsis Bax inhibitor-1 suppressing Bax-, hydrogen peroxide-, and salicylic acid-induced cell death. Plant Cell. 2004, 16: 21-32. 10.1105/tpc.014613.
Watanabe N, Lam E: Arabidopsis Bax inhibitor-1 functions as an attenuator of biotic and abiotic types of cell death. Plant J. 2006, 45: 884-894.
Levine A, Belenghi B, Damari-Weisler H, Granot D: Vesicle-associated membrane protein of Arabidopsis suppresses Bax-induced apoptosis in yeast downstream of oxidative burst. J Biol Chem. 2001, 276: 46284-46289. 10.1074/jbc.M107375200.
Mazel A, Leshem Y, Tiwari BS, Levine A: Induction of salt and osmotic stress tolerance by overexpression of an intracellular vesicle trafficking protein AtRab7 (AtRabG3e). Plant Physiol. 2004, 134: 118-128. 10.1104/pp.103.025379.
Williams ME, Torabinejad J, Cohick E, Parker K, Drake EJ, Thompson JE, Hortter M, Dewald DB: Mutations in the Arabidopsis phosphoinositide phosphatase gene SAC9 lead to overaccumulation of PtdIns(4,5)P2 and constitutive expression of the stress-response pathway. Plant Physiol. 2005, 138: 686-700. 10.1104/pp.105.061317.
Zhong R, Burk DH, Nairn CJ, Wood-Jones A, Morrison WH, Ye ZH: Mutation of SAC1, an Arabidopsis SAC domain phosphoinositide phosphatase, causes alterations in cell morphogenesis, cell wall synthesis, and actin organization. Plant Cell. 2005, 17: 1449-1466. 10.1105/tpc.105.031377.
Hua J, Grisafi P, Cheng SH, Fink GR: Plant growth homeostasis is controlled by the Arabidopsis BON1 and BAP1 genes. Genes Dev. 2001, 15: 2263-2272. 10.1101/gad.918101.
Apel K, Hirt H: Reactive oxygen species: metabolism, oxidative stress, and signal transduction. Annu Rev Plant Biol. 2004, 55: 373-399. 10.1146/annurev.arplant.55.031903.141701.
Davletova S, Schlauch K, Coutu J, Mittler R: The zinc-finger protein Zat12 plays a central role in reactive oxygen and abiotic stress signaling in Arabidopsis. Plant Physiol. 2005, 139: 847-856. 10.1104/pp.105.068254.
Mittler R, Vanderauwera S, Gollery M, Van Breusegem F: Reactive oxygen gene network of plants. Trends Plant Sci. 2004, 9: 490-498. 10.1016/j.tplants.2004.08.009.
Borsani O, Zhu J, Verslues PE, Sunkar R, Zhu JK: Endogenous siRNAs derived from a pair of natural cis-antisense transcripts regulate salt tolerance in Arabidopsis. Cell. 2005, 123: 1279-1291. 10.1016/j.cell.2005.11.035.
Tucker M, Valencia-Sanchez MA, Staples RR, Chen J, Denis CL, Parker R: The transcription factor associated Ccr4 and Caf1 proteins are components of the major cytoplasmic mRNA deadenylase in Saccharomyces cerevisiae. Cell. 2001, 104: 377-386. 10.1016/S0092-8674(01)00225-2.
Birse CE, Minvielle-Sebastia L, Lee BA, Keller W, Proudfoot NJ: Coupling termination of transcription to messenger RNA maturation in yeast. Science. 1998, 280: 298-301. 10.1126/science.280.5361.298.
Xu X, Chen C, Fan B, Chen Z: Physical and functional interactions between pathogen-induced Arabidopsis WRKY18, WRKY40, and WRKY60 transcription factors. Plant Cell. 2006, 18: 1310-1326. 10.1105/tpc.105.037523.
Andreasson E, Jenkins T, Brodersen P, Thorgrimsen S, Petersen NH, Zhu S, Qiu JL, Micheelsen P, Rocher A, Petersen M, et al: The MAP kinase substrate MKS1 is a regulator of plant defense responses. EMBO J. 2005, 24: 2579-2589. 10.1038/sj.emboj.7600737.
Zheng Z, Qamar SA, Chen Z, Mengiste T: Arabidopsis WRKY33 transcription factor is required for resistance to necrotrophic fungal pathogens. Plant J. 2006, 48: 592-605. 10.1111/j.1365-313X.2006.02901.x.
Journot-Catalino N, Somssich IE, Roby D, Kroj T: The transcription factors WRKY11 and WRKY17 Act as negative regulators of basal resistance in Arabidopsis thaliana. Plant Cell. 2006, 18: 3289-3302. 10.1105/tpc.106.044149.
Torres-Galea P, Huang LF, Chua NH, Bolle C: The GRAS protein SCL13 is a positive regulator of phytochrome-dependent red light signaling, but can also modulate phytochrome A responses. Mol Genet Genomics. 2006, 276: 13-30. 10.1007/s00438-006-0123-y.
Osakabe Y, Maruyama K, Seki M, Satou M, Shinozaki K, Yamaguchi-Shinozaki K: Leucine-rich repeat receptor-like kinase1 is a key membrane-bound regulator of abscisic acid early signaling in Arabidopsis. Plant Cell. 2005, 17: 1105-1119. 10.1105/tpc.104.027474.
Umezawa T, Okamoto M, Kushiro T, Nambara E, Oono Y, Seki M, Kobayashi M, Koshiba T, Kamiya Y, Shinozaki K: CYP707A3, a major ABA 8'-hydroxylase involved in dehydration and rehydration response in Arabidopsis thaliana. Plant J. 2006, 46: 171-182. 10.1111/j.1365-313X.2006.02683.x.
Alcazar R, Garcia-Martinez JL, Cuevas JC, Tiburcio AF, Altabella T: Overexpression of ADC2 in Arabidopsis induces dwarfism and late-flowering through GA deficiency. Plant J. 2005, 43: 425-436. 10.1111/j.1365-313X.2005.02465.x.
Urano K, Yoshiba Y, Nanjo T, Ito T, Yamaguchi-Shinozaki K, Shinozaki K: Arabidopsis stress-inducible gene for arginine decarboxylase AtADC2 is required for accumulation of putrescine in salt tolerance. Biochem Biophys Res Commun. 2004, 313: 369-375. 10.1016/j.bbrc.2003.11.119.
Givens RM, Lin MH, Taylor DJ, Mechold U, Berry JO, Hernandez VJ: Inducible expression, enzymatic activity, and origin of higher plant homologues of bacterial RelA/SpoT stress proteins in Nicotiana tabacum. J Biol Chem. 2004, 279: 7495-7504. 10.1074/jbc.M311573200.
van der Biezen EA, Sun J, Coleman MJ, Bibb MJ, Jones JD: Arabidopsis RelA/SpoT homologs implicate (p)ppGpp in plant signaling. Proc Natl Acad Sci USA. 2000, 97: 3747-3752. 10.1073/pnas.060392397.
Swiderski MR, Innes RW: The Arabidopsis PBS1 resistance gene encodes a member of a novel protein kinase subfamily. Plant J. 2001, 26: 101-112. 10.1046/j.1365-313x.2001.01014.x.
Varet A, Parker J, Tornero P, Nass N, Nurnberger T, Dangl JL, Scheel D, Lee J: NHL25 and NHL3, two NDR1/HIN1-1ike genes in Arabidopsis thaliana with potential role(s) in plant defense. Mol Plant Microbe Interact. 2002, 15: 608-616.
Varet A, Hause B, Hause G, Scheel D, Lee J: The Arabidopsis NHL3 gene encodes a plasma membrane protein and its overexpression correlates with increased resistance to Pseudomonas syringae pv. tomato DC3000. Plant Physiol. 2003, 132: 2023-2033. 10.1104/pp.103.020438.
Shao F, Golstein C, Ade J, Stoutemyer M, Dixon JE, Innes RW: Cleavage of Arabidopsis PBS1 by a bacterial type III effector. Science. 2003, 301: 1230-1232. 10.1126/science.1085671.
Yang CW, Gonzalez-Lamothe R, Ewan RA, Rowland O, Yoshioka H, Shenton M, Ye H, O'Donnell E, Jones JD, Sadanandom A: The E3 ubiquitin ligase activity of Arabidopsis PLANT U-BOX17 and its functional tobacco homolog ACRE276 are required for cell death and defense. Plant Cell. 2006, 18: 1084-1098. 10.1105/tpc.105.039198.
Ramonell K, Berrocal-Lobo M, Koh S, Wan J, Edwards H, Stacey G, Somerville S: Loss-of-function mutations in chitin responsive genes show increased susceptibility to the powdery mildew pathogen Erysiphe cichoracearum. Plant Physiol. 2005, 138: 1027-1036. 10.1104/pp.105.060947.
Mengiste T, Chen X, Salmeron J, Dietrich R: The BOTRYTIS SUSCEPTIBLE1 gene encodes an R2R3MYB transcription factor protein that is required for biotic and abiotic stress responses in Arabidopsis. Plant Cell. 2003, 15: 2551-2565. 10.1105/tpc.014167.
Hughes JD, Estep PW, Tavazoie S, Church GM: Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J Mol Biol. 2000, 296: 1205-1214. 10.1006/jmbi.2000.3519.
Eulgem T, Rushton PJ, Robatzek S, Somssich IE: The WRKY superfamily of plant transcription factors. Trends Plant Sci. 2000, 5: 199-206. 10.1016/S1360-1385(00)01600-9.
Harmer SL, Hogenesch JB, Straume M, Chang HS, Han B, Zhu T, Wang X, Kreps JA, Kay SA: Orchestrated transcription of key pathways in Arabidopsis by the circadian clock. Science. 2000, 290: 2110-2113. 10.1126/science.290.5499.2110.
Tatematsu K, Ward S, Leyser O, Kamiya Y, Nambara E: Identification of cis-elements that regulate gene expression during initiation of axillary bud outgrowth in Arabidopsis. Plant Physiol. 2005, 138: 757-766. 10.1104/pp.104.057984.
Hudson ME, Quail PH: Identification of promoter motifs involved in the network of phytochrome A-regulated gene expression by combined analysis of genomic sequence and microarray data. Plant Physiol. 2003, 133: 1605-1616. 10.1104/pp.103.030437.
Chaboute ME, Clement B, Sekine M, Philipps G, Chaubet-Gigot N: Cell cycle regulation of the tobacco ribonucleotide reductase small subunit gene is mediated by E2F-like elements. Plant Cell. 2000, 12: 1987-2000. 10.1105/tpc.12.10.1987.
Kosugi S, Ohashi Y: E2F sites that can interact with E2F proteins cloned from rice are required for meristematic tissue-specific expression of rice and tobacco proliferating cell nuclear antigen promoters. Plant J. 2002, 29: 45-59. 10.1046/j.1365-313x.2002.01196.x.
Vandepoele K, Vlieghe K, Florquin K, Hennig L, Beemster GT, Gruissem W, Van de Peer Y, Inze D, De Veylder L: Genome-wide identification of potential plant E2F target genes. Plant Physiol. 2005, 139: 316-328. 10.1104/pp.105.066290.
Pati A, Vasquez-Robinet C, Heath LS, Grene R, Murali TM: XcisClique: analysis of regulatory bicliques. BMC Bioinformatics. 2006, 7: 218-10.1186/1471-2105-7-218.
Swindell WR: The association among gene expression responses to nine abiotic stress treatments in Arabidopsis thaliana. Genetics. 2006, 174: 1811-1824.
Fowler S, Thomashow MF: Arabidopsis transcriptome profiling indicates that multiple regulatory pathways are activated during cold acclimation in addition to the CBF cold response pathway. Plant Cell. 2002, 14: 1675-1690. 10.1105/tpc.003483.
Chen D, Toone WM, Mata J, Lyne R, Burns G, Kivinen K, Brazma A, Jones N, Bahler J: Global transcriptional responses of fission yeast to environmental stress. Mol Biol Cell. 2003, 14: 214-229. 10.1091/mbc.E02-08-0499.
Fedoroff N: Redox regulatory mechanisms in cellular stress responses. Ann Bot (Lond). 2006, 98: 289-300. 10.1093/aob/mcl128.
Alonso JM, Ecker JR: The ethylene pathway: a paradigm for plant hormone signaling and interaction. Sci STKE. 2001, 2001: RE1-
Weigel World. [http://www.weigelworld.org/resources/microarray/AtGenExpress/]
Wu Z, Irizarry R, Gentleman R, Murillo F, Spencer F: A model based background adjustment for oligonucleotide expression arrays. 2003, Baltimore, MA: Johns Hopkins University, Dept of Biostatistics Working Papers
TAIR - Microarrays. [ftp://ftp.arabidopsis.org/home/tair/home/tair/Microarrays/Affymetrix/]
Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004, 5: R80-10.1186/gb-2004-5-10-r80.
Shah NH, Fedoroff NV: CLENCH: a program for calculating Cluster ENriCHment using the Gene Ontology. Bioinformatics. 2004, 20: 1196-1197. 10.1093/bioinformatics/bth056.
Yan T, Yoo D, Berardini TZ, Mueller LA, Weems DC, Weng S, Cherry JM, Rhee SY: PatMatch: a program for finding patterns in peptide and nucleotide sequences. Nucleic Acids Res. 2005, W262-W266. 10.1093/nar/gki368. 33 Web Server
Xie X, Lu J, Kulbokas EJ, Golub TR, Mootha V, Lindblad-Toh K, Lander ES, Kellis M: Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals. Nature. 2005, 434: 338-345. 10.1038/nature03441.
We thank Dr Qingqiu Gong for many discussions, and the reviewers for their scrutiny and many helpful comments. We are indebted to members of the AtGenExpress consortium for their help. The work was supported by NSF DBI-0223905 and DBI-0211842, and UIUC institutional funds.
Electronic supplementary material
Additional data file 1: Microarray datasets used for this analysis, including the descriptions of the treatments and conditions. The data come from AtGenExpress (abiotic and biotic stresses, elicitor treatments, hormone treatments, organ-specific expression), and transcription data in different cell lineages and developmental stages of the root. (PDF 26 KB)
Additional data file 4: Comparison between clusters N6 and N53 (legend as in Figure 2). (PNG 219 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
About this article
Cite this article
Ma, S., Bohnert, H.J. Integration of Arabidopsis thaliana stress-related transcript profiles, promoter structures, and cell-specific expression. Genome Biol 8, R49 (2007). https://doi.org/10.1186/gb-2007-8-4-r49