Reverse-engineering the Arabidopsis thaliana transcriptional network under changing environmental conditions
- Javier Carrera†1, 2,
- Guillermo Rodrigo†1,
- Alfonso Jaramillo3, 4 and
- Santiago F Elena1, 5Email author
© Carrera et al.; licensee BioMed Central Ltd. 2009
Received: 10 July 2009
Accepted: 15 September 2009
Published: 15 September 2009
Understanding the molecular mechanisms plants have evolved to adapt their biological activities to a constantly changing environment is an intriguing question and one that requires a systems biology approach. Here we present a network analysis of genome-wide expression data combined with reverse-engineering network modeling to dissect the transcriptional control of Arabidopsis thaliana. The regulatory network is inferred by using an assembly of microarray data containing steady-state RNA expression levels from several growth conditions, developmental stages, biotic and abiotic stresses, and a variety of mutant genotypes.
We show that the A. thaliana regulatory network has the characteristic properties of hierarchical networks. We successfully applied our quantitative network model to predict the full transcriptome of the plant for a set of microarray experiments not included in the training dataset. We also used our model to analyze the robustness in expression levels conferred by network motifs such as the coherent feed-forward loop. In addition, the meta-analysis presented here has allowed us to identify regulatory and robust genetic structures.
These data suggest that A. thaliana has evolved high connectivity in terms of transcriptional regulation among cellular functions involved in response and adaptation to changing environments, while gene networks constitutively expressed or less related to stress response are characterized by a lower connectivity. Taken together, these findings suggest conserved regulatory strategies that have been selected during the evolutionary history of this eukaryote.
Living organisms have evolved molecular circuitries with the aim of promoting their own development under dynamically changing environments. In particular, plants are not able to evade those changes and have had to evolve robust methods to cope with environmental stress and recovery mechanisms. Genomic sequences specify the context-dependent gene expression programs to render cells, tissues, organs and, finally, organisms. Then, at any moment during the cell cycle and at each stage of an organism's development, and in response to environmental conditions, each cell is the product of specific and well defined programs involving the coordinated transcription of thousands of genes. Thus, the elucidation of such programs in terms of the regulatory interactions involved is pivotal for the understanding of how organisms have evolved and what environments may have conditioned evolutionary trajectories the most. However, we still have little understanding of how this highly tuned process is achieved for most organisms, and the surface of the problem is only just being scratched for a handful of model organisms, such as the bacterium Escherichia coli , the yeast Saccharomyces cerevisiae , the nematode Caenorhabditis elegans , the plant Arabidopsis thaliana [4, 5], and, to a lesser extent, humans .
Meta-analyses of microarray data collections may now be used to construct biological networks that systematically categorize all molecules and describe their functions and interactions. Networks can integrate biological functions of cells, organs, and organisms. During recent years, there has been a tremendous effort in the development and improvement of techniques to infer gene connectivity. Clustering approaches [7–11] and information theory methods [12–16] have been used to infer regulatory networks. Bayesian methods [17–20] can give accurate networks with low coverage but at a high computational cost.
The analysis of the expression of the A. thaliana transcriptome offers the potential to identify prevailing cellular processes, to associate genes with particular biological functions, and to assign otherwise unknown genes to biological responses. Previous attempts to model the A. thaliana gene network used methods such as fuzzy k-means clustering , graphical Gaussian models , and Markov chain graph clustering [5, 15]. The inconvenience of the first approach is that clustering describes genes based on a characteristic property common to all genes, but it is difficult to deduce a pathway structure from this property alone because pathways would have to be concerned with co-expression features that transcend such cluster structure. The second approach assumes that the number of microarray slides should be much larger than the number of genes analyzed or approximations must be taken (for example, empirical Bayes with bootstrap re-sampling or shrinkage approaches). The last approach is based on Person's correlations and, therefore, strongly sensitive to outliers and to violations of the implicit assumption of linear relationships among genes. In this article, we present a predictable genome model from a regulatory scaffold inferred by using probabilistic methods  and estimate the corresponding kinetic parameters using linear regression [22–25]. We analyze the topological properties and predictive power of the inferred regulatory model. We evaluate the performance of the network by predicting already known transcriptional regulations and assess the functional relevance and reproducibility of the co-expression patterns detected. Finally, we discuss the evolutionary implications of transcriptional control in plants.
High-throughput technologies combined with rigorous and biologically rooted modeling will allow understanding of how simple genetic or environmental perturbations influence the dynamic behavior of cellular genetic and metabolic networks . However, transcriptomic data need to be properly integrated to formulate a model that can be used for making quantitative predictions on how the environment interacts with cellular networks to affect phenotypic responses. At the end, the accurate prediction of this quantitative behavior will open the possibility of re-engineering cellular circuits. To reach this end, we have attempted the integration of experimental and computational approaches to construct a predictive gene regulatory network model covering the full transcriptome of the model plant A. thaliana.
Genome-wide transcriptional control in A. thaliana
Topological parameters of the inferred transcription network of A. thaliana
Characteristic path length
Number of connected genes
Number of regulations inferred
7.78 × 10-4
Genes regulated by one TF
Genes regulated by two TFs
Genes regulated by three TFs
Genes regulated by four TFs
Genes regulated by five TFs
Genes regulated by more than five TFs
which is the harmonic mean of precision and sensitivity. Indeed, precision and sensitivity are necessarily negatively correlated performance statistics, and these two values were set up so they maximize global performance (F) by selecting values > 5 (Figures S1 and S2 in Additional data file 1) for the z-score used as threshold to predict the transcriptional regulations. Figure S3 in Additional data file 1 shows P, S and F as a function of the z-score threshold. Sensitivity is maximized (S = 100%) for z = 0 (that is, a high number of regulations but very low confidence) while precision is maximized (P = 100%) for z = 11 (that is, high confidence but a very low number of regulations). The optimum value is reached for z = 5, a value for which F = 26% (P = 40% and S = 20%). In a recent study, a smaller network topology has been proposed for A. thaliana . This network contains 18,625 regulations and an F = 3.7% (P = 88% but S = 1.8%), relative to the AtRegNet reference dataset.
InferGene predicts that more than half of the genes are controlled by constitutive promoters (17.89%) or by promoters regulated by less than three TFs (Table 1). Also, from a purely topological perspective, the inferred transcriptional network of A. thaliana is weakly connected directed, containing 18,169 connected genes (Table 1), while the size of the largest strongly connected component contains only 730 nodes, all of which are TFs. In addition, it has a high density (0.078%; Table 1); this parameter is the normalized average connectivity of a gene in the network in comparison to values reported in similar studies on other organisms. For example, Lee et al.  suggested a network density of 0.0027% for S. cerevisiae, while we previously reported a value of 0.036% for the inferred network for E. coli . The characteristic path length  of the network follows a Gaussian distribution, with an average value of 5.065 edges (Table 1; Figure S4 in Additional data file 1) and, specifically, the distance between two genes for which a path exists ranges from 1 to 13 edges. In a previous study, we estimated that the characteristic path length for the E. coli network was 1 , much smaller than that for A. thaliana. Furthermore, the E. coli inferred network did not contain any strongly connected components and its largest weakly directed subnetwork contained only four TFs.
The ten transcription factors with the most regulatory effects (highest outgoing connectivity)
GO pathways (level 5)
KAN3 (KANDI 3)
Transcription; regulation of cellular metabolic process
Transcription; regulation of cellular metabolic process; RNA metabolic process
ANAC036 (Arabidopsis NAC domain containing protein 36)
Transcription; regulation of cellular metabolic process; RNA metabolic process
Reproductive structure development; regionalization; organ development; cell fate commitment
AtTLP3 (tubby like protein 3)
Transcription; regulation of cellular metabolic process
Transcription; regulation of cellular metabolic process; RNA metabolic process
MYB29 (myb domain protein 29)
Transcription; response to gibberellin stimulus; regulation of cellular metabolic process; RNA metabolic process
Transcription; regulation of cellular metabolic process; RNA metabolic process
ATERF1/ERF1 (ethylene response factor 1)
Response to ethylene stimulus; transcription; regulation of cellular metabolic process; intracellular signaling cascade; two-component signal transduction system; RNA metabolic process
MYB121 (myb domain protein 121)
Response to abscisic acid stimulus; transcription; regulation of cellular metabolic process; RNA metabolic process
We have validated the set of predicted targets for the 25% most highly connected TFs using AtRegNet, recovering 80% of known interactions for the regulatory model and up to 85% for the effective model (that is, the one containing both gene-gene and gene-TF interactions). Figure 2c shows that the scaling of the average clustering coefficient with the number of genes with k-connections is approximately linear in a log-log scale in the range 1 to 10,000 for neighbors with slope -1.05 (R2 = 0.850). Barabási and Oltvai  and Ravasz and Barabási  have suggested that whenever clustering scales with the number of nodes with slope -1, as in our case, it has to be taken as a strong indication of hierarchical modularity - that is, genes cluster in higher-order units of different modularity - a finding that has been suggested as general for system-level cellular organization in plants . Similarly, when the effective model is analyzed, it shows similar results to those for the regulatory model. The outgoing connectivity per gene follows a truncated power law with scale-free behavior up to k c = 21.341 connections per gene and with an exponent γ = 0.765 (Table S1 in Additional data file 2; R2 = 0.998, Akaike's weight > 99.99%; Figure 2e). Figure 2f shows that the incoming connectivity per gene does not present scale-free properties as it fits to a normal distribution (Table S1 in Additional data file 2; R2 = 0.998, Akaike's weight > 99.99%).
Clustering coefficient of different Gene Ontology pathways in A. thaliana
Number of connected genes
Number of genes
Auxin metabolic process
Response to heat
Alcohol biosynthetic process
Response to salt stress
Systemic acquired resistance
Response to other organism
Response to bacterium
Response to light stimulus
Transcriptomic profile prediction
The basic premise of our approach is to use transcriptomic data from multiple perturbation experiments (either genetic or environmental) and quantitatively measure steady-state RNA concentrations to assimilate these expression profiles into a network model that can recapitulate all observations. We also developed a test model that excludes 10% of experiments to quantify prediction power. This dataset was randomly split into two subsets. The first, larger subset contained 1,292 experiments and was used as a training set for inferring a transcription network containing 128,422 regulatory interactions. The second, smaller subset contained 144 array experiments and was used for validation purposes.
One may ask whether the predictability of our model was driven by TFs and not by non-TF genes. To test this possibility, we proceeded as follows. First, we selected a random set of 1,187 non-TF genes and used them to construct the corresponding pseudo-transcriptional network. Then we evaluated its performance as described above. The level of precision reached was undistinguishable from that of the previous model, with the distribution of relative expression error obtained fully overlapping with thar shown in Figure 4b (data not shown). We conclude from this analysis that TFs do not have stronger predictive power than other genes. This could be rationalized because, in terms of mathematical equations, genes that are coexpressed with the TFs have a priori equal chances to work as regulatory elements. On the other hand, we have also constructed an effective model excluding the TFs from the set of predictors and observed that the relative expression error decreased proportionally to the number of excluded TFs.
As a second step for the predictability of our test model, we computed Pearson correlation coefficients (r) between the experimental and predicted gene expression levels for all microarray experiments and observed that, as expected, genes having high r also have low Δ g (Figure S6 in Additional data file 1). In addition, we noticed that the predictability of the expression of those genes with high r depends on a reduced set of TFs (Figure S7a in Additional data file 1 shows that the critical mass of points concentrates in a region with high r and a low number of predictors), suggesting that a selective pressure exists to introduce indirect regulations as a way to increase robustness of genetic systems to dynamic environments. Figure S7a in Additional data file 1 also shows that the model does not tend to add large numbers of regulations as a way to minimize expression error and, by contrast, the highest density of values corresponds to a rather low number of regulations (between 0 and 30). The average incoming connectivity values estimated for E. coli  and S. cerevisiae  were 1.56 and 2.26 regulators, respectively. The comparison of these figures with the data reported here suggests that r does not significantly increase beyond a given number of regulations.
Nonetheless, a few genes were predicted to have more than 60 regulations. Looking at just the 20 most extremely regulated genes in Figure S7a in Additional data file 1, the results are interesting: the two most extreme cases correspond, respectively, with gypsy- and copia-like retrotransposons (89 and 83 connections to TFs, respectively), nine genes are annotated as unknown proteins, two are annotated as belonging to the F-box family but without any assigned biological process, one has been assigned as a putative protein kinase, five have been loosely assigned to transcription, translation, transport and secondary metabolism, and the only one with a well defined function is the At2g26330 locus, which encodes the ERECTA receptor of protein kinases involved in several developmental roles as well as in response to bacterial infections. Moreover, Figure S7b, c in Additional data file 1 shows a histogram of r per gene over 1,292 experiments in the training set and 144 conditions in the test set, respectively. The average r for the training set was 0.767 and was very similar for the test set (0.759). These values are in the same range as those reported in a study inferring the regulatory network (1,934 genes; including 81 regulators) for Halobacterium salinarum NRC-1  using 266 experimental conditions for the training model and 131 extra experiments as the test set. In this case r = 0.788 for the training set and r = 0.807 for the test set.
Selection of optimality in changing environments
Average incoming connectivity for the Gene Ontology pathways from all levels in A. thaliana
Number of genes
Number of TFs†
Number of TF/Number of genes‡
Number of FFLs§
Top five with the highest total number of TFs
Response to other organisms
Secondary metabolic process
Response to temperature stimulus
Anatomical structure morphogenesis
Response to radiation
Top five with the lowest total number of TFs
Glycerophospholipid metabolic process
Sulfur amino acid biosynthetic process
Cellular morphogenesis in differentiation
Indole and derivative metabolic process
Top five with the highest relative number of TFs
Defense response to fungus
Response to light intensity
Chlorophyll biosynthetic process
Porphyrin biosynthetic process
Top five with the lowest relative number of TFs
Glycerophospolipic metabolic process
Membrane lipid biosynthetic process
Sulfur compound biosynthetic process
Golgi vesicle transport
Biogenic amine metabolic process
Next, we sought to test whether the presence of FFLs indeed contributes to increase the robustness of the gene expression of the involved genes. To do so, we have computed a score, ρ*, quantifying the robustness of gene expression for all predicted TF-gene interactions involving three nodes (Figure 6c). Figure 6e shows the distribution of the robustness score computed from the inferred regulatory network. Although it may not be apparent after visual inspection of Figure 6e, the distribution is asymmetrical (skewness 1.881 ± 0.007, P < 0.001) and strongly leptokurtic (1,294.051 ± 0.014, P < 0.001), suggesting that there are more data points in the tails than close to the mean. The data points in the upper tail correspond to the more robust interactions and, if coherent FFLs are involved in such types of interactions, they may be over-represented in this tail. This is, indeed, the case. If we look at the top 1% of values, 90.7% of them correspond to a coherent FFL. By contrast, if we look at the 1% of interactions around the mean value, only 5.7% correspond to FFLs. Interestingly, 90.2% of motifs within the bottom 1% of the distribution correspond to incoherent FFLs.
We have discussed a reverse-engineered model of the A. thaliana gene regulatory network that will aid future research focused on distinguishing, for example, the molecular targets of a plant virus from the hundreds to thousands of additional gene products that may have modified levels of gene expression as a side-effect. We have used a recent methodology to infer the global topology of transcription regulation from gene expression data to produce a kinetic model able to predict the alterations in gene expression in plants subjected to different external stimuli. Moreover, we have concluded that the A. thaliana inferred transcriptional network presents a hierarchical scale-free architecture where biological functions cluster in modules. We have identified biological functions that are highly controlled by predicted master regulators that could change their operating points in response to dynamic external factors to produce a consistent and robust response upon different stresses at the expense of decreasing the cellular replication rate. We have successfully applied the inferred model to predict the transcriptomic response of A. thaliana under all experimental conditions included in the whole dataset, and also applied the test model to predict the response in a reduced test set, producing errors of 2 to 10% relative to the experimental value (averaging across all test experiments). Thus, we believe this modeling-validation approach constitutes an important step towards understanding an organism's large-scale mode of action to cope with a generally changing environment. The network model suggests that A. thaliana promoters are regulated by multiple TFs (Table 1), a feature that has been shown to be characteristic of eukaryotic gene regulation .
We have discussed a first gene regulatory model based on a transcriptional layer and a second model that enhances this by including gene-gene interactions that provide an even more accurate prediction of gene expression. Future work will consider just the interactions between tissue-specific genes. We have also quantified the presence of network motifs and found that FFLs are overwhelmingly common, thus supporting the above notion that robustness against perturbation has been a major driving force during the evolution of plant lineages. Furthermore, we have confirmed that coherent FFLs are overwhelmingly over-represented among interactions that are robust against the knockout of regulatory TFs (Figure 6e), while incoherent FFLs are among the most sensitive interactions. Figure 6c illustrates a possible mechanism by which FFLs would confer robustness. Imagine that the B product is relevant for cell survival. On one hand, regulatory flow through C is costly because it implies producing a redundant element; on the other hand, if perturbations disrupt the direct edge between A and B, the existence of C still allows the cell to obtain the precious B without incurring a major penalty (Figure 6d). Whether a given regulatory network may be selected to contain this sort of regulatory element depends on the balance between the fitness costs and benefits associated with redundancy [41, 42]. The fact that A. thaliana network topology seems to be rich in these transcriptional regulatory elements suggests that it has been evolutionary optimized to allow rapid responses to changes in external conditions while maintaining cellular homeostasis, and hence maximizing fitness.
The reconstruction of genome-scale regulatory models constitutes a major step towards understanding cellular behavior, but it is also useful in Synthetic Biology, where predictive models can be applied to engineer synthetic systems for biotechnological applications. InferGene  provides a means to predict changes in biological processes when perturbing a cell in order to identify the effects of drugs, viral infection and herbicides on plant interactomes. It may also facilitate optimization of cellular processes for biotechnology applications that utilize the complex regulatory properties of genetic networks.
In this study, we have shown that the A. thaliana regulatory network is scale-free and clustered, both characteristic properties of hierarchical networks. We also used our model to analyze the robustness of expression levels conferred by network motifs such as the coherent FFL. Hence, the meta-analysis presented here has allowed us to identify regulatory and robust genetic structures. These results suggest that A. thaliana has evolved a high connectivity in terms of transcriptional regulations among cellular functions involved in responses and adaptation to changing environments, while gene networks constitutively expressed or less related to stress responses are characterized by a lower connectivity. We successfully applied our quantitative network model to predict the full transcriptome of the plant for a set of microarray experiments, and the quality of the predictions was evaluated by several methods.
Materials and methods
where α i is its constitutive transcription rate, β ij the regulatory effect that gene j has on gene i and δ i the degradation coefficient. If j has no effect on the expression of i, then β ij = 0. No cooperation between genes for regulation has been assumed. Time was conveniently scaled such that δ i = 1 and the model is assumed in steady-state (y i = α i + Σ j β ij y j ), since fitting the appropriate mRNA degradation constant would require time series data .
Steady-state mRNA expression profiles derived from transcriptional perturbations collected from the TAIR website  were used in this study. We found 1,187 TFs by looking for the motif 'transcription factor' in the functionally annotated A. thaliana genome from TAIR (version 7). The dataset contains pre-processed expression data from 1,436 hybridization experiments using the 22,810 probe sets spotted on Affymetrix's GeneChip Arabidopsis ATH1 Genome Array . For this study, we consider 22,094 genes. The arrays were obtained from NASCArrays  and AtGenExpress . Data were normalized using the robust multi-array average method .
The inference procedure consisted of two nested steps. In the first step, the global network connectivity was inferred using the InferGene algorithm . This method uses mutual information with a local significance (z-score computation) to obtain the genome regulations . Hence, the potential interaction between a regulator and a gene is z-scored, constituting an estimator of the likelihood of mutual information. This approach allows some false correlations and indirect influences to be eliminated . Subsequently, we selected a z-score threshold for a cutoff. In a second step, multiple regressions were obtained to estimate the kinetic parameters of a regulatory model based on ordinary differential equations. Multilayer models were constructed to account for different types of regulations between genes and TFs. We have constructed two different models, one for transcription regulations and another to account for effective (transcription and non-transcription) regulations. In the case of non-transcriptional interactions, Lasso's method was used to avoid over-fitting  and the effective interactions between genes giving the non-transcriptional layer were unveiled. To this end, we applied a simple and efficient algorithm based on the Gauss-Seidel method  that reduces the number of regulators that exceeded the z-score threshold for a given gene. Note that the Lasso method enriches in TFs among the predictors of the target for 33.21% of the non-constitutive genes of A. thaliana (that is, the ratio between the number of TFs selected and the total number of predictors of a given gene above a threshold defined as 1,187/22,094 = 0.0537). Finally, one systems biology markup language (SBML)  file containing the transcriptional model and a plain text file containing the effective model were constructed and are available as supplementary files in Additional data file 3. These files can be viewed using the Cytoscape viewer for further analysis. Notice that the transcriptional model was embedded within the effective one. Networks are constructed by placing genes as nodes and regulations as edges. For the transcriptional model, edges only go from TFs to genes (including those encoding other TFs). For the non-transcriptional model, edges connect two genes, the regulator and the target and, thus, the resulting network is directional.
The performance of the inferred model topology was evaluated using a reference network including genes with known transcriptional regulation. For this, the AtRegNet platform  linking cis-regulatory elements and TFs into regulatory networks was used. Only those interactions among genes included in that reference set were evaluated. The fraction of interactions that were correctly predicted by the model (precision, P) and the fraction of all known interactions that were discovered by the model (sensitivity, S) were used to compute a performance statistic defined as F = 2PS/(P + S) . We have to note that the number of transcriptional regulations experimentally confirmed and compiled in AtRegNet is quite limited, containing only 448 reported interactions between TFs and genes. Therefore, it is difficult to obtain an accurate value for the performance of the model.
To validate the predictive power of the methodology, we constructed two transcription models. The first was obtained by using the 1,436 microarrays for training. For the second model (the test model), 1,292 of these 1,436 microarrays were used as a training set (90%) and 144 randomly chosen ones (10%) were retained for validation studies.
Motif detection and analysis
The FANDOM program  was used to detect motifs of three and four genes in the predicted A. thaliana regulatory model. Statistically significant motifs have z-scores > 2.
which is now contained in the interval [-1, 1]. Values of close to 1 would correspond to maximally robust motifs, whereas values close to zero correspond to motifs not contributing to the robustness of the network. Values close to -1 correspond to incoherent motifs, that is, gene circuits implementing antagonistic regulations .
Additional data files
- F :
- P :
- S :
The Arabidopsis Information Resource
We thank J Forment for help with computer resources and MA Blázquez for critical reading of the manuscript and useful suggestions. This work was supported by grants BFU2006-14819-C02-01/BMC and TIN2006-12860 from the Spanish Ministerio de Ciencia e Innovación to SFE and AJ, respectively; FP6-NESTs 043340 (BioModularH2) and 043338 (Emergence), FP7-KBBE-212894 (Tarpol), the Structural Funds of the European Regional Development Fund (ERDF), the 91-A3405-ATIGE Genopole/UEVE and the MIT-France grants to AJ. JC, GR and AJ acknowledge the HPC-Europa program (RII3-CT-2003-506079). GR was supported by a graduate fellowship from the Generalitat Valenciana and an EMBO Short-term fellowship (reference ASTF-343.00-2007). SFE also acknowledges support from the Santa Fe Institute.
- Gutiérrez-Ríos RM, Rosenblueth DA, Loza JA, Huerta AM, Glasner JD, Blattner FR, Collado-Vives J: Regulatory network of Escherichia coli: consistency between literature knowledge and microarray profiles. Genome Res. 2003, 13: 2435-2443. 10.1101/gr.1387003.PubMedPubMed CentralView ArticleGoogle Scholar
- Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, Simon I, Zeitlinger J, Jennings EG, Murray HL, Gordon DB, Ren B, Wyrick JJ, Tagne JB, Volkert TL, Fraenkel E, Gifford DK: Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2002, 298: 799-804. 10.1126/science.1075090.PubMedView ArticleGoogle Scholar
- Kim SK, Lund J, Kiraly M, Duke K, Jiang M, Stuart JM, Eizinger A, Wylie BN, Davidson GS: A gene expression map for Caenorhabditis elegans. Science. 2001, 293: 2087-2092. 10.1126/science.1061603.PubMedView ArticleGoogle Scholar
- Ma S, Gong Q, Bohnert HJ: An Arabidopsis gene network based on the graphical Gaussian model. Genome Res. 2007, 17: 1614-1625. 10.1101/gr.6911207.PubMedPubMed CentralView ArticleGoogle Scholar
- Mentzen WI, Wurtele ES: Regulon optimization in Arabidopsis. BMC Plant Biol. 2008, 8: 99-10.1186/1471-2229-8-99.PubMedPubMed CentralView ArticleGoogle Scholar
- Lee HK, Hsu AK, Sajdak J, Qin J, Pavlidis P: Coexpression analysis of human genes across many microarray datasets. Genome Res. 2004, 14: 1085-1094. 10.1101/gr.1910904.PubMedPubMed CentralView ArticleGoogle Scholar
- Eisen M, Spellman P, Brown P, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA. 1998, 95: 14863-14868. 10.1073/pnas.95.25.14863.PubMedPubMed CentralView ArticleGoogle Scholar
- Alon U, Barkai N, Notterman D, Gish K, Ybarra S, Mack D, Levine A: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA. 1999, 96: 6745-6750. 10.1073/pnas.96.12.6745.PubMedPubMed CentralView ArticleGoogle Scholar
- Ben-Dor A, Shamir R, Yakhini Z: Clustering gene expression patterns. J Comput Biol. 1999, 6: 281-297. 10.1089/106652799318274.PubMedView ArticleGoogle Scholar
- Dhaeseleer P, Liang S, Somogyi R: Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics. 2000, 16: 707-726. 10.1093/bioinformatics/16.8.707.View ArticleGoogle Scholar
- Ihmels J, Friedlander G, Bergmann S, Sarig O, Ziv Y, Barkai N: Revealing modular organization in the yeast transcriptional network. Nat Genet. 2002, 31: 370-377.PubMedGoogle Scholar
- Butte A, Kohane I: Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pac Symp Biocomput. 2000, 5: 418-429.Google Scholar
- Basso K, Margolin AA, Stolovitzky G, Klein U, Dalla-Favera R, Califano A: Reverse engineering of regulatory networks in human B cells. Nat Genet. 2005, 37: 382-390. 10.1038/ng1532.PubMedView ArticleGoogle Scholar
- Margolin A, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Della Favera R, Califano A: ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics. 2006, 7 Suppl 1: S7-10.1186/1471-2105-7-S1-S7.PubMedView ArticleGoogle Scholar
- Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins JJ, Gardner TS: Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 2007, 5: e8-10.1371/journal.pbio.0050008.PubMedPubMed CentralView ArticleGoogle Scholar
- Meyer PE, Kontos K, Lafitte F, Bontempi G: Information-theoretic inference of large transcriptional regulatory networks. EURASIP J Bioinform Syst Biol. 2007, 79879-Google Scholar
- Husmeier D: Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic bayesian networks. Bioinformatics. 2003, 19: 2271-2282. 10.1093/bioinformatics/btg313.PubMedView ArticleGoogle Scholar
- Yu J, Smith V, Wang P, Hartemink A, Jarvis E: Advances to bayesian network inference for generating causal networks from observational biological data. Bioinformatics. 2004, 20: 3594-3603. 10.1093/bioinformatics/bth448.PubMedView ArticleGoogle Scholar
- Fujita A, Sato JR, Garay-Malpartida HM, Yamaguchi R, Miyano S, Sogayar MC, Ferreira CE: Modeling gene expression regulatory networks with the sparse vector autoregressive model. BMC Syst Biol. 2007, 1: 39-10.1186/1752-0509-1-39.PubMedPubMed CentralView ArticleGoogle Scholar
- Steinke F, Seeger M, Tsuda K: Experimental design for efficient identification of gene regulatory networks using sparse bayesian models. BMC Syst Biol. 2007, 1: 51-10.1186/1752-0509-1-51.PubMedPubMed CentralView ArticleGoogle Scholar
- Ma S, Bohnert HJ: Integration of Arabidopsis thaliana stress-related transcript profiles, promoter structures, and cell-specific expression. Genome Biol. 2007, 8: R49-10.1186/gb-2007-8-4-r49.PubMedPubMed CentralView ArticleGoogle Scholar
- Gardner T, diBernardo D, Lorenz D, Collins JJ: Inferring genetic networks and identifying compound mode of action via expression profiling. Science. 2003, 301: 102-105. 10.1126/science.1081900.PubMedView ArticleGoogle Scholar
- di Bernardo D, Thompson MJ, Gardner TS, Chobot SE, Eastwood EL, Wojtovich AP, Elliott SJ, Schaus SE, Collins JJ: Chemogenomic profiling on a genome-wide scale using reverse-engineered gene networks. Nat Biotechnol. 2005, 23: 377-383. 10.1038/nbt1075.PubMedView ArticleGoogle Scholar
- Bonneau R, Reiss D, Shannon P, Facciotti M, Hood L, Baliga N, Thorsson V: The inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology datasets de novo. Genome Biol. 2006, 7: R36-10.1186/gb-2006-7-5-r36.PubMedPubMed CentralView ArticleGoogle Scholar
- Carrera J, Rodrigo G, Jaramillo A: Model-based redesign of global transcription regulation. Nucleic Acids Res. 2009, 37: e38-10.1093/nar/gkp022.PubMedPubMed CentralView ArticleGoogle Scholar
- Bonneau R: A predictive model for transcriptional control of physiology in a free living cell. Cell. 2007, 131: 1354-1365. 10.1016/j.cell.2007.10.053.PubMedView ArticleGoogle Scholar
- Irizarray RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003, 4: 249-264. 10.1093/biostatistics/4.2.249.View ArticleGoogle Scholar
- Shannon P, Markiel A, Ozier O, Baliga N, Wang J, Ramage D, Amin N, Schwikowski B, Ideker T: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13: 2498-2504. 10.1101/gr.1239303.PubMedPubMed CentralView ArticleGoogle Scholar
- Albert R, Barabási AL: Statistical mechanics of complex networks. Rev Mod Phys. 2002, 74: 47-97. 10.1103/RevModPhys.74.47.View ArticleGoogle Scholar
- Albert R: Scale-free networks in cell biology. J Cell Sci. 2005, 118: 4947-4957. 10.1242/jcs.02714.PubMedView ArticleGoogle Scholar
- Barabási AL, Oltvai ZN: Network biology: understanding the cell's functional organization. Nat Rev Genet. 2004, 5: 101-113. 10.1038/nrg1272.PubMedView ArticleGoogle Scholar
- Khanin R, Wit E: How scale-free are biological networks. J Comput Biol. 2006, 13: 810-818. 10.1089/cmb.2006.13.810.PubMedView ArticleGoogle Scholar
- Ravasz E, Barabási AL: Hierarchical organization of complex networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2003, 67: 026112-10.1103/PhysRevE.67.026112.PubMedView ArticleGoogle Scholar
- Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabási AL: Hierarchical organization of modularity in metabolic networks. Science. 2002, 297: 1551-1555. 10.1126/science.1073374.PubMedView ArticleGoogle Scholar
- Oltvai ZN, Barabási AL: Systems biology. Life's complexity pyramid. Science. 2002, 298: 763-764. 10.1126/science.1078563.PubMedView ArticleGoogle Scholar
- Kashtan N, Itzkovitz S, Milo R, Alon U: Topological generalizations of network motifs. Phys Rev E Stat Nonlin Soft Matter Phys. 2004, 70: 031909-10.1103/PhysRevE.70.031909.PubMedView ArticleGoogle Scholar
- Mangan S, Alon U: Structure and function of the feed-forward loop network motif. Proc Natl Acad Sci USA. 2003, 100: 11980-11985. 10.1073/pnas.2133841100.PubMedPubMed CentralView ArticleGoogle Scholar
- Mangan S, Zalsaver A, Alon U: The coherent feedforward loop serves as a sign-sensitive delay element in transcription networks. J Mol Biol. 2003, 334: 197-204. 10.1016/j.jmb.2003.09.049.PubMedView ArticleGoogle Scholar
- Hayot F, Jayaprakash C: A feedforward loop motif in transcriptional regulation: induction and repression. J Theor Biol. 2005, 234: 133-143. 10.1016/j.jtbi.2004.11.010.PubMedView ArticleGoogle Scholar
- Alon U: Network motifs: theory and experimental approaches. Nat Rev Genet. 2007, 8: 450-461. 10.1038/nrg2102.PubMedView ArticleGoogle Scholar
- Sanjuán R, Elena SF: Epistasis correlates to genomic complexity. Proc Natl Acad Sci USA. 2006, 103: 14402-14405. 10.1073/pnas.0604543103.PubMedPubMed CentralView ArticleGoogle Scholar
- Dekel E, Alon U: Optimality and evolutionary tuning of the expression level of a protein. Nature. 2005, 436: 588-592. 10.1038/nature03842.PubMedView ArticleGoogle Scholar
- Bar-Joseph Z: Analyzing time series gene expression data. Bioinformatics. 2004, 20: 2493-2503. 10.1093/bioinformatics/bth283.PubMedView ArticleGoogle Scholar
- TAIR. [http://www.arabidopsis.org/]
- ATH1 Genome Array. [http://www.affymetrix.com/products_services/arrays/specific/arab.affx]
- NASCArrays. [http://affymetrix.arabidopsis.info/narrays/experimentbrowse.pl]
- AtGenExpress. [http://www.arabidopsis.org/info/expression/ATGenExpress.jsp]
- Tibshirani R: Regression shrinkage and selection via de Lasso. J R Statist. 1996, 58: 267-288.Google Scholar
- Shevade SK, Keerthi SS: A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics. 2003, 19: 2246-2253. 10.1093/bioinformatics/btg308.PubMedView ArticleGoogle Scholar
- Hucka M, Bolouri H, Finney A, Sauro HM, Doyle JC, Kitano H, Arkin AP, Bornstein BJ, Bray D, Cornish-Bowden A, Cuellar AA, Dronov S, Gilles ED, Ginkel M, Gor V, Goryanin II, Hedley WJ, Hodgman TC, Hofmeyr JH, Hunter PJ, Juty NS, Kasberger NS, Kremling S, Kummer U, Novère NL, Loew LM, Lucio D, Mendes P, Minch E, Mjolsness ED, et al: The systems biology markup language (SBML): A medium for representation and exchange of biochemical network models. Bioinformatics. 2003, 19: 524-531. 10.1093/bioinformatics/btg015.PubMedView ArticleGoogle Scholar
- AtRegNet. [http://arabidopsis.med.ohio-state.edu/RGNet]
- Wernicke S, Rasche F: FANMOD: a tool for fast network motif detection. Bioinformatics. 2006, 22: 1152-1153. 10.1093/bioinformatics/btl038.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.