We developed a computational systems biology platform, named NetAct, to construct transcription factor (TF)-based GRNs using TF activity. The method uniquely integrates both generic TF-target relationships from literature-based databases and context-specific gene expression data. NetAct also integrates our previously developed mathematical modeling algorithm RACIPE to evaluate whether the constructed network functions properly as a dynamical system. It evaluates the roles of every gene in the network by in silico perturbation analysis. NetAct has three major steps: (1) identifying the core TFs using gene set enrichment analysis (GSEA) [39] with an optimized TF-target gene set database (Fig. 1a), (2) inferring TF activity (Fig. 1b), and (3) constructing a core TF network (Fig. 1c). Then, the network is validated and analyzed by simulating its dynamics using mathematical modeling by RACIPE. Details of each step are given in the “Methods” section and Additional file 1: Supplementary Note 5. Below, we demonstrate how we optimized the NetAct algorithm, compared its performance of activity inference with three existing methods using in silico gene expression data, and applied the network modeling approach to two biological datasets.
Literature-based TF-target relationships facilitate TF inference
To establish a comprehensive gene set database containing TF-target relationships, we considered data from different sources (Additional file 3: Table S1, also see Additional file 1: Supplementary Note 1). They are (D1) a literature-based database, consisting of data from TRRUST [40], RegNetwork [41], TFactS [42], and TRED [43]; (D2) a gene regulatory network database FANTOM5 [44], whose interactions are extracted from networks constructed using RNA expression data from 394 individual tissues; (D3) a database derived from resources of putative TF binding targets, including ChEA [45], TRANSFAC [46], JASPAR [47], and ENCODE [48]; and (D4) a database derived from motif-enrichment analysis, RcisTarget [35]. These databases have been frequently used to study the transcriptional regulations and have already been utilized for network construction [29, 49].
We evaluated the performance of these databases by GSEA on a benchmark gene expression dataset. GSEA is a popular statistical method that can be used to evaluate the significant overlapping between a set of genes and differentially expressed genes between two experimental conditions. Using various types of TF-target databases, our goal is to find the best version of the database, so that GSEA can detect the target gene sets of the relevant TFs to be statistically significant. The benchmark dataset, denoted as set B, consists of a compilation of 12 microarray and 32 RNA-seq gene expression data (Additional file 3: Table S2). Each of these datasets contains at least three samples under the normal condition (control) and three samples under the treatment condition in which a specific TF is treated by knockdown (KD). We applied GSEA (with slight modifications, details in the “Methods” section) on set B to evaluate whether the enrichment analysis can detect the perturbed TFs. The underlying assumption is that, with a better TF-target gene set database, GSEA will be more likely to detect the corresponding perturbed TFs. For each TF-target database and each gene expression data in set B, we calculated the q-values of all the TFs in the database by GSEA to determine whether the target genes of the perturbed TF are enriched in the differentially expressed genes. We found that more significant q-values are usually associated with relatively larger number of targets for each TF; however, too many (e.g., greater than 2000) targets will result in non-significant q-values. The summary statistics, such as the total number of TFs and the average number of target genes per TF, are summarized in Additional file 3: Table S1. Furthermore, these corresponding q-values from all the gene expression data are converted to specificity and sensitivity values (see the “Methods” section), and different databases are compared based on the area under the sensitivity-specificity curves (Fig. 1d). We found that the literature-based database has the best overall performance; thus, we used this database for further analyses. Our results are in line with a previous benchmark study [50] that literature-based TF-target database outperforms others in capturing transcriptional regulation.
Inferring TF activity without using TF expression
NetAct can accurately infer TF activity for an individual sample directly from the expression of genes targeted by the TF (see the “Methods” section). Here, we will illustrate how NetAct infers TF activity on two cases of microarray KD experiments—one case for shRNA KD of FOXM1 and shRNA KD of MYB in lymphoma cells (GEO: GSE17172 [51]), and another case for KD of BCL6 on both OCI-Ly7 and Pfeiffer GCB-DLBCL cell lines (GEO: GSE45838 [2]). NetAct first successfully identified the TFs that undergo knockdown in each case, i.e., FOXM1, MYB, and BCL6, by applying GSEA on the optimized TF-target database (q-value < 0.15).
Next, for each identified TF, NetAct calculates its activity using the mRNA expression of the direct targets of the TF. We first constructed a Spearman correlation matrix from the expression of the targeted genes. As shown in Fig. 2a, the correlation matrix after hierarchical clustering analysis typically consists of two red diagonal blocks, two blue off-diagonal blocks, and the remaining elements with low correlations which will be filtered out subsequently (details in the “Methods” section). Within the red blocks, the expression of any column gene is positively correlated with that of any row gene, while within the blue blocks, the expression of any column gene is negatively correlated with that of any row gene. This indicates that the genes in the two red blocks are anti-correlated in the gene expression with each other. However, if the correlation matrix is constructed from 100 or 200 randomly selected genes (Fig. 2b, c), such a clear pattern disappears. Thus, our observation suggests that genes from one of the red blocks are activated by the TF, whereas genes from the other block are inhibited by the TF. Moreover, filtered genes are not likely to be directly targeted by the TF in this context, or they are regulated by multiple factors simultaneously and are thus likely not a good indicator for the TF activity.
We further evaluated how the filtering step removes noise and retains the important genes in the analysis. We found that, after the filtering step, most of the differentially expressed (DE) genes are retained, as evidenced by Fig. 2d. Here, DE genes from each comparison were retrieved by using limma with a cutoff for the adjusted p-values at 0.05 and a cutoff for the log2 fold changes at 2. Subsequently, for DE TFs, we evaluated the Spearman correlations between the TFs and the corresponding targeted genes. In traditional approaches (such as ARACNe [1], WGCNA [52], and BEST [53]), the co-expression between a TF and its targeted genes is commonly used to identify its association and assign the sign (activation or inhibition) of the regulation. We found that, for each TF, most of the genes in a block either positively correlate with the TF expression (Fig. 2f, g, blue bars), or they negatively correlate with the TF expression (Fig. 2f, g, red bars). The tests demonstrate that, without directly using TF expression, NetAct can successfully identify two groups of important target genes—genes in each group are either activated or inhibited by the TF. These two groups of genes are further used to infer TF activity by a weighted average of their gene expression (Eq. 1 in the “Methods” section). Additionally, we found that the correlations between inferred TF activity and target expression are usually higher than the correlations between TF expression and target expression (Fig. 2h).
Evaluating activity inference and network construction in a simulation benchmark
To evaluate the accuracy and robustness of inferred TF activity, we performed extensive benchmark tests to compare NetAct with other existing methods. We first performed the benchmark tests on simulated data because TF activity is usually not directly measurable. The activity of a TF can be related to its protein level or the level of a particular posttranslational modification, such as phosphorylation. Therefore, it is very difficult to obtain the ground truth of TF activity from an experimental dataset. Thus, in this benchmark test, we rely on mathematical modeling to simulate both the expression and activity of each TF from a synthetic TF-target network. With this simulated data, we benchmark NetAct against other methods.
To establish the simulated benchmark dataset, we first constructed a synthetic TF-target network with a total of 30 TFs. Each TF has 20 target genes randomly selected with replacement from a pool of 1000 genes. In addition, each TF also regulates two (randomly selected) of the 30 TFs. This synthetic network has a hierarchical structure, where a target gene may be co-regulated by multiple TFs. The type of each TF-to-TF regulation is either excitatory, inhibitory, or signaling, with a chance of 25%, 25%, and 50%, respectively; the type of each TF-to-target regulation is either excitatory or inhibitory with a 50% chance for each. Here, the signaling regulation changes the activity of a TF without changing its expression, whereas the excitatory or inhibitory interactions change both the activity and expression. From one realization of the synthetic network generation, the synthetic GRN contains a total of 477 genes (30 TFs, 447 targeted genes) and 660 regulatory links (Fig. 3a). See Additional file 1: Supplementary Note 4 for more details.
To simulate the gene expression of the TF-target network, we applied a generalized version of the mathematical modeling algorithm, RACIPE [37]. Using the network topology as the only input, RACIPE can generate an ensemble of random models, each corresponds to a set of randomly sampled parameters. Here, we used RACIPE to generate simulated data including gene expression and TF activity for the benchmark. Some previous studies have also adopted a similar modeling approach for benchmarking [54, 1]. To consider the effects of a signaling regulatory link, we generalized RACIPE to simulate both expression and activity for each TF. See Additional file 1: Supplementary Note 5 for more details.
In the benchmark test, we used RACIPE to simulate 100 models with randomly generated kinetic parameters. From these 100 models, we obtained 83 stable steady-state gene expression and activity profiles for the 477 genes. As expected, TF activity and target activity from a regulatory link are correlated (1st column, 2nd row in Fig. 3b), TF activity and target expression (3rd column, 2nd row in Fig. 3b) are correlated, and the expression of two target genes (Fig. 3c) are correlated. However, there is no strong correlation between TF expression and target expression (2nd column, 2nd row in Fig. 3b) and, for a signaling regulatory link, between TF activity and target expression (3rd column, 4th row in Fig. 3b). Next, we applied ARACNe to predict the regulon (i.e., the list of targeted genes by a specific TF) using either the simulated expression profiles or the simulated activity profiles. We found that the regulons predicted from the activity profiles are substantially more similar to the predefined ground truth regulons (measured by the Jaccard index [55]) than those predicted from the expression profiles (Fig. 3d). The results indicate the need of using the TF activity, instead of TF expression, to identify TF-target relationships.
Next, we compared the performance of NetAct with several related algorithms, NCA, VIPER, and AUCell, in inferring TF activity using both the simulated expression profiles from the 83 models and a predefined regulon (i.e., the association of each TF with its target genes) (details for the implementation of these algorithms in Additional file 1: Supplementary Note 3). The predicted activity was then compared with the simulated activity (ground truth) to evaluate the performance. To mimic the real-life scenario where the target information may not be complete and accurate, we consider more challenging tests where the regulon data is randomly perturbed. Here, for a specific perturbation level, we generated 100 sets of regulon data by replacing a certain number of target genes for each TF with non-interacting genes. The numbers of replaced genes are 0 (0% level of perturbation), 5 (25%), 10 (50%), and 15 (75%) in different tests. We then evaluated the performance of NetAct, NCA, and VIPER. AUCell protocol advises to include the target genes with only positive interactions in the regulons. To satisfy this criterion, we updated the regulons for both unperturbed and perturbed regulons. For the unperturbed regulons, we retained only the positive interactions; for the perturbed regulons, we retained the positive target genes that were not replaced and a random half of the replaced target genes (assuming that half of the genes are positively regulated by the TF). We then evaluated AUCell performance using these updated regulons (denoted AUCell 1) and non-updated regulons (denoted AUCell 2). As shown in Fig. 4a (also Additional file 2: Figs. S3-S6), NetAct significantly outperforms each of the other methods in reproducing the simulated activity profiles at each perturbation level. As expected, the performance of NetAct is decreased by increasing the perturbation levels of the regulon data; however, NetAct still performs reasonably well even when only 25% of the actual target genes are kept in the regulon data. The results indicate that NetAct can robustly and accurately infer TF activity even with a noisy TF-target database.
Furthermore, we tested another scenario where the test data contains simulated data from two experimental conditions, e.g., one representing an unperturbed condition and the other representing a perturbed condition. Here, we used the same synthetic network but compiled 40 expression and activity data from the abovementioned simulation (unperturbed condition), together with 43 expression and activity data from the simulations in which a specific TF (TF9) is knocked down (perturbed condition). We then performed a similar test as above and found that NetAct outperformed each of the other methods (Additional file 2: Fig. S2, Additional file 2: Fig. S7a). The notable performance gain of NetAct mainly emanates from the removal of incoherent (or noisy) targets of a TF before the activity calculation in NetAct (see the “Methods” section).
In addition, we performed a network construction benchmark of NetAct and a few other network construction algorithms using the in silico simulation dataset, as shown in Fig. 4b–d. NetAct, using the TF activity inferred from the original regulon database, outperforms not only network construction methods using gene expression, such as GENIE3 [56], GRNBoost2 [57], and ppcor [58, 59], but also GENIE3 using the TF activity inferred by AUCell (Fig. 4b). The last approach was presented to mimic a popular method SCENIC. Moreover, we evaluated the performance of NetAct when using a perturbed regulon database. We found that NetAct remains performing well when the perturbation level is as large as 50%, when evaluated by all the ground truth interactions (Fig. 4c) and by those not presented in the regulon database (Fig. 4d). The latter case was designed to evaluate the capability of NetAct in predicting novel interactions. We observed similar outcomes for the case of the second scenario of the simulation data from two conditions (Additional file 2: Fig. S7b-d, see Additional file 1: Supplementary Note 6 for details of the benchmark method). In summary, our in silico benchmark test demonstrates the high performance of NetAct over existing state-of-the-art methods in both inferring TF activity and gene regulatory networks.
Characterizing cellular state transitions by GRN construction and modeling
In the previous sections, we demonstrated the capability of NetAct in identifying the key TFs and predicting TF activity. With these data, NetAct further constructs a TF-based GRN using the mutual information (MI) of the activity from the identified TFs (details in the “Methods” section). We then applied RACIPE to the constructed network to check whether the simulated network dynamics are consistent with the experimental observations. Below, we show the utility of NetAct with two biological examples: epithelial-mechanical transition (EMT) and macrophage polarization.
In the first case (EMT), we analyzed a set of time-series microarray data on A549 epithelial cells undergoing TGF-β-induced epithelial-mesenchymal transition (EMT) (GEO: GSE17708) [60]. According to the overall structure of the transcriptomics profiles, we arranged the samples from different time points into three groups—early stage (time points 0 h, 0.5 h, and 1 h), middle stage (time points 2 h, 4 h, and 8 h), and late stage (time points 16 h, 24 h, and 72 h). We then performed three-way GSEA with our human literature-based TF-target database to identify the enriched TFs that are active between early-middle, early-late, and middle-late time points. Forty-one TFs (q-value cutoff 0.01) were identified including many major transcriptional master regulators, such as BRCA1, CTNNB1, MYC, TWIST1, TWIST2, and ZEB1, and factors that are directly associated with TGF-β signaling pathways, such as SMAD3 [61], FOS, and JUN [62]. The hierarchical clustering analysis (HCA) of the expression and activity profiles for these TFs is shown in Fig. 5a. While the expression profiles are quite noisy, the activities show a clear gradual transition from the epithelial (E) to mesenchymal (M) state. Note that the signs of the activity of a few non-DE TFs were flipped according to experimental evidence of protein-protein interactions and the nature of transcriptional regulation (see the “Methods” section for detailed procedures and Additional file 3: Table S3 for a list of the changes).
We then constructed a TF regulatory network (Fig. 5b) and performed mathematical modeling to simulate the dynamical behavior of the network using RACIPE (Fig. 5c, d). We found that, consistent with the expression and activity profiles (Fig. 5a), the network clearly allows two distinct transcriptional clusters that can be associated with E (the yellow cluster in Fig. 5d) and M states (the blue cluster in Fig. 5d). To assess the role of TGF-β signaling in inducing EMT, we performed a global bifurcation analysis [29] in which the SMAD3 level is used as the control parameter (Fig. 5c). Here, SMAD3 was selected as it is the direct target of TGF-β signaling [61]. As shown in Fig. 5c, when SMAD3 level is either very low or high, the cells reside in E or M states. However, when SMAD3 is at the intermediate level, the cells could be driven into some rare hybrid phenotypes. These results are consistent with our previous studies on the hybrid states of EMT [32, 63]. Using RACIPE, we systematically performed perturbation analyses by knocking down every TF in the network. Our simulation results (Fig. 5e) suggest that knocking down TFs, such as RELA, SP1, EGR1, and CREBBP, has major effects in driving M to E transition (MET), while knocking down TFs, such as TP53, AR, and KLF4, has major effects in driving E to M transition (EMT). These predictions are all consistent with existing experimental evidence (Additional file 3: Table S4).
Compared to a previous model of the EMT network based on an extensive literature survey [19], the GRN constructed by NetAct identified some of the same regulators induced by the TGF-β pathway, such as SMAD3/4, TWIST2, ZEB1, CTNNB1, NFKB1, RELA, FOS, and EGR1. Because of the lack of microRNAs and protein-protein interactions in the database, NetAct did not identify factors like miR200 and signaling molecules like PI3K. Interestingly, the NetAct model identifies STAT1/3, which was connected to other signaling pathways, such as HGF, PDGF, IGF1, and FGR, but not TGF-β in the previous network model. In addition, the NetAct model identified regulators in other important pathways in TGF-β-induced EMT in cancer cells, e.g., cell cycle pathway (RB1 and E2F1) and DNA damage pathway (P53).
In the second case, we studied the macrophage polarization program in mouse bone marrow-derived macrophage cells using time series RNA-seq data (GEO: GSE84517) [64]. In this experiment, macrophage progenitor cells (denoted as UT condition) were treated with (1) IFNγ to induce a transition to the M1 state, (2) IL4 to induce a transition to the M2 state, and (3) both IFNγ and IL4 to induce a transition to a hybrid M state. Here, we reprocessed the raw counts of RNA-seq with a standard protocol (details in Additional file 1: Supplementary Note 2). From principal component analysis (PCA) on the whole transcriptomics (Fig. 6b), we found that the gene expression undergoes distinct trajectories when macrophage cells were treated with either IFNγ (M1 state) or IL4 (M2 state). When both IFNγ and IL4 were administered, the gene expression trajectories are in the middle of the previous two trajectories, suggesting that cells are in a hybrid state (hybrid M state). We aim to use NetAct to elucidate the crosstalk in transcriptional regulation downstream of cytokine-induced signaling pathways during macrophage polarization.
Here, we applied GSEA on six comparisons—untreated versus IFNγ-treated samples (one comparison between the untreated and the treated after 2 h, another between the untreated and the treated after 4 h, same for the other comparisons), untreated versus IL4-treated samples, and untreated versus IFNγ + IL4-treated samples. Using our mouse literature-based TF-target database, we identified 79 TFs (q-value cutoff 0.05 for UT vs IL4-2 h and 0.01 for all others). The expression and activity profiles of these TFs (Fig. 6a–c) capture the essential dynamics of transcriptional state transitions during macrophage polarization as follows. NetAct successfully identified important TFs in these processes, including Stat1, the major target of IFNγ, Stat2, Stat6, Cebpb, Nfkb family members, Hif1a, and Myc [65,66,67]. Myc is known to be induced by IL-4 at later phases of M2 activation and required for early phases of M1 activation [66]. Interestingly, we find Myc has high expression in both IL4 stimulation and its co-stimulation with IFN but its activity is high only in IL4 stimulation. We then constructed a TF regulatory network that connects 60 TFs (Fig. 6d) and simulated the network with RACIPE, from which we found that simulated gene expression (Fig. 6f) matches well with experimental gene expression data (Fig. 6a) (see Additional file 1: Supplementary Note 7). RACIPE simulations display disparate trajectories from UT to IL4 or IFNγ activation and stimulation with both IL4 and IFNγ. Strikingly, we found in the simulation that there is a spectrum of hybrid M states between M1 and M2 (Fig. 6e), which is consistent with the experimental observations of macrophage polarization [65]. Moreover, we also predict from our GRN modeling that the transition from UT to hybrid M is likely to first undergo a transition to either M1 or M2 before a second transition to hybrid M (Fig. 6e). This is because of our observation from the simulation data that there are fewer models connecting UT and hybrid M than any of the other two routes (i.e., UT to M1, and UT to M2) (Additional file 2: Fig. S10). Taken together, we showed that the NetAct-constructed GRN model captures the multiple cellular state transitions during macrophage polarization.
In conclusion, we show that NetAct can identify the core TF-based GRN using both the literature-based TF-target database and the gene expression data. We also demonstrate how RACIPE-based mathematical modeling complements NetAct-based GRN inference in elucidating the dynamical behaviors of the inferred GRNs. Together, these two methods can be applied to infer biologically relevant regulatory interactions and the dynamical behavior of biological processes.