A drought-responsive metabolome landscape of a global diverse maize association panel
To understand how metabolites function in maize drought response and tolerance, we grew a maize set of 385 natural inbred lines in a greenhouse under WW and DS conditions (Additional file 1, Figure S1a; see “Materials and methods”) [24]. This maize set consisted of 135 temperate inbred lines, 153 tropic/subtropic lines, and 97 lines with mixed origin, showing broad, diverse genetic compositions (Additional file 2, Table S1) [24]. Leaf samples were harvested and subjected to liquid chromatography- / high-resolution mass spectrometry (LC-HRMS) analysis with MTBE extracts, which can measure a broad range of metabolites. After the metabolites detected in less than 80% samples were filtered out, a total of remaining 3890 nonredundant metabolites were finally extracted from 770 libraries of the 385 inbred lines grown under both WW and DS conditions, including 353 putatively annotated metabolites (Fig. 1a; Additional file 2, Table S2). We conducted principal component analysis (PCA) for the maize inbred lines with all these 3890 metabolites. The results showed a clear separation between the WW and DS conditions, with that principal component (PC) 1 explaining the largest proportion (16.37%) of the observed phenotypic variance (Fig. 1b). These data indicated that leaf metabolites in the maize population were very sensitive to drought. In line with our results, a previous study showed that drought had a great effect on metabolite production in developing maize kernels [21].
To gain insights into the metabolic responses of maize to drought, we combined different assays to investigate the changes of all the detected 3890 metabolites in the population grown under WW and DS conditions. Orthogonal partial least squares discriminant analysis (OPLS-DA) indicated that 32.5% of the metabolite features contributed to drought response (VIP ≥ 1, 1267/3890). Paired t-test showed that most of the metabolite features were significantly influenced by drought stress, with 37.4% (1301/3483) showing notable changes in accumulation (|Fold Change| ≥ 2, false discovery rate [FDR] < 0.05). In total, 1035 metabolites were detected as drought-responsive metabolites from the intersection part of the results based on the above assays (Fig. 1c; Additional file 1, Figure S1b; Additional file 2, Table S3). Notably, most (83.3%, or 862/1035) metabolites showed upregulation patterns in response to drought (Fig. 1c), indicating that upregulation of genome-wide metabolites tends to have dominant roles in control of maize drought response, which might confer plant drought tolerance. Of these metabolites, 71 were annotated and belonged to broad range of classes, including phenolamides, lipids, sugars, amino acids, benzoxazinoids, polyphenols, organic acids, nucleic acids, flavonoids, and vitamins (Fig. 1d; Additional file 2, Table S2). Interestingly, the drought-responsive metabolites from the phenolamide, lipid, sugar, benzoxazinoids, and nucleic acid families only showed upregulation patterns, while drought-responsive vitamins were only downregulated (Fig. 1d).
Previous studies have reported the drought-tolerant phenotype survival rate of most of our 385 inbred lines (Additional file 2, Table S1) [9]. We further investigated dry masses and fresh weights as drought-tolerant indices of the 385 inbred lines in drought experiments (Additional file 2, Table S1, see “Materials and methods”). Based on these drought-tolerant indices, we performed maize drought tolerance prediction with 1035 drought-responsive metabolites, 3890 total metabolites, and 1035 randomly selected ones via rrBLUP approach (see “Materials and methods”) [25]. The results showed that the prediction accuracy for drought tolerance, especially for survival rate, using 1035 drought-responsive metabolites, was significantly higher than those using the total or randomly selected metabolites (Fig. 1e). In addition, we found that a combination of 15 m-traits could explain more than 60% of the phenotypic variance of survival rate (Additional file 1, Figure S2; Additional file 2, Table S4). Together, these data suggested that the drought-responsive metabolites could efficiently reflect maize response and tolerance to drought, and our subsequent analyses were focused on the 1035 drought-responsive metabolites accordingly.
Genetic basis of the metabolome variations underlying maize drought response
Previous studies have reported ~ 1.25 M single-nucleotide polymorphism (SNP) markers of the maize lines used in this study [24]. To unravel how the drought-responsive metabolites are genetically controlled, we performed GWAS to detect significant associations between SNP markers and the drought-responsive metabolites (or m-traits) with mixed linear model (MLM) controlling population structure (Q) and kinship (K) using TASSEL5 (https://tassel.bitbucket.io/). Around 88% (921/1035) of the m-traits had at least one significantly associated locus (MLM, N = 385, P < 2.1 × 10−6). In total, 7811 distinct significant SNPs associated with 921 m-traits were detected (Fig. 2a; Additional file 2, Table S5). More significant SNPs (50.6% or 3951/7811) were detected by GWAS under DS condition as compared to those under WW condition (38.7%, or 2955/7811), and only 11.6% (905/7811) of the SNPs were statically detected under both conditions (Additional file 1, Figure S3a), indicating that the m-trait-SNP associations were very sensitive to the growth environment. Each SNP marker explained 5.9 ~ 35.8% (median, 7.8%) of the observed phenotypic variance of the m-traits (Additional file 1, Figure S3b). After significant SNPs within 10 kb flanking region were merged, a total of 3415 loci were identified as mQTLs (Additional file 2, Table S6). 21.4% (731/3415) of these QTLs showed co-localization with QTLs reported previously in maize drought studies based on the traditional drought response indices (Additional file 2, Table S6) [26, 27]. Therefore, our m-trait-based GWAS could detect many new putative drought-related QTLs in addition to those of known.
GWAS mapping with this maize population was able to reach single-gene resolution [23]. Based on the peak SNPs (SNP with smallest P value at a locus) and their linkage disequilibrium (LD), a total of 2589 candidate genes were identified based on the 7811 significant SNPs (Additional file 2, Table S5). In total, 190 genes among them were TFs (Additional file 2, Table S5), which showed a significantly enrichment as compared to the total genome-wide TFs (p = 6.6 × 10−4, Fisher’s exact test). These results suggested the importance of transcriptional variations in maize drought tolerance regulation. The candidate genes were more significantly enriched under DS condition than WW condition in several Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways such as “TCA cycle,” “Glycolysis / Gluconeogenesis,” and “Auxin signaling” (Additional file 1, Figure S4a-d; Additional file 2, Table S7), indicating that these enriched variations may contribute to the divergence of the maize drought tolerance by regulating these pathways under DS condition.
Moreover, thirty-five known drought-tolerant genes were included in the candidate gene set (Fig. 2a; Additional file 2, Table S8). For example, ZmSKD1 (suppressor of K+ transport growth defect 1), an AAA-type ATPase encoding gene, was previously reported to play a positive role in drought tolerance [28]. Twelve SNPs of ZmSKD1, which were in the same LD block (R2 > 0.2), were significantly associated with metabolite PN_Group_00629 under both WW and DS conditions (Additional file 1, Figure S5a-d). There were two alleles, C and T, of the significant SNP chr3.S_19462144. Drought promoted the production of PN_Group_00629 in plants with both alleles, but less PN_Group_00629 was produced and a higher survival rate was observed in plants with allele C than in plants with allele T (Additional file 1, Figure S5e and f). Therefore, allele T could be a favorable allele that downregulated PN_Group_00629 but promoted maize drought tolerance. Further, we performed association analysis between survival rate and the candidate genes. Randomly selected genome-wide genes were used as control. The results showed significant enrichment of candidate gene SNPs as compared to those of random genes (Fig. 2b; Additional file 2, Table S5) suggesting that the survival rate-gene associations were not false positive and that they were likely involved in regulation of maize drought tolerance. Next, we ranked the genes based on the p value of the peak SNP of their locus. In the top 100 genes with most significant SNPs, 19 of them were found in our candidate gene set, while the expectation gene number is 8 (Fisher’s exact test, p = 6.4 × 10-4). The number of candidate gene in top 500 gene list is 72, while the expectation gene number is 41 (Fisher’s exact test, p = 3.25 × 10-6), and we observed 126 genes in top 1000 gene list, while the expected genes are 82 (Fisher’s exact test, p = 1.35 × 10−6). These results indicated that our candidate genes might be enriched in the putative drought-tolerant genes.
The transcriptomes of 197 maize natural inbred lines (a core germplasm of the maize population used in the metabolome profiling) grown under WW and DS conditions have been generated (Dai M, unpublished). Based on the centrality-lethality rule [29], correlations were calculated between the 1035 drought-responsive metabolites and gene expression under WW and DS conditions. In total, 4787 metabolite-gene correlations (p < 1 × 10−5) were detected, with 818 metabolites and 2873 genes involved (Additional file 2, Table S9). Of all these metabolite-associated or metabolite-correlated genes, 56 were hub genes (see “Materials and methods”), most of which (82%, or 46/56) were specifically identified under the DS condition (Fig. 2c, d; Additional file 2, Table S10). Three hub genes ZmCS1 (citrate synthase 1, GRMZM2G063909), ZmCS2 (citrate synthase 2, GRMZM2G064023), which are two putative citrate synthase genes [30] (Additional file 1, Figure S6), and ZmMYBR88 (GRMZM2G064328) were statically detected under both WW and DS conditions. Previous studies have reported the associations of ZmCS1 with phenylpropanoid hydroxycitric acids [30], which were also identified in current study (Additional file 1, Figure S7a; Additional file 2, Table S5). In addition, we detected significant associations between ZmCS1 and many other metabolites such as flavonoids, carboxylic acid, benzoxazinoids, and others (Additional file 1, Figure S7a-c). We simultaneously detected ZmCS2 (~ 10 kb next to ZmCS1) that was associated with these metabolites apart from ZmCS1 (Additional file 1, Figure S7a-c). Between ZmCS1 and ZmCS2, only the expression of ZmCS2 was upregulated after DS (P < 9.28 × 10−11, paired t-test), and the associations between metabolites and ZmCS2 were more sensitive to drought than those between metabolites and ZmCS1 (Additional file 1, Figure S7d and e).
Of these hub genes, ZmPP2C-A10 and ZmVPP1 are known to play roles in the maize drought tolerance [9, 10]. These data again demonstrated the strong power and reliability of our analyses in mapping causative genes that regulate maize metabolites in the maize drought response and tolerance. Further, genes co-expressed (P < 4.67 × 10−9) with the hub genes were identified. Pathways (CornCyc, https://corncyc-b73-v3.maizegdb.org/) enriched with the co-expressed genes were detected (Fig. 2c, d). These pathways helped in the functional interpretation of the hub genes in context of metabolic regulation. For example, genes co-expressed with the hub genes Bx12 (GRMZM2G023325, encoding an enzyme for the benzoxazinoid metabolic pathway) and ZmGLK44 (GRMZM2G124540, encoding Golden 2-like TF) identified under DS condition were enriched in DIMBOA-Glc biosynthesis and amino acid metabolism (Fig. 2d), suggesting important roles of Bx12 and ZmGLK44 in regulating these metabolic pathways in maize drought response.
Natural variants that regulated the expression divergences of the candidate genes
Based on the transcriptome of 197 natural inbred lines grown under WW and DS conditions, we investigated the expression QTLs (eQTLs) that regulated the expression of the 2589 candidate genes associated with drought-responsive metabolites. In total, 13,546 distinct eQTLs (MLM, P < 4.2 × 10−8) were identified for 54.8% (1421/2589) of the candidate genes (Additional file 2, Table S11; Additional file 2, Table S11-13), suggesting that transcription variations could play important roles in regulation of metabolites of maize population in drought response or tolerance. An eQTL was defined as a cis eQTL if the lead SNP was located in the 20 kb up or downstream region of an expression trait gene (e-trait gene), but otherwise it was a trans eQTL [31]. Of the 13,546 eQTL, the majority (91.1%, or 12,343/13,546) were trans eQTLs (Fig. 3a; Additional file 2, Table S13), and the remaining (8.9%) were cis eQTLs. These data indicated that there existed complex regulatory networks involved in drought responses or tolerance of the maize population. However, the cis eQTLs generally explained larger phenotypic variations of the expression traits as compared to the trans eQTLs (Fig. 3b).
Most of the eQTLs (84%) were dynamic (identified under either WW or DS growth conditions), only 16.3% were shared (identified under both growth conditions). Similar ratios of dynamic and shared eQTLs were observed for trans eQTLs (Fig. 3c). Interestingly, there were more shared cis eQTLs (54.2%) in contrast with the dynamic ones (45.8%) (Fig. 3c). We looked more deeply into the cis eQTLs as they, on average, had larger effects than trans eQTLs on e-trait gene expression regulation (Fig. 3b). For example, some cis eQTLs were stably and repeatedly detected in genes response to osmotic stress (ara1, rplP10, ZmSLT1, ZmFes1A, ZmRABG3c, ZmSnRK3.16, ZmAMMECR1, ZmNADP-QD, ZmPIMT1, ZmBAG4, ZmCOBL1), glucan metabolism (ZmSS2, ZmDsPTP1, ae1, ZmABI8, ZmXTH30), and cytoskeleton organization (ZmCAP1, ZmTCTP, ZmFACb, ZmMPK20) (Additional file 2, Table S12). In addition, there were many cis eQTLs being involved in regulating the expression of e-trait TF genes (abi14, bhlh106, bhlh162, bzip79, bzip92, c3h12, ca5p9, ereb22, ereb168, farl15, glk6, glk8, myb99, mybr51, mwp1, NF-YA3, wrky29, wrky109) (Fig. 3d). These data suggested strong effects of cis eQTLs on the expression regulation of the genes nearby. For the dynamic cis eQTLs, significant and unique peaks were identified under DS conditions for genes involved in benzoxazinoids synthesis (Bx12), cellular response to stress (gst19, mmp165, ZmGR1, ZmRBP-DR1, ZmRECA3, ZmMCB1), ABA response (ZmLEW3, ZmMCB1, ZmSnRK3.17), phenylpropanoid hydroxycitric acid (ZmCS1, ZmCS2), and transcription regulation (abi36, bhlh81, bzip110, conz1, cpp8, gras43, gras50, hb25, hb123, hox1, mybr26, myb32, smh4, thx15, zmm18) (Fig. 3e; Additional file 2, Table S12). Therefore, these cis variants could be more specifically involved in stress response of these e-trait genes in the maize population.
Bx12 contributing to the divergence of maize drought tolerance via regulating DIMBOA-Glc biosynthesis
In this study, we identified a diversity of new candidate genes associated with metabolite contents in maize drought responses by experimentally characterizing two hub candidate genes, Bx12 and ZmGLK44, for the purpose of showing validation in principle is possible and providing new insights of molecular mechanisms underlying metabolome-mediated maize drought tolerance.
The grass-specific benzoxazinoids play important roles in pest defense and disease resistance [32], but whether and how these metabolites are involved in plant abiotic stress tolerance remain elusive. Various benzoxazinoids, which are catalyzed by different Bx and GLU genes, were detected in maize leaves (Fig. 4a) [32]. In our study, we detected 18 different benzoxazinoids in maize plants grown under WW and DS conditions (Additional file 2, Table S2). Eight of these benzoxazinoids and 7 other drought-responsive metabolites were significantly associated with SNPs in the region of 66–67 Mb on chromosome 1 (Additional file 2, Table S5). Conserved and dynamic associations in this region were observed based on different metabolites. For instance, some SNPs were significantly in association with HBOA contents under both WW and DS conditions, but regarding DIMBOA-Glc contents only under DS conditions (P < 2.1 × 10−6, N = 385), showing two peaks of significant SNPs located in Bx11 (GRMZM2G336824) and Bx12 loci (Fig. 4b–e; Additional file 1, Figure S8a). As a product of DIMBOA-Glc, HDMBOA-Glc also showed significant association with Bx12 loci no matter with or without drought stress (Additional file 1, Figure S8b). Interestingly, HDMBOA-Glc was not a drought-responsive metabolite based on our standard. The significant SNPs from the two peaks showed high linkage (R2 > 0.51) to each other and were located in the same LD block (Additional file 1, Figure S8c-h). Previous reports have shown that an InDel of CACTA TE (TE_InDel) in Bx12 is the causal variant that controls the divergence of Bx12 expression and further DIMBOA-Glc production in maize leaves [33], which is in accordance with our results (Fig. 4f, g). Moreover, we observed that Bx12 expression was enhanced in the maize population after drought stress (Fig. 4f, g; Additional file 2, Table S11), suggesting potential roles of Bx12 in maize drought tolerance.
Next, we focused on the CACTA TE to investigate how Bx12 functions in maize drought tolerance. Among the 385 maize lines, plants with a TE deletion allele had significantly higher survival rate after drought stress than those with a TE insertion allele (Fig. 4h; Additional file 3, Table S14). We further generated three maize linkage populations, two F2 populations derived from CIMBL144 (TE+/+) × CIMBL55 (TE−/−) and B73 (TE+/+) × CIMBL70 (TE−/−), and a heterogeneous inbred family (HIF) population based on the CACTA TE at the Bx12 locus from the previously reported recombinant inbred line (RIL) populations derived from a cross between modern maize Zong3 (TE+/+) and the maize ancestor teosinte (TE−/−) [34] to verify this observation. All these three populations showed significantly higher survival rates for plants with a TE deletion allele than for ones with a TE insertion allele (Fig. 4i–m), suggesting that the TE deletion allele is favorable for drought tolerance. DIMBOA-Glc is the stock form of the active form DIMBOA in the plant cells [32]. Considering both DIMBOA-Glc and DIMBOA have very similar effects on cell responses [35, 36], we used the commercial chemical DIMBOA to evaluate the role of DIMBOA-Glc in maize drought response. Spraying DIMBOA to the maize genotype B73 decreased the survival rates of maize plants after drought treatment, while DIMBOA treatment showed no significant effect on plants grown under well-watered conditions, indicating negative roles of DIMBOA and DIMBOA-Glc in maize drought tolerance (Fig. 4n, o; Additional file 1, Figure S9). Together, these data suggested that Bx12 played positive roles in maize drought tolerance likely through downregulation of DIMBOA-Glc accumulation upon drought stress.
Previous studies have indicated that the TE insertion in Bx12 underwent selection during maize domestication [37]. We further investigated the Pre-Columbian distribution of this TE in the Americas with a published landrace population [38]. The frequencies of this TE in landraces from the USA were significantly higher than those in landraces from the other regions of Americas (Additional file 1, Figure S10a and b; Additional file 3, Table S15). In addition, the TE insertion allele was significantly associated with higher latitudes (Additional file 1, Figure S10c; Additional file 3, Table S15). These data indicated that the TE insertion in Bx12 underwent strong selection during the spread of maize into the US temperate zone. Given the increased tolerance of maize plants with TE deletion allele than with insertion allele, we propose that there could be a selection balance between drought tolerance and other agronomic traits associated with the Bx12 locus during maize domestication and spread.
ZmGLK44 regulates tryptophan accumulation via activating tryptophan biosynthesis gene expression
Golden 2 (G2) and its homologous Golden 2-like (GLK) TFs are plant-specific TFs, belonging to the GARP subfamily of MYB TFs [39, 40]. GLK TFs are vital to plant development [39, 41]. Nonetheless, how these TFs function in plant drought tolerance remains unknown. Based on our data sets, ZmGLK44 is a hub gene correlated / associated with 30 drought-responsive metabolites (Fig. 5a; Additional file 2, Table S10). Eight of these are annotated amino acids (Fig. 5a; Additional file 2, Table S2). To elucidate the roles of ZmGLK44 in regulating the biosynthesis of these drought-responsive metabolites, we generated ZmGLK44 inducible overexpression transgene maize lines (Additional file 1, Figure S11a and b). Metabolic profiling of these transgenic plants and their negative siblings showed that most (60%, or 18/30) of the putatively targeted metabolites were not significantly changed under WW condition, but upregulated to higher levels in positive transgenic plants than in their negative siblings under DS condition (Fig. 5b; Additional file 3, Table S16). The upregulated metabolites included tryptophan (Trp) and two tryptophan-related metabolites achillamide derivatives. We next performed RNA-seq to ZmGLK44-positive and ZmGLK44-negative transgenic plants grown under DS condition. The differentially expressed genes were enriched in several metabolic pathways, including “glycerophosphodiester degradation,” “L-tryptophan biosynthesis,” and “DIBOA-glucoside biosynthesis,” (Fig. 5c Additional file 1, Figure S12 a-c; Additional file 3, Table S17) [42], suggesting a role of ZmGLK44 in regulation of these metabolic pathways.
Both metabolic profiling of the 385 inbred lines and RNA-seq of the transgenic plants indicated a role of ZmGLK44 in regulation of Trp accumulation. We therefore investigated how ZmGLK44 regulates Trp biosynthesis. Three key genes—ASA2, PAT1, and TSB2—that encode synthetic enzymes of the Trp biosynthesis pathway, were upregulated in the positive transgenic plants as compared to the negative siblings (Fig. 5d, e). Consistently, the metabolites indole and Trp were increased to significantly higher levels in positive transgenic plants compared to the negative siblings after DS (Fig. 5f). GLK TFs bind to a cis element with a CCAATC core sequence [39]. To know how ZmGLK44 regulates ASA2, PAT1, and TSB2, the fragments ASA2p, PAT1p, TSB2p1, and TSB2p2 with CCAATC elements in ASA2, PAT1, and TSB2 promoters, respectively, were cloned and used in LUC assays (Fig. 5g). In maize protoplasts, enhanced expression of ZmGLK44 increased the reporter LUC expression levels driven by the fragment TSB2p2, but not ASA2p, PAT1p, nor TSB2p1 (Fig. 5g, h). These data indicated that TSB2 could be a direct target of ZmGLK44 and that the flanking sequences of CCAATC could be important for the binding of GLK TFs to the core cis element.
ZmGLK44 positively regulated maize drought tolerance
Trp is an aromatic amino acid showing responses to water-deficient/drought stresses from very early to late stages [11]. Spraying Trp onto different maize genotypes increased the survival rates of maize plants after drought treatment (Fig. 6a–d), indicating a positive role of Trp promoting maize survival upon drought. Given the higher increased levels of Trp in ZmGLK44 positive transgenic plants compared to the negative siblings after drought stress (Fig. 5f), we hypothesized that ZmGLK44 could function positively in maize drought tolerance. To test this hypothesis, both positive transgenic plants and negative siblings were grown in the soil under WW and DS conditions (Fig. 6e). Under WW condition, the indices of CO2 assimilation (AN), H2O transpiration (E), instantaneous water use efficiency (WUE, AN/E), and chlorophyll made no difference between positive plants and their negative siblings (Fig. 6f). However, following drought treatment, the levels of AN, WUE, and chlorophyll, but not E, were significantly higher, while the levels of ion leakage were significantly lower in the positive plants than the negative siblings (Fig. 6g). As a result, the positive plants had higher survival rates after drought stress (Fig. 6e, g). Taken together, these results demonstrated a positive role of ZmGLK44 in maize drought tolerance.