Sampling strategy and seasonal climate analysis
Vitis vinifera cv Corvina clone 48 berries were harvested from different vineyards, each located in one of the three most important wine production macro-areas of the Verona region (Bardolino, Valpolicella, and Soave). The vineyards were selected on the basis of the site geographical coordinates to maximize differences in environmental conditions (altitude and soil type) and agricultural practices (training system, orientation of the rows, planting layout, vineyard age, and rootstock type) in each of the selected vineyards (Figure 1a; see Additional File 1, Table S1). Berry samples were harvested from all the vineyards on the same day and three biological replicates were taken at each of three different developmental stages (veraison - that is, the term used by viticulturists to indicate the onset of ripening -, mid-ripening, and fully-ripe). A complete list of all samples collected for this study is shown in Additional File 2, Table S2. In brief, sample names are composed by vineyard abbreviations (see Additional File 1, Table S1), followed by the indication of the harvesting year (06, 07, or 08), by the indication of the developmental stage (1, 2, or 3) and by the description of the biological replicate (A, B, or C). The berry ripening stage was verified by measuring three traditional ripening parameters (°Brix, total anthocyanin levels, and total acidity) as well as the ratio between quercetin-3-O-glucoside and quercetin-3-glucoronide, reflecting the fact that ripening Corvina berries progressively lose the former and accumulate the latter [25] (see Additional File 3, Table S3).
The same sampling procedure was repeated over 3 consecutive growth years (2006, 2007, and 2008). In order to obtain samples harvested at a similar phenological phase in the 3 years, the collecting times were advanced or delayed based on seasonal climate conditions and/or agro-meteorological trends. Daily temperature recording suggested that the 2007 season experienced a much warmer spring than the 2006 and 2008 ones (Figure 1b). In a comprehensive study of the relationship between grapevine phenology and climate change in the Veneto region over the period 1964 to 2009, the early spring of 2007 was noted for the highest average temperature (with near-normal precipitation) in the entire 45-year period. The 2007 veraison-to-harvest period was nearly 2 weeks ahead of time compared to the last decade average [26].
Based on the traditional and metabolic parameters discussed above, and with the appropriate inter-annual corrections taken into account, the collected samples were considered homogenously and uniformly ripe among different vineyards and growth years at each developmental stage (see Additional File 3, Table S3).
The impact of season climate on the berry transcriptome
We used the NimbleGen 090918_Vitus_exp_HX12 microarray to investigate the Corvina berry transcriptome at three developmental stages harvested during the 2006-2008 period from four vineyards (AM, CS, MN, and PSP) chosen to maximize climatic and agricultural differences (see Additional File 1, Table S1 and Additional File 2, Table S2). The vineyards therefore represented all three of the macro-areas we considered (Bardolino, Valpolicella, and Soave) and a range of diverse environmental and agricultural parameters including three rootstock types, two altitudes, two vineyard training systems, and rows facing in different directions.
The 108-sample dataset (four vineyards, three developmental stages, three biological replicates, 3 years) was further dissected into three stage-specific 36-sample datasets (four vineyards, one developmental stage, three biological replicates, 3 years). We generated a Pearson's distance correlation matrix for each dataset to compare the transcriptome from each sample. These values were converted into distance coefficients to define the height of a dendrogram.
Berry samples collected at veraison clearly clustered in relation to the growth year and not in relation to the growing sites (Figure 2a). The 2006 and 2008 seasons correlated more closely than either did to the 2007 season, indicating that the high spring temperatures in 2007 had an impact on berry development. To gain insight into the physiological and molecular factors underlying this separation between samples, we carried out a three-group Kruskal-Wallis non-parametric analysis of variance (P <0.01) on the complete first-stage dataset. Hierarchical clustering (HCL) analysis on the resulting 625 genes, whose expression profiles showed a significant difference in modulation in at least 1 year, revealed four major groups (Figure 2b; see Additional File 4, Dataset S1).
Cluster 1 included 373 genes showing higher expression levels in 2008 compared to low levels in 2007. Most of these genes represented the 'DNA/RNA metabolic process' functional category, including several encoding histones, pentatricopeptide proteins, DNA replication proteins, mRNA cap guanidine methyltransferases, and RNA-binding proteins. The 'Transcription' functional category was also strongly represented, including genes encoding bHLH, MYB, bZIP2, and zinc finger transcription factors. The strong representation of these genes suggested a profound remodeling of the transcriptome between the growth years. We also identified stress response genes encoding two thaumatins, a metallothionein [27], and at least four senescence-associated proteins.
Cluster 2 contained 47 genes that were expressed at high levels in 2006 but at low levels in 2008. This included six genes related to hormone metabolism, four of which involved in the response to abscissic acid (ABA), which plays a pivotal role in development, adaptation to dehydration stress [28] and the production of reactive oxygen species (ROS). Given the presence of an early response to dehydration (ERD) protein and of two nudix hydrolases, which have recently been shown to maintain redox homeostasis [29], it is likely that the 2006 season was exposed to greater dehydration stress than the 2008 one.
Cluster 3 comprised 39 genes that were expressed at significantly higher levels in 2006 than 2007. These included genes encoding three expansin proteins directly involved in cell wall expansion [30], and a xyloglucan endotransglucosylase/hydrolase (XTH), which modifies hemicellulose during wall expansion and fruit softening and therefore suggests a direct influence of the growth year condition on cell wall metabolism [31]. Cluster 3 also included four genes related to carbohydrate synthesis, encoding sucrose synthase 2, a transketolase, a phosphomannomutase, and a galactokinase.
Finally, cluster 4 comprised 168 genes expressed at significantly higher levels in 2007 than in 2008. Interestingly, this group included genes encoding at least 10 disease-resistance proteins and heat shock factors. We also identified genes involved in the oxidative burst (two monooxygenases and a respiratory burst oxidative protein B) as well as two alcohol dehydrogenases involved in fermentative metabolism. The upregulation of these genes confirms that severe stress was imposed upon developing berries during the 2007 growing season.
Whereas the veraison berry dendrogram showed predominantly year-specific clustering, the ripening berry dendrograms were organized in a different manner (see Additional File 5, Figure S1a and S1b). Year-specific modulated genes in these samples were identified by normalizing the microarray fluorescence intensity values against the corresponding veraison values, resulting in a dendrogram showing samples clustered according to the growth year (Figure 2c). This indicated that the mid-ripening and late-ripening datasets could be also screened for year-specific modulated transcripts.
To explore the transcriptomic differences between the mid-ripening and late-ripening samples when comparing average climate growth year (2006/2008) and the 2007 season characterized by an exceptionally warm spring, we carried out a paired two-group t-test analysis which revealed 4,775 genes showing significant (P <0.01) differential transcription in one of the two groups (see Additional File 6, Dataset S2). After averaging the fluorescence intensity of all the samples within one group, we used MapMan [32] to visualize genes that were induced either specifically in the 2006/2008 seasons or specifically in 2007 (Figure 2d). We noted that enzymes involved in cell wall structural modifications (especially cellulose synthases, pectinesterases, and xyloglucan endotransglucosylase/hydrolases) were represented to a great extent in the 2006/2008 group, as previously observed in cluster 3 (Figure 2b), suggesting that the expression of these genes is affected by the different season climate. Genes with a role in amino acid metabolism were also induced in 2006/2008, indicating that the management of nitrogen-base substances is impaired under extreme temperatures. However, the major difference between the growth years involved secondary metabolism (Figure 2d), particularly the biosynthesis of phenylpropanoid derivatives in the 2006 and 2008 berries. This was indicated by the induction of genes encoding several phenylpropanoid-related enzymes (for example, phenylalanine ammonia lyase, PAL, and cinnamyl alcohol dehydrogenase (CAD), including a high number of stilbene synthases (STSs), controlling the key step for the synthesis of stilbene compounds (Figure 2e). LC-ESI-MS metabolomic analysis of the same samples used for RNA extraction confirmed that phenylpropanoid-derived compounds such as stilbenes, viniferins, hydroxycinnamic acids, and the flavonoid catechins and epicatechins were less abundant in the 2007 season compared to the 2006/2008 seasons, strongly supporting the transcriptomic data (see Additional File 7, Figure S2). This suggests that the profound reprogramming of the berry transcriptome under diverse meteorological conditions includes metabolic pathways contributing to ripe berry qualitative traits, thus influencing the commercial value of the grapes.
Adaptation of the berry transcriptome to different environments and growing conditions
We focused on the impact of different environments and growing conditions by analyzing berries from the 2008 season, which appeared to be less influenced by the climate than the other growth years (see Additional File 8, Figure S3). We extended the analysis to include all 11 vineyards (see Additional File 2, Table S2). The resulting 99-sample dataset (11 vineyards, three developmental stages, three biological replicates, 1 year) showed a bimodal distribution of fluorescence intensity agreeing with the results of previous investigations [33]. To achieve a unimodal distribution from the whole dataset, we used k-means clustering of the log2 fluorescence intensities (see Additional File 9, Figure S4) applying increasing values of k until only a single cluster displayed bimodal distribution (k = 10) with a low mean expression level. We then grouped the nine unimodal clusters with high mean expression levels, allowing us to select genes providing a unimodal distribution without cutting off low-value expression data (for example, cluster 1, see Additional File 9, Figure S4). We identified 13,752 genes with a unimodal distribution of the fluorescence signal (see Additional File 10, Dataset S3). We carried out a Kruskal-Wallis test (P <0.01) on the reduced dataset from each vineyard to determine the number of genes that were differentially expressed during ripening and found the average number over the 11 vineyards was 8,381. Plastic genes modulated in at least one vineyard during ripening were identified by applying 11-group Kruskal-Wallis analysis to Dataset S3 (Additional File 10), resulting in a reduced set of 1,478 transcripts (P <0.01) (see Additional File 11, Dataset S4). The number of plastic genes appeared remarkably high (approximately 18% of the average number of modulated genes), suggesting that the ripening of Corvina berries can be modified extensively by the growing conditions. This also indicated that approximately 5% of the transcripts represented on the microarray correspond to plastic genes whose expression can vary under diverse growing conditions.
The analysis of transcript functional categories revealed that 21% of the plastic genes were unrecognized ('No Hit') or uncharacterized ('Unknown Protein') suggesting that much remains to be learned about the genes expressed during berry development (Figure 3a). Overall, the 1,478 plastic transcripts were particularly enriched in the functional categories 'Translation', 'Nucleobase, nucleoside, nucleotide and nucleic acid metabolic process', 'Regulation of gene expression, epigenetic', and 'Transport' (see Additional File 12, Figure S5). In particular, at least 86 ribosomal proteins were found in the DNA/RNA metabolic process category (Figure 3b), suggesting that transcriptome reprogramming during ripening involves a shift in protein synthesis. 'Transcription factor activity' function is also well represented, for example, 30 zinc finger genes, including C(2)H(2)-type proteins that regulate stress and hormone response pathways [34] and many C3HC4-type RING zinc fingers that also play a role in abiotic stress responses [35, 36]. We also identified at least eight members of the MYB transcription factor family (see the heat map in Figure 3c, which shows the expression profiles among vineyards and during ripening). Some members of the MYB family have been shown to regulate secondary metabolism in grape berries [37, 38] as well as drought, salinity, and cold stress in Arabidopsis and rice [35, 39].
Genes representing the 'Transport' functional category included those encoding ATP-binding cassette (ABC) proteins (Figure 3c). This is one of the largest and most diverse protein families in plants, and is responsible for transporting many different substances across membranes [40, 41], suggesting a broad reprogramming of intracellular and intercellular transport as a component of phenotypic plasticity in Corvina berries. The glutathione S-transferase (GST) family was also well represented among the plastic genes, with at least 11 tau-class GSTs showing different expression patterns among the 11 vineyards (Figure 3c). Although the function of tau-class GSTs remains poorly understood, they may be involved in stress tolerance and secondary metabolism as well as the detoxification of herbicides [42]. It is noteworthy that many of the 'Response to stress' transcripts we identified are involved in ROS scavenging, such as two glutaredoxins, four ascorbate peroxidases, a nudix hydrolase, two peroxiredoxins, and three superoxide dismutases. Together with the many GSTs that reduce peroxides by controlling the balance between the oxidized and reduced forms of glutathione, the presence of these transcripts suggests that the oxidative burst observed in Pinot Noir berries at veraison [43] could also occur in Corvina and is part of the complex transcriptional rearrangement during berry plasticity. Finally, several of the Corvina plastic transcripts belonged to the 'Developmental process' category, including several homologs of Arabidopsis genes involved in floral transition and flower organ identity, that is, EARLY FLOWERING, CONSTANS, FRIGIDA, and SEPALLATA (see Additional File 11, Dataset S4).
We also investigated whether it is possible to identify groups of vineyards sharing specific pools of plastic transcripts. Principal component analysis (PCA) was applied to the 1,478 plastic genes and we identified five principal components explaining 67.4% of the variability. The resulting dendrogram highlighted four principal clusters of vineyards (Figure 3d). Samples from the same vineyard but from different developmental stages generally clustered in the same group, with the exception of five samples. FA081 and CS081 were outliers possibly because of the significant changes from veraison to later developmental stages. Samples from the GIV vineyard were also outliers, indicating a unique gene expression profile under these particular micro-environmental conditions. Plastic transcripts contributing to the definition of each statistical class were defined by applying a four-class orthogonal projections to latent structures discriminant analysis (O2PLS-DA) model to a 28-sample reduced dataset lacking the outlier samples (Figure 3e). The robustness of the model was tested by calculating the degree of overfitting (100 permutations) of the corresponding three-class PLS-DA model (see Additional File 13, Figure S6). We identified 53, 30, 33, and 29 transcripts specific for each cluster. Remarkably, the vineyards in cluster 1 were all characterized by the intensive transcription of genes encoding ribosomal proteins (almost half of the all the cluster-specific transcripts) (see Additional File 14, Dataset S5).
We next tested whether it was possible to associate specific transcripts to groups of vineyards sharing certain environmental attributes or using specific agricultural practices. We applied the Kruskal-Wallis approach (P <0.01) to the 13,752-unimodal-profiling-transcript dataset (see Additional File 10, Dataset S3) using in each case the appropriate number of groups (for example, two groups for the direction of rows, four groups for the rootstock type). Among all the combinations we tested, only the 'Trelling System' and the 'Geographical Area' categories gave statistically-validated results (see Additional File 15, Figure S7a and S7b). This indicated that the contribution of the four different rootstock genotypes have only a marginal impact on the plastic gene expression of berries compared to the other agricultural parameters and is not appreciable from our experimental design. We found that 373 transcripts (false discovery rate (FDR), 0.25%) were differentially-modulated between vineyards using a replacement cane Guyot system or a parral system. Interestingly, several transcripts encoding heat shock proteins and proteins that maintain membrane integrity were induced among vineyards using the Guyot system but not those using the parral system (see Additional File 15, Figure S7a, and see Additional File 16, Dataset S6). Transcripts associated with the macro-areas had more complex expression profiles. Of the 534 transcripts (FDR, 0.42%) found significant in the statistical test, only the absence of particular transcripts could be specifically assigned to the Soave, Bardolino, or Valpolicella areas (see Additional File 15, Figure S7b, and see Additional File 17, Dataset S7). Thus, the absence of these transcripts in one geographical area (and their presence in the other two) appears to be more important in the definition of transcriptomic plasticity among different cultivation areas.
Transcriptomic grouping at harvest
We next focused on berry harvesting in 2008 because this was the most important from an agronomic perspective and allowed the relationship between transcriptome plasticity and cultivation micro-environment to be investigated in detail. We built a dataset from the fluorescence intensity values of 33 samples (11 vineyards, one developmental stage, three biological replicates, and 1 year) and carried out significance analysis of microarray (SAM) using a FDR of 0.1%. This revealed 11,323 significantly modulated transcripts. We focused on transcripts displaying a ≥2-fold change in at least one vineyard-to-vineyard comparison, narrowing the number of significant transcripts to 8,250 (see Additional File 18, Dataset S8). In order to determine inner dataset dynamics, a cluster dendrogram was built using Pearson's correlation values comparing the transcriptome from each sample, revealing two-cluster partitioning (see Additional File 19, Figure S8a). We then used t-test analysis (α = 0.05) to confirm the transcriptional separation between the two vineyards groups (see Additional File 19, Figure S8b). Functional category distribution analysis uncovered a profound difference in metabolism. Gene expression in the first group of vineyards (VM, GIV, CC, PM, AM, and FA) clearly depicted ripe berry samples (for example, a large number of transcripts related to secondary metabolism) whereas in the second group of vineyards (CS, PSP, BA, BM, and MN) photosynthesis-related genes were still actively transcribed (see Additional File 19, Figure S8c). This metabolic difference, confirmed also by classical berry maturation indexes (total acidity and °Brix/total acidity, see Additional File 19, Figure S8d) strongly indicates a disparity in the degree of ripening at harvesting.
We applied PCA to the 8,250 differentially-modulated transcripts, and the first component, explaining 27.9% of the total dataset variability, was attributed to differences in ripening status as anticipated (Figure 4). This indicated that the plasticity of the berry transcriptome affected the entire berry ripening program, resulting in a diverse range of ripening characteristics at harvest. Overall, these data confirm that the phenotypic variation of grape berry responsible of the diverse qualitative traits a single clone can express in different growing sites reflects a deep plasticity of the berry transcriptome at harvest.
Non-plastic berry genes
The datasets also yielded developmental stage-specific but non-plastic transcripts, that is, those whose expression increases (positive markers) or declines (negative markers) with a constant profile during berry development regardless of the vineyard. This was achieved by applying SAM multiclass analysis (FDR, 0.1%, three groups) to the 99-sample dataset (11 vineyards, three developmental stages, three biological replicates, year 2008 only), revealing 18,190 transcripts that were differentially expressed among the three berry developmental stages but to the same extent in all 11 vineyards. These genes were likewise analyzed by one-way ANOVA (α = 0.01, three groups, standard Bonferroni correction) and the resulting 11,532 genes were grouped into eight k-means clusters of gene expression (Pearson's correlation). The clusters defined by a continuous increase or decline during ripening were further screened for genes with the largest fold change (95th percentile) between the first and last stages, to select those that were more strongly modulated. This yielded 115 upregulated genes (Figure 5a; see Additional File 20, Dataset S9) and 90 downregulated genes (Figure 5b; see Additional File 20, Dataset S9).
The non-plastic upregulated genes included those encoding pathogenesis related (PR) proteins and biotic stress factors such as thaumatins and osmotins, as previously reported [27, 43–45]. The PR10 gene VIT_05s0077g01530 and two PR1 genes (VIT_03s0088g00710 and VIT_03s0088g00690) were previously shown to be differentially modulated during the last stages of berry development stages in Chardonnay grapes [46]. PR proteins are the most abundant proteins in wine and they are expressed across all stages of berry development [47]. The identification of PR-related transcripts as non-plastic developmental markers suggests that they represent a fundamental grapevine disease-prevention strategy that could help to avoid berry infections. We also identified eight non-plastic genes encoding germacrene-D-synthases and seven encoding stilbene synthases (see Additional File 20, Dataset S9) confirming previous reports that the terpene and phenylpropanoid pathways are under strict transcriptional control during ripening [48–50].
The non-plastic downregulated genes included many involved in photosynthesis, which occurs in the early berry until veraison [43, 44, 49]. We identified seven photosynthesis-related transcripts (mainly encoding polyphenol oxidases and photosystem II subunits) showing that the shutting down of photosynthesis can be used to monitor the progress of berry development regardless of the vineyard. Other non-plastic downregulated genes were involved in cell wall structural modifications, including transcripts for expansin A, xyloglucan endotransglucosylase/hydrolase (XTH), and β-D-xylosidase, agreeing with previous investigations of Chardonnay, Cabernet, and Corvina berries [43, 44, 46].
Finally, we identified a number of transcripts that were neither plastic (no variation among the 11 vineyards) nor developmentally modulated (no variation among the three developmental stages) using SAM multiclass analysis (FDR = 0.1%, 11 groups) of three stage-specific datasets, each comprising 33 samples (11 vineyards, one developmental stage, three biological replicates, 2008 season only). The constitutive and non-plastic transcripts were further analyzed by one-way ANOVA (α = 0.01, 11 groups). The 15,841, 14,342, and 13,286 transcripts that were constitutively expressed during veraison, mid-ripening, and ripening, respectively (see Additional File 21, Figure S9), were compared to identify 6,927 transcripts shared among all three developmental stages. These were screened for the lowest (last 99th percentile) standard deviation among samples, resulting in a set of 76 non-plastic genes constitutively expressed during berry development (Figure 5c; see Additional File 22, Dataset S10).
Transcripts scoring the lowest standard deviations included those encoding proteins related to intracellular transport (ADP-ribosylation factor, ABC transporter F member 2 and vacuolar sorting-associated protein), plant cell wall metabolism (xyloglucan endotransglucosylase/hydrolase), DNA and RNA binding and editing (zinc finger A20, AN1 domain-containing stress-associated protein 2 and oligouridylate-binding protein), and cellular metabolism (S-adenosylmethionine synthetase, inorganic pyrophosphatase, and ubiquitin-specific protease). Remarkably, five transcripts with different expression levels and different standard deviations showed constitutive expression also in the transcriptome of all grapevine organs (see Additional File 23, Figure S10), as confirmed in the recent grape gene expression atlas [33]. These 76 non-plastic constitutive genes are candidate reference genes for quantitative gene expression analysis.