Developmental analysis of the wheat leaf reveals stages of cell and chloroplast differentiation
In order to generate a quantitative analysis of chloroplast biogenesis and a simultaneous global gene expression map of the developing wheat leaf, we first carried out a careful selection of biological material. In preliminary experiments on consecutive leaves, we observed, as anticipated, rapid changes in cellular morphology across short physical distances at the base, and very limited differences at more mature stages. It is important to note that while the distance of cells from the leaf base is related to developmental time, the relationship is far from linear. Elegant measurements by Boffey et al. [15], of the first leaf of wheat grown under conditions similar to ours, identified the relationship between distance from the leaf base and cellular age after exit from the shoot meristem. We used this position/age relationship, corrected for the elongation rate observed in our conditions (see “Methods”), to estimate cellular age. The mesophyll cell morphology and the calculated cellular age prompted the need for much denser sampling at the base of leaves than towards the tip.
Cereal leaves present two regions, the cylindrical basal sheath, which emerges last from the meristem and envelops younger leaves to provide structural support, and the blade, with photosynthetic role. A ligule separates them (see mature leaf in Fig. 1b). Sheath and blade cells inevitably undertake distinct developmental paths [17], and therefore it was important to select a leaf developmental stage before they become distinct, which under our conditions was the case for the first leaf of 6-day-old seedlings, which exhibited an essentially uninterrupted developmental sequence. In order to include the earliest fully proliferating cells, we also collected a sample of the shoot apical meristem with the incipient youngest leaves (plastochron stages P3 to P1), the primordium of leaf 3 being around 1.5 mm in length (Fig. 1a). The meristem produced only leaves, internodes initiating much later in development. To obtain a fully mature photosynthetic stage, but without any signs of senescence, we dissected the middle region of the 2-week-old leaf 1 blade to complete a total of 15 samples (Fig. 1b). We used the same dissected leaf samples to simultaneously obtain materials for quantitative microscopy-based cellular and organelle differentiation analysis, and cell cycle stage identification by flow cytometry and molecular analyses (Fig. 1c).
Our cellular analysis was focused on the photosynthetic mesophyll cells that make up about two thirds of the area of a transverse, mature C3 grass leaf section [18], and therefore of the leaf volume. Meristematic/leaf primordia cells were homogeneously small, prismatic, generally isodiametric and with a central large nucleus (Fig. 1d, sample 1). Cells at the leaf base (sample 2, first 5 mm) remained isodiametric, but increased in size. In the subsequent stage (sample 3, 5–10 mm), cells further enlarged, but remained isodiametric, while after the 10 mm position (samples 4–5) cells begun to elongate and after 15 mm started to produce the characteristic lobed shape of wheat mesophyll cells (sample 5 onwards). A large number of proplastids could be observed by varying the focal plane under DIC microscopy already in cells at early stages of leaf development, but greening, indicative of an assembled photosynthetic apparatus, became apparent only at around 30 mm from the meristem (sample 8 onwards), a few millimeters before the point at which leaf 1 emerged from its enveloping coleoptile. Thereafter, green chloroplasts grew rapidly in size and filled the available cellular space, arranged as a single layer sandwiched in a thin cytoplasm sheath between the vacuole and the plasma membrane (Fig. 1d).
Having identified the full range of cellular and organelle differentiation morphologies, representing the entire leaf developmental sequence, we proceeded to quantify the cellular and organellar parameters and the underlying molecular processes through global transcriptome profiling in the same 15 samples.
Biologically informed gene expression map of wheat leaf development demonstrates key differentiation processes
Triplicate RNA samples were subjected to reverse transcription and Illumina-based sequencing. Around 30 million reads were obtained per sample. We made use of the most recent IWGSC genome annotation [19], which encompasses close to 100,000 genes, including homoeologs of the A, B, and D genomes. We used principal component analysis of variance (PCA) to, in an unbiased manner, establish the degree of difference between the different samples and their replicas. A plot of first two principal components (x and y axes, Fig. 2a) or including the third one (x, y, and z axes, Fig. 2b) demonstrated a very short distance between the replica samples and therefore a high degree of reproducibility of the data. PCA also revealed that a broad coverage of the trajectory of the developmental gradient had been achieved, with the largest variance being observed during the earliest stages of cellular development. For example, while greening in sample 14 is over 30 fold greater than in sample 4 (Fig. 1b and see below), the variance between those two samples captured by both the first and second principal components (their distance along those two axes) is not as large as that between samples 1 and 4 (shoot apex and first 15 mm of the leaf base). The first three components accounted for nearly 80% of the total expression variance (Fig. 2c, d). In order to understand the biological processes represented by the principal components, we calculated the load factors of each gene for each of the three components, and identified gene ontology terms enriched in the genes with the top and bottom 5% load factors (see “Methods”). The result (Additional file 1: Figure S1) is summarized in the axes of Fig. 2b and shows that the first component, accounting for over 40% of the variance, effectively represented developmental (pseudo-) time, a gradual shift from early biosynthetic metabolism to photosynthesis. Interestingly, component 2 (19% of variance) moved forward and back, displaying an intermediate peak which, according to gene load factors, represents both plastid and cell wall organization genes that show maximum expression at samples 4 and 5 (the second cm from the leaf base). The third component involved a departure from DNA synthesis and an eventual peak of expression of mature tissue gene signatures (transport processes).
Following a selection procedure designed to compare the different points across a temporal sequence, with thresholds for minimum expression, fold change, and coefficient of variation (see “Methods”), a total of 42,057 dynamically expressed genes [20] (DYGs, including individual homoeologs) were identified (Additional file 2: Table S1). This constitutes over 40% of the entire genome. Clustering using WGCNA [21] identified 12 expression modules (Fig. 2c), with peaks of expression covering the full developmental gradient. We used an over- or under-representation procedure of gene sets representing selected biological and molecular functions (Fig. 2d), which allowed both a broad representation of the processes and the ability to visualize sub-modules within the same functional groups (Additional file 1: Figure S2) [22]. The first module, with peak of expression in the shoot apex, is enriched to an extraordinary degree (p < 10−100) in genes involved in transcriptional control, cell cycle, and translation (including genes encoding ribosomal proteins), i.e., cytoplasmic cell growth. The second module was small but highly specific to the leaf base and is almost-uniquely overrepresented in hormone-related functions. The third module showed a broad peak in the early samples and encompasses further ribosomal and mitochondrial build-up. These three modules alone included nearly half (47%) of all DYGs.
Examination of the expression of genes selected as part of the functional classification strategy (Additional file 1: Figure S2, Additional file 3: Table S2) provided further, highly informative insight into the processes of construction of the leaf organ. Given that the processes of “transcription and its regulation,” “translation,” and “protein fate” in particular could represent markedly different activities in different cellular compartments, we further indicated through color codes for these three processes whether the individual genes encoded proteins targeted to the mitochondrion, the chloroplast, both or elsewhere (Additional file 1: Figure S2, Additional file 3: Table S2). We observed that while the majority of genes for ribosomal or translation-related proteins were tightly co-expressed and highly active at the shoot meristem and the first leaf base samples, and this was also the case for mitochondrial translation proteins, genes encoding chloroplast ribosomal and other translation-related proteins were particularly abundant among the cohorts of genes peaking in samples 4–11, 10–60 mm from the base. Build-up of the mitochondrial metabolism and respiratory chain peaked in samples 3–4, between 5 and 15 mm, while the majority of photosynthesis-related genes became substantially expressed from sample 8, after 30 mm from the base. Tight cohorts of genes within metabolism or cell wall synthesis and modification also reveal discrete and sequential biogenic activities (Additional file 1: Figure S2, Additional file 3: Table S2). We noted that the unique gene expression signature of the mature leaf sample was not primarily due to the initiation of senescence or of autophagy, as the overrepresentation of specifically these two processes in it was limited, genes involved in many other functions also being altered.
The function of the shoot apical meristem, cell specification processes in leaf primordia, and later differentiation, of both cells and chloroplasts, involve fundamental regulatory events brought about by hormonal action [23, 24]. Genes functionally classified as of hormone action were overrepresented specifically at the leaf base, sample 2 (Fig. 2c, d). We took advantage of our transcriptome data to indirectly examine the broad extent of action of eight plant hormones, visualizing the expression of genes involved in their synthesis/catabolism or signal transduction, and that of genes previously shown to be induced by the relevant hormone (serving as reporters), through a previously used approach [24, 25]. Particularly important roles for auxin were evident in the shoot meristematic region and the base of the leaf (Additional file 1: Figure S3, Additional file 4: Table S3). Auxin action occurs through strikingly distinct cohorts of genes in the shoot apical region, the leaf base, and the regions in which different stages of cell elongation occur (Additional file 1: Figure S3). The data also suggest differential gene functions for auxin receptors, expressed in the shoot apex containing meristematic cells (TIR1), in specified progenitor cells at the leaf base (AFB5) or in early expanding cells (ABP1) (Additional file 4: Table S3). The same applies to various auxin response factors (ARFs), showing sequential expression with peaks ranging from the shoot apex to the region of cell expansion (MONOPTEROUS/ARF6/ARF8/ARF19). In relation to chloroplast differentiation, this approach highlighted a possible role only for cytokinin, given that about half the cytokinin signalling genes displayed showed clearest expression between samples 6 and 13 (Additional file 1: Figure S3) at the time during which greening was most pronounced. These signatures generate a wealth of hypotheses, concerning leaf and organelle development, for further analysis.
Consecutive occurrence of cellular proliferation, cytoplasmic growth, and cell expansion
Multiplication of organelles is essential for their number to be maintained in proliferating cells, when each cell division on average halves it, as well as to increase their number in non-dividing cells as part of cellular differentiation. Thus, the quantitative understanding of chloroplast biogenesis necessitates the simultaneous understanding of cell proliferation. We therefore examined this in detail, using quantitative microscopy (Fig. 1) and simultaneous flow cytometric analysis of cell cycle stages (Fig. 3a, Additional file 1: Figure S4). In sample 1, containing the meristem/leaf primordia, we found a high proportion (around 30%) of nuclei undergoing S phase. Based on data showing that the fastest recorded cell cycle in wheat is 12 h [26] while the duration of S phase is around 3 h [27], it is likely that the totality of meristem and primordia cells in our sample 1 are undergoing cycling and that the cycle is operating at full speed. The S phase proportion declined slightly in sample 2, the first 5 mm of the leaf base, very rapidly diminished to less than 30% of that in the shoot apex in sample 3, 5–10 mm from the leaf base, making the cells’ doubling time close to 2 days, and became barely detectable above background subsequently (Fig. 3a, Additional file 1: Figure S4).
We selected a number of signature genes representative of both DNA synthesis and mitosis (listed in Additional file 5: Table S4) from our transcriptome data, and represented their relative expression levels as Z-scores (the expression values for these and other genes in this functional class, as those in others, are provided in Additional file 3: Table S2). In agreement with the flow cytometry data, these genes peaked in expression in the shoot apex, and their transcript levels became minimal in sample 3, between 5 and 10 mm of the leaf base (Fig. 3c). These cell cycle genes are known targets of the E2F transcription factors, which are themselves known to become active when the RETINOBLASTOMA-RELATED (RBR) protein is inactivated by phosphorylation [28, 29]. Immunoblot analysis shows that RBR phosphorylation is high only in the meristematic sample and already declines substantially in the leaf base, becoming undetectable subsequently (Fig. 3b). The total level of RBR1 (one of two RBR proteins present in monocots, and against which an antibody is available) is also most abundant towards the leaf base, but diminishes less rapidly than its degree of phosphorylation does, consistent with its role in repression of cell cycle genes at those subsequent stages.
Measurements of cell size showed an increase already at the leaf base compared to cells in the shoot apex, indicating that the cell expansion program starts at the base while cells still proliferate (Fig. 3d). The vast majority of cell expansion occurred up to sample 7, 30 mm from the base, when cells are less than 2 days old; making this the region which drives leaf lengthening [15]. A small second bout of expansion of mesophyll cells (likely increase in width) occurred from sample 12, after 80 mm, accounting for 15–20% of the final cell plan area (Fig. 3d). Signature genes for cell wall synthesis (cellulose synthases, arabinogalactan proteins, Additional file 5: Table S4) showed two distinct early peaks (Fig. 3e), with the main cellulose synthases peaking around sample 6, 20–25 mm from the leaf base. Genes associated with cell expansion or turgor facilitation (expansins, aquaporins) showed a corresponding early expression (Fig. 3f).
In summary, divisions are rapid and continuous in leaf primordia. At the base of the developing leaf, cells only undergo between one and two further rounds of division. In the first half day after leaving the meristem, cells move up about 2–3 mm [15] by the expansion of dividing, preceding cells also leaving the meristem. Between half and 1 day, cells move up to 8–9 mm and essentially cease proliferation. Simultaneously with proliferation, cells initiate expansion at the leaf base, this continues after cells fully exit the division cycle, and largely concludes in the following 24 h, ending the first phase of morphological differentiation (Fig. 3d).
A phase of plastid proliferation is followed by the build-up of plastid genomes and transcription and translation machinery
The most important aspect of mesophyll cell differentiation, and arguably of leaf function, is the gradual filling of cells with plastids (Figs. 1d, 4). We used quantitative microscopy [30] to record the number of plastids or chloroplasts, their size and proportion of the cell they occupied, and used quantitative PCR and alternative techniques to measure the number of copies of the plastid genome and of plastid ribosomes in relation to their cytosolic counterparts (Fig. 4).
Plastid number per cell increased in cells located within the basal 20 mm, up to sample 5 (Fig. 4a). Considering that in the first two samples cells halve the plastid number at each division, the calculated frequency of plastid divisions was in fact highest in proliferating cells but declined slower than cell proliferation did (Fig. 4b). Those data show that once cells become specified at the base of the leaf, their plastids undergo between 4 and 5 rounds of division in total. This is confined largely to the first 24 h and fully to the basal 15 mm segment, samples 2–4, broadly coinciding with cell enlargement but before the lobed cell morphology is attained. A selection of plastid division signature genes (Additional file 5: Table S4) was consistent with such a pattern, the expression of several plastid division genes is early and mirrors the calculated division, see, e.g., ARC2, ceasing after 15 mm, although others extend further (Fig. 4c). The FZL gene has been shown to be involved in thylakoid biogenesis [31], but its mutant phenotype [32], with fewer, larger chloroplasts, is suggestive of a possible role in plastid proliferation; it is interesting to observe its later expression, between samples 6 and 12, 20 to 80 mm from the base (Fig. 4c), compatible with thylakoid development and incompatible with a role at the stage of division. To our surprise, we consistently observed a small, gradual reduction of plastid numbers in the latter region. The reason for both of these observations will require future investigation.
The size of individual plastids increased continuously from the very early stages, even during plastid proliferation, but reached a transient plateau before greening and then underwent a second phase of rapid enlargement (Fig. 4d). Remarkably, the total area of all plastids in a cell also increased in two distinct stages (Fig. 4e). The rate of growth, corrected for the effect of cell division, even more clearly showed the two distinct phases of build-up of the chloroplast compartment. The first, which we designated “plastid” phase, preceded greening and essentially concluded with the stage in sample 7, at 30 mm from the base (between 1.5 and 2 days after cells left the meristem), while the second, which we consider “chloroplast” phase, began at around sample 10, 40 mm from the leaf base and continued throughout to fill the mesophyll cells (Fig. 4f). This is intriguing since it reveals a spurt of plastid growth activity well before greening. The transition between the two phases broadly coincides with the emergence of the first leaf from its enveloping coleoptile, a translucent, non-photosynthetic, leaf-like structure, which aids leaf emergence through the soil and provides structural support. The proportion of the cell occupied by the organelles, i.e., the cellular chloroplast compartment or “chloroplast index,” was calculated as the total area of chloroplasts (obtained as the product of chloroplast number and average chloroplast plan area), divided by total cell plan area. We found this also to follow a clear biphasic profile (Fig. 4g). The full occupancy of cells by chloroplasts is dependent on the activity of the REC gene family [33]. We found (Fig. 4h) that while FMT showed expression consistent with a role in mitochondrial cellular distribution, REC1 peaked around the time of maximal plastid growth, while the expression of REC2 and REC3 preceded the “chloroplast growth” phase.
Concomitant and subsequent to their multiplication, starting from a small initial number of proplastids, the replication of sufficient copies of the plastid genome follows, to support the synthesis of large quantities of the photosynthetic polypeptides it encodes. Indeed, DAPI staining of DNA showed that while the majority of cellular DNA is nuclear, non-nuclear DNA in mature mesophyll cells was associated with chloroplasts (Additional file 1: Figure S5). In agreement, we observed (Fig. 4i) that the detectable but very low initial number of copies of chloroplast DNA (cpDNA) per haploid nuclear genome (gDNA) underwent multiple rounds of replication throughout the “plastid growth” phase, with less than one final round (not all copies of cpDNA replicated) taking place during the “chloroplast growth” stage. Multimers of cpDNA and associated proteins form nucleoids. A number of such polypeptides have been identified as plastid transcriptionally active chromosome proteins (pTACs) [34]. The expression of several of these nucleoid proteins also largely peaked during the plastid phase (Fig. 4j), although that of the wheat homolog of pTAC12/HEMERA (see later) continued for longer and that of pTAC11/WHIRLY3 followed a distinct profile, more akin to that of chloroplast translation-associated proteins (see below). We also observed an apparent final loss of around 50% of the cpDNA in the mature leaf sample. Given the fact that we quantified three different plastid DNA genes, in the so-called large and small single-copy regions and in the inverted repeat (Additional file 1: Figure S6), this decrease cannot be explained by plastid genome rearrangements. The decrease, however, is consistent with a smaller decrease observed in mature maize leaf stages in one study [35], and does not support the near-total loss observed in another study [36].
While the vast majority of plastid-encoded proteins play a photosynthetic role, chloroplast ribosomes are constituted of plastid-encoded rRNA, making an early, active plastid genome essential. Like chloroplast genomes, chloroplast ribosomes, as quantified by the content of 16S cprRNA relative to 18S cytosolic rRNA, were present in very low amounts in shoot apical or leaf base cells, and accumulated largely during the plastid growth phase, by that time achieving already more than 50% of their final content in spite of the small total plastid content (Fig. 4k, Additional file 1: Figure S7). This was corroborated using two separate techniques (see “Methods”). As a result, the investment of cellular translation capacity clearly shifts from almost entirely cytoplasmic (less than 1% plastidic), when cell proliferation is taking place, to more balanced (between 1/5 and 1/3 of total rRNA being plastidial), for almost the entire duration of the greening process. Genes for nuclear-encoded chloroplast proteins constituting part of ribosomes or otherwise associated with chloroplast translation exhibited the broadest profiles in expression, spanning both the plastid and the chloroplast phases (Fig. 4l), raising rapidly in sample 3, after the first 5 mm of leaf base (in cells of under 1 day of age since leaving the meristem) and remaining high until at least 80 mm, in sample 12 (2 days later).
To support chloroplast biogenesis, the capacity for the import of nuclear-encoded proteins into chloroplasts needs building up. Our data show that the early plastid phase coincides with peaks of expression of genes for several protein import translocon components, at the outer and inner plastid envelopes (samples 3–4, 5 to 15 mm from the meristem, Fig. 4m, n, Additional file 5: Table S4). These components include homologs of TOC34, TOC159 (I) and the channel TOC75. Meanwhile, an alternative TOC159 (II), expressed in cells at the shoot apex, reinitiated expression later. Of note, the gene for the SP1 ubiquitin ligase, involved in Arabidopsis in the remodelling of import complexes to switch from an import function for housekeeping to one for photosynthetic polypeptides, or vice versa [37], reached highest, broad levels of expression around the transition point from the plastid to the chloroplast phase (Fig. 4m).
Transcription occurs in plastids at nucleoids. In agreement with previous observations [38], different actors of plastid transcription were synthesized in succession, with the nuclear-encoded RNA polymerase (RPOTp, homologous to the mitochondrial polymerase) being expressed early (Fig. 4o). Our nuclear transcriptome data do not include the expression of the subunits of the multimeric, alternative, chloroplast-encoded RNA polymerase, also known as plastid-encoded polymerase or PEP. However, the SVR4/RCB (regulator of chloroplast biogenesis) protein has recently been shown to play a central role in PEP assembly [39], and its gene expression profile (Fig. 4o) matches those of both the RPOTp and that which had been followed by several pTACs (Fig. 4j) whose loss indeed impacts early PEP function [34]. In a very revealing manner, sigma factors, the nuclear-encoded regulatory subunits of the chloroplast-encoded polymerase, peaked in expression successively, in the order of SIG3, SIG2_1 (plastid stage), SIG2_2, SIG6 (transition) and SIG1, SIG5 (chloroplast stage), suggesting dedicated function in expression of different cohorts of PEP-dependent genes underpinning the phases of chloroplast biogenesis.
The latter phase of chloroplast development involves photosynthetic build-up
We carried out a bulk quantitation of the development of photosynthetic apparatus (chlorophyll-containing reaction centers and antenna proteins) by measuring chlorophyll per unit leaf mass (Fig. 5a). This showed that pigment-containing complexes accumulate gradually but very slowly in young cells undergoing the plastid expansion phase, their rate of accumulation becoming substantial only around sample 7, 30 mm (cell age between 1.5 and 2 days), as chloroplasts initiate their rapid growth phase, starting from roughly 30% of their final individual area.
Signature genes (Additional file 5: Table S4) were used to characterize the development of the photosynthetic apparatus. Their relative expressions were remarkably consistent, but exhibited developmental (pseudo-) time shifts: genes involved in chlorophyll and carotenoid biosynthesis initiated their expression early, in sample 4, between 1 and 1.5 days of cell age, while plastids were still proliferating; the CURT1 gene, whose product plays a structural role in thylakoid membrane development [40], followed an identical profile (Fig. 5b). Genes for nuclear-encoded proteins associated to the reaction centers, or for antenna or electron transport components followed similar kinetics (antenna transcript levels dropping earlier), but were shifted by a few hours (at this stage by about one sample, Fig. 5c). Carbon fixation-related genes followed almost immediately after (Fig. 5d). Notably, genes for photorespiration-related chloroplast enzymes followed essentially identical patterns to those for carbon fixation (Additional file 5: Table S4, Fig. 5d), consistent with shared regulation. In summary, the distinct plastid and chloroplast phases of organelle biogenesis are underpinned by corresponding, distinct gene expression programs to synthesize and assemble the photosynthetic capacity.
We then set out to determine how transcript levels are reflected in the abundance of plastid-localized proteins during the distinct stages of plastid biogenesis. To this end, we used immunoblots to monitor the levels of three protein products of genes representing each of the two phases. SVR4/RCB, ARC5 and TIC40 represent fundamental agents in PEP assembly, plastid division and protein import respectively. Proteins that are part of the photosystem antenna (LHCB1), are associated with the PSII reaction center (PsbO) or form part of the carbon fixation cycle (SBPase) were also selected. Transcripts for these genes accumulated in cells at the “plastid growth” (RCB, ARC5, TIC40) or “chloroplast” phases (LHCB1, PsbO, SBPase, Fig. 5e). Immunoblots demonstrated a clear plastid- (RCB, ARC5) or chloroplast-phase (LHCB1, PsbO, SBPase) protein accumulation profile, while TIC40 protein, in spite of being the product of a “plastid stage” transcript, was abundant through both stages, presumably a result of low protein turnover and of a role which is also fundamental for photosynthetic chloroplast differentiation (Fig. 5f). Interestingly, TIC40 appeared as two forms of slightly different electrophoretic mobility, the transition between them coinciding almost exactly with the plastid/chloroplast stage transition. The nature and significance of this transition is currently unknown. These protein data provide further support for the distinct phases of plastid biogenesis.
Stage-specific modelled activity of candidate regulators of chloroplast biogenesis
In an effort to associate candidate drivers to the two phases of the chloroplast biogenesis gene expression program, we examined the expression of previously identified proteins with either chloroplast-related transcription factor function, or a range of other functions but which also impinge on the regulation of transcription for chloroplast proteins. Such regulators have been identified by mutants leading to defects in greening (G2/GLK [41]) or the response to light (HY5 [42], HEMERA/pTAC12 [43], RCB/SVR4/MRL7 [39] and NCP/MRL7-L [44]) or cytokinin (GNC [45]). Supporting evidence of the regulatory roles of GLKs and GNC is the fact that, when overexpressed, they promote ectopic greening of excised Arabidopsis roots [46]. CIA2 was identified through its involvement in the expression of chloroplast protein translocon components [47]. While not a direct transcriptional regulator, we also separately monitored expression of the GUN1 gene, whose product is central for chloroplast-to-nucleus (retrograde) communication, which itself has a major impact on nuclear gene expression [48].
GNC, RCB, HEMERA and one CIA2 homolog were found to coincide in expression with the plastid expansion phase (Fig. 6a; for RCB see Fig. 5e). HY5 was expressed early but peaked in expression at the plastid/chloroplast growth transition stage, which occurs approximately at the stage of leaf emergence from the translucent coleoptile into full light (Fig. 6a). Two GLK1 homologs exhibited elevated expression during the chloroplast greening phase, consistent with their known targets, although GLK1_2 also showed a degree of both early and very late expression, possibly suggestive of further plastid development and chloroplast resource mobilization roles. Only a second CIA2 homolog (which we name CIA2_2) and NCP exhibited expression potentially associated with the very active plastid proliferation and growth phase (Fig. 6a).
Of interest, the central “retrograde” communication gene GUN1 was maximally expressed at the transition stage between the plastid and chloroplast phases (Additional file 1: Figure S8). GUN1 encodes a pentatricopeptide repeat (PPR) protein. Members of the PPR family are involved in RNA editing, turnover or other processes of metabolism of RNA for components of the translation machinery, as PPR5 or PPR103, or for individual subunits of photosynthetic complexes, as HCF152 or PPR10 [49] or for both kinds of proteins as PGR3 [50]. The profiles of these PPR genes reflected those functions, but that of GUN1 at the transition stage was unique among them (Additional file 1: Figure S8), and this may have implications for our understanding of its biological role.
Light-dependent chloroplast development is also known to involve the inactivation of PIFs, transcription factors of the bHLH superfamily and negative regulators of chloroplast development in the dark, which are rapidly turned-over in the light [51, 52]. Homology searches to known monocot PIFs [53] identified five sets of wheat homoeologous genes, homologous to either PIF1 or PIF3. Their expression (Additional file 1: Figure S9a) was largely restricted to samples 6 onwards (over 20 mm from the meristem), mostly during the chloroplast growth phase. The exception was PIF3.2, whose expression was limited to the plastid phase (Additional file 1: Figure S9a). Other indirect negative regulators of light responses are the COP1 and DET1 genes, COP1 protein being involved in the turnover of positive regulators like HY5 in the dark, or negative regulators like PIF1 in the light [53]. DET1 exhibited a late plastid-phase expression, also the time of maximal plastid growth, while that of COP1 was chloroplast-phase-specific (Additional file 1: Figure S9a).
We examined whether the candidate transcriptional regulators could be associated with the gene expression program underpinning chloroplast biogenesis. To this end, we sought a prediction of potential links between target genes (all genes for plastid-localized proteins) and their candidate regulators using a computational algorithm. We first assembled the target list by filtering the DYGs (Fig. 2c) for those encoding proteins with a predicted or previously observed plastid localization (Fig. 6b, Additional file 1: Figure S9b, Additional file 6: Table S9, see “Methods”). We then used GENIE3, a top-performing gene regulatory network reconstruction tool employing a random forests algorithm [54] to select the most likely candidate regulator from among the above known regulatory genes. We did this together for all candidate regulators, but visualized the result separately for positive and negative ones, as heatmaps of rankings of association calculated by GENIE3, rather than as a classic network. This was because the heatmap display better reflects both possible outcomes, presence or absence of connectivity between target and regulator. Figure 6c represents for candidate positive regulators the result, in which the color reflects the likelihood of regulation of a gene for a chloroplast protein in the corresponding row in Fig. 6b, by the regulator in the corresponding column in Fig. 6c. This result (see also Additional file 7: Table S10) predicted substantial roles for RCB and also (unexpectedly) GLK1_1 homologs during the plastid build-up stage, for HEMERA to the large group of genes which includes, among others, those for plastid ribosomal proteins, and again for GLK1_1 during the chloroplast build-up stage. It also showed very limited connectivity for the only candidates with early expression, homologs of CIA2 and NCP, or for any other candidate regulators, to genes active during the early stages of plastid build-up, for example when proliferation occurs (Fig. 6c). As for potential negative regulators (Additional file 1: Figure S9c), if considered taking into account the fact that the regulatory role would be repressive, the result pointed to only a low ranking role for PIF3.2 in the plastid build-up phase.