Single-cell RNA sequencing of Chinese cabbage shoot and leaf cells
To systematically determine gene expression patterns during Chinese cabbage leaf development, we isolated protoplasts from inbreeding line Chinese cabbage A03 shoot apices (0.5 cm in length from the shoot tip) (S) and developing leaves (L) and profiled them using droplet-based scRNA-seq to generate a single-cell transcriptomic atlas (Fig. 1a). After enzymatic digestion, shoot and leaf cells were isolated from 1-week-old and 4-week-old plants grown at 25 °C day/18 °C night, respectively. To monitor the reproducibility of the experiment and reliability of the scRNA-seq results, two replicates were included for both shoot and leaf cell samples. A total of 12,985 individual shoot cells (6392 in S1 and 6592 in S2) and 17,245 individual leaf cells (8350 in L1 and 8895 in L2) were labeled (Fig. 1b). Then, cDNA libraries were generated and sequenced, and the data were filtered at both the cell and gene levels. Approximately 64,000 reads and 1600 median genes per shoot cell and 51,000 reads and 3400 median genes per leaf cell were detected for further analysis (Additional file 2: Table S1).
To generate a cell atlas of Chinese cabbage leaf development, we merged two shoot apex samples (S1 and S2) with two leaf samples (L1 and L2) for cell clustering and annotation. In total, 28,343 single-cell transcriptomes were used to identify distinct cell populations, and they were grouped into 19 distinct clusters (Additional file 3: Table S2; Additional file 1: Figure S1). The t-distributed stochastic neighborhood embedding (t-SNE) method was used to visualize and explore the cell clusters (Fig. 1c). These clusters harbored similar numbers of cells in each replicate but showed differences between shoot and leaf samples (S/L) in terms of the proportion of these two cell types. Clusters 3, 4, 6, 7, 8, 10, 11, 14, 15, 17, and 18 contained significantly more cells from shoot samples (S), while clusters 0, 1, 2, 5, 9, 13, and 16 contained significantly more cells from leaf samples (L) (Additional file 1: Figure S1).
Identification of major cell types in the shoots and leaves
To annotate each cluster, we first identified cluster-enriched genes that were highly expressed in one cluster compared to all the other clusters (the genes must be expressed in 25% of cells within the cluster; q ≤ 0.01; log2|fold change (FC)| ≥ 0.36) (Additional file 4: Table S3). To explore the potential regulators of different clusters, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed (Additional file 5: Table S4). The distribution of cluster-enriched genes ranged from 387 to 2021 per cluster, and GO analysis of these gene sets revealed the potential biological functions of the genes expressed in each cell cluster, which helped us to predict cell types. In addition, the expression of a series of marker genes, including those whose functions have been thoroughly studied or those identified from transcriptomic datasets, was compared across clusters to determine the cell types in those clusters (Additional file 1: Figure S2). Some of these marker genes were highly and specifically enriched in each corresponding cluster and helped us to identify major cell types present in the shoots and leaves, which included meristem cells (four clusters), mesophyll cells (six clusters), proliferating cells (one cluster), epidermal cells and guard cells (two clusters), and vascular cells (three clusters) (Additional file 6: Table S5).
The expression of key meristem development genes, such as SHOOT MERISTEMLESS (STM), KNOTTED 1-LIKE HOMEOBOXs KNAT1, KNAT2 and KNAT6, CYTOCHROME P450 (CYP78A5), and LIGHT SENSITIVE HYPOCOTYLS 3 (LSH3), showed high specificity in clusters 3, 4, 6, and 7, which were shoot meristematic cells [13,14,15]. The proliferating cell marker genes, including 3XHIGH MOBILITY GROUP-BOX 2 (3xHMG-box2) and SYNTAXIN OF PLANTS 111 (SYP111), were expressed in cluster 11 cells [16, 17]. The vascular cells were assigned to clusters 10, 15, and 17, in which the following genes were expressed: the companion cell genes ARATHNICTABA 5 (AN5), SUCROSE-PROTON SYMPORTER 2 (SUC2), TETRASPANIN 6 (TET6), and HEAVY METAL-ASSOCIATED PROTEIN 42 (NAKR1); the phloem-related genes UAS-TAGGED ROOT PATTERNING 3 (URP3) and DNA BINDING WITH ONE FINGER 5.6 (DOF5.6); and the xylem-related genes ACAULIS 5 (ACL5), ABNORMAL SHOOT 5 (ABS5), and FASCICLIN-LIKE ARABINOGALACTAN-PROTEIN 12 (FLA12) [6, 18,19,20,21]. Clusters 5 and 12 comprise epidermal cells and guard cells. The epidermal cell and guard cell marker genes included PROTODERMAL FACTOR 1 (PDF1), FIDDLEHEAD (FDH), MERISTEM LAYER 1 (ATML1), FAMA (FMA), ALUMINUM-ACTIVATED 12 (ALMT12), and STOMATAL CARPENTER 1 (SCAP1) [22,23,24,25,26,27]. Mesophyll cells were assigned to clusters 0, 1, 2, 9, 13, and 14. The light-dependent gene RUBISCO SMALL SUBUNIT 3B (RBCS3B) and chloroplast-related genes CHLORORESPIRATORY REDUCTION 23 (CRR23) and BUNDLE SHEATH DEFECTIVE 2 (BSD2) were highly expressed mainly in clusters 2 and 13 and weakly expressed in clusters 0, 1, and 9. These genes encode chloroplast proteins that are highly expressed in the mesophyll cells [6, 7, 28]. In addition, clusters 0 and 2 were highly enriched for the expression of “chloroplast” signature genes, and clusters 2, 13, and 14 were enriched in genes involved in the “photosynthesis” pathway, suggesting that these mesophyll cell populations play pivotal roles in light capture (Additional file 1: Figure S1). Similarly, correlations between clusters could reveal the cell type organization. We found a strong correlation between clusters 0, 1, and 9 (>0.95). Clusters 16 and 18 showed weak correlations (0.2 and 0.1, respectively) with all the other clusters. Using the known marker genes, we could not determine the cell types for clusters 8, 16, or 18.
Identification of novel cell-type marker genes
Several databases are available for selecting cell types in a few plant species, including Arabidopsis, rice, maize, and peanut, but no vegetable species are represented. However, we found that these marker genes are not expressed exclusively in a single cell type, making it desirable to identify novel genes with cell-type-specific expression in Chinese cabbage.
To explore the potential marker genes for different cell types, we analyzed gene expression profiles in shoot meristematic cells, mesophyll cells, epidermal cells, guard cells, vascular cells, and proliferating cells and in three unknown cell clusters, UK8, UK16, and UK18. We confirmed the genes with high (average expression value in the target cluster > that of the others) and cell-type-specific expression (genes must be expressed in 25% of cells within the target cell type and < 25% of cells in all the other cell types; p value ≤ 0.01; log2FC ≥ 0.5). In total, 24 genes in SCs, 229 genes in mesophyll cells, 78 genes in epidermal cells, 116 genes in guard cells, 72 genes in vascular cells, and 219 genes in proliferating cells as well as 22 genes in UK8, 237 genes in UK16, and 808 genes in UK18 were identified (Additional file 7: Table S6). Some genes with known gene functions were included (Additional file 1: Figure S2): the stomatal guard cell differentiation promote the gene FAMA (BAA01g31960, BAA03g42630) and the putative Na+/H+ antiporter gene CHX20 (BAA04g05930) in guard cells; the plant-specific transmembrane domain-containing protein-encoding gene MLO6 (BAA01g31040) and the wax biosynthesis gene KCS3 (BAA09g69430) in epidermal cells; the cell expansion gene ARL (BAA03g24360) in shoot meristematic cells; the mitotic cell cycle and division control gene SCL28 (BAA09g16200) in proliferating cells; the large subunit of ADP-glucose pyrophosphorylase-encoding gene ADG2 (BAA10g21220) in mesophyll cells; and the phloem development gene APL (BAA07g26030, BAA02g27290) and the Cu-chaperone protein-encoding gene CCH (BAA07g21830) in vascular cells. Moreover, many genes whose functions were unknown were identified from the transcriptomic datasets. The top 20 marker genes with the highest expression in each cell type were selected for display in the t-SNE plots to show the cell type specificity (Additional file 1: Figure S2).
Predominant gene expression from different subgenomes in different cell types
Compared with the model plant species Arabidopsis, Brassica species experienced a WGT event, which has played an important role in the morphotypic diversification of Brassica plants. The genome of Chinese cabbage comprises three subgenomes, namely, LF, MF1 and MF2, which differ in both gene density and gene expression.
The Chinese cabbage A03 line was found to contain 14,470 genes (covering 31% of the A03 genome) in LF, 10,160 genes in MF1 (covering 21%), and 8578 genes in MF2 (covering 18%), as well as 14,605 ungrouped genes (UG) (covering 30%) [29] (Fig. 2a). To characterize the genes that shows predominant expression from a specific subgenome, gene expression was measured for all genes in the populations of mesophyll cells, mesophyll cells, epidermal cells, guard cells, vascular cells, and proliferating cells respectively, and expressed genes were identified as those expressed in at least 5% of cells in the target cell type. In total, 13,011 genes in mesophyll cells, 12,997 genes in proliferating cells, 10,325 genes in vascular cells, 10,798 genes in epidermal cells, 11,684 genes in guard cells, and 9857 genes in SCs were identified as expressed genes. In each cell type, the proportion of expressed genes in the subgenome was similar. The expression percentage/density of the subgenomes was in the order of LF>MF1>MF2>UG; 41% showed expression in the LF subgenome in each cell population, 26% showed expression in MF1, 22% showed expression in MF2, and 10% showed expression in the UG (Fig. 2b). The mean expression values of all expressed genes in LF, MF1, and MF2 were similar across different cell types and significantly higher than those in the UG, which contained the largest number of genes (Fig. 2c). Unlike the B. rapa genes in the three subgenomes, those in the UG were identified as having no syntenic orthologs in Arabidopsis.
The syntenic orthologs between Arabidopsis and Brassica were identified using the method described by Cheng et al. (2012), and 9922 one-to-one copies, 6468 pairs one-to-two copies, and 2083 groups one-to-three copies of Brassica genes were identified in A03. Together with nonsyntenic genes in the UG, we quantified the number of expressed genes from different subgenomes in all six cell types. The UG had the highest number of genes, but the proportion of genes expressed within the group was the lowest—approximately 10%. For the syntenic genes, the proportion of expressed genes within groups was high for one-to-three-copy genes, followed by one-to-two-copy and one-to-one-copy genes (Fig. 2d).
Does a subgenome exhibit dominant expression at the cellular level? An analysis of the transcript levels of genes in different cell types was performed to identify the expression differences among duplicated co-orthologs. We used a twofold change method to evaluate the predominantly expressed genes, or those that were more highly expressed in one subgenome than in the other two subgenomes according to the criteria that at least 5% of cells were expressed in the target cell type and the |log2FC| was ≥0.36 with a p value of ≤0.05 (Additional file 8: Table S7). We found that the genes in the LF subgenome were predominantly expressed compared with the genes in MF1 and MF2, especially in SCs and proliferating cells (Fig. 2f). In addition, more genes in MF1 than in MF2 were predominantly expressed. Not only the difference in the expression level but also the function of the predominantly expressed genes in each subgenome differed across different cell types (Additional file 1: Figure S3). For example, in the mesophyll cells, the LF predominantly expressed genes were enriched in phosphatase regulator activity, organonitrogen compound biosynthetic process, and ribosome-related terms; MF1 predominantly expressed genes were enriched in cellular respiration, translation, and ATPase complex transport; and MF2 predominantly expressed genes were enriched in peptide biosynthetic and metabolic processes, intercellular parts, and organelle membrane systems. Although the expression distribution across the three subgenomes was similar in different cell types, the expression patterns of duplicated genes differed between cell types. An analysis of the transcript levels of genes in different cell types revealed the expression differences between duplicated co-orthologs (Fig. 2e). Interestingly, orthologs of Arabidopsis genes were predominantly expressed in different cell types. For example, BAA06g05630 (LF) and BAA08g34220 (MF1) are homologous to AT1G08050, which encodes a zinc finger (C3HC4-type RING finger) protein, and these genes were expressed in epidermal cells and vascular cells, respectively, which may indicate that the copies of these genes had different functions in different cell types.
Heat stress induced transcriptomic changes vary among cell types
Chinese cabbage is a cool-season leafy vegetable species whose leaf development is greatly influenced by temperature. To study the cellular heterogeneity of Chinese cabbage leaves in response to heat stress, we isolated protoplasts from the third true leaves of 4-week-old plants grown at 40 °C for 12 h (“heat”). Protoplasts from plants grown at 25 °C day/18 °C night were used as controls.
In total, 17,245 “control” individual leaf cells (8350 in L1-C and 8895 in L2-C) and 20,663 “heat-treated” individual leaf cells (11,075 in L1-H and 9588 in L2-H) were obtained (Additional file 1: Figure S4). The medians of the unique molecular identifiers (UMIs) and genes were 9618 and 3208 per cell in LC and 8310 and 3546 per cell in LH, respectively. We classified 34,953 leaf cells into 19 clusters, and the uniform manifold approximation and projection (UMAP) algorithm was used to visualize and explore the datasets. All 19 cell clusters included cells from both control and heat-treated plants, which suggests that the cell type identities were not affected by heat treatment (Fig. 3a). However, the percentage of cells in each cluster was notably different between LC and LH (Additional file 1: Figure S4). The aforementioned marker genes were used to determine the cell types of each cluster. In the central part of the leaves, mesophyll cells were found in the most cell clusters, including clusters 0, 1, 2, 3, 4, 6, 7, 8, 10, and 11. Cluster 15 comprised proliferating cells. Phloem cells, companion cells, and xylem cells contained in the vasculature were identified in clusters 9, 14, 17, and 18. Cluster 5 and neighboring cluster 12 comprised epidermal cells and guard cells. Consequently, the major cell types in the leaves were identified, including mesophyll cells, vascular cells, and epidermal cells.
The UMI reflects the number of transcripts captured by scRNA-seq. After heat treatment, we found that the total UMI significantly decreased in the mesophyll cell, epidermal cell, guard cell, and vascular cell types but not in the proliferating cells (Fig. 3b). We measured the expression of genes affected by heat stress in different cell types. The reference genes UBC10, TUA, TSB, CAC, SNF, SAND, UBC1, PP2A, and ZNF were not differentially expressed in different cell types between the control and heat stress treatment groups [30, 31] (Additional file 9: Table S8). Several groups have reported that gene expression patterns were affected after heat treatment in different cell types.
The differentially expressed genes (DEGs) for each cell type between the control and heat-treated conditions were detected when there was a |log2FC| ≥ 0.36 difference in the average expression level and when P < 0.05 (Additional file 10: Table S9). We found that the DEGs varied in different cell types: mesophyll cells (5924 up, 2143 down) > epidermal cells (4030 up, 1212 down) > vascular cells (2368 up, 934 down) > proliferating cells (999 up, 426 down) > guard cells (770 up, 246 down) (Fig. 3c). Of them, the expression of only 150 upregulated DEGs and 92 downregulated DEGs was altered in all five cell types (Fig. 3d). In contrast, more DEGs were specifically active in a single cell type, including 3202 in mesophyll cells (2261 up, 941 down), 906 in epidermal cells (610 up, 296 down), 457 in vascular cells (297 up, 160 down), 235 in proliferating cells (120 up, 115 down), and 108 in guard cells (73 up, 35 down). These cell type-specific DEGs were enriched in various GO terms and were also different from those in the overall leaf tissues (Additional file 1: Figure S5). Cell type-specific DEGs in the epidermis, the outermost cell layer surrounding the leaves, were mostly enriched in the categories “ribosome,” “channel activity,” and “stress regulation” in epidermal cells and “sucrose” in guard cells. In mesophyll cells and proliferating cells, DEGs showed signatures for “organophosphate,” “histone,” “amino acids catabolic,” “kinase,” and “cellular and intercellular.” In vascular cells, the DEGs were predicted to be significantly enriched in “transport” processes.
Cell type-specific heat stress response marker genes
Expression analysis of the cell type-specific DEGs whose expression changed after heat treatment only in one cell type could be considered potential heat stress response marker genes for different cell populations (Additional file 11: Table S10; Additional file 1: Figure S5). For example, BAA03g00780, which is syntenic to Arabidopsis AT5G02380 (MT2B) and encodes a small, cysteine-rich metal-binding protein active in the response to lead exposure, NaCl stress, and heavy metal tolerance, was upregulated in mesophyll cells after heat treatment; BAA03g43520, which is syntenic to the Arabidopsis salt stress response gene AT3G52590 (UBQ1), was upregulated in epidermal cells after heat treatment; and BAA09g53920, which is syntenic to Arabidopsis AT3G62550, which encodes a protein with an ATP-binding motif, was downregulated in mesophyll cells. Moreover, many Brassica genes with unknown functions, such as BAA09g04920 in epidermal cells, BAA07g19420 in vascular cells, BAA03g14340 in guard cells, and BAA10g23280 in proliferating cells, were specifically down- or upregulated in one cell type after heat treatment (Fig. 3e). These cell-specific corresponding genes with unknown functions need to be studied to explore the cell-specific thermal response mechanism.
Expression patterns of HSPs in different cell types
There were more upregulated genes than downregulated genes in all five cell types after heat stress. By investigating heat-responsive genes at the cellular level, we found differences at the tissue level. The expression of HSP genes, which are well-known target genes of heat stress responsive TFs, was more often downregulated than upregulated under heat stress in this study. We found that high temperature downregulated 38 HSP genes and upregulated eight HSPs of various families. The expression of these HSPs was different in the different cell populations, but not all of them showed significant changes in a particular cell type. For downregulated HSPs, the expression of 23 HSPs was altered in all five cell types, while 15 HSPs showed a cell-type-specific response. For example, HSP23.6 and HSP90-7 were downregulated only in vascular cells; HSP90-6 was downregulated only in epidermal cells; HSP22 was downregulated in epidermal cells and vascular cells; and HSP15.7 was downregulated in mesophyll cells and vascular cells. The eight upregulated HSPs all showed cell-type-specific expression changes. HSP90-5 and HSP70-16 were upregulated in only mesophyll cells, and HSP70-1 was upregulated in mesophyll cells and vascular cells.
There are multiple copies of HSPs in Chinese cabbage, and we found that duplicated pairs of genes exhibit differences in expression profiles in response to heat stress in the same cell type. Three orthologous genes of HSP18.2 were identified between Chinese cabbage (LF-BAA10g17490, MF1-BAA03g11920, MF2-BAA02g11630) and Arabidopsis (AT5G59720). The expression of HSP18.2 was not affected by heat stress in the mesophyll cells, epidermal cells, guard cells, or proliferating cells. In the vascular cells, MF2-BAA02g11630 expression increased, whereas LF-BAA10g17490 and MF1-BAA03g11920 expression decreased. For the genes present in multiple copies in the Chinese cabbage LF, MF1, and MF2 subgenomes, differential expression patterns were also detected. Collectively, the genes whose expression was affected by heat stress were more highly expressed in LF than in MF1 and MF2 (Fig. 3f).
Heat stress resulted in opposite gene expression patterns in different cell types
The heat stress response is essentially a single-cell response. The pattern of gene expression in different cell types under heat stress conditions is different. In particular, some genes showed opposite expression patterns in different cell types in response to heat stress (Additional file 12: Table S11). We found 43 genes in the mesophyll cells, 140 genes in the epidermal cells, 97 genes in the guard cells, 65 genes in the vascular cells, and 54 genes in the proliferating cells that showed opposite expression patterns in the other cell types (Additional file 1: Figure S6). For example, the calcium-binding EF-hand family protein-encoding gene MSS3 was upregulated in mesophyll cells but downregulated in the epidermal cells after heat treatment; the gene encoding a WRKY DNA-binding protein, WRKY8, was upregulated in the vascular cells but downregulated in the mesophyll cells; and many ribosomal protein family genes, including S8e, L2, L22e, L24e, and L36e family members, were upregulated in the epidermal cells but downregulated in the other cell types (Fig. 3g). These observations reveal the divergence of the expression of duplicated genes and highlight a new perspective for studying gene functional diversification.