FlsnRNA-seq: protoplasting-free full-length single-nucleus RNA profiling in plants
Genome Biology volume 22, Article number: 66 (2021)
The broad application of single-cell RNA profiling in plants has been hindered by the prerequisite of protoplasting that requires digesting the cell walls from different types of plant tissues. Here, we present a protoplasting-free approach, flsnRNA-seq, for large-scale full-length RNA profiling at a single-nucleus level in plants using isolated nuclei. Combined with 10x Genomics and Nanopore long-read sequencing, we validate the robustness of this approach in Arabidopsis root cells and the developing endosperm. Sequencing results demonstrate that it allows for uncovering alternative splicing and polyadenylation-related RNA isoform information at the single-cell level, which facilitates characterizing cell identities.
High-throughput single-cell transcriptome studies have thrived in animal and human research in recent years [1,2,3,4,5]. However, despite successful single-cell characterization at a relatively low scale in maize developing germ cells  and rice mesophyll cells  using capillary-based approaches , only a handful of large-scale single-cell RNA studies using high-throughput platforms such as 10x Genomics or Drop-seq  have been published in plants , most of which profiled protoplasts generated from the root of Arabidopsis [11,12,13,14,15,16,17,18,19]. A major reason for this narrow focus of tissue type is that plant cells are naturally confined by cell walls, and protoplasting is required to release individual cells—a procedure that is thoroughly tested for Arabidopsis roots [20,21,22] but remains to be difficult or impractical in many other tissues or species. Moreover, generating protoplasts from all cells uniformly is challenging given the complexity of plant tissues, and the enzymatic digestion and subsequent cleanup process during protoplast isolation may trigger the stress response and influence the transcriptome. Therefore, a protoplasting-free method is urgently needed to broaden the application of large-scale single-cell analysis in plants.
We recently characterized full-length nascent RNAs in Arabidopsis and unexpectedly found a large number of polyadenylated mRNAs that are tightly associated with chromatin . Since it is considerably easier and more widely applicable to perform nucleus isolation on various plant tissues than protoplasting, we set out to test if the polyadenylated RNAs in a single nucleus are sufficient to convey information on cell identity using the 10x Genomics high-throughput single-cell platform. Besides the standard Illumina short-read library which primarily captures abundance information, long-read sequencing has recently been incorporated into single-cell studies [24,25,26]. To access the large number of intron-containing RNAs in plant nuclei, we also constructed a Nanopore-based long-read library and developed a bioinformatic pipeline named “snuupy” (single nucleus utility in python) to characterize mRNA isoforms in each nucleus (Fig. 1a, Additional file 1: Fig. S1). Here, we applied the flsnRNA-seq to root and endosperm, respectively, and demonstrated that the long-read single-nucleus strategy would enable plant biologists to bypass protoplasting and study RNA isoforms derived from alternative splicing and alternative polyadenylation (APA) at the single-cell level and provides additional dimensions of transcriptome complexity that could potentially further improve clustering or characterization of different cell types.
Results and discussion
First, we chose to use the Arabidopsis root to validate the effectiveness of our protoplasting-free single-nucleus RNA sequencing approach because of the well-studied cell types  and the rich resource of single-cell data [11,12,13,14,15,16] of this tissue. We directly isolated nuclei by sorting from homogenized root tips of 10-day-old Arabidopsis seedlings without protoplasting (Additional file 1: Fig. S2). The nuclei were fed to the 10x Genomics Chromium platform to obtain full-length cDNA templates labeled with nucleus-specific barcodes, which are subsequently divided into two equal parts and used for constructing Illumina short-read and Nanopore long-read libraries, respectively (Fig. 1a).
From the Illumina library, we obtained a total of 1186 single-nucleus transcriptomes covering 18,913 genes, with median genes/nucleus at 810 and median UMIs/nucleus at 1131. It is worth noting that the proportion of intron-containing mRNAs is extremely high in plant nucleus—54% compared to less than 2% in total RNAs  (Fig. 1b). After generating the cell-gene abundance matrix from Illumina data, we utilized an unbiased graph-based clustering method Louvain  and identified 14 distinct cell clusters (Fig. 1c). We then applied a set of cell type-specific marker genes provided in a recent massive single-cell study of Arabidopsis roots  to annotate each cluster (see the “Methods” section, Additional file 2: Table S1). We were able to assign cell types to all 14 clusters and identified 10 major root cell types previously reported (Fig. 1c, Additional file 1: Fig. S3), with the signature transcripts for each cell type enriched in the corresponding cluster (Fig. 1d, Additional file 1: Fig. S4). Consistent with previous reports [11,12,13,14,15,16], we also noticed that some cell types from our result are composed of multiple clusters, such as stem cell niche (clusters 1, 4, and 12), mature non-hair (clusters 2 and 6), and endodermis (clusters 5 and 8) (Fig. 1c), demonstrating additional heterogeneity (subcell types) within cell types. Moreover, we found the exact same subcell type marker genes of endodermis are enriched in each of its corresponding subcell types as shown in Zhang et al.  (Additional file 1: Fig. S5), demonstrating the robustness of our single-nucleus data. In addition, we used the Scanorama algorithm  to compare our dataset with several recently published root single-cell datasets from protoplasts [11, 12, 14,15,16]. The expression abundance matrix from our single-nucleus dataset closely resembles the protoplasting-based single-cell dataset generated from the same tissue (10-day seedling, 0.5 mm primary root tips)  (Fig. 2a, b). Taken together, we demonstrated that transcriptomes of the single nucleus are sufficient for cell type identification and can be used as a reliable alternative to protoplasts.
As to the Nanopore data analysis, a key challenge is that the relatively low sequencing accuracy of Nanopore (~ 95% per base) makes it difficult to correctly recognize the cell barcodes and UMI information on each Nanopore read. To solve this problem, Lebrigand et al. developed a method named Sicelore to use Illumina short reads generated from the same cDNA library as the guide to allocate Nanopore reads . Sicelore searches for both polyA and adapter sequence and defines the region between these two as the potential barcode and UMI. However, this algorithm relies on the recognition of polyA tail sequence generated by the Nanopore basecalling software, which tends to severely underestimate the length of polyA tail . We tried to further improve Sicelore by developing a polyA-independent algorithm (named snuupy), which searches for cell barcodes and UMIs in the unmapped region of Nanopore reads (see the “Methods” section and Additional file 1: Fig. S1, Fig. 3a). As the result, snuupy recovers 20% more reads from our Nanopore data compared to using Sicelore  (Fig. 3b). After snuupy processing, we obtained 1169 long-read single-nucleus transcriptomes from Nanopore data (compared to the 1186 from Illumina data). The median UMI counts per nucleus (729) and the median gene counts per nucleus (563) from Nanopore data are ~ 64% and ~ 70% of the Illumina count, respectively, and highly consistent in all nuclei (Fig. 3c). The clustering result using Nanopore abundance matrix closely resembles the one generated by Illumina data (Fig. 3d, e), suggesting that Nanopore data itself is sufficient for cell type classification, consistent with a recent large-scale single-cell analysis in human and mouse cells performed entirely with Nanopore data [24, 25].
The single-nucleus long-read Nanopore library provides isoform-level information such as splicing and APA, compared to Illumina library which only captures abundance information of transcripts. Therefore, we generated two additional isoform matrices to track splicing and APA in single nucleus, respectively (Fig. 4a and Additional file 1: Fig. S6), and combined them with the Illumina abundance matrix for a multilayer clustering, to test if these extra layers of information could improve cell type classification. Indeed, we found that the original cluster 2 (mature non-hair) and cluster 10 (cortex) from Illumina data (Fig. 1c) can be further separated into two subcell type clusters after the multilayer clustering (Fig. 4a). As an example, from the Illumina data, transcripts of AT3G19010 are present in both subcell type 2.1 and 2.2 (Fig. 4b, c), while the Nanopore data revealed a large difference at the splicing level of this gene between the two sub-clusters, with the second intron largely unspliced in subcell type 2.2 (Fig. 4d). It is worth noting that, JAZ7, the top 1 enriched gene in cluster 2.2 (Fig. 4e), can regulate splicing during jasmonate response , implying a fascinating potential of cell type-specific regulation of splicing that could be investigated with flnsRNA-seq.
After establishing flsnRNA-seq using well-documented root tissue, we applied this method to investigate other tissues that have not been previously characterized at the single-cell level due to difficulties in obtaining corresponding protoplasts. In flowering plants, seed development is initiated by double fertilization, during which egg cell and central cell fuse with sperm cells respectively to form embryo and endosperm . The endosperm is embedded in the seed coat and responsible for providing nutrients from maternal parent to developing embryo [34, 35]. In Arabidopsis endosperm, the primary nucleus formed after fertilization undergoes several rounds of rapid nuclear divisions without cytokinesis, resulting in a multinucleate cell termed syncytium, which later cellularized and differentiated into three endosperm domains: the micropylar, central peripheral, and chalazal [36, 37] (Fig. 5a). Cellularization of the syncytium is critical for embryo viability , and this process is initiated when the embryo reaches the heart stage, starting from the micropylar domain and gradually proceeds to the central periphery in a wavelike pattern and eventually reach the chalazal zone . Transcriptomes from various developing stages of bulked endosperm have been well-documented using microarray or RNA-seq [39,40,41,42,43,44,45]; however, endosperm has yet to be characterized at the single-cell level due to technical challenges in generating protoplasts from the endosperm.
Here, we applied flsnRNA-seq to the multinucleate endosperm isolated from the heart-stage ovules of Arabidopsis and generated both the Illumina and Nanopore 10x libraries. We obtained 576 nuclei from Illumina data, with the median genes/nucleus at 645 and the median UMIs/nucleus at 853. All 576 nuclei were captured by the Nanopore library, with the median genes/nucleus at 300 and the median UMIs/nucleus at 362 (Additional file 1: Fig. S7). Based on the Illumina abundance matrix, we identified six clusters using the Louvain method (Fig. 5a). Next, we used Nanopore full-length transcript data to analyze retained introns in each nucleus and found that the nuclei from cluster 4, a major cluster that accounts for 14% of the total nuclei, exhibits a distinct high ratio of incompletely spliced transcripts (Fig. 5b, c). Several previous studies have established that increased accumulation of intronic reads is an indicator of transcription activation [46, 47], and the ratio of unspliced precursor and spliced mature transcripts have been widely used to estimate RNA velocity [48, 49]. The high ratio of incompletely spliced transcripts in this particular cluster of endosperm nuclei may reflect delayed pre-mRNA decay, disrupted intron turnover rate, or a global activation of transcription. Next, we assigned the cell type of each nucleus using previously reported cell type-enriched genes at the heart stage [50, 51] and found that the majority of nuclei in cluster 4 are annotated as micropylar endosperm (Fig. 5d). In addition, gene ontology (GO) analysis of the genes upregulated in cluster 4 found all enriched cellular component terms are membrane related (Fig. 5e, Additional file 2: Table S3), suggesting these nuclei are poised to forming the cellular membrane and entering the cellularization stage, consistent with the previous observation that cellularization of the Arabidopsis endosperm is first initiated at the micropylar endosperm [38, 52]. Hence, our method identified a unique cluster of endosperm nuclei with a high proportion of incompletely spliced transcripts, and further investigation could determine whether this is due to the increase in transcription or delay in splicing.
Transcriptome profiling using single nuclei of neurons or frozen tissues has been well-established in animals, and the result is comparable to using the RNA from single cells [53,54,55,56]. In plants, the transcriptome from isolated nuclei and from total RNA of the endosperm has also been shown to be well-correlated . Here, as a proof-of-concept demonstration in plant, our results showed that protoplasting-free large-scale single-nucleus sequencing is sufficient for cell type classification and marker gene identification in Arabidopsis. The cost per cell for single-cell sequencing using isolated nuclei on the 10x Genomics platform is still relatively higher compared to using protoplast due to lower capturing efficiency of 10x Genomics on nuclei versus cell, possibly due to the much smaller size of nucleus; thus, this step still could benefit from further optimization. As we are preparing this manuscript, several groups have also recently adopted the nucleus-based protoplasting-free strategy independently to investigate a wide range of plant tissues, including root, leaf, seedling, shoot apex, endosperm, and inflorescence [57,58,59,60,61]. Eliminating protoplasting as a prerequisite would enable large-scale single-cell profiling on a wide range of tissues and plant species; nevertheless, isolating nuclei from some cell types remains challenging and requires further developments of the nucleus isolation protocol . Our method uniquely combined the Nanopore-based full-length RNA sequencing method with single-nucleus sequencing to capture isoform diversity at the single-nucleus level, which can facilitate cell type classification and characterization by providing extra layers of information in addition to abundance.
Nucleus isolation from root tip of Arabidopsis
The wild-type Arabidopsis seedlings (Col-0) were grown on 1/2 MS plates at 22 °C (16 h light/8 h dark) for 10 days before harvest. The root tip region (5 mm) of seedlings were cut and transferred immediately into a 1.5-ml RNase-free Eppendorf tube kept in liquid nitrogen and were ground into fine powder by a 1000-μl pipette tip in the tube. The powder was then dissolved in 300 μl ice-cold Extraction Buffer (EB)—0.4 M sucrose, 10 mM Tris-HCl pH 8.0, 10 mM MgCl2, 0.2% (w/v) Triton X-100, 1 mM dithiothreitol (DTT), 1× protease inhibitor (Roche), 0.4 U/μl RNase inhibitor (RNaseOUT, Thermo Fisher Scientific). Nonionic surfactant Triton X-100 is used to release nuclei and avoid aggregation during FACS . After gentle vertexing and inversion, the homogenate was filtered through a 20-μm cell strainer into a new tube. Another 400 μl EB was added to the strainer to wash the remaining nuclei. After centrifugation at 4 °C, 2000g for 5 min, the supernatant was removed carefully to avoid RNA contaminants from the cytoplasmic fraction. The pellet was washed twice at 4 °C, 2000g, 5 min with 1 ml EB, and then resuspended in 500 μl EB. For sorting, the nuclei were stained with 4,6-diamidino-2-phenylindole (DAPI) and loaded into a flow cytometer with a 70-μm nozzle. One-milliliter EB was used as the collection buffer. A total of 40,000 nuclei were sorted based on the DAPI signal and the nuclear size. To avoid aggregation, the sorted nuclei were pelleted at 4 °C, 2000g, 5 min, and then resuspended in 100 μl PBST buffer (1× PBS with a low concentration of 0.025% Triton X-100). After checking the quality of nuclei and counting under a microscope using the DAPI channel, 5000 nuclei were transferred into a new tube with 500 μl PBST buffer and centrifuged at 4 °C, 2000g, 5 min. Then, the pellet was resuspended in 20 μl PBST buffer and diluted to about 1000 nuclei/μl.
Endosperm nucleus isolation
Endosperm was isolated as previously reported [63, 64]. Briefly, 5 DAP ovules at the heart stage of embryogenesis were placed on a slide under a dissecting microscope. The endosperm nuclei were released to the slide from the seed by gentle pressure after piercing by a needle. The nuclei were resuspended in isolation buffer (0.3 M sorbitol, 5 mM MES (pH 5.7), 0.4 U/μl RNaseOUT) and transferred to a low-bound 1.5-ml Eppendorf tube by a 10-μl pipette. Then, the collected nuclei were filtered with a 20-μm cell strainer to a new tube and then washed twice with washing buffer (EB) at 4 °C, 2000g, 5 min to remove free RNA and contaminants. Then, the nuclei were resuspended in 100 μl PBST buffer and diluted to 1000 nuclei/μl.
Single-nucleus RNA-seq library construction for Illumina and Nanopore sequencing
Libraries were constructed according to the standard 10x Genomics protocol (Single Cell 3′ Reagent Kits v2 User Guide) with modifications to accommodate Nanopore long-read sequencing. For root tip, nucleus suspension from the previous step (~ 5000 nuclei) were loaded onto the 10x Genomics ChIP, and libraries were made using a 10x Chromium Single Cell 3′ Solution V2 kit. For endosperm, ~ 10,000 nuclei were loaded and subjected to library construction. To obtain full-length cDNA, we extend the elongation time during cDNA amplification from the standard 1 min to 2 min. Half of the cDNA template was used to construct Illumina library according to the manufacturer’s instruction and sequenced with Illumina NavoSeq (Read1:28 bases + Read2:150 bases); the other half of the template was used to make Nanopore library using the Oxford Nanopore LSK-109 kit and sequenced on a MinION flow cell (R9.4.1).
Illumina single-nucleus data analysis
Raw reads of root were mapped to the TAIR10 reference genome by Cell Ranger (v3.1.0) using the default parameters. Cell Ranger (v3.1.0) only counts reads without introns; to accommodate the high proportion of intron-containing reads in our single-nucleus libraries, we removed the intron regions of each read and re-aligned reads to the reference genome by Cell Ranger to identify the nucleus barcode, UMI, and corresponding gene of each read (Additional file 1: Fig. S1). For quality control purpose, genes expressed in less than three nuclei were discarded, and cells with gene counts more than 2300 or fewer than 350 were removed. The Illumina abundance matrix was subsequently analyzed using Scanpy package (v1.6.0)  with recommended parameters for normalization, log-transformation, and scaling. Then, principal component analysis and Louvain algorithm were used on this abundance matrix for clustering. Next, we used the marker genes for different cell types identified in a massive single-cell root data  (Additional file 2: Table S1) to annotate the cell type of each cluster. We first calculate the cell score of each cell type for all cells based on the enrichment degree of a given marker gene set in a given cell, as previously described method . If the highest score exceeds zero, the cell is assigned to the corresponding cell type; otherwise, it is assigned as unknown (Additional file 1: Fig. S3a). Then, each cluster was annotated as the cell type with the highest proportion (Additional file 1: Fig. S3b), and we used developmental stage-specific genes identified in the massive single-cell root data  (Additional file 1: Table S1) to further annotate the clusters resenting non-hair cells as either mature non-hair or elongating non-hair cells (Additional file 1: Fig. S3c).
Five previously published single-cell RNA-seq data of protoplasted Arabidopsis roots using 10x Genomics platform were collected from public databases [11, 12, 14,15,16]. We use Scanorama  to remove batch effects and calculate the alignment score between different datasets.
For endosperm data, raw Illumina reads were processed by Cell Ranger (v5.0.0) using the “--include-introns” parameter, and only nuclei with gene counts between 400 and 3000 were used for the subsequent analysis. After clustering, we used previously reported heart-stage tissue-enriched genes identified from the microarray data of laser capture microdissection samples [50, 51] to assign cell types (Additional file 2: Table S2). We use the “rank_genes_groups” function of scanpy to perform the Wilcoxon test and used “filter_rank_genes_groups” function with “max_out_group_fraction = 0.25” and “min_ fold_change = 1.50” to identify cluster-enriched genes. And agriGO v2  was used for GO enrichment analysis.
Nanopore single-nucleus data processing and isoform analysis
Raw Nanopore data were basecalled using Guppy (v3.6.0) with the parameters “--c dna_r9.4.1_450bps_hac.cfg --fast5_out.” The basecalled reads were mapped to the TAIR10 genome by minimap2 (v2.17) with the parameters “-ax splice --secondary=no -uf --MD --sam-hit-only,” and the multi-mapped reads as well as potential chimeric reads (either the 5′ or 3′ unmapped region is great than 150 nt) were filtered out. The nucleus barcodes and UMI sequences in Nanopore reads were extracted from the unmapped sequences of each read via aligning against all barcode/UMI combinations identified from the Illumina library made from the same full-length cDNA templates, a strategy inspired by the algorithm Sicelore . To reduce search space, we divided the genome into non-overlapped 500-bp bins, and only matched the Illumina barcode/UMI combinations from the bins overlapping or adjacent to the mapping genome region of specific Nanopore read (Fig. 3a). To speed up the alignment process, we first used the heuristic algorithm Blastn (v2.10.0) to find potential seed regions with parameters “-word_size 7 -gapopen 0 -gapextend 2 -penalty -1 -reward 1” and then re-aligned the seed regions by the more accurate Smith-Waterman local alignment algorithm. Our pipeline assigns the closest barcode-UMI match (i.e., with minimal mismatch/gap) to each Nanopore read, allowing up to three base errors (mismatch/gap) for either barcode or UMI, and remove reads with multiple best matching barcode-UMIs. After the barcode and UMI assignment, the Nanopore reads with the same UMI were used to generate an error-corrected consensus sequence of the original RNA molecule by poaV2  and racon . PAS isoform annotation and the intron splicing status of Nanopore read were determined based on previously described [23, 70]. In brief, we clustered adjacent polyA sites into one polyA site cluster with a distance threshold 24 nt and then count the reads in each polyA site cluster to obtain the APA matrix. And we defined the reads containing at least one intron with a mapping ratio of more than 50% as unspliced read, and the others were defined as spliced read. The resulted APA and splicing matrices for all nuclei were merged with Illumina abundance matrix and analyzed by Scanpy.
The same Cell Ranger result is used as the input file for Sicelore. Except that the maximum edit distance during barcode and UMI assignment is forcibly set to 3, the remaining parameters are the same as the official example (https://github.com/ucagenomix/sicelore/blob/master/quickrun.sh).
Availability of data and materials
Flsn-seq data generated in this study are deposited in NCBI with the accession numbers PRJNA664874 (Root)  and PRJNA685588 (Endosperm) . The preprocessed datasets analyzed in the study and the source code can be downloaded from Zenodo (https://doi.org/10.5281/zenodo.4467583)  or GitHub repository (https://github.com/ZhaiLab-SUSTech/snuupy/tree/master) .
Chen X, Teichmann SA, Meyer KB. From tissues to cell types and back: single-cell gene expression analysis of tissue architecture. Annu Rev Biomed Data Sci. 2018;1:29–51.
Lein E, Borm LE, Linnarsson S. The promise of spatial transcriptomics for neuroscience in the era of molecular cell typing. Science. 2017;358:64–9.
Kelsey G, Stegle O, Reik W. Single-cell epigenomics: recording the past and predicting the future. Science. 2017;358:69–75.
Stubbington MJT, Rozenblatt-Rosen O, Regev A, Teichmann SA. Single-cell transcriptomics to explore the immune system in health and disease. Science. 2017;358:58–63.
Svensson V, Vento-Tormo R, Teichmann SA. Exponential scaling of single-cell RNA-seq in the past decade. Nat Protoc. 2018;13:599–604.
Nelms B, Walbot V. Defining the developmental program leading to meiosis in maize. Science. 2019;364:52–6.
Han Y, Chu X, Yu H, Ma Y-K, Wang X-J, Qian W, Jiao Y. Single-cell transcriptome analysis reveals widespread monoallelic gene expression in individual rice mesophyll cells. Sci Bull. 2017;62:1304–14.
Luo C, Fernie AR, Yan J. Single-cell genomics and epigenomics: technologies and applications in plants. Trends Plant Sci. 2020;25:1030–40.
Macosko Evan Z, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, Tirosh I, Bialas Allison R, Kamitaki N, Martersteck Emily M, et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell. 2015;161:1202–14.
Rich-Griffin C, Stechemesser A, Finch J, Lucas E, Ott S, Schäfer P. Single-cell transcriptomics: a high-resolution avenue for plant functional genomics. Trends Plant Sci. 2020;25:186–97.
Denyer T, Ma X, Klesen S, Scacchi E, Nieselt K, Timmermans MCP. Spatiotemporal developmental trajectories in the Arabidopsis root revealed using high-throughput single-cell RNA sequencing. Dev. Cell. 2019;48:840–52 e845.
Jean-Baptiste K, McFaline-Figueroa JL, Alexandre CM, Dorrity MW, Saunders L, Bubb KL, Trapnell C, Fields S, Queitsch C, Cuperus JT. Dynamics of gene expression in single root cells of Arabidopsis thaliana. Plant Cell. 2019;31:993–1011.
Shulse CN, Cole BJ, Ciobanu D, Lin J, Yoshinaga Y, Gouran M, Turco GM, Zhu Y, O’Malley RC, Brady SM, Dickel DE. High-throughput single-cell transcriptome profiling of plant cell types. Cell Rep. 2019;27:2241–7 e2244.
Ryu KH, Huang L, Kang HM, Schiefelbein J. Single-cell RNA sequencing resolves molecular relationships among individual plant cells. Plant Physiol. 2019;179:1444–56.
Zhang T-Q, Xu Z-G, Shang G-D, Wang J-W. A single-cell RNA sequencing profiles the developmental landscape of Arabidopsis root. Mol Plant. 2019;12:648–60.
Wendrich JR, Yang B, Vandamme N, Verstaen K, Smet W, Van de Velde C, Minne M, Wybouw B, Mor E, Arents HE, et al. Vascular transcription factors guide plant epidermal responses to limiting phosphate conditions. Science. 2020;370:eaay4970.
Shahan R, Hsu C-W, Nolan TM, Cole BJ, Taylor IW, Vlot AHC, Benfey PN, Ohler U: A single cell Arabidopsis root atlas reveals developmental trajectories in wild type and cell identity mutants. bioRxiv 2020:2020.2006.2029.178863.
Liu Q, Liang Z, Feng D, Jiang S, Wang Y, Du Z, Li R, Hu G, Zhang P, Ma Y, et al. Transcriptional landscape of rice roots at the single cell resolution. Mol Plant. 2020; https://doi.org/10.1016/j.molp.2020.12.014.
Satterlee JW, Strable J, Scanlon MJ. Plant stem-cell organization and differentiation at single-cell resolution. Proc Natl Acad Sci U S A. 2020;117:33689-99.
Birnbaum K, Shasha DE, Wang JY, Jung JW, Lambert GM, Galbraith DW, Benfey PN. A gene expression map of the Arabidopsis root. Science. 2003;302:1956.
Brady SM, Orlando DA, Lee J-Y, Wang JY, Koch J, Dinneny JR, Mace D, Ohler U, Benfey PN. A high-resolution root spatiotemporal map reveals dominant expression patterns. Science. 2007;318:801.
Li S, Yamada M, Han X, Ohler U, Benfey Philip N. High-resolution expression map of the Arabidopsis root reveals alternative splicing and lincRNA regulation. Dev Cell. 2016;39:508–22.
Jia J, Long Y, Zhang H, Li Z, Liu Z, Zhao Y, Lu D, Jin X, Deng X, Xia R, et al. Post-transcriptional splicing of nascent RNA contributes to widespread intron retention in plants. Nat Plants. 2020;6:780–8.
Lebrigand K, Magnone V, Barbry P, Waldmann R. High throughput error corrected Nanopore single cell transcriptome sequencing. Nat Commun. 2020;11:4025.
Volden R, Vollmers C: Highly multiplexed single-cell full-length cDNA sequencing of human immune cells with 10x Genomics and R2C2. bioRxiv 2020:2020.2001.2010.902361.
Gupta I, Collier PG, Haase B, Mahfouz A, Joglekar A, Floyd T, Koopmans F, Barres B, Smit AB, Sloan SA, et al. Single-cell isoform RNA sequencing characterizes isoforms in thousands of cerebellar cells. Nat Biotechnol. 2018;36:1197–202.
Drapek C, Sparks EE, Benfey PN. Uncovering gene regulatory networks controlling plant cell differentiation. Trends Genet. 2017;33:529–39.
Parker MT, Knop K, Sherwood AV, Schurch NJ, Mackinnon K, Gould PD, Hall AJW, Barton GJ, Simpson GG. Nanopore direct RNA sequencing maps the complexity of Arabidopsis mRNA processing and m6A modification. eLife. 2020;9:e49658.
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech Theory Exp. 2008;2008:P10008.
Hie B, Bryson B, Berger B. Efficient integration of heterogeneous single-cell transcriptomes using Scanorama. Nat Biotechnol. 2019;37:685–91.
Krause M, Niazi AM, Labun K, Torres Cleuren YN, Müller FS, Valen E. tailfindr: alignment-free poly(A) length measurement for Oxford Nanopore RNA and DNA sequencing. RNA. 2019;25:1229–41.
Feng G, Yoo M-J, Davenport R, Boatwright JL, Koh J, Chen S, Barbazuk WB. Jasmonate induced alternative splicing responses in Arabidopsis. Plant Direct. 2020;4:e00245.
Bleckmann A, Alter S, Dresselhaus T. The beginning of a seed: regulatory mechanisms of double fertilization. Front Plant Sci. 2014;5:452.
Lafon-Placette C, Köhler C. Embryo and endosperm, partners in seed development. Curr Opin Plant Biol. 2014;17:64–9.
Gehring M, Choi Y, Fischer RL. Imprinting and seed development. Plant Cell. 2004;16:S203.
Sorensen MB, Mayer U, Lukowitz W, Robert H, Chambrier P, Jurgens G, Somerville C, Lepiniec L, Berger F. Cellularisation in the endosperm of Arabidopsis thaliana is coupled to mitosis and shares multiple components with cytokinesis. Development. 2002;129:5567–76.
Brown RC, Lemmon BE, Nguyen H. Events during the first four rounds of mitosis establish three developmental domains in the syncytial endosperm of Arabidopsis thaliana. Protoplasma. 2003;222:167–74.
Brown RC, Lemmon BE, Nguyen H, Olsen O-A. Development of endosperm in Arabidopsis thaliana. Sex Plant Reprod. 1999;12:32–42.
Gehring M, Missirian V, Henikoff S. Genomic analysis of parent-of-origin allelic expression in Arabidopsis thaliana seeds. PLoS One. 2011;6:e23687.
Hsieh T-F, Shin J, Uzawa R, Silva P, Cohen S, Bauer MJ, Hashimoto M, Kirkbride RC, Harada JJ, Zilberman D, Fischer RL. Regulation of imprinted gene expression in Arabidopsis endosperm. Proc Natl Acad Sci U S A. 2011;108:1755.
Del Toro-De León G, Köhler C. Endosperm-specific transcriptome analysis by applying the INTACT system. Plant Reprod. 2019;32:55–61.
Day RC, Herridge RP, Ambrose BA, Macknight RC. Transcriptome analysis of proliferating Arabidopsis endosperm reveals biological implications for the control of syncytial division, cytokinin signaling, and gene expression regulation. Plant Physiol. 2008;148:1964.
Pignatta D, Erdmann RM, Scheer E, Picard CL, Bell GW, Gehring M. Natural epigenetic polymorphisms lead to intraspecific variation in Arabidopsis gene imprinting. eLife. 2014;3:e03198.
Moreno-Romero J, Santos-González J, Hennig L, Köhler C. Applying the INTACT method to purify endosperm nuclei and to generate parental-specific epigenome profiles. Nat Protoc. 2017;12:238–54.
Moreno-Romero J, Del Toro-De León G, Yadav VK, Santos-González J, Köhler C. Epigenetic signatures associated with imprinted paternally expressed genes in the Arabidopsis endosperm. Genome Biol. 2019;20:41.
Zeisel A, Köstler WJ, Molotski N, Tsai JM, Krauthgamer R, Jacob-Hirsch J, Rechavi G, Soen Y, Jung S, Yarden Y, Domany E. Coupled pre-mRNA and mRNA dynamics unveil operational strategies underlying transcriptional responses to stimuli. Mol Syst Biol. 2011;7:529.
Gaidatzis D, Burger L, Florescu M, Stadler MB. Analysis of intronic and exonic reads in RNA-seq data characterizes transcriptional and post-transcriptional regulation. Nat Biotechnol. 2015;33:722–9.
La Manno G, Soldatov R, Zeisel A, Braun E, Hochgerner H, Petukhov V, Lidschreiber K, Kastriti ME, Lönnerberg P, Furlan A, et al. RNA velocity of single cells. Nature. 2018;560:494–8.
Bergen V, Lange M, Peidli S, Wolf FA, Theis FJ. Generalizing RNA velocity to transient cell states through dynamical modeling. Nat Biotechnol. 2020;38:1408–14.
Schon MA, Nodine MD. Widespread contamination of Arabidopsis embryo and endosperm transcriptome data sets. Plant Cell. 2017;29:608.
Belmonte MF, Kirkbride RC, Stone SL, Pelletier JM, Bui AQ, Yeung EC, Hashimoto M, Fei J, Harada CM, Munoz MD, et al. Comprehensive developmental profiles of gene activity in regions and subregions of the Arabidopsis seed. Proc Natl Acad Sci U S A. 2013;110:E435.
Orozco-Arroyo G, Paolo D, Ezquer I, Colombo L. Networks controlling seed size in Arabidopsis. Plant Reprod. 2015;28:17–32.
Habib N, Li Y, Heidenreich M, Swiech L, Avraham-Davidi I, Trombetta JJ, Hession C, Zhang F, Regev A. Div-Seq: single-nucleus RNA-Seq reveals dynamics of rare adult newborn neurons. Science. 2016;353:925.
Habib N, Avraham-Davidi I, Basu A, Burks T, Shekhar K, Hofree M, Choudhury SR, Aguet F, Gelfand E, Ardlie K, et al. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat Methods. 2017;14:955–8.
Ding J, Adiconis X, Simmons SK, Kowalczyk MS, Hession CC, Marjanovic ND, Hughes TK, Wadsworth MH, Burks T, Nguyen LT, et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat Biotechnol. 2020;38:737–46.
Krienen FM, Goldman M, Zhang Q, C. H. del Rosario R, Florio M, Machold R, Saunders A, Levandowski K, Zaniewski H, Schuman B, et al: Innovations present in the primate interneuron repertoire. Nature 2020, 586:262–269.
Thibivilliers S, Anderson D, Libault M. Isolation of plant root nuclei for single cell RNA sequencing. Curr Protoc Plant Biol. 2020;5:e20120.
Picard CL, Povilus RA, Williams BP, Gehring M: Single nucleus analysis of Arabidopsis seeds reveals new cell types and imprinting dynamics. bioRxiv 2020:2020.2008.2025.267476.
Tian C, Du Q, Xu M, Du F, Jiao Y: Single-nucleus RNA-seq resolves spatiotemporal developmental trajectories in the tomato shoot apex. bioRxiv 2020:2020.2009.2020.305029.
Farmer A, Thibivilliers S, Ryu KH, Schiefelbein J, Libault M. Single-nucleus RNA and ATAC sequencing reveals the impact of chromatin accessibility on gene expression in Arabidopsis roots at the single-cell level. Mol Plant. 2021; https://doi.org/10.1016/j.molp.2021.01.001.
Sunaga-Franze DY, Muino JM, Braeuning C, Xu X, Zong M, Smaczniak C, Yan W, Fischer C, Vidal R, Kliem M, et al: Single-nuclei RNA-sequencing of plants. bioRxiv 2020:2020.2011.2014.382812.
Krishnaswami SR, Grindberg RV, Novotny M, Venepally P, Lacar B, Bhutani K, Linker SB, Pham S, Erwin JA, Miller JA, et al. Using single nuclei for RNA-seq to capture the transcriptome of postmortem neurons. Nat Protoc. 2016;11:499–524.
Gehring M, Huh JH, Hsieh TF, Penterman J, Choi Y, Harada JJ, Goldberg RB, Fischer RL. DEMETER DNA glycosylase establishes MEDEA polycomb gene self-imprinting by allele-specific demethylation. Cell. 2006;124:495–506.
Ibarra CA, Feng X, Schoft VK, Hsieh TF, Uzawa R, Rodrigues JA, Zemach A, Chumak N, Machlicova A, Nishimura T, et al. Active DNA demethylation in plant companion cells reinforces transposon methylation in gametes. Science. 2012;337:1360–4.
Wolf FA, Angerer P, Theis FJ. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 2018;19:15.
Tirosh I, Izar B, Prakadan SM, Wadsworth MH, Treacy D, Trombetta JJ, Rotem A, Rodman C, Lian C, Murphy G, et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science. 2016;352:189.
Tian T, Liu Y, Yan H, You Q, Yi X, Du Z, Xu W, Su Z. agriGO v2.0: a GO analysis toolkit for the agricultural community, 2017 update. Nucleic Acids Res. 2017;45:W122–9.
Lee C, Grasso C, Sharlow MF. Multiple sequence alignment using partial order graphs. Bioinformatics. 2002;18:452–64.
Vaser R, Sović I, Nagarajan N, Šikić M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 2017;27:737–46.
Wu X, Liu M, Downie B, Liang C, Ji G, Li QQ, Hunt AG. Genome-wide landscape of polyadenylation in Arabidopsis provides evidence for extensive alternative polyadenylation. Proc Natl Acad Sci U S A. 2011;108:12533.
Long Y, Liu Z, Jia J, Mo W, Fang L, Lu D, Liu B, Zhang H, Chen W, Zhai J. A protoplasting-free approach for high-throughput full-length single-cell RNA profiling in plants. Datasets. Gene Expression Omnibus. https://www.ncbi.nlm.nih.gov/bioproject/PRJNA664874 (2020). Accessed 21 Sep 2020.
Long Y, Liu Z, Jia J, Mo W, Fang L, Lu D, Liu B, Zhang H, Chen W, Zhai J. Single-nucleus full-length RNA profiling of endosperm. Datasets. Gene Expression Omnibus. https://www.ncbi.nlm.nih.gov/bioproject/PRJNA685588 (2021). Accessed 16 Dec 2020.
Long Y, Liu Z, Jia J, Mo W, Fang L, Lu D, Liu B, Zhang H, Chen W, Zhai J. Single-nucleus nanopore reads processing pipeline. zenode. https://doi.org/10.5281/zenodo.4467583 (2021). Accessed 26 Jan 2021.
Long Y, Liu Z, Jia J, Mo W, Fang L, Lu D, Liu B, Zhang H, Chen W, Zhai J. Single-nucleus nanopore reads processing pipeline. Github. https://github.com/ZhaiLab-SUSTech/snuupy/tree/master (2021). Accessed 26 Jan 2021.
The review history is available as Additional file 3.
Peer review information
Wenjing She was the primary editor of this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Group of J.Z. is supported by the National Key R&D Program of China Grant (2019YFA0903903), the Program for Guangdong Introducing Innovative and Entrepreneurial Teams (2016ZT06S172), the Shenzhen Sci-Tech Fund (KYTDPT20181011104005), and Key Laboratory of Molecular Design for Plant Cell Factory of Guangdong Higher Education Institutes (2019KSYS006).
Ethics approval and consent to participate
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Schematic diagram of snuupy bioinformatic pipeline. Fig. S2. The sorted nuclei were observed under a microscopy with DAPI staining. Fig. S3. Identification of clusters by a marker-gene-based method. Fig. S4. UMAP visualization of the representative cell-type marker genes for each of the 14 cell clusters. Fig. S5. UMAP visualization showing the abundances of representative marker genes in two subcell types of endodermis. Fig. S6. Scheme for deriving the splicing and APA matrices from Nanopore data. Fig. S7. The gene expression matrices of endosperm generated from the two different libraries were similar to each other. (PPTX 4884 kb)
Cell type-specific genes identified by Shahan et al. are used for cell type annotation. Table S2. Cell type-enriched genes at heart stage identified by Schon and Nodine are used for cell type annotation. Table S3. All enriched genes for cluster 4.
About this article
Cite this article
Long, Y., Liu, Z., Jia, J. et al. FlsnRNA-seq: protoplasting-free full-length single-nucleus RNA profiling in plants. Genome Biol 22, 66 (2021). https://doi.org/10.1186/s13059-021-02288-0