- Open Access
QsRNA-seq: a method for high-throughput profiling and quantifying small RNAs
© The Author(s). 2018
- Received: 6 May 2018
- Accepted: 30 July 2018
- Published: 14 August 2018
The ability to profile and quantify small non-coding RNAs (sRNAs), specifically microRNAs (miRNAs), using high-throughput sequencing is challenging because of their small size. We developed QsRNA-seq, a method for preparation of sRNA libraries for high-throughput sequencing that overcomes this difficulty by enabling a gel-free separation of fragments shorter than 100 nt that differ only by 20 nt in length. The method allows the use of unique molecular identifiers for quantification and is more amenable to automation than gel-based methods. We show that QsRNA-seq gives very accurate, comprehensive, and reproducible results by looking at miRNAs in Caenorhabditis elegans embryos and larvae.
- High-throughput sequencing
- C. elegans
Small RNAs (sRNAs) are non-coding RNA molecules, 20–30 nucleotide (nt) long, that impact diverse biological events through the control of gene expression and genome stability. During the last decade, sRNAs emerged as central players in the regulation of gene expression in all kingdoms of life , and they have been shown to regulate virtually all cellular processes. There are several classes of sRNA , with Dicer-generated microRNAs (miRNAs) being the most extensively studied .
High-throughput sequencing (HTS) is currently the method of choice for identifying and analyzing the cellular repertoire of RNAs because it allows one to investigate the entire transcriptome in an unbiased way. While preparation of mRNA and DNA libraries for HTS has become a routine procedure, preparation of sRNA libraries remains technically challenging. Most protocols for generation of sRNA libraries require the sRNA molecules to be ligated from both sides (5′ and 3′) to oligonucleotide adapters that contain the sequencing primers, reverse transcribed for complementary DNA (cDNA) generation, and amplified by polymerase chain reaction (PCR) . The first challenge that arises at the very beginning of library preparation is separation of the sRNAs from other RNA species close in size, in particular transfer RNA (tRNA). Omission of this step results in sRNA libraries that are highly contaminated. In addition, the adapters readily ligate to each other instead of to the RNA, and this product (adapter dimer), if not removed, is preferably amplified by PCR, resulting in null sequences in the HTS data. Because of the small size difference between sRNAs and tRNA and between the sRNA-ligated product and adapter dimer, it is very difficult to separate them, decreasing library quality. Denaturing polyacrylamide gel (PAGE), commonly used for size-based separation of small fragments, not only is time consuming and requires expertise, but also results in a significant loss of the product and eliminates the possibility of automating the pipeline. Solid-phase reversible immobilization (SPRI) on magnetic beads [5, 6] is widely used for nucleic acid separation based on size; however, this method is not able to discriminate between fragments shorter than 100 nt and thus is not applicable for preparation of sRNA libraries.
The implementation of HTS for sRNA profiling is also hindered by the inability to reliably quantify the output data. Small RNA library construction for HTS involves a PCR amplification step that is prone to bias. Because PCR is not a linear process, it quickly reaches plateau, distorting differences in expression. Moreover, PCR efficiency depends on the length of the fragment and on its sequence; variations in base composition might lead to preferred template-specific amplifications . Other biases can be introduced during library preparation including ligation bias during adapter ligation (reviewed in ). HTS-based miRNA expression data is thus not regarded as quantitative, and other techniques, mostly quantitative real-time PCR (qPCR) and miRNA microarrays, are used instead for quantification of individual miRNA and for large-scale studies, respectively. While these two assays are accurate and sensitive, both require prior knowledge and accurate annotation of the miRNA sequence tested and are not applicable to discovering novel miRNAs.
PCR-derived artifacts can be corrected by counting absolute numbers of molecules using unique molecular identifiers (UMIs)  . This method allows distinguishing between original copies of the sRNA present in cells and their amplification products by marking, prior to the amplification step, each molecule in a population by attaching a UMI, a short random sequence. Following amplification, each one of the UMIs attached to an original copy of the molecule will be observed multiple times; however, the original copy number of a molecule can be determined simply by counting each UMI only once upon analysis of HTS sequencing data. Thus, UMIs in the library act as a molecular memory for the number of molecules in the starting sample. In addition, it was shown that using UMIs reduces ligation bias by randomizing the adapter sequences at the ligation junction [8, 10]. Surprisingly, this method is rarely used in generation of sRNA libraries.
Here, we present QsRNA-seq, a novel method for preparation of sRNA libraries for HTS sequencing that overcomes the above-mentioned shortcomings. Our protocol comprises two innovations: (1) gel-less size-based separation of fragments shorter than 100 nt, differing in length by 20 nt or more, and (2) use of UMIs to enable quantification of sRNA expression data.
A novel method for separating nucleic acids shorter than 100 nucleotides
Binding efficiencies of single-stranded DNA (ssDNA) oligonucleotides to SPRI beads at varying isopropanol concentrations
Oligo size (nt)
Isopropanol concentration (%)
QsRNA-seq: a method for preparation of small RNA libraries
The QsRNA-seq protocol is based on the classic sRNA sequencing PAGE-based method . While the PAGE-based method is reliable and has been used in many studies, it is tedious and results in a very low yield. To estimate the efficiency of QsRNA-seq, we prepared libraries from 20 pmol of 22-nt-long synthetic RNA oligo by both methods (QsRNA-seq and the PAGE-based method, two replicas each) and directly quantified the obtained 3′,5′-ligated sRNA using Bioanalyzer. The average yield from QsRNA-seq was about five times higher than that from the PAGE-based method (Additional file 2: Figure S1). Moreover, both replicas generated by QsRNA-seq in terms of quantities were very similar, while the PAGE-generated replicas were different and needed PCR optimization. In addition, reducing the starting material by 10-fold when generating libraries by QsRNA-seq yielded a similar outcome (see below). Therefore, we conclude that QsRNA-seq is much more efficient and robust than the standard PAGE-based method.
To further test the quality of QsRNA-seq-generated libraries, we used RNA extracted from Caenorhabditis elegans embryos because the PAGE-based method was established in C. elegans and has been widely used to study sRNA in this organism. From the same RNA we generated three technical replicas using the classic PAGE-based method with three different barcodes and three technical replicas using QsRNA-seq with the same barcode set. In all six libraries, the fraction of tRNA, which is very close in size to the sRNAs, was minimal (Additional file 2: Table S2). The small nucleolar RNA (snoRNA) fraction was also very low, while the ribosomal RNA (rRNA) fraction was relatively high, ranging from 20% to 54%. rRNAs are larger in size than tRNA and both are very abundant, suggesting that the high fraction of rRNA is a result of RNA degradation and the quality of the RNA. Indeed, earlier generated libraries (8N_AAGA and PAGE based, Additional file 2: Table S2) contain a lower fraction of rRNA. Comparing the two methods, we found a very significant correlation in miRNA expression (see below and Additional file 2: Figure S5) and very similar read length distribution (see below, Additional file 2: Figure S11) and first nucleotide preference (Additional file 2: Figure S9). Collectively, our results indicate that using QsRNA-seq generates sRNA libraries of equivalent quality and higher yield than those obtained using the classic PAGE-based method.
Small RNA library generation is vulnerable to many biases (reviewed in ). Ligation bias, which results from adapter preference, is one of the major biases in library generation. It was shown that adding a UMI to each RNA molecule can reduce the ligation bias  and that adding the UMI before the PCR step can correct for PCR-induced artifacts and enable quantification. Therefore, we also used 5′ adapters that contain eight random nucleotides that provide UMIs in our QsRNA-seq method. After PCR amplification, we considered identical sRNAs with the same UMI as an amplification product and merged them to one sequence (known as collapsing). Comparing libraries generated from the same RNA with and without the UMI with the same set of barcodes, we found that adding the UMI significantly reduced the biases. Evaluating the expression of miRNAs in RNA extracted from C. elegans embryos, we found that more than half of the miRNAs have an improvement in the adapter bias when UMIs are added (Additional file 2: Figure S2, 0Nvs0N compared to 8Nvs8N). After collapsing the samples with the UMI, two thirds of the miRNAs had a reduction in library construction biases with a 30% average bias reduction (Additional file 2: Figure S2, 0Nvs0N compared to 8Ncolvs8Ncol). In addition, adding a UMI seems to improve miRNA quantification (see also the discussions in subsequent sections).
QsRNA-seq can evaluate miRNA abundance and expression changes accurately
To test the ability of QsRNA-seq to detect sRNAs, we used QsRNA-seq on RNA extracted from wild-type C. elegans synchronized to the embryo or the L4 larval stage and on total RNA obtained from human brain. Human brain total RNA was chosen because miRNAs constitute most of the sRNAs in this sample; thus, we expected it to result in a very uniform library. In contrast, C. elegans contains many types of sRNA, including miRNAs, primary and secondary endogenous small interfering RNAs (siRNAs), and piwi-interacting RNAs (piRNAs). We generated three independent biological samples from each C. elegans developmental stage, embryo and L4, as biological replicas. RNA extracted from one sample from each stage was also subjected to three independent library preparations, as technical replicas, and was also used to prepare three technical replica libraries having no UMI in the 5′ adapter (0 N). All library preparations resulted in very clean products ready for sequencing with a negligible ratio of adapter dimer containing no product (less than 2% of the total reads in each library; see the examples in Additional file 2: Figure S3 and the library information in Additional file 2: Table S3).
To evaluate the quality of the QsRNA-seq output sequences, we aligned the generated sequences to all annotated miRNAs (both miRNA and miRNA*) in C. elegans, miRBase WBcel235, or in human, miRBase GRCh38. In the C. elegans sample, all the annotated miRNAs (100%) were present in our samples by at least one strand (3P or 5P), while 97% of all microRNAs had coverage for both strands. Even rare miRNAs such as lsy-6, which is expressed in only one pair of neurons in the C. elegans head, were present  (Additional file 3: Table S4). QsRNA-seq allows extensive multiplexing of the samples before amplification, which can reduce significantly the amount of starting material required. However, even without multiplexing, reducing the starting material by 10-fold, from 1 μg to 100 ng, produced nearly identical results (Additional file 2: Figure S4). In human brain, the coverage was somewhat lower, with alignment to 80% of annotated miRNAs (Additional file 4: Table S5). The difference probably derives from the large number of samples that we generated from whole worms at two developmental stages while we only generated one sample from human cells from a specific tissue. However, miRNAs known to be enriched in human brain, for example, let-7 family, mir-9, mir-26a, and others , were very abundant in our libraries.
We also compared miRNA expression in the libraries generated by the PAGE-based method and QsRNA-seq method from the same RNA without UMIs and found a very significant correlation (Pearson correlation R2 = 0.88, p value < 2.2e-16, Additional file 2: Figure S5).
The increase in variation between samples, before and after collapsing, correlates with the abundance of miRNA in the initial samples, due to biased amplification of the abundant miRNA by PCR. While the ratio between the number of reads obtained for a single miRNA by collapsing and by non-collapsing is not significant for a low abundance miRNA (Additional file 2: Figure S7A), it increases drastically in direct relation to miRNA abundance. Interestingly, the ratio becomes significant (above twofold) when the initial expression level of a miRNA rises above 100 copies (Additional file 2: Figure S7A). Despite this, the higher read number obtained by non-collapsing does not affect the global picture of differential expression between the L4 and embryo stages, i.e., whether a miRNA is upregulated or downregulated (Additional file 2: Figure S8). However, non-collapsing augments the magnitude of L4/embryo expressional fold change for a subset of miRNAs (Additional file 2: Figure S7B).
The 23-nt-long small RNAs are enriched in L4 larvae and not in embryos in C. elegans
To evaluate whether QsRNA-seq can also capture 5′ triphosphate secondary siRNAs, which are mainly 22 nt in size, we added a step to the QsRNA-seq protocol, in which RppH enzyme (New England Bioloabs (NEB)) was used to remove two 5′ phosphates to enable ligation to the 5′ adapter. Using the modified protocol, we generated three biological replicas from the embryo stage and three biological replicas from the L4 stage. The length distribution of the sequences was as expected (Additional file 2: Figure S13): in both the embryo and L4 developmental stages most of the sequences were 22 nt long. The portion of miRNAs from these sequences was very low compared to sequencing that captures mainly 5' monophosphate sRNA (Fig. 4, Additional file 2: Figure S13), suggesting that these are mainly secondary siRNAs. In addition, we observed the difference in length distribution of sequences between embryo and L4 in this sequencing method as well; the portion of 23-nt-long sequences is much higher in L4 compared to embryo, and these 23-nt-long sequences are mainly miRNAs (Additional file 2: Figure S13).
Interestingly, evaluating miRNA expression using only sequences that are 23 nt long (Fig. 5b), we found a high fraction of 23-nt-long miRNAs that are predominantly expressed in L4 (45%) as compared to miRNA predominantly expressed in the embryo stage (23%) (p value < 0.05 by chi-square test). Evaluating miRNA expression using sequences that are 22 nt long, we did not observe a significant difference in the fraction of miRNAs predominantly expressed in L4 versus embryo stage (Additional file 2: Figure S14). Evaluating the portion of highly expressed miRNAs in the L4 developmental stage that are 23 nt long compared to 22 nt long at the samples enriched for secondary siRNAs (Additional file 2: Figure S13), we observed a very similar result with a significant portion of 23-nt-long miRNAs compared to 22-nt-long miRNAs (p value < 0.05 by chi-square test), which was not observed at the embryo stage. Thus, we conclude that in C. elegans the expression of 23-nt-long miRNAs is developmental stage specific.
Although sRNAs (20–30 nt) constitute a very small fraction of the total cellular RNA, they have a significant role in every aspect of cellular and organismal development and maintenance . Thus, identification, characterization, and quantification of sRNAs are an important part of many studies. miRNAs are also promising clinical diagnostic and predictive biomarkers ; in particular, miRNAs circulating in body fluids are attractive candidates to serve as markers in non-invasive “liquid biopsies” . Unfortunately, because of the difficulty in isolating sRNAs from other nucleic acids very close in size, profiling sRNAs using HTS is often avoided. To overcome this problem, we determined the best conditions to separate short nucleic acid fragments using SPRI beads. Our screen resulted in the development of a method to separate fragments shorter than 100 nt differing in length by as few as 20 nt. The method also allows flexibility based on the tradeoff between purity and quantity (e.g., better separation between the fragments but less recovery of the desired fragment). While we concentrated on the binding efficiency of fragments between 19 and 66 nt, for needs of library preparation, binding conditions for fragments ranging between 60 and 100 nt can be easily determined in the same manner for any other purpose.
We developed a new protocol, QsRNA-seq, substituting PAGE size selection by SPRI beads, which resulted in a much quicker, technically easier protocol that enables parallel processing of samples with higher yield. The protocol comprises all the separation steps required for preparing high-quality sRNA libraries for HTS. The 20-nt separation resolution that we gained is sufficient to significantly reduce the two main contaminants of sRNA libraries: tRNAs and adapter dimers. To further avoid adapter dimer contamination, we also include in the protocol a common method of turning a free 3′ adapter into a double-stranded structure by hybridizing the 3′ adapter with a reverse complementary oligonucleotide . Lack of contaminants resulted, as expected, in an impressive sequencing depth. The sequences that were obtained aligned to 97% of the annotated C. elegans miRNAs. All other sRNA types known in C elegans, such as endogenous primary and secondary siRNA and 21 U-RNA, were present in significant quantities. The presence of short (20–40 nt) degradation products of other RNA species, in particular long and abundant rRNA molecules, cannot be avoided by this method but can be minimized by using high-quality RNA.
While loss of material during QsRNA-seq is significantly reduced since no gel purification is required, an amplification step is still needed to produce enough material for HTS. PCR is usually used for amplification; however, it is not a linear process and is not free of biases . In contrast to the heterogeneity in reads observed in mRNA-seq libraries, due to random fragmentation of the input mRNA, reads obtained from miRNA libraries are very uniform, making quantification methods used in mRNA-seq, such as collapsing reads or fragments per kilobase per million mapped fragments (FPKM), unsuitable for miRNA quantification. To allow quantification of miRNAs, we use adapters containing random sequences (UMIs) to mark each molecule before the amplification step. Using UMIs, we can collapse the sequencing reads similarly to what is done for mRNA-seq , obtaining quantitative global sRNA expression data. Another benefit of using UMIs is that it allows an unlimited number of PCR amplification cycles, enabling library preparation from low amounts of starting material, which is especially beneficial for clinical purposes. A UMI reliably reflects molecule counts only if the number of distinct labels is substantially larger than the copy number of the most abundant target molecule; a copy number/labels ratio greater than 0.2 results in an approximate 10–25% undercounting of collapsed reads . In our case, the use of 8-nt-long UMIs resulted in a random barcode pool being saturated for a fraction of highly overexpressed molecules (Additional file 2: Figure S7). Thus, when highly expressed miRNA quantification is needed or a low amount of starting material that requires many rounds of amplification is used, longer UMIs are preferable. Nevertheless, even with 8-nt-long UMIs, reducing the input material by 10-fold produced very similar results (Additional file 2: Figure S4). QsRNA-seq can be adjusted easily to longer UMIs using Table 1. In addition, the use of UMIs was shown to reduce ligation bias as well , which we also observed in our samples (Additional file 2: Figure S2). It is also possible to further reduce the ligation bias by mixing several barcodes for a specific sample. Sequencing errors in the UMI sequences were shown to reflect on the accuracy of quantification when using UMIs; therefore, it is also possible to use UMI-tools for the analysis . In this analysis we only aligned our sequences to existing miRNA datasets; however, with the sequencing depth our method provides, it is also possible to use the sequences to identify novel miRNAs by tools such as mirTools , ShortStack , and miRDeep .
The sequencing depth allowed by QsRNA-seq led to a surprising finding of an expressional bias of 23-nt-long miRNAs in the L4 larval stage in C. elegans, suggesting a possible connection between the length of miRNA and its role in the organism’s development. Interestingly, a study on endogenous siRNAs in C. elegans showed that targets of 23-nt siRNAs are associated uniquely with post-embryonic development . Generation of both endogenous siRNAs and miRNAs is dependent on the Dicer enzyme, and the competition among these processes on resources was suggested to affect development . It is tempting to speculate that sRNA processing by Dicer changes as development progresses, which might be needed for normal development. Studies combining mRNA-seq and sRNA sequencing might be able to shed more light on the function of these 23-nt-long sRNAs.
After establishing a way to separate very short nucleic acid fragments (shorter than 100 nt) that differ in length by only 20 nt, we developed a new method, QsRNA-seq, to prepare sRNA libraries for HTS. QsRNA-seq is a gel-free, fast, and easy-to-perform method that also utilizes UMI and barcoding, produces high-quality sRNA libraries, and generates high-depth expression data. We discuss why QsRNA-seq will be very useful for sRNA research and for clinical diagnosis. In addition, profiling miRNA in C. elegans using QsRNA-seq suggested that not just miRNA expression varies in different developmental stages but also miRNA sizes. We believe that QsRNA-seq can transform the preparation of sRNA libraries into a routine procedure like the preparation of mRNA libraries.
C. elegans growth and synchronization
Wild-type C. elegans strain Bristol N2 was used in this study and was maintained on OP50-seeded enriched plates at 20 °C as described in . Embryos were isolated from gravid N2 adults by treatment with sodium hypochlorite solution to dissolve animals of all stages but embryos. To obtain synchronized L4 worms, embryos were incubated in M9 media without food at 20 °C for 24 h. Hutched synchronized L1 larvae were grown on OP50-seeded enriched plates at 20 °C until they reached the L4 larval stage.
Synchronized embryos or L4 larval worms were washed several times with M9 to avoid contamination with bacteria, and then snap-frozen in liquid nitrogen and ground to powder with a liquid nitrogen pre-chilled mortar and pestle. High molecular weight and low molecular weight RNA fractions were isolated using an miRVana miRNA isolation kit (Ambion). RNA quantity was measured with a Qubit® Fluorometer using a Qubit® RNA HS Assay Kit (Molecular Probes), and RNA quality was estimated using agarose gel electrophoresis and TapeStation (Agilent genomics). Human Brain RNA was obtained from (FirstChoice Human Brain Total RNA, Life Technologies).
Determining SPRI binding conditions
where V is the total volume, X is the volume of nucleic acid solution, P is the desired concentration (%) of PEG, and Q is the desired concentration (%) of isopropanol.
First, a total volume of binding solution (V) was calculated by substituting P, Q, and X in the equation for the desired concentrations of PEG and isopropanol and the volume of nucleic acid solution. Next, the volumes of 20% PEG and 100% isopropanol needed for the desired concentrations, equal to 5PV/100 and QV/100, respectively, were calculated.
To measure the binding efficiency, a solution of 2 ng/μl of synthetic single-stranded DNA oligonucleotide was aliquoted at 50 μl per tube. SPRI beads in 20% PEG (SPRIselect, Beckman-Coulter) and 100% isopropanol were added to each tube at volumes determined using the calculation method above. Size selection was performed according to the manufacturer’s protocol (Beckman’s AMpureXP, left-side selection). Oligonucleotide concentrations in the input and eluted samples were measured with a Qubit® Fluorometer and a Qubit® ssDNA Assay Kit (Molecular Probes). The binding efficiency was calculated by the percentage of the output oligonucleotide obtained from the input quantity.
Small RNA library preparation
Small RNA libraries were prepared from at least three biological replicas of N2 worms at the embryo or L4 stage. One RNA sample from each stage was selected for preparing two additional libraries, resulting in three technical replicas for each stage.
A step-by-step protocol developed in this study that includes reagents and primers is described in Additional file 1. In short, low molecular weight RNA was ligated to a 5′-adenylated 3′ adapter using T4 RNA Ligase 2, truncated (NEB) in an absence of ATP. For preparation of phosphate-independent libraries after the ligation, the sample was incubated with 5 U of RppH enzyme (NEB) for 30 min at 37 °C. The 3′ adapter-ligated sRNA was separated from free 3′ adapters and longer RNA species and then ligated to a 5′ adapter, containing multiplexing barcode and UMI, using T4 RNA Ligase 1 (NEB). sRNA ligated from both sides was then separated from the adapter dimer and free 5′ adapter to obtain an sRNA library. All the separation steps of the library preparation process were performed using the method described above involving modified SPRI-based size selection of short fragments. The sRNA library was reverse transcribed using a qScript Flex cDNA Synthesis Kit (Quantabio) and amplified using Phusion High-Fidelity DNA Polymerase (NEB). The amplified library was cleaned from primers and irrelevant products below 100 bp and above 200 bp by double-side size selection on SPRI beads (Beckman’s AMpureXP) according to the size-selection conditions recommended by the manual, and its concentration and quality were determined by TapeStation analysis (Agilent Genomics). Libraries prepared by the PAGE-based method were prepared exactly as described in . Libraries were sequenced using 50-bp single-read (SR) sequencing mode on a HiSeq 2500 platform (Illumina). All sequences generated in this study were submitted to the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/)  under accession number GSE96824.
Sequence processing and expression analysis
The RNA sequences obtained were first de-multiplexed according to the 4-nt barcode. Next, the 3’ adapter sequences were trimmed off by scanning from the 3′ end of the sequence the first instance of the adapter sequence in increments of 1 nt. We then either (1) removed the barcode and UMI (8-nucleotide) and considered these sequences as non-collapsed, or (2) merged identical sequences and then removed the barcode and UMI, and considered these sequences as collapsed.
C. elegans sequences were either aligned to the WS220 (Wormbase, www.wormbase.org) genome  using Bowtie  for size distribution analysis, allowing no mismatches with no more than 10 alignments to the genome (the Bowtie parameters are bowtie -v 0 -e 120 -a --strata --best -m 10) or aligned to miRBase WBcel235 (www.mirbase.org) , allowing no mismatches and not more than one alignment (Bowtie parameters are -v 0 -e 120 -a --strata --best -m 1). Human brain sequences were aligned to miRBase GRCh38  with the same Bowtie parameters. We used the full miRBase WBcel235 and not just high-confidence miRNAs to allow a large enough dataset for significant comparison. Size distribution analysis was done on processed sequences before and after alignment to the genome. The DESeq  package in R (http://www.r-project.org)  was used to evaluate miRNA expression, and the estimateDispersions function in DESeq was used to estimate the dispersion between biological replicas and technical replicas.
We thank Professor Emeritus Vladimir Lin, from the Technion Faculty of Mathematics, for his help in developing the equation for calculating the volumes of PEG and isopropanol. We also thank Orna Ben-Naim Zgayer, Nabeel Ganem, and Noa Ben-Asher for critical reading of the manuscript and discussions.
This work was funded by the Israeli Centers of Research Excellence (I-CORE) program, (Center No. 1796/12 to ATL), the Israel Science Foundation (grant No. 644/13 to ATL), and the Israel Cancer Research Fund (ICRF).
Availability of data and materials
All data generated in this study can be accessed from NCBI/Genbank short read archive (GEO)  accession number [GEO:GSE96824]: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE96824 .
AF and ATL designed the experiments and protocol. AF performed the experiments. DL and ATL performed the bioinformatics analysis. AF and ATL wrote the manuscript. All authors read and approved the final manuscript.
PCT Patent Application No. PCT/IL2017/050761, entitled “Methods for Isolation and Quantification of Short Nucleic Acid Molecules,” authored by AF and ATL, was filed by Technion Research and Developmental Foundation on 6 July 2017. The application claims priority from US Provisional Application No. 62/359,245. The patent does not restrict the use of the method for the purpose of academic research. Author DL decares that he has no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Jost D, Nowojewski A, Levine E. Small RNA biology is systems biology. BMB Rep. 2011;44:11–21.View ArticlePubMedGoogle Scholar
- Youngman EM, Claycomb JM. From early lessons to new frontiers: the worm as a treasure trove of small RNA biology. Front Genet. 2014;5:416.View ArticlePubMedPubMed CentralGoogle Scholar
- Bartel DP. MicroRNAs: target recognition and regulatory functions. Cell. 2009;136:215–33.View ArticlePubMedPubMed CentralGoogle Scholar
- Lu C, Meyers BC, Green PJ. Construction of small RNA cDNA libraries for deep sequencing. Methods. 2007;43:110–7.View ArticlePubMedGoogle Scholar
- Borgstrom E, Lundin S, Lundeberg J. Large scale library generation for high throughput sequencing. PLoS One. 2011;6:e19119.View ArticlePubMedPubMed CentralGoogle Scholar
- Lundin S, Stranneheim H, Pettersson E, Klevebring D, Lundeberg J. Increased throughput by parallelization of library preparation for massive sequencing. PLoS One. 2010;5:e10029.View ArticlePubMedPubMed CentralGoogle Scholar
- Dabney J, Meyer M. Length and GC-biases during sequencing library amplification: a comparison of various polymerase-buffer systems with ancient and modern DNA sequencing libraries. Biotechniques. 2012;52:87–94.View ArticlePubMedGoogle Scholar
- Raabe CA, Tang T-H, Brosius J, Rozhdestvensky TS. Biases in small RNA deep sequencing data. Nucleic Acids Res. 2014;42:1414–26.View ArticlePubMedGoogle Scholar
- Kivioja T, Vaharautio A, Karlsson K, Bonke M, Enge M, Linnarsson S, Taipale J. Counting absolute numbers of molecules using unique molecular identifiers. Nat Methods. 2012;9:72–4.View ArticleGoogle Scholar
- Fuchs RT, Sun Z, Zhuang F, Robb GB. Bias in ligation-based small RNA sequencing library construction is determined by adaptor and RNA structure. PLoS One. 2015;10:e0126049.View ArticlePubMedPubMed CentralGoogle Scholar
- Paithankar KR, Prasad KSN. Precipitation of DNA by polyethylene glycol and ethanol. Nucleic Acids Res. 1991;19:1346.View ArticlePubMedPubMed CentralGoogle Scholar
- Lau NC, Lim LP, Weinstein EG, Bartel DP. An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science. 2001;294:858–62.View ArticlePubMedGoogle Scholar
- Lui W-O, Pourmand N, Patterson BK, Fire A. Patterns of known and novel small RNAs in human cervical cancer. Cancer Res. 2007;67:6031–43.View ArticlePubMedGoogle Scholar
- Johnston RJ, Hobert O. A microRNA controlling left/right neuronal asymmetry in Caenorhabditis elegans. Nature. 2003;426:845–9.View ArticlePubMedGoogle Scholar
- Skalsky RL, Cullen BR. Reduced expression of brain-enriched microRNAs in glioblastomas permits targeted regulation of a cell death gene. PLoS One. 2011;6:e24248.View ArticlePubMedPubMed CentralGoogle Scholar
- Grishok A. Biology and mechanisms of short RNAs in Caenorhabditis elegans. Chap. 1. In: Theodore Friedmann TJCD, Stephen FG, editors. Advances in Genetics, vol. 83. Amsterdam: Academic Press; 2013. p. 1–69.Google Scholar
- Ruby JG, Jan C, Player C, Axtell MJ, Lee W, Nusbaum C, Ge H, Bartel DP. Large-scale sequencing reveals 21U-RNAs and additional MicroRNAs and endogenous siRNAs in C. Elegans. Cell. 2006;127:1193–207.View ArticlePubMedGoogle Scholar
- Abbott AL, Alvarez-Saavedra E, Miska EA, Lau NC, Bartel DP, Horvitz HR, Ambros V. The let-7 MicroRNA family members mir-48, mir-84, and mir-241 function together to regulate developmental timing in Caenorhabditis elegans. Dev Cell. 2005;9:403–14.View ArticlePubMedPubMed CentralGoogle Scholar
- Esquela-Kerscher A, Johnson SM, Bai L, Saito K, Partridge J, Reinert KL, Slack FJ. Post-embryonic expression of C. Elegans microRNAs belonging to the lin-4 and let-7 families in the hypodermis and the reproductive system. Developmental Dynamics. 2005;234:868–77.View ArticlePubMedPubMed CentralGoogle Scholar
- Karp X, Hammell M, Ow MC, Ambros V. Effect of life history on microRNA expression during C. Elegans development. RNA. 2011;17:639–51.View ArticlePubMedPubMed CentralGoogle Scholar
- Kato M, de Lencastre A, Pincus Z, Slack FJ. Dynamic expression of small non-coding RNAs, including novel microRNAs and piRNAs/21U-RNAs, during Caenorhabditis elegans development. Genome Biol. 2009;10:R54.View ArticlePubMedPubMed CentralGoogle Scholar
- Faruq O, Vecchione A. microRNA: diagnostic perspective. Frontiers Medicine. 2015;2:51.View ArticleGoogle Scholar
- Zen K, Zhang C-Y. Circulating MicroRNAs: a novel class of biomarkers to diagnose and monitor human cancers. Med Res Rev. 2012;32:326–48.View ArticlePubMedGoogle Scholar
- Vigneault F, Ter-Ovanesyan D, Alon S, Eminaga S, C Christodoulou D, Seidman JG, Eisenberg E, M Church G: High-throughput multiplex sequencing of miRNA. Curr Protoc Hum Genet 2012, Chapter 11:Unit 11.12.11–10.Google Scholar
- Lamm AT, Stadler MR, Zhang H, Gent JI, Fire AZ. Multimodal RNA-seq using single-strand, double-strand, and CircLigase-based capture yields a refined and extended description of the C. elegans transcriptome. Genome Res. 2011;21:265–75.View ArticlePubMedPubMed CentralGoogle Scholar
- Fu GK, Hu J, Wang P-H, Fodor SPA. Counting individual DNA molecules by the stochastic attachment of diverse labels. Proc Natl Acad Sci. 2011;108:9026–31.View ArticlePubMedGoogle Scholar
- Smith T, Heger A, Sudbery I. UMI-tools: modeling sequencing errors in unique molecular identifiers to improve quantification accuracy. Genome Res. 2017;27:491–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Wu J, Liu Q, Wang X, Zheng J, Wang T, You M, Sheng Sun Z, Shi Q. mirTools 2.0 for non-coding RNA discovery, profiling, and functional annotation based on high-throughput sequencing. RNA Biol. 2013;10:1087–92.View ArticlePubMedPubMed CentralGoogle Scholar
- Axtell MJ. ShortStack: comprehensive annotation and quantification of small RNA genes. RNA. 2013;19:740–51.View ArticlePubMedPubMed CentralGoogle Scholar
- An J, Lai J, Lehman ML, Nelson CC. miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data. Nucleic Acids Res. 2013;41:727–37.View ArticlePubMedGoogle Scholar
- Asikainen S, Heikkinen L, Wong G, Storvik M. Functional characterization of endogenous siRNA target genes in Caenorhabditis elegans. BMC Genomics. 2008;9:270.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhuang JJ, Hunter CP. The influence of competition among C. Elegans small RNA pathways on development. Genes. 2012;3:671–85.View ArticlePubMedPubMed CentralGoogle Scholar
- Brenner S. The genetics of Caenorhabditis elegans. Genetics. 1974;77:71–94.PubMedPubMed CentralGoogle Scholar
- Gent JI, Lamm AT, Pavelec DM, Maniar JM, Parameswaran P, Tao L, Kennedy S, Fire AZ. Distinct phases of siRNA synthesis in an endogenous RNAi pathway in C. Elegans soma. Mol Cell. 2010;37:679–89.View ArticlePubMedPubMed CentralGoogle Scholar
- Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 2013;41:D991–5.View ArticlePubMedGoogle Scholar
- Chen N, Harris TW, Antoshechkin I, Bastiani C, Bieri T, Blasiar D, Bradnam K, Canaran P, Chan J, Chen C-K, et al. WormBase: a comprehensive data resource for Caenorhabditis biology and genomics. Nucleic Acids Res. 2005;33:D383–9.View ArticlePubMedGoogle Scholar
- Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25.View ArticlePubMedPubMed CentralGoogle Scholar
- Kozomara A, Griffiths-Jones S. miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res. 2014;42:D68–73.View ArticlePubMedGoogle Scholar
- Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11:R106.View ArticlePubMedPubMed CentralGoogle Scholar
- R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2013. http://www.R-project.org/.Google Scholar
- Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002;30:207–10.View ArticlePubMedPubMed CentralGoogle Scholar
- Fishman A, Light D, Lamm AT: QsRNA-seq: a method for high-throughput profiling and quantifying small RNAs. GEO accession number GSE96824. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE96824.