Skip to main content
Fig. 5 | Genome Biology

Fig. 5

From: Long non-coding RNAs display higher natural expression variation than protein-coding genes in healthy humans

Fig. 5

GEUVADIS RNA-seq data confirm increased lncRNA expression variability. a Sample processing overview: 462 lymphoblastoid cell lines (LCL) established from healthy donors by EBV transformation were processed by the GEUVADIS RNA-seq Project [50]. b LncRNA identification overview. We picked 20 unrelated donors (total of 522 million uniquely mapped reads) from 462 donors and processed the raw RNA-seq data through the same pipeline used to annotate lncRNAs in granulocytes (Additional file 1: Figure S24). The resulting LCL lncRNA transcriptome contained 2,611 lncRNA loci formed by 8,560 lncRNA transcripts. c Top: overlap between LCL and granulocyte de novo transcriptome annotations created in the study. A total of 536 of 2,611 LCL lncRNA loci overlap granulocyte loci. A total of 9,357 of 12,241 LCL de novo mRNA loci overlap granulocyte loci. Bottom: overlap of de novo lncRNA annotation in LCL with commonly used public annotations (PA): RefSeq, GENCODE-v19, and Cabili [14, 58, 59], and the MiTranscriptome annotation [29] identifies 295 new lncRNA loci. Of these, only 18 loci overlap the de novo lncRNA granulocyte annotation. d, e LncRNAs show higher expression variability than mRNAs in LCL. The boxplots show inter-individual variability of LCL lncRNA (green) and mRNA (blue) transcripts (d) and loci (e). Inter-individual variability is estimated by calculating standard deviation between expression of each transcript/locus in 462 donors normalized to the mean expression. Both transcripts and loci variability is significantly (***P <10–16) different between lncRNAs and mRNA. Median values: lncRNA transcripts: 0.56, mRNA transcripts: 0.24, lncRNA loci: 0.51, mRNA loci: 0.25. f Inter-individual expression variability is higher for newly annotated lncRNA transcripts in LCL. Boxplot shows inter-individual expression variability of LCL lncRNA transcripts split according to coverage by public annotations (PA), which is higher for ‘not in PA’ and ‘isoform not in PA’ lncRNA transcripts compared to ‘in PA’. Median normalized standard deviation values: not in PA: 0.66, isoform not in PA: 0.58, in PA: 0.46. Blue dashed line indicates median expression variability of all de novo mRNA transcripts in (d). Remarks to boxplots d, e, f: transcripts or loci not expressed (RPKM <0.2) in any of the 462 donors were discarded. The box plot displays the full population but P value is calculated using Mann–Whitney U test on equalized sample size (**P <10–10, ***P <10–16). Data from chromosomes X, Y were discarded and outliers are not displayed

Back to article page