Skip to main content
Fig. 7 | Genome Biology

Fig. 7

From: Transcriptome-wide RNA processing kinetics revealed using extremely short 4tU labeling

Fig. 7

Features associated with splicing speed and comparison of paralogs. a Comparison of secondary structure scores (ΔG; y-axis) for the fastest-splicing and slowest-splicing thirds of 82 ribosomal protein (RP) intron-containing genes (x-axis). The violin plots show the distribution of the features, and the blue dots represent individual RP genes, with dot size corresponding to the splicing speed. The p-value was obtained using Wilcoxon’s test. b Comparison of exon 2 length and secondary structure at the 3′ splice site (3′ss) (y-axis) for 35 non-ribosomal protein (non-RP) intron-containing genes to splicing speed (x-axis). The violin plots show the distribution of the features, and the blue dots represent individual genes, with dot size corresponding to the splicing speed. The p-value was obtained using Wilcoxon’s test. c The mRNA proportion changes of three pairs of paralogs, each pair of which show a similar splicing rate (left panel) and of three pairs of paralogs, each pair of which shows different splicing rates (right panel). The proportion of mRNA is estimated using the probabilistic model described in Additional file 1: Data and Methods from 4-thiouracil (4tU) data. The ΔG per nucleotide values (see “Methods” section) between the 5′ss and branch point are stated in the inset boxes. d Pearson’s correlations between splicing speed and sequence patterns show the significantly correlated features (p < 0.05) to splicing speeds for all 117 intron-containing genes (left panel), 35 non-RP intron-containing genes (middle panel), and 82 RP intron-containing genes (right panel). The features are the occurrence of the specific base or bases in the intron. Yellow represents positive correlation with splicing speed and purple represents negative correlation. e Scatterplot of observed and predicted splicing speeds from the associated features. The features are listed in Additional file 1: Tables S8 and S9, and include secondary structures, splice site scores, intron length and exon length. The predictions are obtained by random forest regression with automatic feature selection. AUC area under the curve

Back to article page