Skip to main content
Fig. 7 | Genome Biology

Fig. 7

From: Origin of exon skipping-rich transcriptomes in animals driven by evolution of gene architecture

Fig. 7

Prediction of ES incidences in ancestral genomes. a Receiver operating characteristic (ROC) curve of a binomial logistic regression model of ES prediction derived from 11 gene architectural features, 22 species, and a dataset of 18,678 exon skipping events for training and testing (see “Methods”). AU-ROC value (0.752) is reported with 95% confidence interval. The non-pale blue dot marks the optimal probability threshold (0.522) corresponding to the maximum sum of sensitivity (0.617) and specificity (0.760). b ES-positive probability distributions of the test dataset, containing 9318 exons with known ES status (50% positive, in green; 50% negative, in maroon). The dotted line marks the optimal probability threshold (0.522; see above). c Contour plot representing ES incidences (IES, z axis) across 1600 simulated genomes of varying mean intron sizes (horizontal) and intron densities (vertical). IES is defined as the fraction of exons in a given simulation with pES > 0.522. Grey marks represent extant species positions in the intron size and density dimensions. Red/pink/orange-colored dots represent estimates of ancestral premetazoan and metazoan genomes. Ancestral intron densities from [26, 27] (Table 1); ancestral intron length distributions are represented by the mean, median, and first and third quartiles (Table 1, Additional file 1: Figure S24; see “Methods”). d 3D projection of c, with IES, (z-axis), intron sizes (x-axis), and intron densities (y-axis). Each dot represents an estimation of a genome (intron density and mean intron length). e Correlation between the IES values obtained from analyzing real genome architectures with the predictive classifier (x-axis) and its closest equivalent from the spectrum of simulated genomes (y-axis) in terms of intron density and mean length

Back to article page