Skip to main content

Table 5 Predicted proportion of exonic variants that disrupt pre-mRNA splicing in human genetic disease (Inherited disease, that is, germline; and Cancer, that is, somatic) and also identified in the general population (1000 Genomes Project participants)

From: MutPred Splice: machine learning-based prediction of exonic variants that disrupt splicing

Data set Proportion of SAVs in data set (predicted SAVs/total variants)
Missense Same-sense Nonsense Total
Inherited disease 11.0% (5,193/47,228) 90.3% (468/518) 30.5% (4,130/13,559) 16.0% (9,791/61,305)
Cancer 9.2% (32,056/347,380) 8.6% (9,010/105,094) 32.4% (9,141/28,256) 10.4% (50,207/480,730)
1000 Genomes 6.8% (7,016/103,445) 6.7% (5,968/89,396) 19.5% (273/1,400) 6.8% (13,257/194,241)
  1. The somatic Cancer data set includes driver and passenger mutations recorded in COSMIC [63]. The 1000 Genomes Project data set was derived from the 1000 Genomes Project without any MAF filter having been applied, that is, all rare and common variants were included. The proportion of predicted SAVs for each data set is shown together with the frequencies of predicted SAVs; the sizes of the data sets are shown in parentheses.