Skip to main content

Table 3 Summary of features investigated in this study

From: MutPred Splice: machine learning-based prediction of exonic variants that disrupt splicing

Feature Type Description
Distance to nearest splice site SNP-based Distance between a given variant and the nearest 5′ or 3′ splice site in the target exon.
ESR change SNP-based Change in the frequency of ESR elements subsequent to a single base substitution. This includes:
ESE to neutral (ESE loss)
ESE to ESE (no change)
Neutral to ESE (ESE gain)
ESE to ESS (ESE loss and ESS gain)
Neutral to neutral (no change)
Neutral to ESS (ESS gain)
ESS to neutral (ESS loss)
ESS to ESE (ESS loss and ESE gain)
In ESE SNP-based Frequency of ESE binding sites (in the wild-type) that overlap with the location of the variant
In ESS SNP-based Frequency of ESS binding sites (in the wild-type) that overlap with the variant
ESR hexamer score (ESR-HS) SNP-based Hexamer scoring function to express the relationship between disease and neutral variants and their differential distributions with respect to loss or gain of an ESE or ESS
Spectrum kernel SNP-based Frequency of 3-mers and 4-mers over an 11 bp window (wild type and mutant)
Change in natural splice site strength SNP-based MaxEnt splice site score of natural splice site in mutant allele minus MaxEnt splice site score of wild-type allele
Maximum cryptic splice site SNP-based Maximum cryptic splice site (5′ and 3′) score (outside of the natural splice site) found overlapping the variant on the mutant allele
Evolutionarily conserved element SNP-based PhastCons conserved element probability for substitution site, based on multiple alignments of 46 placental mammals
Base-wise evolutionary conservation SNP-based PhyloP base-wise sequence conservation score at site of single base substitution based on multiple sequence alignment of 46 placental mammals
Natural wild-type splice site strength Exon-based MaxEntScan score of the natural 5′ and 3′ splice site of the wild-type target exon
Flanking intron size Exon-based Length in base-pairs of the upstream and downstream introns flanking the target exon
Intronic ESS density Exon-based Intronic ESS density was calculated for 100 bp upstream and 100 bp downstream of the target exon
Exonic ESS density Exon-based ESS density was calculated across the first 50 bp and the last 50 bp of the target exon. If the length of the exon was less than 100 bp, then the full length of the exon was used to calculate the ESS density
Exonic ESE density Exon-based Same as above but for ESEs
Internal coding exon Exon-based {true, false}, Is the target exon an internal coding exon (that is, the target exon is not the first or last coding exon)
Exonic GC content Exon-based Percentage of nucleotides that are either guanine or cytosine in the target exon
Exon size Exon-based Size of the target exon
Constitutive exon Exon-based Is the target exon constitutively spliced
Exon number Gene-based Number of exons in the transcript
Transcript number Gene-based Number of different reported isoforms that the target gene encodes