Case study illustrating the semi-supervised approach employed in this study. The disease-causing (DM) missense mutation CM080465 in the OPA1 gene (NM_015560.2: c.1199C > T; NP_056375.2: p.P400L) was not originally reported to disrupt splicing but was later shown in vitro to disrupt pre-mRNA splicing . CM080465 was included in the negative set in the first iteration (Iter. 1). The Iter. 1 model, however, predicted CM080465 to disrupt pre-mRNA splicing (SAV). In the next iteration (Iter. 2), CM080465 was excluded from the negative set. The Iter. 2 model still predicted CM080465 to be a SAV and so, in the final iteration (Iter. 3), this variant was included in the positive set. This demonstrated that a semi-supervised approach can, at least in some instances, correctly re-label an incorrectly labeled training example. SAV, splice-altering variant; SNV, splice neutral variant.