Model performance evaluation using ROC curves when applied to the same unseen test of 352 variants (238 positive and 114 negative). For each of the four training sets (Table 2), three different RF classification models were built (Iter. 1, Iter. 2 and Iter. 3). The percentage AUC for each training set and specific iteration are shown in parentheses.