The effect of missing values on the accuracy of quantitative GI prediction. (a-c) The three scenarios producing missing values in E-MAP data. (a) In the 'Random' scenario, a random subset of gene pairs have hidden GIs. (b) in the 'Submatrix' scenario, a random subset of genes was selected and all the interactions between them were hidden. (c) In the 'Cross' scenario, two random disjoint subsets of genes were selected and all the interactions between them were hidden. In all three examples, 20% of the gene pairs were hidden. (d, e) Performance for different fractions of missing values in the three scenarios, using only the GSG+MATRIX features (d) or all the features (e). Performance was tested using the ER E-MAP, as it contained the least missing values. Imputation was performed by linear regression. The procedure was repeated 30 times. Performance was evaluated as the average Pearson correlation between the hidden and the predicted S-scores. Note that the 'Cross' scenario is not applicable for cases with ≥ 50% missing values.