Factors affecting prediction performance. (a) Precision at 20% recall (P20R) values evaluated using held-out annotations on all Gene Ontology (GO) terms (vertical axis) within each of the 12 evaluation categories for each submission (left panel) and for a simple guilt-by association using each data set in turn as its sole evidence source (right panel). The number of genes in each evaluation category is shown in parentheses. GO-BP, GO Biological process; GO-CC, GO Cellular component; GO-MF, GO Molecular function; NB, naïve Bayes. Data sets are described in Table 1. (b) Fraction of the 21,603 genes in the data collection with at least one annotated neighbor per data set. (c) Analysis of variance (ANOVA), exploring the effects of various factors on P20R values. (d) Fraction of total variance in P20R values that is explained by each effect. Asterisks in (c, d) indicate interaction between two factors.