Relative performance of different methods with regard to the test set and novel set on GO biological process terms (size 101 to 300). The relative performance of individual groups differs between the test set and novel set. In addition, the performance on the novel set was generally worse than on the test set. This indicates that cross-validation should be used carefully in assessing the relative performance of different algorithms and that evaluation on novel biology is necessary. Asterisks indicate second round submissions. GO, Gene Ontology.