Skip to main content
Figure 6 | Genome Biology

Figure 6

From: Predicting gene function in a hierarchical context with an ensemble of classifiers

Figure 6

Bayesian combination of diverse datasets. (a) A typical example of distribution of SVM outputs of a single dataset. This distribution of SVM output (from the Su et al. expression dataset [34]) for positive examples is shifted slightly higher relative to the distribution of the negative examples. (b) Schematic of Bayesian combination of diverse datasets. For each GO term, we constructed a naïve Bayes classifier where the output of single-dataset SVMs was used as a single input node (observed node). (c) Improvement of AUC over single SVM predictions for selected terms for biological process terms with size 101 to 300 genes. The Bayesian combination of datasets was selected where the held-out results on the training set showed superior performance over the single SVM. (d) Median improvement of predictions for selected GO terms over different biological process GO term sizes. The Bayesian combination of diverse datasets performs well only for large GO terms. AUC, area under receiver operating characteristic curve; GO, Gene Ontology; SVM, support vector machine.

Back to article page