Skip to main content

Table 9 AUROC per bacterial species in the validation data set and mean AUROC ± standard deviation for each model

From: Promotech: a general tool for bacterial promoter recognition

Model M. smegmatis L. phytofermentans B. amyloliquefaciens R. capsulatus Mean AUROC
RF-HOT 0.939 0.591 0.640 0.660 0.708 ± 0.157
RF-TETRA 0.814 0.608 0.837 0.674 0.733 ± 0.110
GRU-0 0.630 0.488 0.496 0.577 0.548 ± 0.068
GRU-1 0.601 0.487 0.502 0.566 0.539 ± 0.054
LSTM-3 0.622 0.489 0.481 0.553 0.536 ± 0.066
LSTM-4 0.592 0.498 0.506 0.546 0.536 ± 0.043
MULTiPly 0.684 0.470 0.700 0.593 0.612 ± 0.106
iPro70-FMWin 0.642 0.587 0.779 0.575 0.646 ± 0.093
bTSSFinder (0.272, 0.265) (0.944, 0.924) (0, 0) (0.250, 0.245) NA
G4PromFinder (0.938, 0.932) (0.216, 0.269) (0.339, 0.554) (0.960, 0.953) NA
BProm (0.006, 0.002) (0.560, 0.398) (0.421, 0.181) (0.011, 0.007) NA
  1. AUROC is roughly the likelihood that a positive instance will get a higher probability of being a promoter sequence than a negative instance. These results were obtained in data sets (not seen during training) with a 1:1 ratio of positive to negative instances. The numbers in bold indicate the model with the highest AUROC. For BPROM, bTSSFinder, and G4PromFinder, the numbers between brackets indicate true-positive rate and false-positive rate obtained as these tools did not provide a probability associated to each instance in the data set