Skip to main content

Table 1 The accuracy, exact match ratio, micro and macro F1 score, macro recall, and precision of the different ML models. The models we considered were balanced SVC (feature selection) + neural network classification using oversampling (Bl. SVC + NN (OS)), two-stage detection + classification neural networks (TS NN), two-stage detection + classification balanced support vector classifier (TS Bl. SVC), and the majority vote ensemble classifier (MV ensemble). TS NN had the highest positive label (PL) precision and TS Bl.SVC had the highest positive label (PL) recall, while Bl. SVC + NN (OS) had the best balance between precision and recall. Majority vote ensemble improved on the results of the three classifiers as conveyed by both the high precision and recall the method achieves

From: SeqScreen: accurate and sensitive functional screening of pathogenic sequences via ensemble learning

Model

Accuracy

Exact match ratio

Micro F1 score

Macro F1 score

Macro recall

Macro precision

Mean PL precision

Mean PL recall

Bl. SVC + NN (OS)

0.9997

0.9924

0.9859

0.8210

0.8039

0.8716

0.8759

0.8180

TS NN

0.9997

0.9924

0.9359

0.6934

0.6445

0.8011

0.8893

0.6988

TS Bl.SVC

0.9996

0.9893

0.8692

0.7047

0.8310

0.6492

0.7382

0.8869

MV ensemble

0.9997

0.9934

0.9424

0.7998

0.8016

0.8453

0.9003

0.8273