Skip to main content

Table 3 Comparison of SeqFEATURE (at three specificity-based score cutoffs) to Gene3D, Pfam, and HMMPanther (Panther)

From: The SeqFEATURE library of 3D functional site models: comparison to existing methods and applications to protein function annotation

 

Gene3D

Pfam

Panther

SeqF_95

SeqF_99

SeqF_100

True positive sensitivity

0.998

0.937

0.919

0.862

0.821

0.492

False negative sensitivity

0.907

0.704

0.532

0.831

0.775

0.282

Overall (TP + FN) sensitivity

0.983

0.898

0.831

0.857

0.814

0.457

(False positive) specificity

0.854

1.000

-

0.452

0.603

0.973

Positive predictive value

0.990

1.000

-

0.948

0.960

0.995

Sensitivity at <35% seq-ID

0.925

0.761

0.639

0.769

0.699

0.316

Sensitivity at <30% seq-ID

0.869

0.738

0.618

0.869

0.783

0.400

Sensitivity at <25% seq-ID

0.600

0.467

0.250

0.933

0.786

0.429

  1. We calculated the sensitivity (on PROSITE true positives (TP), PROSITE false negatives (FN), and overall), specificity, PPV, and sensitivity at varying levels of sequence identity for each method using the portion of the test set corresponding to the coherent patterns for that method. Bold values indicate the top three performing methods for that row. The sequence-based methods are expected to do very well since they have the advantage of abundant sequence data for modeling functional families. SeqFEATURE performs relatively less well in sensitivity and specificity, but is still very competitive, especially with regards to positive predictive value and false negative sensitivity. At lower sequence identities, Gene3D outperforms all other methods at sequence identities greater than 30%, but SeqFEATURE remains robust throughout and, in particular, shows better performance than Gene3D and Pfam when sequence identity is less than 30%. SeqFEATURE is the best performing method by far when the proteins have less than 25% sequence identity to the training set.