Skip to main content
Figure 7 | Genome Biology

Figure 7

From: The SeqFEATURE library of 3D functional site models: comparison to existing methods and applications to protein function annotation

Figure 7

Overview of the SeqFEATURE pipeline. SeqFEATURE forms training sets by (a) extracting sequence (one-dimensional) motifs from PROSITE and (b) identifying the annotated functional amino acids. We extract examples of the one-dimensional motif with known three-dimensional structure in the PDB and center FEATURE training sites on each functional atom of each functional amino acid annotated in the PROSITE pattern. We choose negative sites matched for atom density randomly from the PDB that do not contain the function. (c) FEATURE then creates a model of the sites by summarizing the biochemical and biophysical features found in concentric shells around the functional atom center. (d) The resulting three-dimensional fingerprint specifies the properties that are in relative abundance or paucity in the site, representing the model. (e) A protein of interest is converted into features and scored with the model using a naïve Bayes scoring function, and predictions are made using score cutoffs, which can be based on desired performance statistics. The scores are calibrated into Z-scores using the training set used to derive each model.

Back to article page