Skip to main content

Table 7 Characteristics of promoter sequences used by Fprom for identification of TATA+ promoters

From: Automatic annotation of eukaryotic genes, pseudogenes and promoters

Characteristics D2 for TATA+ promoters
Hexaplets in region [-200, -45] 3.1
Hexaplets in region [1, 40] 4.0
TATA box score in region [-45, -25] 2.3
TATA box average score in region [-45, -25] 2.2
Triplets in region [-200, -45] 2.2
Triplets in region [0, 40] 2.9
Position triplet matrix in region [-50, +30] 7.0
Protein-induced deformability 2.9
CpG content 3.0
Similarity in region [-200, -100] 1.0
Motif density in region [-200, -100] 4.5
Protein-DNA-twist 0.3
Motif density in region [-100, -1] (reverse chain) 2.3
Total Mahalonobis distance 14.8
Number of promoters/non-promoters 366/18600
  1. D2 is the Mahalonobis distance [26] showing the strength of characteristics to separate promoter from non-promoter test set sequences.