Skip to main content

Table 7 Characteristics of promoter sequences used by Fprom for identification of TATA+ promoters

From: Automatic annotation of eukaryotic genes, pseudogenes and promoters

Characteristics

D2 for TATA+ promoters

Hexaplets in region [-200, -45]

3.1

Hexaplets in region [1, 40]

4.0

TATA box score in region [-45, -25]

2.3

TATA box average score in region [-45, -25]

2.2

Triplets in region [-200, -45]

2.2

Triplets in region [0, 40]

2.9

Position triplet matrix in region [-50, +30]

7.0

Protein-induced deformability

2.9

CpG content

3.0

Similarity in region [-200, -100]

1.0

Motif density in region [-200, -100]

4.5

Protein-DNA-twist

0.3

Motif density in region [-100, -1] (reverse chain)

2.3

Total Mahalonobis distance

14.8

Number of promoters/non-promoters

366/18600

  1. D2 is the Mahalonobis distance [26] showing the strength of characteristics to separate promoter from non-promoter test set sequences.