Computational analysis of core promoters in the Drosophila genome

Table 2 The ten most significant motifs in the core promoter sequences from -60 to +40, as identified by the MEME algorithm

Motif	Bits	Consensus	Number	E value
1	15.2	YGGTCACACTR	311	5.1e-415
2 DRE	13.3	WATCGATW	277	1.7e-183
3 TATA	13.2	STATAWAAR	251	2.1e-138
4 INR	11.6	TCAGTYKNNNTYNR	369	3.4e-117
5	15.2	AWCAGCTGWT	125	2.9e-93
6	15.1	KTYRGTATWTTT	107	1.9e-62
7	12.7	KNNCAKCNCTRNY	197	1.9e-63
8	14.7	MKSYGGCARCGSYSS	82	5.1e-29
9 DPE	15.4	CRWMGCGWKCGGTTS	56	1.9e-12
10	15.3	CSARCSSAACGS	40	8.3e-9

We show the identified motifs in pictogram representation, where the height of letters corresponds to their frequencies relative to the single-nucleotide background used when running MEME. The information content in bits is also calculated with respect to this background. The consensus sequence represents only the highly conserved part of each motif, using the IUPAC code for ambiguous nucleotides. The number of occurrences refers to the sequences that MEME decided to use to build each motif model. The E-value refers to the probability that a motif of the same width is found with equally or higher likelihood in the same number of random sequences having the same single-nucleotide frequencies as our promoter set.

ISSN: 1474-760X