Skip to main content

Table 4 List of the significant blocks detected in the scl dataset

From: A novel approach to identifying regulatory motifs in distantly related genomes

Block

Consensus sequence and possible binding sites

scl 1.1 (-)

TTGCCAAATTAAAATGAATCATTTGGCCCATAATGGCCGAGGCGCT

 

*Conserved sequence identified in [47], GCCAAAT: 3-9 +

 

*Putative SKN1 site reported in [47], AATGAATCATTT: 13-24 +

 

CdxA, M00100, 'MTTTATR': 29-35 - (0.917)

 

CdxA, M00101, AWTWMTR: 7-13 + (0.901); 8-14 + (0.905); 10-16 + (0.927); 29-35 + (0.927); 29-35 - (0.929); 7-13 - (0.913)

 

*En-1, M00396, GTANTNN: 30-36 + (0.936)

 

Cap, M00253, NCANHNNN: 19-26 + (0.932); 10-17 - (0.933)

 

Pbx-1, M00096, ANCAATCAW:14-22 + (0.941)

 

AP-1, M00199, NTGASTCAG: 14-22 + (0.913)

 

*HOXA3, M00395, CNTANNNKN: 29-37 + (0.927)

 

Tst-1, M00133, NNKGAATTAVAVTDN: 3-17 + (0.901)

  1. For each block, the consensus sequence is given followed by the possible binding sites situated in this block: motifs previously described in the literature [48] are marked with an asterisk. The motifs are summarized by their motif name (in bold), by their consensus sequence, if known, as described in the original article, by the sequence of the motif instance in our search, by the positions of the motif instance relative to the consensus sequence of the entire block and by the strand (indicated by a '+' or a '-') on which the motif occurred. Motif hits derived by Transfac are indicated by their matrix accession number, the consensus of this binding site and the instances of this motif in our search. These are further characterized by their positions relative to the consensus sequence of the entire block, by the strand on which the motif occurred and by the corresponding MotifLocator score (in parentheses). The blocks identified by the UCSC genome browser as conserved between mammals and Fugu are marked with 'UCSC', while the blocks detected by our two-step methodology but not present in the UCSC genome browser are indicated with a '-'.