- Deposited research article
- Open Access
Extreme conservation of non-repetitive non-coding regions near HoxDcomplex of vertebrates
Genome Biologyvolume 4, Article number: P2 (2003)
Homeotic gene complexes determine the anterior-posterior body axis in animals. The expression pattern and function of hox genes along this axis is colinear with the order in which they are organized in the complex. This 'chromosomal organization and functional correspondence' is conserved in all bilaterians investigated. Although the molecular basis of this 'colinearity' in not yet understood, it is possible that there are control elements within or in the proximity of these complexes that establish and maintain the expression patterns of hox genes in a coordinated fashion. We report here an unprecedented conservation of non-coding DNA sequences adjacent to the HoxD complex of vertebrates. Stretches of hundreds of base pairs in a 7 kb region, upstream of HoxD complex, show 100% conservation from fish to human. Using primers designed from these sequences of human HoxD complex, we amplified the corresponding regions from different vertebrates, including mammals, aves, reptiles, amphibians and pisces. Such a high degree of conservation, where no variation was allowed during ~500 million years of evolution, suggests critical function for these sequences in the regulation of the HoxD complex. Furthermore, these sequences provide a molecular handle to gain insight into the mechanism of regulation of this complex.
Eukaryotic genome contains a large excess of non-coding sequences. Conservation of these sequences among species is a strong indication of their functional significance. With the availability of genome sequences it is possible to identify such sequences taking the comparative genomics approach [1–3]. Clustering of genes that are regulated in a linked manner has been noticed in several cases[4, 5]. Among the most conserved regions of the vertebrate genome are the clusters of homeotic genes[6, 7]. Homeotic gene complex was first identified in Drosophila melanogaster and was demonstrated to play major role in anterior-posterior body axis formation. Hox genes in flies and similarly in vertebrates are expressed in a coordinated manner along the body axis. The molecular mechanism behind such coordination in regulation, however, is not yet understood. Several mechanisms have been proposed that link the organization of homeotic genes and the spacio-temporally controlled expression [9–11] of which the most attractive one implicates higher order chromatin organization in this process. It has been shown that an upstream region spanning up to 20 kb plays an important role in the regulation of this complex. Such studies have lead to the speculation that repressive elements in this region may initially silence the complex and then release the genes for expression in a sequential manner. Fine mapping of such sequences and their conservation in other vertebrates have not been reported. Role of higher order chromatin organization in the regulation of homeotic gene complex is relatively better known in case of bithorax complex of Drosophila .
Results and discussion
We compared genomic regions flanking hox complexes in order to identify conserved regions. Here we report that the upstream regions of HoxD complexes of Homo sapiense (human), Mus musculus (mouse), Rattus norvegicus (rat), Papio hamadryas (sacred baboon), Heterodontus francisci (horn shark), Danio rerio (zebra fish) and Fugu rubripes (Fugu) contain long stretches of extremely conserved sequences. Analysis of a 25 kb region upstream of the HoxD complex from these organisms revealed an extremely conserved region spread in three blocks located within 7 kb from the 3' end of the Evx-2 gene. These conserved regions, designated as Conserved Region-1, Conserved Region-2, and Conserved Region-3 (CR-1, CR-2 and CR-3) (Fig. 1a) show a degree of conservation not seen before among distant species. Detailed analysis of each region spanning to several hundred base pairs, in particular the CR-2, shows 100 % conservation, Table 1, Fig. 1b. These sequences are found as single copy and are vertebrate specific. We also noticed longer stretches of conservation among mammals, which gradually shortens as we go towards lower vertebrates, defining the core of each conserved region, across the vertebrate classes, Table 1. This and the fact that in case of shark, as compared to mammals, the intervening sequence lengths between CR-2 and CR-3, and CR-1 and Evx-2 is shorter by ~1300 bp and ~600 bp, respectively (Fig. 1a) suggest that starting from the shorter conserved regions, additional unique sequences have progressively been acquired during the evolution of primates from lower vertebrates. We did not find such a degree of conservation in the flanking regions of other hox complexes (HoxA, B and C) of vertebrates, data not shown.
Primers designed from all three conserved regions of the human HoxD complex, amplified the corresponding regions from different species covering all five classes of vertebrates. The PCR products from different species and Southern hybridization by the human probes are shown in Fig. 2. Sequencing or the PCR products confirmed these observations.
Several recent reports using comparative genomics approach have identified conserved non-coding regions among different vertebrates [15–17] but none to the degree that we report here. The mechanism that may require such a high degree of conservation is not known. It is not, therefore, immediately clear what precisely is the (regulatory) role of these sequences. A part of CR-1, 2 or 3 could be the enhancer of Evx-2 gene or other regulatory elements, that could be in this region[4, 5]. The size and the extent of conservation, however, rules out such enhancer type regulatory sequences to be the only functional element associate with these sequences. The conserved sequences fall within the region that has been suggested to organize a repressive complex. Identification of CR-1, 2 and 3, and their 'class specific' extensions (Table 1) will help in the search for molecular components of any such or any other mechanism of HoxD regulation.
EST data base search revealed that part of CR-1 and CR-3 are transcribed but no EST corresponding to CR-2 or any other part of the 7.5 Kb region was found. These transcripts are expressed early in the development, Fig. 1. A possible mechanism could involve RNA from this region that may be functioning by base pairing to implement temporal and spatial regulation of the homeotic genes. If that is the case, such high conservation could be expected. Role of transcription in the regulation of bithorax complex is emerging from recent studies [18–21]. Further studies will be required to determine if such a process may be common to vertebrate Hox complexes as well.
While such an extreme conservation of several hundred nucleotides over half a billion years in a region that does not code for any known proteins certainly implicates essential role for such sequences, probably in the regulation of HoxD complex, no known regulatory element requires such extreme conservation extending up to hundreds of base pairs. It is therefore, likely that these elements are a component of a novel mechanism common to all vertebrates that regulates this gene complex. We are tempted to suggest that such a strongly conserved region from fish to human linked to a gene complex that is known to determine body axis formation may be the key determinant of molecular basis of early ontogeny. Early embryos of all vertebrates show striking similarity and we suggest that these elements may be controlling the early expression pattern of HoxD which leads to similar pattern of the embryo shape. While very speculative, such possibilities can be tested experimentally. The gradient of conservation seen in this region from fish to human may signify the evolutionary history of this locus. Diversification of the vertebrate classes and the morphological features along the anterio-posterior body axis that have been acquired during evolution[22, 23] could potentially be correlated by extensive molecular analysis of these sequences.
The genomic sequences that contained Evx-2 and any of the Hoxd genes were downloaded and annotated using gene/ORF prediction tools. Similar approach was used for other hox complexes. Homology searches of the upstream sequences of HoxD region from human (AC009336; from nucleotide 56601 to 64095) was carried out using the BLAST program of NCBI. The sequences that showed significant homology were further used to analyze the extent of homology by BLAST 2 program. The conserved regions from each sequence was obtained and subjected to multiple sequence analysis using Clustal X. In order to identify the expressed sequences corresponding to the conserved sequence, the conserved sequences along with the unique sequences were BLASTed against EST databases (human, mouse and dbEST).
The contigs that showed significant homology to the upstream sequences of human HoxD were annotated using the tBLASTx program and searching the translated amino acid sequence in the Swissprot database. Repeat masker program was used to look for repeat content. Genebank sequences used in this study are as follows: AC116665 Papio hamadryas, AF224263 Heterodontus francisci, AC015584 Mus musculus, AC009336 Homo sapiens, CAAB01000449 Fugu rubripes and NW_042732 Rattus norvegicus.
We identified Hox complexes by searching for the respective homeotic genes and then downloading the genomic sequences. In this way we were able to study the flanking regions of HoxA, HoxB and HoxC from different vertebrate species. In order to see if there are such conserved regions associated with other complexes, we took 25 kb DNA from the human HoxA, HoxB and HoxC complexes and BLASTed against 'non redundant' sequence and also against the genome sequence of mouse, rat, fugu and zebrafish and other eukaryotic genomes available in the public database.
Genomic DNA isolation, PCR amplification, Sequencing and Southern hybridization
For the isolation of genomic DNA blood samples of human, chick and cobra (Naja naja) were used while liver tissue of mouse and muscle tissue of frog (Bufo melanostictus) and zebrafish were used. Standard protocol of DNA isolation was followed which included lysis, RNase A and proteinase K digestions followed by phenol/chloroform extraction and precipitation. Concentration and quality of the genomic DNA was checked on 0.7% agarose gel and UV absorption spectrophotometry. Based on the sequence of conserved regions primers were designed to amplify the three regions CR1, CR2 and CR3. Primers used in this study to amplify conserved regions from different vertebrate species were:
CR-1 forward- GAGGCTGTTCACACTGGTGG,
CR-1 reverse- ATCATGCTCTCTGATGGACC,
CR-2 forward- GCATCGTAATCAGTTCGGTC,
CR-2 reverse- TGATACAAGCTGATACCGTC,
CR-3 forward- GCTATTCAAAATGTTATTTGAG &
CR-3 reverse- CTGTAATGAAGAAAAGATTTATG.
The 25 μl reaction was performed using 100 ng template DNA and 5 pmol each of forward and reverse primers. PCR conditions were- initial denaturation step of 94°C for 3 min was followed by 35 cycles of 94°C for 1 min, 57°C for 1 min and 72°C for 1.30 min and final extension step at 72°C for 7 min. All the PCR products were sequenced on an ABI automated DNA sequencer (Perkin Elmer) using the ABI Big Dye terminator chemistry. For Southern hybridization, the PCR products were separated on 1% agarose gels and transferred to N+ nylon membrane. Purified PCR products amplified from human DNA were labeled by random priming and used as probe. Hybridization and washings were performed as at 65°C. We also amplified these sequences using same primer sets from a variety of animals across the vertebrates, data not shown.
Pennacchio LA, Rubin EM: Genomic strategies to identify mammalian regulatory sequences. Nat Rev Genet. 2001, 2: 100-109. 10.1038/35052548.
Kondrashov AS, Shabalina SA: Classification of common conserved sequences in mammalian intergenic regions. Hum Mol Genet. 2002, 11: 669-674. 10.1093/hmg/11.6.669.
Dehal P, Predki P, Olsen AS, Kobayashi A, Folta P, Lucas S, Land M, Terry A, Ecale Zhou C, Rash S, Zhang Q, Gordon L, Kim J, Elkin C, Pollard MJ, Richardson P, Rokhsar D, Uberbacher E, Hawkins T, Branscomb E, Stubbs L: Human Chromosome 19 and Related Regions in Mouse: Conservative and Lineage-Specific Evolution. Science. 2001, 293: 104-111. 10.1126/science.1060310.
Boutanaev AM, Kalmykova AI, Shevelyov YY, Nurminsky DI: Large clusters of co-expressed genes in the Drosophila genome. Nature. 2002, 420: 666-669. 10.1038/nature01216.
Lercher MJ, Urrutia AO, Hurst LD: Clustering of housekeeping genes provides a unified model of gene order in the human genome. Nat Genet. 2002, 31: 180-183. 10.1038/ng887.
McGinnis W, Krumlauf R: Homeobox genes and axial patterning. Cell. 1992, 68: 283-302.
Krumlauf R: Hox genes in vertebrate development. Cell. 1994, 78: 191-201.
Lewis EB: A gene complex controlling segmentation in Drosophila. Nature. 1978, 276: 565-570.
Kmita M, Kondo T, Duboule D: Targeted inversion of a polar silencer within the HoxD complex reallocates domains of enhancer sharing. Nat Genet. 2000, 26: 451-454. 10.1038/82593.
Spitz F, Gonzalez F, Peichel C, Vogt TF, Duboule D, Zákány J: Large scale transgenic and cluster deletion analysis of the HoxD complex separate an ancestral regulatory module from evolutionary innovations. Genes & Dev. 2001, 15: 2209-2214. 10.1101/gad.205701.
Duboule D: Vertebrate hox gene regulation: clustering and/or colinearity?. Curr Opin Genet Dev. 1998, 8: 514-518. 10.1016/S0959-437X(98)80004-X.
Kmita M, Fraudeau N, Herault Y, Duboule D: Serial deletions and duplications suggest a mechanism for the collinearity of Hoxd genes in limbs. Nature. 2002, 420: 145-150. 10.1038/nature01189.
Kondo T, Duboule D: Breaking colinearity in the mouse HoxD complex. Cell. 1999, 97: 407-417.
Mihaly J, Hogga I, Barges S, Galloni M, Mishra RK, Hagstrom K, Muller M, Schedl P, Siposd L, Gausz J, Gyurkovics H, Karch F: Chromatin domain boundaries in the Bithorax complex. Cell Mol Life Sci. 1998, 54: 60-70. 10.1007/s000180050125.
Wassermann WW, Palumbo M, Thompson W, Fickett JW, Lawrence CE: Human-mouse genome comparisons to locate regulatory sites. Nat Genet. 2000, 26: 225-228. 10.1038/79965.
Aparicio S, Morrison A, Gould A, Gilthorpe J, Chaudhuri C, Rigby PWJ, Krumlauf R, Brenner S: Detecting conserved regulatory elements with the model genome of the Japanese puffer fish Fugu rubripes. Proc Natl Acad Sci USA. 1995, 92: 1684-1688.
Dermitzakis ET, Reymond A, Lyle R, Scamuffa N, Ucla G, Deutsch S, Stevenson BJ, Flegel V, Bucher P, Jongeneel CV, Antonarakis SE: Numerous potentially functional but non-genic conserved sequences on human chromosome 21. Nature. 2002, 420: 578-582. 10.1038/nature01251.
Drewell RA, Bae E, Burr J, Lewis EB: Transcription defines the embryonic domains of cis-regulatory activity at the Drosophila bithorax complex. Proc Natl Acad Sci (USA). 2002, 99: 16853-16858. 10.1073/pnas.222671199.
Rank G, Prestel M, Paro R: Transcription through intergenic chromosomal memory elements of the Drosophila bithorax complex correlates with an epigenetic switch. Mol Cell Biol. 2002, 22: 8026-8034. 10.1128/MCB.22.22.8026-8034.2002.
Hogga I, Karch F: Transcription through the iab-7 cis-regulatory domain of the bithorax complex interferes with the Polycomb-mediated silencing. Development. 2002, 129: 4915-4922.
Bender W, Fitzgerald DP: Transcription activates repressed domain in the Drosophila bithorax complex. Development. 2002, 129: 4923-4930.
Prince V: The hox paradox: more complex(es) than imagined. Dev Biol. 2002, 249: 1-15. 10.1006/dbio.2002.0745.
Manzanares M, Wada H, Itasaki N, Trainor PA, Krumlauf R, Holland PW: Conservation and elaboration of Hox gene regulation during evolution of the vertebrates. Nature. 2000, 408: 854-857. 10.1038/35048570.
We acknowledge the help from the CCMB sequencing facility, wildlife conservation group for providing DNA samples of various animal species, Mehar Sultana, A Suresh, Md Idris during this work. This work was supported by a young investigators grant (RGY0316/2001-M) from Human Frontier Science Program to RKM.