Genomic structure of the gene for mouse germ-cell nuclear factor (GCNF). II. Comparison with the genomic structure of the human GCNF gene
© Süsens and Borgmeyer, licensee BioMed Central Ltd 2001
Received: 5 March 2001
Accepted: 9 April 2001
Published: 2 May 2001
Germ-cell nuclear factor (GCNF, NR6AI) is an orphan nuclear receptor. Its expression pattern suggests it functions during embryogenesis, in the placenta and in germ-cell development. Mouse GCNF cDNA codes for a protein of 495 amino acids, whereas the four reported human cDNA variants code for proteins of 454 to 480 amino acids. Apart from this size difference, there is sequence conservation of up to 98.7%. To elucidate the genomic structure that gives rise to the different human GCNF mRNAs, the sequence information of the human GCNF locus is compared to the previously reported structure of the mouse locus.
The genomic structures of the mouse and human GCNF genes are highly conserved. The comparison reveals that the shorter human protein results from skipping the 45 base-pair third exon. Three different human isoforms - GCNF-1, GCNF-2a and GCNF-2b - are generated by differential usage of alternative splice acceptor sites of the fourth and the seventh exon.
By homology with the mouse gene, 11 GCNF coding exons can be defined on human chromosome 9. All human GCNF cDNAs identified so far are, however, derived from mRNAs generated by splicing the fourth to the second exon. Although the genomic sequence is highly conserved, the analysis suggests that alternative splicing generates a higher complexity of human GCNF isoforms compared with the situation in the mouse.
The nuclear receptors comprise a family of transcriptional regulators involved in a wide variety of biological processes such as embryonic development, differentiation and homeostasis [1,2]. The family includes ligand-dependent zinc-finger transcription factors for steroid hormones, estrogens, thyroid hormones, retinoids, vitamin D and other hydrophobic molecules. In addition, several family members are 'orphan receptors' for which ligands have yet to be identified. Nuclear receptors have been assigned to six subfamilies on the basis of evolutionary studies . As the first member of the sixth subfamily, GCNF is also known by its systematic name NR6A1 . On the basis of homology and expression profile, the receptor has been given the alternative name of retinoic acid receptor-related testis-associated receptor (RTR) . GCNF lacks known ligands and is therefore referred to as an orphan receptor. The gene has been mapped to chromosome 9q33-q34.1 . Transfection experiments reveal that GCNF can act as a constitutive repressor when binding as a homodimer to promoters containing a direct repeat DNA element 5'-AGGTCAAGGTCA-3' (DRo) [7,8,9,10]. Gene targeting in the mouse shows that GCNF has essential functions during embryogenesis . The mouse receptor (mGCNF) is highly expressed in the developing embryonic nervous system and the labyrinthine layer of the placenta [12,13]. In the adult, high transcript levels are restricted to the developing germ cells [5,14,15,16]. Northern analysis reveals a transcript of 7.5 kilobases (kb) in somatic cells and an additional message of approximately 2.4 kb in male germ cells. This size difference is at least partially due to different polyadenylation sites , and it is therefore assumed that both transcripts code for identical proteins of 495 amino acids. The protein sequence is encoded by 11 exons . When differentiation of P19 embryonal carcinoma cells is triggered by retinoic acid, the transcript and the protein are temporarily upregulated and then downregulated .
Isolation of a human cDNA coding for a protein (hGCNF) with an identity to the mouse protein of 98.7%, similar regulation in mouse P19 cells and in the human embryonal carcinoma cell line NT2/D1, together with the presence of two mRNAs of approximately 7.5 and 2.2 kb in human testis, suggested similar functions for mouse and human GCNF [18,19,20]. The cloning of human cDNAs that give rise to different hGCNF isoforms, however, suggests a higher complexity in humans. Currently, four different hGCNF cDNAs have been isolated that code for isoforms ranging in size from 454 to 480 amino acids .
We have investigated the genomic structure of mammalian GCNF to determine how the different GCNF isoforms are generated. Here we compare the exon/intron structure of the previously characterized mouse gene with the human ortholog . Our study shows that alternative splicing generates at least three of the different GCNF isoforms.
Results and discussion
Structure of the first coding exon
Conserved structure of exons 2 to 11
Two short exons of 42 bp and 45 bp, respectively, follow the first protein-coding exon in the mouse . Short exons are relatively rare in mammalian genomes. The structure of the second protein-coding exon is conserved (Figure 3a). Splice donor and acceptor sites are identical in both species. Interestingly, on the basis of the genomic cDNA, the third exon is highly conserved as well (Figure 3b). The human splicing apparatus preferentially, or exclusively, skips this putative exon, however. As splicing is highly regulated, a splice enhancer present in the mouse genome may not be present in the human genome. Consequently, all known human GCNF isoforms lack the amino acids encoded by the putative third exon.
Of the 243 bp exon 4 that encodes the core of the DNA-binding domain, 225 bp are identical in both species (Figure 3c). One of the reported sequences (U64876/NM_001489) has a C to A transversion, however, which changes a codon for asparagine to one for lysine. Splicing of exon 2 to exon 4 at the position characterized in the mouse results in isoform hGCNF-2. In addition to this splice acceptor position, a splice acceptor site located 12 nucleotides further downstream is used to generate hGCNF-1. Exons 5 and 6 are highly conserved (Figure 3d,e). Two hGCNF-2 variants, hGCNF-2A and hGCNF-2B, which differ by a single amino acid, have been isolated. As speculated, alternative splicing generates the isoform 2B with a deletion of a serine residue. Splicing to an acceptor site of exon 7 located three nucleotides further downstream gives rise to this shorter isoform (Figure 3e,f). The sequence and structure of exons 8 to 11 are also highly conserved (Figure 3g,h,i,j). The comparison of the 11th exon was extended up to the end of the human cDNA sequence of S88309. Highly conserved sequence elements of up to 91 identical nucleotides indicate a regulatory function of the 3'-untranslated sequence following the translational stop codon.
All intron-exon boundaries obeyed the GT/AG rule . The AceView analysis at the NCBI based on the draft sequence and a Blast search with S83309 of the Celera Genomics freely accessible whole-genome sequence data gave mostly similar intron sizes. Both analyses revealed a large first intron of 37,652 bp in the public sequence data, and 37,157 bp in the private data. The size of the second intron separating exon 2 and exon 4 was only available in the NCBI database (14,869 bp). According to the NCBI and Celera databases, introns 3 to 9 have sizes of 10,486 (NCBI) (10,471, Celera) bp, 3629 (3615) bp, 190,321 (1708) bp, 1963 (1960) bp, 2716 (9019) bp, 1905 (1912) bp, and 1927 (1928) bp, respectively. The comparison of both analyses shows that the deduced sizes of two of the human introns differs greatly. It seems likely that these inconsistencies will be corrected in the final assembly of the human genome.
In summary, our analysis reveals a conserved structure for GCNF, allows the verification and systematic analysis of splice variants, and may be the basis of a better understanding of GCNF. The human GCNF gene consists of at least 10 exons. The conservation of the intron-exon boundaries is consistent with the extremely high degree of amino-acid conservation between the human and the mouse proteins. The generation of the proteins hGCNF-1, hGCNF-2a and hGCNF-2b can be explained by alternative splicing of the RNA. The sequence of the third coding mouse exon, including the splice sites, is highly conserved; however, at present no human cDNA has been isolated containing this putative exon. Alternative splicing provides a plausible means for generating diversity and may contribute to a higher instructive complexity in human GCNF.
Materials and methods
Exons of GCNF were identified by a Blast  search with the human GCNF cDNA sequence (S83009) in the "unfinished high throughput genomic sequences" and in the Homo sapiens genomic contig sequences at the NCBI [24,25]. Intron sizes given by the AceView analysis  were compared with the numbers obtained by a Blast search of Celera's assembled sequence of the human genome . The putative human GCNF exon 3 was identified by a Blast search with the sequence of the third mouse exon.
Sequences were aligned using the Wisconsin Package Version 10.0 of the Genetics Computer Group (GCG), Madison, Wisconsin.
- Mangelsdorf DJ, Thummel C, Beato M, Herrlich P, Schütz G, Umesono K, Blumberg B, Kastner P, Mark M, Chambon P, Evans RM: The nuclear receptor superfamily: the second decade. Cell. 1995, 83: 835-839.PubMedView ArticleGoogle Scholar
- Giguère V: Orphan nuclear receptors: from gene to function. Endocr Rev. 1999, 20: 689-725.PubMedGoogle Scholar
- Laudet V: Evolution of the nuclear receptor superfamily: early diversification from an ancestral orphan receptor. J Mol Endocrinol. 1997, 19: 207-226.PubMedView ArticleGoogle Scholar
- The Nuclear Receptor Committee: A unified nomenclature system for the nuclear receptor superfamily. Cell. 1999, 97: 161-163.View ArticleGoogle Scholar
- Hirose T, O'Brien DA, Jetten AM: RTR: a new member of the nuclear receptor superfamily that is highly expressed in murine testis. Gene. 1995, 152: 247-251.PubMedView ArticleGoogle Scholar
- Agoulnik IY, Cho Y, Niederberger C, Kieback DG, Cooney AJ: Cloning, expression analysis and chromosomal localization of the human nuclear receptor gene GCNF. FEBS Lett. 1998, 424: 73-78.PubMedView ArticleGoogle Scholar
- Bauer U-M, Schneider-Hirsch S, Reinhardt S, Pauly T, Maus A, Wang F, Heiermann R, Rentrop M, Maelicke A: Neuronal cell nuclear factor-a nuclear receptor possibly involved in the control of neurogenesis and neuronal differentiation. Eur J Biochem. 1997, 249: 826-837.PubMedView ArticleGoogle Scholar
- Cooney AJ, Hummelke GC, Herman T, Chen F, Jackson KJ: Germ cell nuclear factor is a response element-specific repressor of transcription. Biochem Biophys Res Commun. 1998, 245: 94-100.PubMedView ArticleGoogle Scholar
- Greschik H, Wurtz J-M, Hublitz P, Köhler F, Moras D, Schüle R: Characterization of the DNA-binding and dimerization properties of the nuclear orphan receptor germ cell nuclear factor. Mol Cell Biol. 1999, 19: 690-703.PubMedPubMed CentralView ArticleGoogle Scholar
- Yan Z, Jetten AM: Characterization of the repressor function of the nuclear orphan receptor retinoid receptor-related testis-associated receptor/germ cell nuclear factor. J Biol Chem. 2000, 275: 10565-10572.Google Scholar
- Chung AC-K, Katz D, Pereira FA, Jackson KJ, DeMayo FJ, Cooney AJ, O'Malley BW: Loss of orphan receptor germ cell nuclear factor function results in ectopic development of the tail bud and a novel posterior truncation. Mol Cell Biol. 2001, 21: 663-677.PubMedPubMed CentralView ArticleGoogle Scholar
- Süsens U, Aguiluz JB, Evans RM, Borgmeyer U: The germ cell nuclear factor mGCNF is expressed in the developing nervous system. Dev Neurosci. 1997, 19: 410-420.PubMedView ArticleGoogle Scholar
- Morasso MI, Grinberg A, Robinson G, Sargent TD, Mahon KA: Placental failure in mice lacking the homeobox gene Dlx3. Proc Natl Acad Sci USA. 1999, 96: 162-167.PubMedPubMed CentralView ArticleGoogle Scholar
- Chen F, Cooney AJ, Wang Y, Law SW, O'Malley BW: Cloning of a novel orphan receptor (GCNF) expressed during germ cell development. Mol Endocrinol. 1994, 8: 1434-1444.PubMedGoogle Scholar
- Katz D, Niederberger C, Slaughter GR, Cooney AJ: Characterization of germ cell-specific expression of the orphan nuclear receptor, germ cell nuclear factor. Endocrinology. 1997, 138: 4364-4372.PubMedGoogle Scholar
- Zhang YL, Akmal KM, Tsuruta JK, Shang Q, Hirose T, Jetten AM, Kim KH, O'Brien DA: Expression of germ cell nuclear factor (GCNF/RTR) during spermatogenesis. Mol Reprod Dev. 1998, 50: 93-102.PubMedView ArticleGoogle Scholar
- Süsens U, Borgmeyer U: Genomic structure of the mouse germ cell nuclear factor (GCNF) gene. Genome Biol. 2000, 1: research0006.1-0006.3.View ArticleGoogle Scholar
- Heinzer C, Süsens U, Schmitz TP, Borgmeyer U: Retinoids induce differential expression and DNA binding of the mouse germ cell nuclear factor in P19 embryonal carcinoma cells. Biol Chem. 1998, 379: 349-359.PubMedView ArticleGoogle Scholar
- Süsens U, Borgmeyer U: Characterization of the human germ cell nuclear factor gene. Biochim Biophys Acta. 1996, 1309: 179-182.PubMedView ArticleGoogle Scholar
- Schmitz TP, Süsens U, Borgmeyer U: DNA binding, protein interaction and differential expression of the human germ cell nuclear factor. Biochim Biophys Acta. 1999, 1446: 173-180.PubMedView ArticleGoogle Scholar
- Greschik H, Schüle R: Germ cell nuclear factor: an orphan receptor with unexpected properties. J Mol Med. 1998, 76: 800-810.PubMedView ArticleGoogle Scholar
- Shapiro MB, Senapathy P: RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression. Nucleic Acids Res. 1987, 15: 7155-7174.PubMedPubMed CentralView ArticleGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215: 403-410.PubMedView ArticleGoogle Scholar
- NCBI: Basic BLAST. [http://www.ncbi.nlm.nih.gov/blast/blast.cgi]
- NCBI: Genome Sequencing - BLAST the Human Genome. [http://www.ncbi.nlm.nih.gov/genome/seq/HsBlast.html]
- The AceView at the NCBI. [http://www.ncbi.nlm.nih.gov/AceView/]
- CELERA: Consensus Human Genome. [http://public.celera.com]