The GOLD domain, a novel protein module involved in Golgi function and secretion
© Anantharaman and Aravind, licensee BioMed Central Ltd 2002
Received: 3 January 2002
Accepted: 7 March 2002
Published: 24 April 2002
Members of the p24 (p24/gp25L/emp24/Erp) family of proteins have been shown to be critical components of the coated vesicles that are involved in the transportation of cargo molecules from the endoplasmic reticulum to the Golgi complex. The p24 proteins form hetero-oligomeric complexes and are believed to function as receptors for specific secretory cargo.
Using sensitive sequence-profile analysis methods, we identified a novel β-strand-rich domain, the GOLD (Golgi dynamics) domain, in the p24 proteins and several other proteins with roles in Golgi dynamics and secretion. This domain is predicted to mediate diverse protein-protein interactions. Other than in the p24 proteins, the GOLD domain is always found combined with lipid- or membrane-association domains such as the pleckstrin homology (PH), Sec14p and FYVE domains.
The identification of the GOLD domain could aid in directed investigation of the role of the p24 proteins in the secretion process. The newly detected group of GOLD-domain proteins, which might simultaneously bind membranes and other proteins, point to the existence of a novel class of adaptors that could have a role in the assembly of membrane-associated complexes or in regulating assembly of cargo into membranous vesicles.
The Golgi complex is the central secretory organelle of most eukaryotic cells and consists of membranous stacks called cisternae [1,2]. Secreted proteins, like all other proteins, are synthesized in the endoplasmic reticulum (ER) and are specifically packaged into vesicles that bud off from the ER in a GTP-dependent process [3,4]. These lipid vesicles are coated with the COPII coat protein-complex and are equipped with the ATP-dependent vesicle-fusion apparatus. They carry the secretory cargo to the cis surface of the Golgi complex, with which they fuse, delivering the cargo. A second type of vesicle, coated by the COPI coat-protein complex, is part of a retrograde pathway that buds off the Golgi membrane and returns proteins that are not targeted for secretion back to the endoplasm [3,4].
Studies on the secretory system in crown-group eukaryotes (plants, animals and fungi) have uncovered a family of proteins, the p24 (p24/gp25L/emp24/Erp) family, that have an important role in cargo selection and packaging into COPII-coated vesicles [5,6,7,8]. Additionally, they might also function in excluding secreted proteins from COPI-coated retrograde vesicles [9,10]. Members of the p24 family are type I membrane proteins, with a small carboxy-terminal cytoplasmic tail that interacts with the vesicle coat proteins and a globular lumenal region that probably interacts with the cargo [11,12]. They are abundantly distributed on the membranes of the vesicles budding off the ER and the cis Golgi membranes. The p24 proteins belong to at least four distinct subfamilies [8,12] and form hetero-oligomeric complexes that contain at least one member from each subfamily. This heteromerization of the p24 proteins has been shown to require a coiled-coil stretch at the extreme carboxyl terminus of their lumenal regions .
Improved understanding of the p24 family may throw light on evolution and function of the Golgi apparatus in eukaryotes. With this objective, we conducted a computational sequence analysis of the p24 proteins and show that they contain a conserved globular domain that is also present in several other Golgi and lipid-traffic proteins. We present evidence that this module is likely to serve as a common denominator in protein-protein interactions in several distinct contexts, such as in secretory vesicles and on the Golgi peripheral membrane. The proliferation of this superfamily appears to have been central to the diversification of the eukaryotic secretory apparatus.
Results and discussion
Identification of a conserved domain in p24 and other Golgi proteins
The presence of the GOLD domain at the extreme amino or carboxyl terminus of the Osh3p and animal Sec14 proteins, respectively, allowed us to establish accurate boundaries for it. The domain is typically between 90 and 150 amino acids long and, in the p24 family, it comprises almost the entire lumenal region, with the exception of an α-helical extension of approximately 50 amino acids that precedes the transmembrane segment. Most of the size difference observed in the GOLD-domain superfamily is traceable to a single large low-complexity insert that is seen in some versions of the domain. A secondary-structure prediction for the domain using the PHD  program reveals that it is likely to adopt a compact all-β-fold structure with six to seven strands. Most of the sequence conservation is centered on the hydrophobic cores that support these predicted strands. The predicted secondary-structure elements and the size of the conserved core of the domain suggests that it may form a β-sandwich fold with the strands arranged in two β sheets stacked on each other.
Experimental studies so far on diverse proteins containing GOLD domains point to a role for it in protein-protein interactions. A region of the GPC60 molecule that rather precisely encompasses the GOLD domain has been shown to bind to the cytoplasmic region of the Golgi membrane protein Giantin . Cross-linking experiments have suggested that the p24 proteins interact directly with the cargo molecules that are present in the lumen of the COPII-coated vesicles and that they are, accordingly, cargo receptors . However, yeast deletion mutants lacking all the p24 proteins grow similarly to wild type, although they show delays in translocation of a subset of cargo molecules such as invertase and Gas1p from the ER to the Golgi, and increased secretion of resident ER proteins . Certain members of the p24 family from vertebrates have also been shown to bind to specific ligands such as the interleukin-1 receptor-like molecule T1/ST2 and might aid its proper expression on the cell surface . These observations suggest that the p24 subset of the GOLD domains probably function as discriminators that selectively interact with particular proteins to influence their loading into vesicles. The GOLD domains show considerable variability in some of the loops that are predicted to extrude from the core β-sandwich-like structure (Figure 1). These loops might form exposed surfaces that provide the GOLD domains with the discriminatory capacity necessary for their interactions with diverse ligands.
The domain architecture and evolutionary history of GOLD-domain proteins: functional implications
Thus, all GOLD-domain proteins can be divided into two architectural categories: the p24-like category, in which the GOLD domains project into the lumen, anchored in the membrane by the membrane-spanning helix (category 1); and proteins in which the GOLD domain occurs at the extreme amino or carboxyl terminus, with additional domains that are known to interact with lipid membranes (category 2) (Figure 2). GCP60, which is peripherally associated with the Golgi membrane, is one of the proteins in the second category that has been experimentally characterized. It has been shown that overexpression of a region of this protein encompassing the GOLD domain caused disassembly of the Golgi structure and abrogated protein transport from the ER to the Golgi .
These observations can be accommodated by two (not mutually exclusive) hypotheses regarding the functions of these proteins. The GOLD proteins belonging to the second architectural category could function as double-headed adaptors that interact with both a specific protein (via the GOLD domain) and different cellular lipid membranes. Thus, GCP60 and GOLD proteins with analogous architectures could help in the assembly of vesicular or Golgi-membrane-associated protein complexes by tethering specific proteins to the membranes, with the GOLD domain binding the protein targets and the lipid-binding protein to the membrane. Alternatively, at least some of the category-2 proteins could function as a previously unrecognized class of vesicular cargo-loading molecules that associate with the membrane via their lipid-binding domains and deliver their protein ligands via the GOLD domain. The observation that deletion mutants lacking all the p24 proteins still show normal trafficking of certain proteins such as carboxypeptidase Y, suggests that there are some protein-trafficking pathways that are unaffected by their absence. Thus, the GOLD-domain proteins of category 2 may have a specific role in regulating the secretion of molecules that are not affected by the p24 proteins. The hetero-oligomerization of the p24 proteins via the coiled-coil regions carboxy-terminal to the GOLD domain seems to help in generating combinatorial diversity for their interactions with multiple ligands. The presence of extensive coiled-coil segments in some of the category-2 GOLD-domain proteins, such as FYCO1, suggests that they might also form oligomers, like the p24 proteins.
Similarity-based clustering and phylogenetic analysis divides the GOLD domains into two primary divisions that precisely mirror the two categories established on the basis of domain architectures (Figure 2). This division was also supported by a synapomorphic (shared derived) feature in the form of two conserved cysteines, which is restricted to the p24 family (category-1 proteins). Likewise, the presence of a specific insert between strand 1 and 2 with a characteristic conserved tryptophan serves as a synapomorphic feature for category-2 GOLD domains (Figure 1). An analysis of the phyletic patterns suggests that the p24 family had already differentiated into at least four distinct subfamilies in the common ancestor of plants, animals and fungi. The detection of multiple members of the p24 family in the early branching eukaryotes such as Cryptosporidium parvum and kinetoplastids suggests that some of this diversification was probably already under way early in eukaryotic evolution. Within the eukaryotic crown group, we obtained evidence of specific instances of duplications and gene losses that are restricted to particular lineages. The most striking case is seen in Arabidopsis thaliana, which appears to have proliferated the Erv25 subfamily (five to six members), but lacks the Erp2p and Erp5p subfamilies. The second major family of GOLD domains (category 2) is so far only attested in the crown group. In fungi, this group is typified by Saccharomyces cerevisiae Osh3p, which combines an amino-terminal GOLD domain with PH and oxysterol-binding domains. The greatest architectural diversity of this group is seen in animals (Figure 2), suggesting that there was increased proliferation and domain shuffling among these proteins concomitant with the evolutionary emergence of the animals. This might correlate with the increased complexity of animal-specific secretory functions.
A novel β-strand-rich domain was identified in numerous eukaryotic proteins, including the p24 proteins, which appear to have a function related to the Golgi complex, secretion or protein sorting. These GOLD domains are predicted to be involved in specific protein-protein interactions. Other than the p24 proteins, GOLD domains are present in several proteins where they occur at the extreme termini and are combined with diverse membrane- or lipid-binding domains. These proteins are predicted to be double-headed adaptors that may help in the assembly of protein complexes on membranes or in the packaging of specific cargo molecules in membranous vesicles. The identification to the GOLD domain may help in a directed dissection of p24-family function and provide novel candidate molecules for experimental studies on secretion and sorting.
Materials and methods
The Non-Redundant (NR) database of protein sequences (National Center for Biotechnology Information, NIH, Bethesda) was searched using the BLASTP program . Profile searches were conducted using the PSI-BLAST program with either a single sequence or an alignment used as the query, with a profile-inclusion expectation (E)-value threshold of 0.01, and were iterated until convergence [13,28]. Previously known conserved protein domains were detected using the corresponding PSI-BLAST-derived position-specific scoring matrices (PSSMs) . The PSSMs were prepared by choosing one or more starting queries (seeds) for a set of most frequently encountered domains (see reference  for details) and run against the NR database until convergence with the -C option of PSI-BLAST to save the PSSM. We ensured that at convergence no false positives were included in the profiles. This profile database can be downloaded from  or used on the internet via the RPS-BLAST program . All globular segments of proteins that did not map to domains with previously constructed PSSMs were searched individually using PSI-BLAST to detect any additional domains that may have been overlooked.
Multiple alignments were constructed using the T-Coffee program , followed by manual correction based on the PSI-BLAST results. Protein secondary structure was predicted using a multiple alignment as the input for the PHD program . Signal peptides were predicted using the SIGNALP program [33,34] and the transmembrane regions were predicted using the TOPRED program . Phylogenetic analysis was carried out using the maximum likelihood, neighbor-joining and least-squares methods [36,37]. Briefly, this process involved the construction of a least-squares tree using the FITCH program or a neighbor-joining tree using the NEIGHBOR program (both from the Phylip package) , followed by local rearrangement using the Protml program of the Molphy package  to arrive at the maximum likelihood (ML) tree. The statistical significance of various nodes of this ML tree was assessed using the relative estimate of logarithmic likelihood bootstrap (Protml RELL-BP) with 10,000 replicates.
We thank Eugene Koonin for providing useful comments on the manuscript.
- Alberts B, Bray D, Lewis J, Raff M, Roberts K, Watson JD: Molecular Biology of the Cell. New York and London: Garland Publishing:. 1999Google Scholar
- Hong W: Protein transport from the endoplasmic reticulum to the Golgi apparatus. J Cell Sci. 1998, 111: 2831-2839.PubMedGoogle Scholar
- Bannykh SI, Nishimura N, Balch WE: Getting into the Golgi. Trends Cell Biol. 1998, 8: 21-25. 10.1016/S0962-8924(97)01184-7.PubMedView ArticleGoogle Scholar
- Bannykh SI, Rowe T, Balch WE: The organization of endoplasmic reticulum export complexes. J Cell Biol. 1996, 135: 19-35.PubMedView ArticleGoogle Scholar
- Wada I, Rindress D, Cameron PH, Ou WJ, Doherty JJ, Louvard D, Bell AW, Dignard D, Thomas DY, Bergeron JJ: SSR alpha and associated calnexin are major calcium binding proteins of the endoplasmic reticulum membrane. J Biol Chem. 1991, 266: 19599-19610.PubMedGoogle Scholar
- Schimmoller F, Singer-Kruger B, Schroder S, Kruger U, Barlowe C, Riezman H: The absence of Emp24p, a component of ER-derived COPII-coated vesicles, causes a defect in transport of selected proteins to the Golgi. EMBO J. 1995, 14: 1329-1339.PubMedPubMed CentralGoogle Scholar
- Stamnes MA, Craighead MW, Hoe MH, Lampen N, Geromanos S, Tempst P, Rothman JE: An integral membrane component of coatomer-coated transport vesicles defines a family of proteins involved in budding. Proc Natl Acad Sci USA. 1995, 92: 8011-8015.PubMedPubMed CentralView ArticleGoogle Scholar
- Marzioch M, Henthorn DC, Herrmann JM, Wilson R, Thomas DY, Bergeron JJ, Solari RC, Rowley A: Erp1p and Erp2p, partners for Emp24p and Erv25p in a yeast p24 complex. Mol Biol Cell. 1999, 10: 1923-1938.PubMedPubMed CentralView ArticleGoogle Scholar
- Fiedler K, Veit M, Stamnes MA, Rothman JE: Bimodal interaction of coatomer with the p24 family of putative cargo receptors. Science. 1996, 273: 1396-1399.PubMedView ArticleGoogle Scholar
- Ciufo LF, Boyd A: Identification of a lumenal sequence specifying the assembly of Emp24p into p24 complexes in the yeast secretory pathway. J Biol Chem. 2000, 275: 8382-8388. 10.1074/jbc.275.12.8382.PubMedView ArticleGoogle Scholar
- Kuehn MJ, Herrmann JM, Schekman R: COPII-cargo interactions direct protein sorting into ER-derived transport vesicles. Nature. 1998, 391: 187-190. 10.1038/34438.PubMedView ArticleGoogle Scholar
- Dominguez M, Dejgaard K, Fullekrug J, Dahan S, Fazel A, Paccaud JP, Thomas DY, Bergeron JJ, Nilsson T: gp25L/emp24/p24 protein family members of the cis-Golgi network bind both COP I and II coatomer. J Cell Biol. 1998, 140: 751-765. 10.1083/jcb.140.4.751.PubMedPubMed CentralView ArticleGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.PubMedPubMed CentralView ArticleGoogle Scholar
- Li H, Degenhardt B, Tobin D, Yao Zx ZX, Tasken K, Papadopoulos V: Identification, localization, and function in steroidogenesis of PAP7: a peripheral-type benzodiazepine receptor- and PKA (RIalpha)-associated protein. Mol Endocrinol. 2001, 15: 2211-2228.PubMedGoogle Scholar
- Rost B, Sander C: Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol. 1993, 232: 584-599. 10.1006/jmbi.1993.1413.PubMedView ArticleGoogle Scholar
- Sohda M, Misumi Y, Yamamoto A, Yano A, Nakamura N, Ikehara Y: Identification and characterization of a novel Golgi protein, GCP60, that interacts with the integral membrane protein Giantin. J Biol Chem. 2001, 276: 45298-45306. 10.1074/jbc.M108961200.PubMedView ArticleGoogle Scholar
- Muniz M, Nuoffer C, Hauri HP, Riezman H: The Emp24 complex recruits a specific cargo molecule into endoplasmic reticulum-derived vesicles. J Cell Biol. 2000, 148: 925-930. 10.1083/jcb.148.5.925.PubMedPubMed CentralView ArticleGoogle Scholar
- Springer S, Chen E, Duden R, Marzioch M, Rowley A, Hamamoto S, Merchant S, Schekman R: The p24 proteins are not essential for vesicular transport in Saccharomyces cerevisiae. Proc Natl Acad Sci USA. 2000, 97: 4034-4039. 10.1073/pnas.070044097.PubMedPubMed CentralView ArticleGoogle Scholar
- Gayle MA, Slack JL, Bonnert TP, Renshaw BR, Sonoda G, Taguchi T, Testa JR, Dower SK, Sims JE: Cloning of a putative ligand for the T1/ST2 receptor. J Biol Chem. 1996, 271: 5784-5789. 10.1074/jbc.271.10.5784.PubMedView ArticleGoogle Scholar
- Haslam RJ, Koide HB, Hemmings BA: Pleckstrin domain homology. Nature. 1993, 363: 309-310. 10.1038/363309b0.PubMedView ArticleGoogle Scholar
- Mayer BJ, Ren R, Clark KL, Baltimore D: A putative modular domain present in diverse signaling proteins. Cell. 1993, 73: 629-630.PubMedView ArticleGoogle Scholar
- Aravind L, Neuwald AF, Ponting CP: Sec14p-like domains in NF1 and Dbl-like proteins indicate lipid regulation of Ras and Rho signaling. Curr Biol. 1999, 9: R195-R197. 10.1016/S0960-9822(00)80039-1.PubMedView ArticleGoogle Scholar
- Stenmark H, Aasland R: FYVE-finger proteins - effectors of an inositol lipid. J Cell Sci. 1999, 112: 4175-4183.PubMedGoogle Scholar
- Callebaut I, de Gunzburg J, Goud B, Mornon JP: RUN domains: a new family of domains involved in Ras-like GTPase signaling. Trends Biochem Sci. 2001, 26: 79-83. 10.1016/S0968-0004(00)01730-8.PubMedView ArticleGoogle Scholar
- Mari M, Macia E, Le Marchand-Brustel Y, Cormont M: Role of the FYVE finger and the RUN domain for the subcellular localization of Rabip4. J Biol Chem. 2001, 276: 42501-42508. 10.1074/jbc.M104885200.PubMedView ArticleGoogle Scholar
- Guzman-Rojas L, Sims JC, Rangel R, Guret C, Sun Y, Alcocer JM, Martinez-Valdez H: PRELI, the human homologue of the avian px19, is expressed by germinal center B lymphocytes. Int Immunol. 2000, 12: 607-612. 10.1093/intimm/12.5.607.PubMedView ArticleGoogle Scholar
- Nakai M, Takada T, Endo T: Cloning of the YAP19 gene encoding a putative yeast homolog of AP19, the mammalian small chain of the clathrin-assembly proteins. Biochim Biophys Acta. 1993, 1174: 282-284. 10.1016/0167-4781(93)90198-M.PubMedView ArticleGoogle Scholar
- Aravind L, Koonin EV: Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches. J Mol Biol. 1999, 287: 1023-1040. 10.1006/jmbi.1999.2653.PubMedView ArticleGoogle Scholar
- Schaffer AA, Wolf YI, Ponting CP, Koonin EV, Aravind L, Altschul SF: IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices. Bioinformatics. 1999, 15: 1000-1011. 10.1093/bioinformatics/15.12.1000.PubMedView ArticleGoogle Scholar
- IMPALA ftp site. [ftp://ftp.ncbi.nih.gov/pub/impala/]
- NCBI Conserved domain database search. [http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi]
- Notredame C, Higgins DG, Heringa J: T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol. 2000, 302: 205-217. 10.1006/jmbi.2000.4042.PubMedView ArticleGoogle Scholar
- Nielsen H, Engelbrecht J, Brunak S, von Heijne G: A neural network method for identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Int J Neural Syst. 1997, 8: 581-599. 10.1142/S0129065797000537.PubMedView ArticleGoogle Scholar
- Nielsen H, Engelbrecht J, Brunak S, von Heijne G: Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 1997, 10: 1-6. 10.1093/protein/10.1.1.PubMedView ArticleGoogle Scholar
- von Heijne G: Membrane protein structure prediction: hydrophobicity analysis and the 'positive inside' rule. J Mol Biol. 1992, 225: 487-494.PubMedView ArticleGoogle Scholar
- Felsenstein J: Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. Methods Enzymol. 1996, 266: 418-427.PubMedView ArticleGoogle Scholar
- Hasegawa M, Kishino H, Saitou N: On the maximum likelihood method in molecular phylogenetics. J Mol Evol. 1991, 32: 443-445.PubMedView ArticleGoogle Scholar
- Felsenstein J: PHYLIP - Phylogeny Inference Package (Version 3.2). Cladistics. 1989, 5: 164-166.Google Scholar