Characterizing glycosylation pathways
Genome Biologyvolume 2, Article number: reviews0004.1 (2001)
Numerous factors that influence cell-surface carbohydrate composition remain to be elucidated. The combination of novel biochemical and metabolism-based approaches with emerging genomic methods promises to accelerate efforts to understand glycosylation.
The surface of a mammalian cell is decorated with complex carbohydrates. These sugars, known individually as glycans and collectively as the glycocalyx, are biosynthetically assembled from simple monosaccharides into a diverse array of oligo- and poly-saccharides (Figure 1). Glycans mediate a cell's communications with the outside world [1,2] and play a crucial role in the events at fertilization that initiate the life of a multicellular organism . Carbohydrates continue to play a critical role throughout development and contribute to the healthy life of the mature organism [4,5]. Abnormalities in glycan expression are implicated as causative or incidental factors in both relatively rare congenital diseases  and widespread acquired diseases, such as cancer . The recent revelation that fewer genes than originally thought comprise the human genome has further highlighted the importance of post-translational modifications, such as glycosylation, as determinants of higher eukaryotic functions [8,9].
Deciphering the molecular details of oligosaccharide synthesis and biological activity is one of the major challenges now confronting the cell biologist. Unlike other structural biomolecules such as proteins and nucleic acids, synthesis of which is template-driven and well defined at a molecular level, oligosaccharides are not primary gene products . An understanding of their biosynthesis remains rudimentary. This review briefly describes current understanding of glycan biosynthesis and the methods that have been used to garner this information. It then addresses the exciting prospects that emerging genomic and metabolic techniques, coupled with established methodologies, offer for rapid discovery of the glycosylation processes of a cell.
Conversion of monosaccharides into complex oligosaccharides
The common sugars, requisite co-substrates, and many of the enzymes necessary for the synthesis of complex carbohydrates are already known. Knowing the complete human genome ensures that the remaining enzymes involved will be identified soon. What remains mysterious is how these molecular players work in concert to convert a few simple monosaccharides into the exact pattern of complex cell-surface glycans that gives each cell type a unique and reproducible identity (Figure 1). This gap in knowledge not only precludes a detailed understanding of how these molecules are regulated during the healthy lifespan of an organism but, more importantly, also hinders our ability to intervene in pathological situations. Answering questions about glycan biosynthesis will lead to insights into basic biological processes and also opens the door to therapeutic intervention in disease processes.
For the purposes of this review, the metabolic pathways responsible for endowing each cell with its unique complement of oligosaccharides are divided into two stages (Figure 2a). The early steps involve the conversion of monosaccharides obtained by the cell from dietary sources, or from recycling and salvage processes (Figure 1), into nucleotide-sugar donors. This stage typically entails the phosphorylation of one or more of the hydroxyl groups of the monosaccharides. In addition, it often involves the inversion of stereocenters to convert one sugar to a related epimer. In some cases, the sequential action of several enzymes is required to transform one monosaccharide (such as ManNAc; see Figure 1 for abbreviations used in sugar names) into a considerably different sugar (such as sialic acid; see Figure 2b).
Once a phosphorylated monosaccharide of the desired stereochemical configuration is achieved, conversion to the UDP, GDP, or CMP analog (depending on the sugar) creates a nucleotide-sugar donor. The nucleotide functions as a high-energy leaving group to facilitate the assembly of the monosaccharides into complex carbohydrates by the stepwise action of a group of enzymes, known as glycosyltransferases, most of which reside in the Golgi apparatus (Figure 2c) . Each step, although seemingly straightforward, involves the correct choice of one of several possible glycosyltransferases with similar, but subtly different, substrate specificities. The exact composition of the final product is determined, at least in part, by the route of transit the growing oligosaccharide follows through the secretory pathway, allowing (or avoiding) contacts with particular glycosyltransferases. As a result, one protein can be decorated by a particular oligosaccharide while a different protein, another copy of the first protein, or even an alternate glycosylation site on the same protein, can be endowed with distinct oligosaccharides (Figures 1 and 2c).
A fruitful method for determining the internal workings of a cell is the isolation and analysis of subpopulations of cells with distinct properties . Molecular and genetic characterization of such cells has illuminated the function of enzymes responsible for a wide range of cellular characteristics, including oligosaccharide biosynthesis . Continued progress in unraveling glycosylation depends on judicious selection of cell populations with variations in surface glycan expression. Variation in the output of the glycosylation pathways can result from endogenous regulatory mechanisms, abnormalities associated with specific diseases, or external inputs (Figure 3). We next discuss the benefits and limitations of using each of these sources of variation in oligosaccharide composition for further study of glycan biosynthesis.
Endogenous regulatory mechanisms
The glycoconjugate diversity found on different cell types in a healthy organism is a consequence of intracellular metabolic mechanisms responding, in many cases, to external stimuli (Figure 3). In theory, the entire human genome sequence, coupled with methods to measure the expression level of each gene, provides tools for thorough study of glycan biosynthesis. For example, microarray analysis of cell populations selected at various stages in development, or derived from different tissues, promises to explain the distinct oligosaccharide expression pattern of each cell type. Despite the fact that different cell populations also experience gross morphological changes involving many genes that are not related to glycosylation, the development of sophisticated computational algorithms should allow glycosylation-specific effects on gene expression to be deconvoluted from irrelevant secondary effects. This type of genome-wide approach will provide an elegant picture of how the internal glycosylation processes of a cell are reflected on the cell surface as a diverse array of oligosaccharides. Nevertheless, such an approach, by itself, may leave a number of key issues in glycan biosynthesis unresolved.
Microarray studies have the remarkable ability to describe the interplay of numerous enzymes comprising multiple metabolic pathways and reveal system-wide perturbations in gene expression. But, unfortunately, the overwhelming volume of data generated from these studies presents difficulties in distinguishing a primary molecular defect from ensuing secondary (and higher order) effects. As a consequence, subtle changes affecting enzymes that play key regulatory roles may be obscured amid quantitatively larger, but functionally less significant, alterations. The molecular defects responsible for a pathology may be similarly concealed. If so, it will be difficult to identify candidate enzymes as targets for pharmaceutical intervention, or for replacement by gene therapy. Accordingly, a more traditional study of disease states leading to the exact identification of a molecular defect responsible for a pathological phenotype remains a sensible experimental route to complement more recent methods to assess genome-wide changes in gene expression.
Glycosylation changes associated with disease states
Abnormalities associated with disease states are major contributors to diversity in glycan expression (Figure 3c). These defects, in many cases, ultimately arise from the dysfunction of an individual enzyme, often as the consequence of a single amino-acid mutation. Determination of the molecular defect responsible for a disease phenotype can therefore reveal the normal function of a particular gene product . A fundamental limitation, however, of relying on human-disease or whole-animal models is that the critical role of glycans in fertilization and development means that many glycosylation defects are embryonic-lethal. Although fortunate from the perspective of human health, this situation prevents potentially interesting metabolic perturbations from being observed clinically (or in animal models) and results in an incomplete understanding of the glycosylation processes of a cell.
To circumvent limitations inherent in the study of animal disease models, researchers have found that it is possible to mutagenize large cell populations and select rare-event mutant cell lines that often mimic disease states at a molecular level [12,14]. The experimental advantage of such cell-based 'forward genetics' screens is that requirements for cell-surface glycosylation are significantly less stringent in cell monocultures than for whole organisms. Consequently, cells with potentially embryonic-lethal mutations can survive to be isolated and studied, resulting in an enhanced picture of the glycosylation machinery.
The most straightforward method for selecting a rare cell harboring a glycosylation defect is through the use of toxic lectins. Lectins are proteins that recognize and bind to specific sugar residues when presented in certain conformational contexts within an oligosaccharide. Certain lectins are bifunctional: in addition to a sugar-binding domain they also contain a cytotoxic (usually ribosome-inactivating) domain . Incubation of a large, mutagenized cell population with a toxic lectin rapidly results in the isolation of rare mutant cell subpopulations that lack the targeted binding motif and therefore escape death (Figure 4a). As in animal disease models, molecular characterization leads to the identification of the particular molecular defect responsible for the cell-surface aberration.
The use of toxic lectins, while experimentally easy, can lead to unintended selection outcomes. For example, most of these toxins require retrograde transport into the Golgi or endoplasmic reticulum before they can be translocated into the cytoplasm . Not surprisingly, selection outcomes therefore include transport defects that have little relevance to glycosylation processes. A superior cell selection method, with mutational outcomes more closely targeted to the intended glycosylation pathways, is the use of fluorescently labeled, non-toxic lectins (or carbohydrate-specific antibodies) coupled with a cell-sorting method based on flow cytometry or magnetic particles (Figure 4b). These sophisticated, high-throughput methods allow the selection and subsequent propagation of living cells, thereby avoiding the multiple mechanisms that cells can exploit to reduce the deleterious effects of exposure to toxins. Also, such methods permit both positive and negative selection for the desired glycosylation phenotype (Figure 4b).
A final concern with current selection techniques is that mutational outcomes are biased toward the later steps in glycoconjugate biosynthesis, typically either the Golgi transporter or glycosyltransferase stages (Figure 2c) [16,17]. As illustrated in Figure 2b, the earlier stages of glycosylation involve multiple interconversions between structurally related monosaccharides. Many early-stage defects can be 'masked' by re-adjustment of metabolic flux through these intersecting pathways. Clearly, experimental access to such masked mutations is crucial to gaining a complete understanding of the glycosylation processes of a cell. An emerging metabolism-based method that we have developed has the potential to illuminate this class of masked molecular abnormalities.
Externally supplied substrates can illuminate 'masked' metabolic defects
External inputs can modulate the glycosylation pathways of a cell (Figure 3c). As mentioned above, extracellular signals can trigger changes in oligosaccharide biosynthesis by as-yet poorly understood regulatory processes. External manipulation of glycosylation can also be achieved by the direct delivery of small molecule metabolites into biosynthetic pathways [18,19]. For example, compounds as simple as ammonia can have a profound effect on the display of cell-surface carbohydrate epitopes . A substrate-based approach can establish linkages, many of which may be unanticipated, between various components of metabolic pathways. Also, in contrast to changes effected by the manipulations of enzymes, substrate-based intervention is easily reversible, and it is amenable to quantitative analysis by altering the concentration of the externally delivered agent.
The substrate-based approach that we have pursued involves the interception of a pathway with an unnatural analog of a metabolic intermediate. Specifically, the sialic acid pathway is capable of enzymatically processing unnatural analogs (Figure 5). N-acetylmannosamine (ManNAc) derivatives can be converted to their corresponding cell-surface-displayed sialic acid counterparts . Utilization of an analog such as N-levulinoylmannosamine (ManLev) results in the incorporation of a unique chemical tag, the ketone functionality, into the cell-surface glycoconjugates in the form of the modified sialic acid SiaLev (Figure 5) . The ability to react ketone-specific probes, such as biotin hydrazide , with the engineered sialic acid allows the development of cell selection schemes that exploit the ability of ManLev to successfully transit the entire pathway (Figure 5). Consequently, mutations that either enhance or diminish the ability of the unnatural substrate to gain cell-surface expression are detectable throughout the entire length of, or even upstream of, the portion of the pathway traversed by the analog.
The sialic acid pathway provides an illustration of the molecular nature of early stage 'masked' mutations that can be illuminated by an unnatural substrate-based approach (Figure 6). One of the mutational outcomes yielded by ManLev-based selection in human Jurkat (T-lymphoma derived) cells is a mutant form of the UDP-GlcNAc epimerase . The mutant enzyme is refractory to allosteric feedback inhibition because of loss of binding of the downstream metabolite, CMP-Sia (Figure 6, 1). The causative molecular defect, a single amino-acid substitution identical to that found in the inborn human disease sialuria , results in overproduction of ManNAc , thereby competitively excluding ManLev from the pathway and abolishing SiaLev expression on the cell surface. Because flux of the natural substrate, ManNAc, continues through the pathway, there is no change in cell-surface glycan expression in the absence of ManLev. Consequently, this molecular defect could not have been isolated using established lectin or antibody approaches, and it demonstrates one advantage of the unnatural substrate-based method .
Another advantage of substrate-based metabolic selection is the ability to rapidly compile comprehensive libraries of mutations. To continue the sialuria example, only three mutations have been characterized from human patients, because the disease is rare . In the absence of structural characterization, this limited data set led to the tentative suggestion that the mutation affected a regulatory domain in UDP-GlcNAc 2-epimerase. The high-throughput nature of the substrate-based approach allowed a large set of mutations to be obtained rapidly, and the resulting array of mutations has firmly established the existence of the putative regulatory domain.
Another early-stage masked molecular abnormality that is only accessible by using the metabolic selection method is a defect in sialic acid synthase (Figure 6, 2). In normal cell culture conditions, this mutation would be complemented by intake of sialic acid from the serum in culture media. In an animal model, the inability to produce sialic acid is likely to be embryonic-lethal because of the crucial role of this sugar in development. A substrate-based approach is therefore the only strategy capable of detecting this abnormality in a 'forward genetics' screen. Another important feature of the substrate-based approach is its suitability for selection of gain-of-function or overexpression mutants. To give a specific example, ManLev-based selection produced mutant Jurkat cells that had begun to express polysialic acid (poly-Sia), as a consequence of upregulation of NCAM expression (Figure 6c) . This abnormality parallels changes in cell-surface carbohydrates observed in highly metastatic cancers.
In the future, application of metabolic substrate-based forward genetics methods will extend beyond the sialic acid biosynthetic pathway. Our recent demonstration that cells can incorporate 'ketoGal', a ketone-containing analog of GalNAc, into cell-surface glycans  has opened the door for selection strategies that explore molecular defects affecting internal positions of a complex polysaccharide chain. Together, selection strategies based on unnatural substrates that target the terminal sialic acid residues (accessed by ManLev) and the position proximal to the peptide backbone in O-linked glycoproteins (accessed by ketoGal) promise to facilitate the identification of new genes and regulatory control points that impinge on the target pathways.
Currently, characterization of molecular abnormalities uncovered by genetic screens proceeds largely by using classical molecular biological and biochemical techniques. This process has been compared to a search for a needle in a haystack, requiring a large input of manpower and resources per gene that is hard to reconcile with the wealth of genetic information now available . Now that the entire human genome is (nearly) in hand, the exact sequence of a gene suspected of harboring a glycosylation-related defect can be determined with relative ease by direct sequencing methods. Nevertheless, with dozens, and potentially hundreds, of enzymes involved in oligosaccharide biosynthesis, a direct sequencing approach remains daunting. Even more problematic are situations in which the cause of a glycosylation abnormality is distant from the enzymes directly involved in carbohydrate metabolism.
The up-regulation of NCAM in ManLev-selected Jurkat cells  exemplifies molecular defects that do not occur directly in the oligosaccharide biosynthetic pathways yet nevertheless affect the composition of cell-surface sugars. Neo-expression of NCAM, leading to cell-surface polysialic acid display, may result from regulatory perturbations in one or more currently unidentified transcription factors. In this case, no single gene can be readily implicated as a causative factor. Instead, molecular resolution awaits the exploitation of genome-wide methods that have yet to reach maturation. For example, microarray analysis based on comprehensive libraries of single-nucleotide polymorphisms (SNPs) that are now being compiled , and developing proteomic approaches, will shed light on how the molecular players physically interact with each other and their environment.
A final necessity is to translate the effects of the primary molecular defect to the overall metabolism of the cell and functioning of the entire organism . As discussed previously, a genome-wide approach, when used alone, faces the challenge of identifying a primary metabolic defect amidst an overwhelming number of compensatory secondary and tertiary effects. Conversely, traditional approaches used to identify a primary, causative defect typically ignore overall effects on cellular metabolism. As an illustration, we again consider the 'sialuria' cells (Figure 6a). The effects of a single amino-acid mutation in the UDP-GlcNAc 2-epimerase are likely to reverberate throughout the cell, affecting metabolic events far removed from the glycosylation pathways. For example, the increased sialic acid synthesis in these cells requires a large diversion of phosphoenolpyruvate (PEP) from its normal role in energy production. Similarly, production of CMP-Sia diverts CTP from its normal cellular functions that include RNA and DNA synthesis. Nevertheless these mutant cells continue to thrive, probably as a consequence of rearrangement of flux throughout multiple metabolic pathways, a type of change that can only be assessed through a genome-wide approach.
In conclusion, the ultimate goal of studying glycosylation is to learn the role of glycans in a whole animal. Methods that extend lessons learned from cell models to multicellular organisms are therefore crucial . In the future, with the entire genomes of higher organisms becoming known, the molecular defects observed in single cells can be introduced into the analogous enzymes in stem cells. Already these 'reverse genetic' methods are feasible in the nematode Caenorhabditis elegans [31,32]. With efforts to sequence the mouse genome well underway and the rat genome now in progress , such methods will soon be extended to mammalian systems. When necessary, sophisticated temporal and tissue-specific gene expression systems [34,35] can be used to avoid the problem of embryonic lethality. We conclude that the integration of constantly improving molecular biology techniques with emerging substrate-based and genome-wide approaches promises rapid progress in determining the molecular forces that govern oligosaccharide biosynthesis.
Crocker PR, Feizi T: Carbohydrate recognition systems: functional triads in cell-cell interactions. Curr Opin Struct Biol. 1996, 6: 679-691. 10.1016/S0959-440X(96)80036-4.
Solis D, Jimenez-Barbero J, Kaltner H, Romero A, Siebert HC, von der Lieth CW, Gabius HJ: Towards defining the role of glycans as hardware in information storage and transfer: basic principles, experimental approaches and recent progress. Cells Tissues Organs. 2001, 168: 5-23. 10.1159/000016802.
Rosati F, Capone A, Giovampaola CD, Brettoni C, Focarelli R: Sperm-egg interaction at fertilization: glycans as recognition signals. Int J Dev Biol. 2000, 44: 609-618.
Dennis JW, Granovsky M, Warren CE: Protein glycosylation in development and disease. BioEssays. 1999, 21: 412-421. 10.1002/(SICI)1521-1878(199905)21:5<412::AID-BIES8>3.3.CO;2-X.
Rudd PM, Elliott T, Cresswell P, Wilson IA, Dwek RA: Glycosylation and the immune system. Science. 2001, 291: 2370-2376. 10.1126/science.291.5512.2370.
Schachter H: Diseases with deficiences in asparagine-linked glycosylation. In Molecular and Cellular Glycobiology. Edited by Fukuda M, Hindsgaul O. Oxford: Oxford University Press;. 2000, 1-61.
Sell S: Cancer-associated carbohydrates identified by monoclonal antibodies. Hum Pathol. 1990, 21: 1003-1009.
International Human Genome Sequencing Consortium: Initial sequencing and analysis of the human genome. Nature. 2001, 409: 860-921. 10.1086/172716.
Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al: The sequence of the human genome. Science. 2001, 291: 1304-1351. 10.1126/science.1058040.
Fukuda M: Cell surface carbohydrates: cell type-specific expression. In Molecular and Cellular Glycobiology. Edited by Fukuda M, Hindsgaul O. Oxford: Oxford University Press,. 2000, 1-61.
Colley KJ: Golgi localization of glycosyltransferases: more questions than answers. Glycobiology. 1997, 7: 1-13.
Stark GR, Gudkov AV: Forward genetics in mammalian cells: functional approaches to gene discovery. Hum Mol Genet. 1999, 8: 1925-1938. 10.1093/hmg/8.10.1925.
Stanley P, Raju TS, Bhaumik M: CHO cells provide access to novel N-glycans and developmentally regulated glycosyltransferases. Glycobiology. 1996, 6: 695-699.
Stanley P, Ioffe E: Glycosyltransferase mutants: key to new insights in glycobiology. FASEB J. 1995, 9: 1436-1444.
Sandvig K, van Deurs B: Entry of ricin and Shiga toxin into cells: molecular mechanisms and medical perspectives. EMBO J. 2000, 19: 5943-5950. 10.1093/emboj/19.22.5943.
Potvin B, Raju TS, Stanley P: Lec32 is a new mutation in Chinese hamster ovary cells that essentially abrogates CMP-N-acetylneuraminic acid synthetase activity. J Biol Chem. 1995, 270: 30415-30412. 10.1074/jbc.270.27.16107.
Vischer P, Hughes RC: Glycosyl transferases of baby-hamster-kidney (BHK) cells and ricin-resistant mutants. N-glycan biosynthesis. Eur J Biochem. 1981, 117: 275-284.
Bertozzi CR, Kiessling LL: Chemical glycobiology. Science. 2001, 291: 2357-2364. 10.1126/science.1059820.
Goon S, Bertozzi CR: Metabolic substrate engineering as a tool for glycobiology. In Glycobiology: Principles, Synthesis, and Applications. Edited by Wang PG, Bertozzi CR. New York; Marcel Dekker. 2001, 641-674.
Zanghi JA, Mendoza TP, Knop RH, Miller WM: Ammonia inhibits neural cell adhesion molecule polysialation in Chinese hamster ovary and small cell lung cancer cells. J Cell Physiol. 1998, 177: 248-263. 10.1002/(SICI)1097-4652(199811)177:2<248::AID-JCP7>3.0.CO;2-N.
Kayser H, Zeitler R, Kannicht C, Grunow D, Nuck R, Reutter W: Biosynthesis of a nonphysiological sialic acid in different rat organs using N-propanoyl-D-hexosamines as precursors. J Biol Chem. 1992, 267: 16934-16938.
Mahal LK, Yarema KJ, Bertozzi CR: Engineering chemical reactivity on cell surfaces through oligosaccharide biosynthesis. Science. 1997, 276: 1125-1128. 10.1126/science.276.5315.1125.
Yarema KJ, Mahal LK, Bruehl RE, Rodriguez EC, Bertozzi CR: Metabolic delivery of ketone groups to sialic acid residues. Application to cell surface glycoform engineering. J Biol Chem. 1998, 273: 31168-31179. 10.1074/jbc.273.47.31168.
Yarema KJ, Goon S, Bertozzi CR: Metabolic selection of glycosylation mutations in human cells. Nat Biotechnol. 2001,
Seppala R, Lehto VP, Gahl WA: Mutations in the human UDP-N-acetylglucosamine 2-epimerase gene define the disease sialuria and the allosteric site of the enzyme. Am J Hum Genet. 1999, 64: 1563-1569. 10.1086/302411.
Seppala R, Tietze F, Krasnewich D, Weiss P, Ashwell G, Barsh G, Thomas GH, Packman S, Gahl WA: Sialic acid metabolism in sialuria fibroblasts. J Biol Chem. 1991, 266: 7456-7461.
Hang HC, Bertozzi CR: Ketone isosteres of 2-N-acetamindosugars as substrates for metabolic cell surface engineering. J Am Chem Soc. 2001, 123: 1242-1243. 10.1021/ja002962b.
Evans MJ, Carlton MBL, Russ AP: Gene trapping and functional genomics. Trends Genet. 1997, 13: 370-374. 10.1016/S0168-9525(97)01240-7.
Chakravarti A: Single nucleotide polymorphisms: . . . to a future of genetic medicine. Nature. 2001, 409: 822-823. 10.1086/172712.
Stanley P: Functions of carbohydrates revealed by transgenic technology. In Molecular and Cellular Glycobiology. Edited by Fukuda M, Hindsgaul O. Oxford: Oxford University Press. 2000, : 169-198.
Ahringer J: Turn to the worm!. Curr Opin Genet Dev. 1997, 7: 410-415. 10.1016/S0959-437X(97)80157-8.
The C. elegans Sequencing Consortium: Genome sequence of the nematode C. elegans: a platform for investigating biology. Science. 1998, 282: 2012-2018. 10.1126/science.282.5396.2012.
Marshall E: Rat genome spurs an unusual partnership. Science. 2001, 291: 1872-1872. 10.1126/science.291.5510.1872.
Ghersa P, Gobert RP, Sattonnet-Roche P, Richards CA, Merlo Pich E, Hooft van Huijsduijnen R: Highly controlled gene expression using combinations of a tissue-specific promoter, recombinant adenovirus and a tetracycline-regulatable transcription factor. Gene Ther. 1998, 5: 1213-1220. 10.1038/sj/gt/3300713.
Vivian JL, Klein WH, Hasty P: Temporal, spatial and tissue-specific expression of a myogenin-lacZ transgene targeted to the Hprt locus in mice. Biotechniques. 1999, 27: 154-162.
Varki A, Cummings R, Esko J, Freeze H, Hart G, Marth J: Essentials of Glycobiology. Cold Spring Harbor: Cold Spring Harbor Laboratory Press. 1999