Synthetic Glycobiology: Parts, Systems, and Applications

Protein glycosylation, the attachment of sugars to amino acid side chains, can endow proteins with a wide variety of properties of great interest to the engineering biology community. However, natural glycosylation systems are limited in the diversity of glycoproteins they can synthesize, the scale at which they can be harnessed for biotechnology, and the homogeneity of glycoprotein structures they can produce. Here we provide an overview of the emerging field of synthetic glycobiology, the application of synthetic biology tools and design principles to better understand and engineer glycosylation. Specifically, we focus on how the biosynthetic and analytical tools of synthetic biology have been used to redesign glycosylation systems to obtain defined glycosylation structures on proteins for diverse applications in medicine, materials, and diagnostics. We review the key biological parts available to synthetic biologists interested in engineering glycoproteins to solve compelling problems in glycoscience, describe recent efforts to construct synthetic glycoprotein synthesis systems, and outline exemplary applications as well as new opportunities in this emerging space.


■ MOTIVATION AND SCOPE
Synthetic biology has made great strides in engineering living systems for desired purposes and in creating novel biological processes with compositions and properties not found in nature. 1−4 While the field is historically rooted in the development of methods to better read, write, edit, and design DNA, synthetic biology has since leveraged these tools to impact a wide variety of applications which require understanding and harnessing cellular processes mediated by post-translational modifications (PTMs), 5−7 a task that remains one of the key challenges of the postgenomic era. Protein glycosylation, the attachment of complex sugar moieties (glycans) to amino acid side chains, is among the most diverse, abundant, and important PTMs, making it of particular interest to the academic and industrial research communities. 8 Glycosylation is present in all domains of life 9−12 and over half of eukaryotic proteins, 13 playing important roles in protein folding and function. 12,14,15 Secreted and cell-surface proteins are glycosylated at particularly high rates, making glycans important for cell−cell signaling, 16,17 host−pathogen interactions, 18,19 and immune responses. 20−22 In addition, 70% of approved or preclinical protein therapeutics 23 are glycosylated, having profound effects on protein stability, 24,25 immunogenicity, 26 and activity. 27 Biopharmaceutical glycosylation patterns must be rigorously controlled during development and production and can be intentionally engineered to produce desired properties in protein therapeutics and vaccines. 24,27−30 Taken together, these factors make it clear that fulfilling the vision of synthetic biology to precisely control and construct novel biological systems will require the design and understanding of protein glycosylation.
Drawn by new opportunities to understand fundamental biology as well as compelling applications in medicine and materials, researchers have begun to use tools originally developed for metabolic engineering, genetic editing, protein engineering, and chemical biology to manipulate glycosylation. These lines of inquiry have recently coalesced to form the field of synthetic glycobiology 31−33 which, broadly defined, seeks to apply the tools of synthetic biology to the engineering and design of glycosylation systems. Because this field has advanced rapidly over recent years and involves a unique set of biological parts and methods compared to more traditional applications of synthetic biology, a systematic review is warranted.
Here, we review the exciting area of synthetic glycobiology with a focus on useful abstractions, tools, and methods regularly employed by the synthetic biology community at large. Specifically, we outline the functional parts required to manipulate protein glycosylation and how they are organized within natural systems. We then describe how these parts have been assembled to construct synthetic glycosylation systems in mammalian, insect, plant, and bacterial cells, as well as cell-free systems. Finally, we review select applications of synthetic protein glycosylation systems and present outstanding opportunities to use synthetic glycosylation systems to solve compelling problems in medicine, materials, and beyond ( Figure 1).
We note that glycobiology and therefore synthetic glycobiology is a broad field, and there are important innovations in the areas of glycolipids, 34 glycosylated natural products, 35 glycomimetic systems, 36−38 free oligosaccharides, 39 and cell-surface (glycocalyx) engineering 31 that have been recently reviewed elsewhere. However, this review focuses on protein glycosylation because of its relevance to techniques (DNA assembly, transcription/translation control, genetic editing, metabolic engineering, etc.) and applications (therapeutics, vaccines, diagnostics, materials, etc.) that are often of interest to synthetic biologists. As bioprocessing methods to control glycosylation within their native hosts 40 and methods for complete chemical synthesis of glycans and glycoproteins 41, 42 were recently reviewed elsewhere, this work focuses instead on a more detailed description of highly engineered biosynthetic systems where synthetic biology tools are most readily applicable.
■ THE "PARTS" OF SYNTHETIC GLYCOBIOLOGY: AN ENGINEER'S GUIDE TO PROTEIN GLYCOSYLATION In order to design and build a biological process for a desired function, one must first understand the parts available for its construction. Here, we briefly review the key mechanisms of protein glycosylation found in nature as well as important characteristics for the construction of synthetic glycosylation systems ( Figure 2). The reader may consult more exhaustive reviews of protein glycosylation systems in bacteria, 43 archaea, 44, 45 and eukaryotes 46 and how they compare 47 for further information. For the construction of synthetic pathways, it is useful to abstract glycosylation systems into a set of five functional parts: glycosyltransferase enzymes (GTs), sugar donors, sequons, glycosidases, and lectins. GTs covalently attach glycans to target proteins at a glycosylation site, which is found within a sequence of amino acids known as a sequon. GTs use activated sugar donors as substrates which are made up of saccharides linked to lipids or nucleotide-diphosphates such as uracil (UDP-), guanine (GDP-), cytosine (CMP-), thymine (TDP-), or adenine (ADP-). In the protein modification step, a polypeptide glycosyltransferase (ppGT) transfers one or more saccharides from a sugar donor to a sequon within a target protein. While a sequon is the sequence of amino acids required for glycosylation to occur, it is important to note that overall target protein structure and folding can also influence modification efficiency. 48−50 Following protein modification, glycans can be further modified by elaborating GTs and trimmed by glycosidases. There are two major classes of glycosidases. Exoglycosidases remove sugars from the termini of glycans while endoglycosidases hydrolyze glycosidic bonds within glycan chains. Glycosidases have been employed and engineered for glycoprotein analysis, 51,52 remodeling, 53−55 and even therapeutics. 56,57 Once constructed, glycoproteins often interact with lectins which specifically bind to certain glycan structural motifs. Although this review focuses primarily on the synthesis of glycoproteins, knowledge of lectin specificities has often been leveraged in the field of synthetic glycobiology to design glycanbased selection schemes, 58−60 develop new approaches to fight infectious and autoimmune disease, 61−63 produce functional biomaterials, 64−66 and to understand and manipulate protein trafficking within the human body. 67−71 Key resources for the identification of relevant enzymes, glycans, and glycan-binding proteins include the Carbohydrate-active enzyme (CAZY) database (exhaustive list of genetically identified GTs and lectins), 72 the GlyCosmos Portal (especially the GlyTouCan glycan search and the Lectin Frontier database 73 ), and the GlycoGene Database (a curated list of key classes of GTs 74 ).
The five functional parts outlined in Figure 1 are assembled in a multitude of naturally occurring glycosylation systems across the three domains of life. These glycosylation pathways can generally be classified in terms of the topology, chemical bond, and specificity of its polypeptide modification step (Table 1). Two major glycosylation system topologies differ in the type of ppGT which performs the critical polypeptide modification step Figure 1. Parts, systems, and applications of synthetic glycobiology. Glycosylation is mediated by five key parts: sequons, glycosyltransferases, glycosidases, sugar donors, and lectins which accept, add, trim, supply, and bind sugars, respectively. Synthetic glycobiology repurposes, recombines, and engineers these parts to construct biosynthetic systems that produce designer glycoproteins for compelling applications in therapeutics, vaccines, diagnostics, and materials.
which controls the location and diversity of the installed glycan. 47,75 The first topology is oligosaccharyltransferase (OST)-dependent in which prebuilt glycans are transferred en bloc from a lipid-linked oligosaccharide (LLO) onto a Figure 2. Selected naturally occurring glycosylation systems and the parts they supply for synthetic glycobiology. The parts of sequons, sugar donors, glycosyltransferases, glycosidases, and lectins are arranged as they interact in OST-dependent and OST-independent glycosylation systems in eukaryotes or bacteria. (a) OST-dependent glycosylation systems in eukaryotes transfer sugars en bloc from lipid-linked oligosaccharides (LLOs) to asparagine (N-linked) residues within proteins in the endoplasmic reticulum. These sugars are then trimmed down by glycosidases and elaborated by other GTs in the Golgi. (b) OST-dependent glycosylation systems in bacteria work similarly, but N-linked OSTs (N-OSTs) transfer sugars to asparagine as well as O-linked OSTs (O-OSTs) transfer sugars to serine and threonine (O-linked) residues in the periplasm via single-subunit OSTs. (c) Eukaryotic OST-independent glycosylation systems of interest for glycoengineering include mucin-type glycosylation (O-linked GalNAc) and glycosaminoglycan (GAG) glycosylation (O-linked xylose) in the Golgi as well as O-linked GlcNAc glycosylation in the cytoplasm. (d) Bacterial OSTindependent glycosylation systems of interest for glycoengineering primarily glycosylate autotransporter/adhesion proteins in the cytoplasm and are initiated by N-glycosyltransferases (N-linked Glc), the GtfA/B complex (O-linked GlcNAc), or BAHT (O-linked Heptose). Bacterial effector toxins including SetA (O-linked Glc), NleB (arginine or R-linked GlcNAc), and EarP (R-linked Rhamnose) that are secreted into eukaryotic host cells are also of interest. Both prokaryotic and eukaryotic cells are surrounded by a glycocalyx layer and use lectins to selectively bind to glycans in their environment. (e) Symbol key for parts of synthetic glycobiology. The Consortium for Functional Glycomics symbol nomenclature is used for sugar monomers. Polypeptide GTs are color-coded to correspond to the sugars that they conjugate to proteins. specifically targeted sequon by an OST. The second topology is OST-independent in which the glycans are built in a sequential fashion on a sequon within a target protein. 47 The most common protein conjugation bonds are N-linked (most often on Asn residues but can also include Arg) and O-linked (most often on Ser or Thr residues but can also include Tyr, hydroxylysine, and hydroxyproline). Enzymes that modify nitrogen within Arg side chains are commonly referred to as R-linked. Other conjugation bonds include S-linked (Cys residues) and C-linked (Trp residues). 44,46,76,77 The level of specificity of ppGTs to both sugar donors and protein acceptor sequences (i.e., sequons) is important in determining the potential utility of a glycosylation system for engineering. 78 Glycans conjugated to proteins have directionality, defined from the reducing end sugar (attached to the amino acid side chain) to the nonreducing end (termini). These sugars can be conjugated in a variety of linkages between saccharides at the anomeric carbon (α-linkage or β-linkage) and the carbons on each sugar involved in those linkages (notated as, for example, β1−4 linkages or α2−3 linkages). Linkage differences can change the physical and biological properties of glycans and can be important in GT, glycosidase, and lectin specificities. 79 PpGTs are particularly specific for the sugar at the reducing end of a sugar donor.
PpGTs also have specificity for the glycosylation site or sequon. The modification of a sequon by a given ppGT is highly dependent on neighboring amino acids 80 and/or its structural context. 49 In fact, some ppGTs are dedicated to the modification of a single protein in their natural systems, 78 while others are more general. PpGTs with more relaxed specificities can be used to modify diverse target proteins by introducing an engineered sequence of amino acids known as a glycosylation tag (GlycTag), into the target protein sequence. 80−83 GlycTags can refer to native sequons that are engineered to optimize glycosylation efficiency or sequons that are introduced in diverse nonnative target proteins to enable glycosylation, similar to the addition of an affinity tag for protein purification. This protein specificity factor highlights the importance of the design and understanding of acceptor sequons to the bottom-up construction of synthetic glycosylation pathways. Overall, the classification of glycosylation systems by the topology, bond, and specificity of their polypeptide modification step is useful for synthetic biologists to design the site-specific attachment of diverse glycans to proteins, a key advantage of biosynthetic glycoprotein systems over purely chemical methods. 84 Here we describe the mechanisms of protein glycosylation systems found in various domains of life with the goal of defining the parts that they supply to engineers for the construction of synthetic systems.
OST-Dependent Glycosylation. N-linked OST-dependent glycosylation is the most well-studied type of glycosylation in both eukaryotes and prokaryotes. 75 Notably, most eukaryotic OSTs are composed of multiple subunits with the STT3 integral membrane protein forming the catalytic core. However, singlesubunit OSTs have been discovered in some parasites such as Trypanosoma 105, 106 and Leishmania 107 which contain multiple STT3-like proteins with distinct specificities. Similarly, bacterial OSTs, such as Campylobacter jejuni PglB (CjPglB), are generally   Here, we categorize protein glycosylation systems by the topology, chemical bond, and specificity of their polypeptide modification step. The specificity of the polypeptide glycosyltransferase (ppGT) is described by its reducing end sugar substrate requirements and its optimized minimal amino acid recognition motif (if known). The domains of life in which each ppGT naturally occurs is also listed. Enzymes that modify nitrogen within Arg side chains are commonly referred to as R-linked (*).
composed of a single subunit which is homologous to the STT3 catalytic domain of eukaryotic OSTs. 108 There is a strong topological resemblance between bacterial and eukaryotic OSTdependent glycosylation as they both involve the cytoplasmic construction of an LLO that is flipped into an oxidative compartment (the periplasm in bacteria and the endoplasmic reticulum (ER) in eukaryotes) before being transferred to an acceptor sequon. 75 The colocalization of both the LLO and the OST in the membrane means that polypeptide modification is only dependent on 2D diffusion and enables cotranslational modification in eukaryotic systems. The fact that bacterial OSTs are not as closely coupled to the translocon as eukaryotic OSTcomplexes, 109 makes bacterial OST-dependent glycosylation more dependent on structural context and generally requires the placement of glycosylation sites in flexible regions of the protein. 110 That said, recent studies suggest that the efficiency of glycosylation can be impacted by the secretory pathway (Sec or Tat) used to secrete the target protein, 50,82 indicating that glycosylation can also occur before complete folding in bacterial systems. Glycans transferred by OSTs are often complex, as they are first built up by multiple GTs on a lipid before transfer to a protein. Therefore, the en bloc transfer mechanism employed by OSTs have made OST-dependent glycosylation systems promising engineering methods to transfer large glycans structures.
In eukaryotes, GTs use nucleotide-activated forms of Nacetylglucosamine (GlcNAc) and mannose (Man) sugar donors to assemble a Man 5 GlcNAc 2 LLO that is linked to a dolichol pyrophosphate lipid on the cytoplasmic side of the endoplasmic reticulum (ER) membrane ( Figure 2). This LLO is then flipped into the ER lumen by a flippase enzyme, elaborated by GTs using dolichol-phosphate-linked Man and glucose (Glc) sugar donors, and then transferred to a nascent polypeptide chain by the OST. 75 Except in a few rare cases, 111 the sequon for N-linked OSTs in eukaryotes is N-X-S/T where N is the glycosylated asparagine and X is any amino acid except proline. 46,75 The glycan initially transferred by the OST from the LLO may be as complex as Glc 3 Man 9 GlcNAc 2 , but it is then processed in the ER and Golgi by glycosidases and GTs to create a myriad of structures that vary across protein identity, glycosylation sites on the same protein, cell type, disease state, and time 75,112 such that only a N-linked Man 3 GlcNAc 2 core structure is conserved among all N-linked eukaryotic glycans. 75 In humans, the Man 3 GlcNAc 2 core is generally elaborated by GTs utilizing nucleotide-activated sugar donors with GlcNAc, galactose (Gal), fucose (Fuc), and sialic acid (Sia) to form many branched, complex glycans resembling the biantennary, Nlinked glycan in Figure 2. 75 This dynamic process of glycan trimming and elaboration also serves as a protein proofreading system that directs misfolded proteins to a ER-associated degradation (ERAD) pathway. 113 Once thought to exist only in eukaryotes, OST-dependent, general glycosylation pathways are now known to be abundant and far more diverse in bacteria 9,10,75 and archaea. 11,114 For example, bacteria possess both N-and O-linked OSTs. Generally, bacterial glycans are assembled in the cytoplasm by GTs and then flipped into the periplasm before being transferred in their final form by the OST to an acceptor protein. The best characterized and most commonly engineered prokaryotic glycosylation system is from the bacterium Campylobacter jejuni 9, 75,115 in which an N-linked OST, called CjPglB, installs an N-linked heptasaccharide ( Figure 2). The glycosylation system in C. jejuni and many other bacteria are associated with virulence and host−pathogen interactions. 116 There are three key differences between bacterial and eukaryotic OST-dependent glycosylation systems that are important to keep in mind for engineering strategies. First, bacterial LLOs are generally assembled on undecaprenyl (rather than dolichol) pyrophosphate lipids and the glycan linked to this LLO is generally not extensively trimmed and elaborated once leaving the cytoplasm. Second, the simplicity of single-subunit bacterial OSTs make them easier to purify and recapitulate outside of natural systems and facilitates post-translational modification of folded proteins. 117 Finally, bacterial OSTs possess unique specificities for acceptor sequons and LLOs compared to eukaryotic OSTs. 108,118 Acceptor sequons for bacterial N-linked OSTs do resemble the eukaryotic N-X-S/T motif; however, some bacterial OSTs additionally require a negatively charged residue (D/E) at the X −2 position relative to the glycosylated asparagine. 59,85−87 For example, an optimized acceptor sequence, D-Q-N-A-T, has been identified for CjPglB 81 and has been implemented as a GlycTag to direct glycosylation to flexible regions of proteins of interest. 82 In terms of LLO specificity, bacterial N-linked OSTs are known to transfer a broader array of glycan structures than their eukaryotic counterparts, but they do still possess unique LLO specificities that limit the transfer of some glycans. For example, naturally occurring N-linked OSTs generally require acetylation at the C2 position of the reducing sugar. 105,106,108,118−121 Compared to N-linked OSTs, O-linked OSTs generally possess less stringent specificities for glycans and more stringent specificities for peptide acceptors. Three main classes of bacterial O-linked OSTs with clear applicability to synthetic glycobiology have been described: PilO, PglL, and PglS which were first identified in Pseudomonas aeruginosa, Neisseria meningitidis, and Acinetobacter baylyi, respectively. 122,123 Each of these classes are known to glycosylate pilin proteins within their native hosts. The acceptor sequences of the PilO from P. aeruginosa and the PglL from N. meningitidis have been reduced to GlycTags of a C-terminal TAWKPNYAPANAPKS 89 sequence and the so-called minimal optimal O-linked recognition (MOOR) motif WPAAASAP, 88 respectively. PglS glycosylation has only been demonstrated to target its native pilin-like ComP. 90 While these complex GlycTag sequence and structure requirements make it more difficult to direct glycosylation by O-linked OSTs onto recombinant proteins, these enzymes still hold great promise for engineering due to their promiscuity in the sugars that they can attach to proteins. 124 For example, PglS is the only OST known to be able to transfer LLOs with glucose at the reducing end 90 and PglL has been shown to transfer a single N′-diacetylbacillosamine from a nucleotide-activated sugar. 125 While archaea possess both N-and O-linked protein glycosylation systems, most research has been dedicated to the N-linked OST-dependent glycosylation systems in these organisms. Interestingly, archaeal N-linked OST-dependent systems use both dolichol-phosphate and dolichol-pyrophosphate LLOs, attach a greater variety of sugars than bacteria and eukaryotes, and are even known to attach multiple distinct Nglycans to defined positions onto a single protein. 126,127 While the diversity of tools offered by archaeal glycosylation systems holds great theoretical potential for biosynthesis, the difficulties associated with culturing and manipulating these organisms has prevented the engineering of those systems until very recently. 128−130 Several previous works provide systematic descriptions of archaeal glycosylation systems 44,45,114 and the full diversity of known prokaryotic protein glycosylation systems. 44 OST-Independent Glycosylation. Much progress has been made in the last two decades in elucidating the diversity, importance, and utility of OST-independent glycosylation systems in both eukaryotes and bacteria ( Figure 2). For synthetic glycobiology, OST-independent pathways provide three key advantages that make them complementary to OSTdependent systems. 78 First, most OST-independent systems do not require lipid-associated GTs or sugar donors, making them easier to synthesize and manipulate outside of their native hosts. Second, OST-independent systems generally do not require transporting target proteins or sugar donors across membranes, enabling the synthesis of glycoproteins in the cytoplasm of Escherichia coli. 131 Third, OST-independent systems install sugars in a stepwise fashion by sequentially transferring monosaccharides from sugar donors, allowing for greater modularity and freedom of design that is unconstrained by OST specificities for LLOs. Compared with OST-dependent pathways, OST-independent pathways are more diverse in their topologies, sugar constituents, and possible amino acid linkages (including Asn, Arg, Thr, Ser, Tyr, hydroxylysine, hydroxyproline, Trp, and Cys). Several systematic reviews 44,46 and useful visualizations 76,77 of the diversity of glycosylation systems are available. Glycosylation systems of greatest interest to synthetic glycobiology are discussed below, including: O-GalNAc (mucintype) glycosylation, O-GlcNAc glycosylation, glycosaminoglycan (GAG) biosynthesis, cytoplasmic bacterial glycosylation systems (such as N-glycosyltransferases or NGTs), and bacterial effector toxin GTs (Figure 2).
The most characterized OST-independent pathway is the O-GalNAc glycosylation system found in higher eukaryotes that modifies Ser and Thr residues of proteins. 46 In humans, a family of 20 polypeptide N-acetylgalactosaminyltransferases (Gal-NAcTs) located in the ER and Golgi utilize nucleotide-activated sugar donors to glycosylate Ser and Thr residues on specific protein substrates, including the extensively modified mucin family of glycoproteins. 46,132 A combination of quantitative glycoproteomics and genetic knockouts 91−93 as well as in vitro characterization methods 80,94,95 have revealed that these GalNAcTs possess unique, but partially overlapping polypeptide acceptor specificities that depend on primary amino acid sequence, presence of nearby glycans, colocalization in the Golgi, and protein structure. These unique specificities provide cells with the ability to dynamically control the glycoproteome 91−93,133,134 and present synthetic glycobiologists with a diverse toolkit to construct glycoproteins. After initiation by GalNAcTs, O-GalNAc residues are often sequentially elaborated to a wide variety of structures containing Gal, Sia, GalNac, Fuc, and GlcNAc 46,76 that play critical roles in human biology and can affect protein stability, 8 proteolytic processing, 134,135 immunogenicity, 89 and trafficking. 133,136 The synthesis of O-linked glycosaminoglycans (GAGs) also takes place within the ER and Golgi of higher eukaryotes. GAGs are long, linear polysaccharides that form the glycan moieties of proteoglycans found on cell surfaces or secreted into the extracellular matrix. GAGs modulate cell-signaling, tissue growth, cytokines, and chemokines, but much of the interest in GAGs for engineering has been due to the anticoagulant properties of heparin sulfate (a GAG structure) which binds to and activates antithrombin. 137 GAG synthesis is initiated by one of up to two O-xylosyltransferases (O-XylTs) whose specificities are not fully understood, but are known to prefer serine residues immediately flanked by glycines with nearby acidic residues in the X −2 to X −4 positions. 96 This xylose (Xyl) residue is then sequentially elaborated by three GTs producing a tetrasaccharide linker of the form glucuronic acid (GlcA)-β1,3-Gal-β1,3-Gal-β1,4-Xyl-β1-O-Ser where the proximal Gal residue must be phosphorylated by a glycan-modifying enzyme to permit extension. 138 This linker can then be further extended to form heparan sulfate, chrondroitin sulfate, or dermatan sulfate, which are composed of sulfated disaccharide repeat units of (GlcNAc-α1,4-GlcA-β1,4-), (GalNAc-β1,4-GlcA-β1,3-), and (GalNAc-β1,4-IdoA-β1,3-), respectively, 138 where IdoA is iduronic acid. There are two other GAG structures synthesized in vertebrates: keratan sulfate (which can be linked to oligosaccharide N-linked glycans, O-GalNAc-type glycans, and single O-Man residues) and hyaluronic acid (which is not covalently attached to proteins). 137 Glycosylation machinery producing GAG or GAGlike polymer backbones has also been discovered in bacteria, providing promising enzymes for GAG synthesis in microbes, particularly when exact sulfation patterns are not required. 139 Eukaryotes also possess a soluble O-linked N-acetylglucosamine transferase (OGT) which installs GlcNAc moieties onto Ser and Thr residues of diverse target proteins, playing important roles in stress response and disease states including cancer, diabetes, and neurodegeneration. 140−145 The three splice variants of OGT in humans, sOGT, ncOGT, and mOGT, are found in the cytoplasm, nucleus, and mitochondria, respectively. 146 The OGT glycosylation system is somewhat unique because its polypeptide modification step is regularly reversed by the O-GlcNAcase (OGA) enzyme which removes O-GlcNAc residues installed by OGT. 146 The dynamic interplay between OGT and OGA as well as protein kinases and phosphatases for occupation of Ser and Thr residues allows cells to modulate complex signaling cascades. 146 Many structural, proteomic, and biochemical studies have endeavored to characterize the peptide acceptor specificity of OGT, revealing a complex set of rules and interactions that determine O-GlcNAc modification. 80,97,147−149 An optimal recognition motif of PPVSR has been identified; 97 however, the complexity of O-GlcNAc recognition means that the modification of a given sequence still requires empirical measurement or at least the application of computational techniques, reviewed here. 150 The promiscuity of OGT for azido-sugars or the derivatization of O-GlcNAc sugars with azido-sugars has been exploited to learn much about the functions of these systems in their native cellular contexts, 151−154 reviewed here. 155 In addition to its O-GlcNAc transferase activity, OGT is also known to catalyze the addition of O-linked glucose 151 and S-linked GlcNAc 156 as well as the proteolytic cleavage of the human protein HCF-1. 157 Recently, several N-and O-linked glycosylation systems that function in the bacterial cytoplasm have been discovered. These systems often glycosylate extracellular adhesion and autotransporter proteins that facilitate adherence of pathogenic bacteria to human cells. 158−160 N-glycosyltransferases (NGTs) are one such class of enzymes that have been recently characterized 98,159,161−169 and have elicited great interest from the glycoengineering community for their ability to initiate N-linked glycosylation in the bacterial cytoplasm when heterologously expressed in E. coli. 49,78,80,83,131,160,170−175 NGTs bear structural homology to eukaryotic OGTs, but were first identified as part of an extracellular adhesion operon in Haemophilus influenzae, 167 founding a new functional class of GTs that install monosaccharides onto asparagine residues in the cytoplasm using UDP-Glc or UDP-Gal as soluble sugar donors. 169 In some ACS Synthetic Biology pubs.acs.org/synthbio Review species, the single glucose residues installed by NGTs are extended into a dextran polymer by a glucose polymerase (α 1,6 GlcT). 98 Despite their lack of homology to OSTs, NGTs share the same general acceptor motif, N-X-S/T. 98 Rigorous Much effort has been directed toward knocking out or supplementing GTs and enzymes involved in sugar donor metabolism to tune glycosylation structures and produce more homogeneous structures. 28,192 More dramatically, a highly simplified trisaccharide glycan known as GlycoDelete has been generated using these methods. 53 Remodeling mammalian pathways have also generated libraries of cells displaying various glycosylation structures. 138,193 (b) Insect cell and insect cell-based baculovirus glycosylation systems have been remodeled to obtain full-length bianntenary N-linked glycans without α1,3 fucose residues. 194   ACS Synthetic Biology pubs.acs.org/synthbio Review characterization of the acceptor specificity of NGTs using glycoproteomics and in vitro as well as cell-free methods 49,80,168,171,176 has illuminated detailed rules for the prediction and design of sequons for various NGTs. So far, the NGT from Actinobacillus pleuropneumoniae (ApNGT) has been the most extensively characterized and most often used for glycoengineering efforts, 78 discussed below.
Other OST-independent glycosylation systems that also act on adhesions and autotransporters but have little homology to NGTs, continue to emerge and may be of interest for future applications in synthetic glycobiology. For example, the Olinked autotransporter heptosyltransferase (BAHT) GTs which glycosylate autotransporter proteins with heptose residues in Gram-negative bacteria have been shown to target a 13 amino acid structural motif that could be used to direct modification for glycoconjugate vaccines. 78,99,177 Another O-linked cytoplasmic glycosylation system initiated by a dimeric GT called GtfA-GtfB modifies serine-rich repeat (SRRP) adhesion proteins with αlinked GlcNAc in streptococci and staphylococci bacteria, has been shown to modify a 25 amino-acid tag and could provide methods to display various glycans on bacterial surfaces. 78,178,179 Finally, effector GT toxins that are secreted into host cells by bacteria to facilitate infection and pathogenesis may provide GTs of interest for synthetic systems. 18,180,181 For example, Olinked effector glucosyltransferases from Clostridium and SetA from Legionella have recently been characterized and used to modify recombinant proteins using nine amino acid (YAPTVF-DAY) 101 and seven amino acid (GKTTLTA) 102 GlycTag sequences, respectively. Other arginine (R)-linked effector Nacteylglucosaminyltransferases, SseK in Salmonella or NleB in E. coli and Citrobacter rodentium, modify eukaryotic proteins involved in metabolism and cell signaling. 103,182 However, these R-linked effector GTs as well as the R-linked EarP glycosyltransferase that modulates polyproline synthesis by modification of EF-P in Neisseria, Pseudomonas, and Shewanella, 104 appear to be dedicated to the modification of a single or a few substrates and are of greater interest for antibiotic intervention 183 than use in synthetic protein glycosylation systems.

■ SYNTHETIC GLYCOSYLATION SYSTEMS
In this section, we describe key paradigms and examples of how the parts of synthetic glycobiology outlined above have been assembled, repurposed, and engineered to produce glycoproteins. Because the host organisms in which these glycosylation pathways are constructed strongly affect their challenges, advantages, and applications, we describe examples of synthetic glycosylation pathways developed in mammalian, insect, plant, yeast, and bacterial cells, cell-free, and chemoenzymatic backgrounds. This order represents a spectrum from the remodeling of natural systems that already function similarly to human glycosylation pathways where genes must generally be knocked out to obtain structures generally desired for therapeutics (eukaryotic systems shown in Figure 3) to the bottom-up construction of highly engineered synthetic glycosylation systems where many new parts must be assembled (bacterial, cell-free, and chemoenzymatic systems shown in Figure 4).
Synthetic Glycosylation Systems in Mammalian Cells. Despite the many efforts to characterize and harness microbial protein glycosylation systems during the last two decades, the majority of glycobiology and glycoengineering efforts still focus on mammalian systems. Nearly all glycoprotein therapeutics are currently produced at the industrial scale in Chinese Hamster Ovary (CHO) cells, 184 due in large part to the similarity between CHO glycosylation structures and those in the human body. 185,186 The importance of the glycosylation structure located at Asn297 on the constant region of human immunoglobulin G (IgG) antibodies for antibody-dependent cell-mediated cytotoxicity (ADCC), protein trafficking, and circulation time, 187,188 make the N-linked glycosylation systems in mammalian cells the most extensively studied and engineered protein glycosylation systems. For decades, glycosylation patterns in CHO cells have been closely monitored and controlled during development and production of protein therapeutics through the use of specific culture conditions and proprietary cell lines. 186,189 The first methods for genetically controlled glycosylation in CHO cells were based on lectin screens that identified random mutants of cultured mammalian cells. 60 However, the advent of improved gene editing strategies and increased knowledge of glycosylation pathways have substantially increased the ability to genetically define glycosylation structures in mammalian cells by the introduction of new glycosylation sites 24 as well as the knock-in and knockout of specific glycosylation related genes such as GTs, metabolic enzymes, and glycosidases. 190,191 A key aim for genetic glycoengineering of mammalian cell lines has been to produce more homogeneous glycosylation patterns. Proteins derived from natural systems and nonengineered cell lines are generally composed of a heterogeneous mixture of glycosylation structures. This heterogeneity complicates drug approvals and the optimization of glycosylation structures for desired purposes. 198 Three exemplary engineering studies have recently addressed this problem by engineering CHO cells to produce more homogeneous glycans. The first study knocked out MGAT1, which adds a β1−2 linked GlcNAc to the α1−3 arm of the trimannose core, and introduced an endoglycosidase (EndoT) to truncate multiantennary human glycans to a single GlcNAc residue which can then be elaborated to a much more homogeneous Siaα2−3-Galβ1−4-GlcNAc trisaccharide (called GlycoDelete). 53 While  237 the rigorous characterization of ppGT specificities, 49,80 the rapid discovery of new synthetic glycosylation pathways, 174 and the on-demand production of glycosylated therapeutics and vaccines by cell-free glycoprotein synthesis (CFGpS). 238,239 (c) Chemoenzymatic methods have been developed to install full-length human glycans. Primary strategies include: (i) endoglycosidase-mediated transglycosylation 206 for remodeling glycans produced in yeast or CHO cells; (ii) enzymatic "tag and modify" approaches which use engineered bacteria or purified enzymes to install O-linked GlcNAc, 240 Nlinked GlcNAc from an exoglycosidase-treated C. jejuni heptasaccharide, 241 N-linked Glc installed by NGT, 170 or an N-linked GlcNAc installed by NGT and acetyltransferase GlmA 172 which can then be elaborated to full-length N-linked glycans using transglycosylation; (iii) chemical "tag and modify" methods that directly modify cysteine or noncanonical amino acids within proteins to install glycan handles that can be further elaborated by transglycosylation; 242−244 and (iv) total chemical synthesis approaches that use solid phase-peptide synthesis to directly incorporate glycosylated amino acids into peptides which can then be linked together using native chemical ligation approaches. 25,41,245 ACS Synthetic Biology pubs.acs.org/synthbio Review this structure cannot fully recapitulate the ADCC binding of fulllength human glycans, it does promise to simplify approval of antigen-neutralizing antibodies. Another pair of studies 28,192 have used large zinc-finger nucleases and CRISPR-Cas9 genetic editing libraries to strategically introduce GT knockouts and knock-ins to achieve more homogeneous, full-length, humanlike glycosylation structures in CHO cells for applications in IgGs and enzyme replacement therapies. While the engineering of N-linked glycoproteins has received the most attention in mammalian systems, platforms have been developed to produce and display O-linked glycoproteins using mammalian cells. There is gathering evidence that O-GalNAc glycosylation structures can be important for glycoprotein therapeutic efficacy. 199−201 However, the engineering of O-GalNAc pathways in mammalian cells thus far has been primarily limited to the development of research tools to study natural glycosylation pathways. 91,202 Another area of research has involved the display of N-linked, 193 O-GalNAc, 193 and GAG 138,193 pathways on the surface of mammalian cells. These cells can then be used to study the function of glycosylation biosynthesis genes and to characterize the biological function and properties of certain glycosylation structures. Now that these research tools and the design rules they have generated are established, it is expected that future glycoengineering efforts will involve greater engineering of Olinked glycoproteins.
Despite advances in the engineering of mammalian glycosylation systems, limitations remain in the variety of glycosylation structures that can be generated in these systems (due to the limited set of nucleotide sugars and the inability to knockout some essential glycosylation pathways while maintaining cell viability), the ability to obtain homogeneous products, and the high cost and development time associated with mammalian cell culture. 78,203−206 These limitations have led to the exploration of alternative organisms and the construction of synthetic glycosylation pathways, described below.
Synthetic Glycosylation Systems in Insect Cells. Insect cells lines (S2, High Five, and Sf9 derived from Drosophila melanogaster, Trichoplusia ni, and Spodoptera f rugiperda, respectively), as well as insect-based baculovirus expression vector systems (BEVSs), have long been of interest for the production of glycoproteins as they have the potential to offer more flexibility in glycosylation system design and lower costs than mammalian cells. 194 Though the vast majority of biologics are made in CHO cells, two vaccines protective against cervical cancer and influenza, as well as an adenovirus gene therapy treating familial lipoprotein lipase deficiency produced in insect cells have been already approved for clinical use. 194 Thus, it is possible that the glycoengineering of insect cells could unlock the production of traditional protein therapeutics in this desirable expression host. While insect cells do contain sufficient enzymatic machinery to produce full-length sialylated Nglycans, the reliable production of human-like glycoproteins generally requires several glycoengineering strategies (reviewed here 194,207 ) including the knockout of the β-hexosaminidase FDL; inhibition of endogenous α1−3 fucosylation machinery; 208 and addition of machinery to install GlcNAc, 209 Gal, 210 and sialic acids 211 onto the N-linked Man 3 GlcNAc 2 core. Olinked glycosylation has not yet been extensively engineered in insect cells; however, insect cells do contain the endogenous machinery to make human-like O-GalNAc glycans. 190 While BEV systems obtain high-yields and enable faster production and development timelines, they present other challenges including genetic instability as well as the additional process complexity and contamination risk associated with using a live virus. 194 Improvements in genetic engineering methods may enable further customization of stable insect cell lines and expedite glycoengineering efforts, thereby increasing the reliability and adoption of insect-cell based systems for glycoprotein production. 194 Synthetic Glycosylation Systems in Plants. Plants may offer a promising low-cost glycoprotein manufacturing host that is more compatible with distributed manufacturing than traditional fermentation-based production methods. 195 Plants can generally produce correctly folded human proteins and contain similar glycosylation systems to those found in mammalian cells. Despite containing nonhuman glycan modifications, an approved enzyme replacement therapy, glucocerebrosiase (taliglucerase alfa), is currently produced in carrots. 212 However, it is likely that the wide adoption of plantbased glycoprotein therapeutic production will require glycoengineering plant cells to humanize their glycosylation patterns. 195 Notably, the analogous glycosylation pathways in plants are considerably simplified compared to mammals. There is no O-GalNAc glycosylation in plants and N-glycans generally terminate with N-linked Man 3 GlcNAc 2 that may be modified with bianntenary GlcNAc residues. 195 These simplified pathways and the apparent tolerance of plants for heterologous glycosylation pathways offer excellent opportunities for de novo construction of desired glycosylation systems with a freedom of design and homogeneity that may be more difficult to achieve in mammalian systems. 195 Thus far, glycoprotein engineering in plants (reviewed thoroughly here 195,213 ) has focused on (i) ensuring homogeneous expression of N-linked GlcNAcylated trimannose by removal of β-hexosaminidases; 214 (ii) the removal of nonhuman sugar linkages including β1−2 Xylose, α1−3 Fucose, 215 arabinosylated hydroxyproline, 216 and Lewis A structures; 217 and (iii) the addition of metabolic machinery and human GTs to obtain human-like, sialylated N-and Oglycans. 218−222 Similarly to the GlycoDelete strategy in mammalian cells, plants were also recently engineered to generate a minimal trisaccharide. 223 The end result of these works is the ability to produce glycoprotein therapeutics in a number of model plant and plant cell systems (such as Nicotiana bethamiana, Arabidopsis thaliana, and Nicotiana tabacum) with highly similar glycosylation to mammalian systems. 195 Key remaining challenges lie in the optimization of homogeneity and production levels without affecting plant fitness and control of potentially immunogenic nonhuman hydroxylproline modifications. 190,195 Synthetic Glycosylation Systems in Yeast. Due to its low fermentation costs, fast doubling time, ability to secrete products at high titers, and genetic tractability, yeast strains are in widespread use in industrial biotechnology to produce small molecules as well as approved protein therapeutics, including insulin and glucagon. There have been many efforts to expand yeast production methods (usually in the strains Pichia pastoris and Saccharomyces cerevisiae) to glycoprotein therapeutics in academia and industry. While early steps in the N-glycosylation pathways of yeast and mammalian cells are topologically similar, yeast lack much of the machinery to trim down and elaborate the mannose glycans transferred by the OST that is required to arrive at human-like bianntenary glycans terminated in sialic acid (see Figure 2). 196 Furthermore, essential O-linked glycosylation pathways in yeast and mammalian cells are very different, ACS Synthetic Biology pubs.acs.org/synthbio Review constructing mannose chains rather than mucin-type O-GalNAc glycans. 224,225 As in insect and plant-based systems, yeast glycoengineering efforts (reviewed here 196,197 ) have focused on the removal of endogenous machinery producing potentially immunogenic glycosylation structures and knocking in heterologous glycosylation enzymes to construct human-like glycan motifs. Specifically, the hypermannosylation of N-glycans can be removed by the knockout of mannosyltransferases 226 and Omannosylation can be partially reduced (but not fully eliminated) by knockout of PMT genes and addition of small molecular inhibitors. 227 A combinatorial approach was used by Gerngross and colleagues to knock in mannosidases as well as human galactosyltransferases and sialic acid installation machinery in order to create "humanized" yeast that can, in some cases, produce homogeneous, sialic acid-capped, humanlike N-glycans on protein therapeutics. 203,226,228−230 Humanlike O-GalNAc pathways have also been introduced into yeast. 231,232 Interestingly, the introduction of the STT3D OST from Leishmania major into yeast successfully increased Nglycan occupancy, likely by augmenting the endogenous yeast OST activity and specificity. 233 While yeast-based glycoprotein production systems have continued to receive significant investment and are nearing commercialization, some concerns remain regarding the presence of O-mannosylation structures that cannot be eliminated while maintaining cell viability, and FDA approval of molecules produced in glycoengineered yeast platforms has not yet occurred. 196 Synthetic Glycosylation Systems in Bacteria. Since the functional recapitulation of the C. jejuni N-glycosylation system in E. coli, 115 the field of bacterial glycoengineering has grown rapidly. 204 Laboratory E. coli strains lack native glycosylation machinery, 204 providing a blank canvas for the modular construction and control of glycosylation pathways. This bypasses the heterogeneity and design limitations imposed by the endogenous and often essential glycosylation pathways of eukaryotic expression systems for the production of novel and homogeneous glycoforms. 204,246 As bacterial glycoengineering continues to advance, it is now possible to imagine developing E. coli as a low-cost, high-titer, and fast-growing expression host to produce glycoprotein therapeutics, 185,204,247−249 motivating the development of new synthetic glycosylation systems and biosynthetic parts for the construction of therapeutically relevant glycans in bacteria. 204 Most bacterial glycoengineering efforts so far have focused on the use of the bacterial OSTs to transfer glycans in living E. coli by hijacking its lipopolysaccharide (LPS) synthesis system 115,119 (Figure 4). E. coli and many other bacteria naturally synthesize LPS by building diverse polysaccharide structures on LLOs within the cytoplasm which are then flipped into the periplasm by the flippase Wzx. 250 The sugar structures on these LLOs can then be polymerized by the enzyme Wzy to form a larger undecaprenyl-linked O-antigen. This O-antigen is then transferred onto a lipid A carrier by the enzyme WaaL before being displayed on the outer membrane. 250 This process can be engineered in laboratory strains of E. coli by heterologously expressing an LLO biosynthesis pathway and a bacterial OST. This OST will transfer glycans from these LLOs onto target proteins bearing GlycTag acceptor sequences. 119 This process can be optimized by knocking out WaaL in the host strain 119 so that LLOs accumulate on the periplasmic membrane.
This strategy for constructing synthetic OST-dependent glycosylation systems has proven to be a powerful technology, enabling the site-specific installment of diverse glycans onto diverse heterologous proteins both in vitro and in vivo. 122 By overexpressing different naturally occurring or synthetic bacterial O-antigen biosynthesis gene clusters, a wide variety of glycans can be installed using this method. For example, a single study demonstrated the transfer of nine unique glycans by the bacterial O-linked OST PglL. 124 Due to the inherent compatibility of bacterial O-antigen pathways with this system and the somewhat relaxed sugar specificity of bacterial OSTs, most applications of OST-dependent bacterial glycosylation systems have sought to synthesize vaccines against pathogenic bacteria, with vaccines against Shigella and E. coli in clinical trials. 122 The discovery and engineering of N-linked OST variants with greater promiscuity for acceptor sequons 59,86,87 (not requiring a negatively charged residue at the X −2 position) or LLO donors 59 (not requiring an acetyl group at the C2 position of the reducing sugar) has expanded the set of glycoproteins that can be generated using this strategy. In pioneering work, the eukaryotic core Man 3 GlcNAc 2 glycan has also been successfully transferred by overexpressing part of the yeast LLO biosynthesis pathway, 234 opening the door to the production of glycoproteins with human-like glycosylation. Unfortunately, even after optimization, 251 current bacterial Nlinked OSTs still exhibit low turnover rates with LLOs containing the GlcNAcβ1,4GlcNAc chitobiose core (found in all eukaryotic N-linked glycans) at the reducing end. 118 Future protein engineering and phylogenetic screening efforts are expected to reveal new N-linked OSTs that can enable the more efficient synthesis of eukaryotic glycoproteins using bacterial systems.
OST-independent glycosylation systems such as NGTs, OGTs, and GalNAcTs have been far less explored for bacterial glycoengineering than OST-dependent systems. As previously described, the stepwise and lipid-independent nature of these systems may provide complementary technologies to OSTdependent techniques. 78 NGTs are particularly promising glycoengineering tools because they are the only known cytoplasmic enzyme class capable of installing glycans onto asparagine residues at eukaryotic-like N-X-S/T sequons. 44,46,168,169 For example, ApNGT has been functionally expressed in E. coli where it was found to glycosylate several autotransporter proteins, some native E. coli proteins, and recombinant human erythropoietin (EPO). 168 Other studies have developed short, optimized GlycTag sequences for NGT 80 and have shown that the modification of a target protein with these GlycTags (such as GGNWTT) can successfully direct efficient NGT glycosylation of diverse recombinant proteins in vivo and in vitro. 80,160,173 Later studies have found that the single glucose residue installed by NGT can be elaborated to a dextran polymer 160 (which could be useful for vaccines against pathogenic bacteria that use NGTs to adhere to human cells), polysialic acids 131 (which may prolong the serum-half-life of small therapeutic proteins), N-acetyllactosamine (Lac-NAc), 174,175 and other fucosylated and sialylated forms of lactose 174,175 by overexpression of elaborating GTs within the cell. 131 This sequential elaboration technique may also allow an NGT-based system to circumvent the limits on glycan structure found in OST systems. However, the inability of NGTs to utilize UDP-GlcNAc or UDP-GalNAc sugar donors has complicated their application to the production of authentic N-linked and Olinked human glycans which have GlcNAc and GalNAc as their reducing end sugars, respectively. Thus far, naturally occurring and engineered NGTs have been shown to utilize UDPglucosamine (GlcN), 171 UDP-Glc, UDP-Gal, UDP-Xyl, GDP-ACS Synthetic Biology pubs.acs.org/synthbio Review Glc, and GDP-Man. 169,252 The discovery or engineering of NGTs capable of transferring these acetylated sugars remains an active area of research. 80,171,252 Aside from NGTs, human O-GalNacTs and OGTs have also been transferred to E. coli in order to produce glycoproteins in bacterial systems. Specifically, GalNAcT2 has been transferred to E. coli with oxidizing cytoplasms to enable modification with O-GalNAc. 253 This system was later improved to enable the modification of proteins with Core 1 (Gal-GalNAc-Ser/Thr) within cells. 235 O-GlcNAc modified proteins have also been produced in E. coli by coexpression of OGT with a target protein. 236 Cell-Free Synthetic Glycosylation Systems. Cell-free protein synthesis (CFPS) systems use cell lysates, amino acids, nucleic acids, and cofactors to produce proteins without intact cells. 2,254 First used to decipher the genetic code in the 1960s 255 and throughout the late 20th century for fundamental biology studies, 256,257 E. coli crude lysate based-CFPS technologies experienced a technical renaissance in the mid-2000s 254,258−260 with the ability to use less costly reagents, 261 sustain synthesis for days, 262 produce protein in g/L quantities, 263,264 and make far more diverse products including integral membrane proteins, 265 268 CFPS reactions are scalable over 6 orders of magnitude. 254 The compatibility of CFPS with 96-well plates, liquid handling robots, and microfluidic platforms provides an attractive high-throughput protein expression platform. 254,289 While no FDA-approved protein therapies have been made in CFPS so far, cell-free systems still hold great promise for glycoengineering because they serve as an intermediate point between bacterial systems and completely purified in vitro synthesis, enabling the production and study of complex biological molecules with greater control and simplicity of handling. Although certain CFPS systems based on mammalian cell lines allow for some level of glycosylation that can be increased by the addition of microsomes, 290−293 CFPS systems based on bacterial lysates (the most well-described, economically viable, and highest-yielding CFPS system) were unable to produce glycoproteins until recently.
Bacterial cell-free protein glycosylation systems introduce glycosylation machineries from across the domains of life into bacterial lysates. In 2011, the first bacterial cell-free glycoprotein production system was developed by adding purified CjPglB and LLOs to a completed E. coli-based CFPS reaction. 294 Building upon this work, a single-pot, Cell-free Glycoprotein Synthesis (CFGpS) platform was developed that simultaneously synthesized and glycosylated target proteins in vitro. 239 In this study, CFGpS was used to install a variety of glycans including the C. jejuni heptasaccharide and the eukaryotic core Man 3 GlcNAc 2 onto glycoproteins by overexpressing plasmids encoding CjPglB and the LLO biosynthesis pathways in the bacterial chassis strain before lysis and then expressing the target protein in CFPS reactions containing these lysates. 239 This all-in-one CFGpS platform has recently been used to synthesize a variety of glycoconjugate vaccines using freeze-dried lysates that can be rehydrated at the point-of-care by overexpressing various bacterial O-antigen gene clusters. 238 Whereas the CFGpS method utilizes enzymes and LLOs synthesized in living cells to produce preparative quantities of glycoproteins in vitro, other efforts in cell-free systems have sought to use the flexibility and throughput of CFPS to better understand and engineer synthetic glycosylation pathways. For example, one study overcame the difficulties associated with expressing OSTs (which are integral membrane proteins containing with 13 transmembrane helices) in living bacterial cells by expressing several active bacterial N-linked OST homologues in CFPS by supplementing extracts with protein−lipid nanodiscs. 237 Other works have focused on the development of OST-independent cell-free glycosylation systems based on NGTs, OGTs, GalNAcTs, etc. to completely decouple glycosylation pathway construction from living cells by using enzymes generated in CFPS to build glycans step-by-step from sugar donors. A recent study in OST-independent glycosylation systems used CFPS and high-throughput mass spectrometry of self-assembled monolayers to develop a platform for Glycosylation Sequence Characterization by Rapid Expression and Screening (GlycoSCORES). 80,83 Glyco-SCORES has been used to rigorously characterize the acceptor sequence specificity of NGTs, GalNAcTs, and human OGT and then leverage this information to design GlycTags that were more efficiently modified by ApNGT than naturally occurring glycosylation sites, both in vitro and in the E. coli cytoplasm. The GlycoSCORES method has also been adapted to analyze intact glycoproteins, enabling the high-throughput synthesis and analysis of target protein variants with glycosylation sites at different positions. 49 While GlycoSCORES enabled optimization of the initiating step of glycosylation, CFPS has also been used to develop a method for multienzyme Glycosylation Pathway assembly by Rapid In vitro Mixing and Expression (GlycoPRIME). 174 The GlycoPRIME system uses CFPS to enrich crude bacterial lysates with GTs which are then combined in a mix-and-match fashion to construct new glycosylation pathways. In this way, 37 putative synthetic glycosylation pathways initiated by ApNGT were rapidly tested in vitro, leading to the development of biosynthetic routes to 23 distinct glycosylation structures. These pathways were then translated to the cytoplasm of living bacteria to produce sialylated IgG Fc or to a one-pot CFPSdriven CFGpS system where all enzymes and the target protein were simultaneously synthesized in vitro. 174 The continued development of cell-free glycosylation systems will enable new applications in GT characterization and engineering, biosynthetic pathway prototyping, and on-demand production of therapeutics and vaccines.
Chemoenzymatic Protein Glycosylation Methods. While biosynthetic methods for glycoprotein production can be operated at large scales and take advantage of endogenous protein synthesis machinery, they often result in heterogeneous mixtures of various glycoforms. These heterogeneous mixtures complicate structural and functional studies as well as the characterization and approval of therapeutics. 206 To address this problem, many chemical and chemoenzymatic synthesis strategies have been developed to produce structurally homogeneous glycoproteins. This section discusses key methodologies employing chemical synthesis methods for glycoprotein research and production. The reader can find more detailed reviews elsewhere. 84,206 One way to synthesize homogeneous glycoproteins is to remodel native glycan structures, typically N-glycans, found on recombinantly produced proteins (usually derived from CHO or yeast cells). Glycans can be "polished" in vitro by adding exoglycosidases and/or GTs 295 to edit glycans in a user-defined way. An advantage of performing these polishing steps in vitro is ACS Synthetic Biology pubs.acs.org/synthbio Review the ability to incorporate abiological or modified sugar monomers or PEGylation as a strategy for functionalization. 296,297 However, achieving homogeneous, human Nglycosylation structures generally requires that the native glycan is enzymatically trimmed to the reducing end GlcNAc residue and then built back up to create the desired uniform structure using glycosyltransferases to sequentially add sugars 298 or by transferring a chemically synthesized glycan en bloc using an endoglycosidase. 299 Specifically, a class of endoglycosidases called endo-β-N-acetylglucosaminidases (ENGases) that naturally cleave N-glycans from proteins between the reducing end GlcNAcs have been repurposed to catalyze the reverse reaction to form a glycosidic bond between the released N-glycan and the GlcNAc residue on the protein. 299 One particular benefit of this synthetic method is the conservation of the native sugar linkages. This technology, known as transglycosylation, has become an increasingly efficient synthesis strategy through the use of synthetic sugar oxazolines as improved glycosyl donors 300,301 and the discovery of mutant ENGases with more specific activities. 302,303 A similar, but more "bottom-up" application of the transglycosylation approach (i.e., enzymatic "tag and modify") is to obtain the protein-linked monosaccharide substrate for ENGases from bacterial cells or directly from an in vitro enzymatic reaction rather than by truncating a eukaryotic Nglycan. For example, CjPglB can be used to install a single N-GlcNAc (using synthetic lipid substrates 121 or trimming down a larger glycan installed in living E. coli 241 ) which is then elaborated to a eukaryotic glycan using transglycosylation methods. Transglycosylation has been used to elaborate a protein-linked O-GlcNAc residue installed by OGT in living bacteria 240 and a peptide-linked N-Glc installed by ApNGT to generate eukaryotic-like N-glycans. 170 A variation of this method using an engineered NGT (ApNGT Q469A ) to install GlcN along with an acetyltransferase (GlmA) enabled the synthesis of an authentic human N-linked glycopeptide with GlcNAc at the reducing end. 172 The discovery of NGT homologues with unique and conditionally orthogonal peptide acceptor specificities combined with transglycosylation strategies has recently enabled the sequential, site-specific installation of multiple  309 and increasing the circulation times of enzyme replacement therapeutics by precise manipulation of terminal glycosylation structures, 28 erythropoietin by introduction of additional glycosylation sites, 24 and Factor IX by GlycoPEGylation. 296,310 (b) Synthetic glycosylation systems have produced bacterial, fungal, and viral vaccines carrying glycan epitopes specific for these infectious diseases; 311,312 cancer vaccines carrying tumor-associated carbohydrate antigens; 313 as well as protein and nanoparticle vaccines adjuvanted by glycan structures such as the αGal motif. 22,314−318 (c) Synthetic glycosylation systems have also been used to generate diagnostic assays to detect bacterial infections 122 and cancer. 319 (d) Finally, glycoengineering has enabled the production of functional glycomaterials including biomaterials that control and promote tissue growth, 64,66,320 self-assembling glycopeptides that form nanofibers 321,322 and bind to galectins, 65 and virus-like particle vaccines. 175,323 Engineering of the glycocalyx as a glycomaterial by overexpression of mucin proteins has generated mammalian expression hosts with decreased aggregation. 324 ACS Synthetic Biology pubs.acs.org/synthbio Review distinct glycans on a single target protein. 83 While further efforts are needed to enhance efficiency of such an approach, this advances a new concept for synthesizing defined glycoproteins for research and therapeutic applications. The incorporation of specific natural and noncanonical amino acids at desired glycosylation sites can also provide chemical handles for modification of proteins. This chemical "tag and modify" strategy has been used with a wide variety of chemistries and reactive amino acids, including cysteine residues or noncanonical amino acids carrying azide−alkyne click chemistry handles. 242 In one particularly compelling example, dehydroalanine (Dha) residues inserted using an orthogonal translation system in E. coli were harnessed to generate stabilized radicals that could be used to introduce many post-translational modifications including both N-and O-linked GlcNAc residues that differ only by one carbon from natural structures. 6 The modification of both natural and noncanonical amino acid handles has permitted the site-specific installation of multiple distinct glycans. 304 In addition to modifying recombinant proteins, these glycan remodeling tools can be interfaced with chemical peptide synthesis methods. For a few glycoproteins, complete chemical synthesis of homogeneous glycoproteins has been demonstrated using ligation and modification of peptides produced by solidphase peptide synthesis (SPPS). 25,245 The types of glycans that can be generated by chemical or chemoenzymatic synthesis have been greatly expanded by the development of automated glycan assembly (AGA) platforms and commercially available synthesizers. 305,306 At present, these systems can generate increasingly complex structures ranging from GAGs 307 to biantennary glycans 308 that could be used for many different glycoengineering applications. However, the site-specific coupling of these glycans onto proteins is always a key challenge that must be overcome in the various ways discussed above. While chemical synthesis is a promising route for homogeneous glycoprotein synthesis for study, these approaches require large quantities of purified enzyme and nucleotide-activated sugar donor substrates or many protection and reaction steps. Further development will be required to simplify and scale these reactions before they can widely adopted as practical means for industrial-scale production of glycoproteins. 204 ■ APPLICATIONS OF SYNTHETIC GLYCOSYLATION SYSTEMS Synthetic glycobiology has been used in a wide variety of applications. This section describes selected applications of the synthetic glycoprotein production systems described above to solve compelling problems in the fields of therapeutics, vaccines, diagnostics, and glycomaterials ( Figure 5).
Glycoprotein Therapeutics. Synthetic protein glycosylation systems, particularly those in mammalian cells, have been applied in numerous ways to the production of glycoprotein therapeutics. Here, we highlight three key application areas: the study and modulation of antibody therapeutic ADCC activities, the improvement of protein therapeutic delivery and circulation time, and the development of portable or on-demand protein therapeutic production systems. More complete reviews of the application of glycoengineering to protein therapeutics can be found here. 23, 325,326 Many antibody-based therapeutics, like those used to treat cancers, direct the patient's immune system to attack targeted cells by antibody-dependent cell-mediated cytotoxicity (ADCC). 187 ADCC activity requires the binding of FcγRIIIa receptors present on natural killer (NK) cells to the Fc region of the antibody therapeutic. In 2002, a pivotal study showed that antibodies derived from Lec13 CHO cells (which produce IgG antibodies with significantly reduced levels of α1−6 fucosylation on the reducing end GlcNAc of the N-glycan present at Asn297 of the Fc domain of human IgG antibodies) bind 50 times tighter to the FcγRIIIa compared to IgGs produced in standard CHO cells. 327 Further testing confirmed that this tighter binding is only observed when the FcγRIIIa receptor itself is glycosylated, indicating the importance of glycan−glycan interactions. 328 Many later studies have used chemoenzymatic transglycosylation methods to generate homogeneous IgG glycosylation structures for functional analysis, providing critical design rules for optimizing ADCC activity. 187 Since these pioneering works, there has been an explosion of clinical trials investigating antibodies lacking core fucosylation. As described in a recent review, 309,26 afucosylated antibodies have been investigated in clinical trials, and three have already been approved with indications in lymphoma and severe asthma. These three approved antibodies are produced either by overexpression of bisecting GnT-II and αMan-II which prevent modification with Fut8 or direct knockout of Fut8 in CHO cells. 309 The intentional engineering of protein glycosylation structures has also been shown to increase the stability and circulation time of protein therapeutics. 329 While the effect of glycosylation on each protein may be different, studies have generally concluded that the stabilizing effect of glycoengineering for therapeutics is achieved by (i) preventing denaturation, aggregation, and degradation by shielding protein regions that are unstructured, hydrophobic, or liable to proteases; 329 (ii) increasing the molecular weight and hydrodynamic radius of the molecule to prevent kidney filtration; 8 (iii) removing immunogenic glycan motifs to prevent clearance by the immune system; 26 and (iv) capping or removing terminal motifs that are selectively cleared by human lectins. 28 Several key examples showing how these mechanisms have been used to increase glycoprotein therapeutic stability are described below.
In a landmark study in 2003, the introduction of two additional glycosylation sites into human erythropoietin (EPO) by mutation of the native amino acid sequence and expression in CHO cells provided increased in vivo activity and prolonged serum half-life, eventually leading to the development of the drug darbepoetin alfa. 24 This study, 24 along with later works using chemoenzymatic synthesis, 245 indicate that the glycans in EPO cover hydrophobic patches on the protein and increase the molecular weight of the overall molecule, preventing aggregation and clearance. Glycans containing sialic acids have been shown to be particularly effective at stabilizing EPO and other therapies. 245,330 The negative charge of sialic acids is thought to prevent aggregation by creating a repulsive force between therapeutic molecules and preventing kidney filtration. 330 Accordingly, polysialylation of therapeutics has been shown to significantly increase half-life. 331,332 Similar increases in half-life can be obtained by the conjugation of polyethylene glycol (PEG) to therapeutics. 333 While most methods of PEGylation involve direct modification of amino acids, this can also be accomplished using glycans as a conjugation point. 296,297 This "glycoPEGylation" method has been implemented by modifying Factor IX in the cytoplasm of bacteria and then using a sialyltransferase to conjugate a PEGylated sialic acid moiety in vitro, leading to the approved therapy Rebinyn. 310 ACS Synthetic Biology pubs.acs.org/synthbio Review In contrast to the general stability of EPO, other examples of glycoengineering involve the removal of specific glycan motifs that cause an immune response or clearance. For example, the presence of α-galactose motifs at the terminus of the antibody therapeutic cetuximab expressed in murine cells was shown to generate a strong immune response and even anaphylaxis. 26 In this case, expression in CHO cells (which do not express large amounts of the α-1,3 galactosyltransferase) produced therapeutics without this immunogenic motif. 26 Other glycoengineering efforts seek to remove or cap glycan motifs which are not immunogenic but are selectively cleared by human lectins, leading to shorter circulation times. 8 Specifically, terminal galactose or mannose residues are often associated with clearance as they are bound by asialoglycoprotein receptors and mannose receptors. 8,28 A recent study systematically compared the properties of α-galactosidase A (a lysosomal replacement enzyme for Fabry disease) with a wide variety of glycosylation structures 28 generated using CRISPR/Cas9 glycoengineered CHO cell lines. Previous enzyme replacement therapies have been glycoengineered to contain terminal mannose or mannose-6-phosphate for cellular targeting. However, the presence of these terminal mannose residues also shortens half-life and directs the protein therapeutic immediately to the liver and spleen. In this study, the researchers generated dozens of unique CHO cell lines (knocking out 46 genes individually or in parallel) to generate different glycoforms of α-galactosidase A, which they then tested in a mouse model to determine the optimal glycan for the desired biodistribution profile. They found that a bianntenary glycan terminated with α2−3 sialic acids (rather than terminal mannoses or α2−6 sialic acids) increased circulation time and enabled drug delivery to harder to reach organs such as the heart. 28 In addition to optimizing the molecular structure of protein therapeutics, the development of synthetic glycosylation systems in alternative (nonmammalian) hosts holds great potential in facilitating distributed, on-demand, and more cost-effective production of therapeutics. Most development for these applications has focused on plant, yeast, bacterial, and cell-free expression systems. For example, a recent study reported the use of glycoengineered N. benthaminana plants to produce an antibody cocktail protective against Ebola virus. 334 The plants were engineered to avoid nonmammalian α1,3 Fuc and β1,2 Xyl epitopes and produced approximately 80% afucosylated complex-type glycans. After purification, a cocktail of three IgGs produced in these plants was effective in preventing Ebola infection. In fact, these IgGs were more effective than similar IgGs produced in CHO cells (likely because of the lack of core fucosylation on the IgGs produced in plants). 334 Another study in yeast showed that dried IgA glycoproteins produced in engineered P. pastoris yeast cells administered orally without purification were effective in preventing gastrointestinal infection within a pig model. 335 Glycoengineered yeast have also been directly integrated with an on-demand protein production, purification, and formulation system. 336 Due to their low cost and relative simplicity, bacterial glycoprotein production strategies using OST-dependent and OST-independent synthetic glycosylation systems may be useful in the more cost-effective and distributed production of therapeutics. 174,175,238,239 Cell-free glycoprotein production systems may be especially amenable to distributed manufacturing as they can be freeze-dried and reactivated to produce glycoproteins at the point of care. 2, 238,337,338 For example, freeze-dried CHO cell lysates have been implemented to synthesize, purify, and formulate various therapeutics ondemand. 339 One-pot bacterial cell-free glycoprotein production systems have been shown to generate glycoproteins with the eukaryotic trimannose core glycan, 239 glycoconjugate vaccines with O-antigen bacterial glycans, 238 a vaccine candidate with an adjuvanting α-galactose glycan, 174 and proteins modified with minimal sialic acid motifs with possible utility in stabilizing therapeutics. 174 Glycoprotein Vaccines. Glycoprotein vaccines leverage the roles of carbohydrates in disease to train the immune system to respond when it encounters specific glycans. A glycoconjugate vaccine is comprised of three main parts, the carrier protein, the glycan antigen, and the adjuvant. While glycans have been developed as vaccine candidates, a polysaccharide antigen alone has poor immunogenicity and results in a T-cell independent immune response that does not generate an IgM to IgG transition. Thus, when covalently conjugated to a carrier protein, the body is able to generate long-term B-cell memory of the vaccine and protect the recipient, and is particularly important for vaccine efficacy in infants. 340 Commercially approved carrier proteins are typically inactivated toxins that can improve immunogenicity of the vaccine. 341 An adjuvant molecule is then usually coformulated with the vaccine or covalently attached for immune system stimulation. Since the first antibacterial glycoconjugate vaccine was approved in the 1980s, 342 great strides have been made to enable protection against a wide range of diseases.
All currently licensed glycoconjugate vaccines protect against bacterial infections and include the bacteria Hemophilus influenzae type B, multiple serotypes of Streptococcus pneumoniae, and Neisseria meningitis. 311 The corresponding antigens are typically either capsular or O-antigen polysaccharides, which decorate the cell-surface of the pathogenic bacteria and are presented to the body during infection. 343 Current industrial processes involve culturing pathogenic bacteria and extracting the LLOs. The LLO is then chemically linked to recombinantly produced carrier proteins following additional chemical priming and processing. In addition to requiring the use of pathogenic bacteria, this process is expensive and typically employs nonspecific conjugation, resulting in heterogeneous products. 344 Thus, there has been a compelling opportunity to use glycoengineering solutions to improve the process and enable future generations of glycoconjugate vaccine molecules.
In vivo production in E. coli is the primary glycoengineering strategy to produce antibacterial vaccines. As the bacterial polysaccharide antigens of interest are typically large structures consisting of multiple repeating units of smaller sugar motifs, en bloc transfer by OST-dependent glycosylation systems have been employed for protein modification. This bioconjugation or protein glycan coupling technology (PGCT) involves expression of the LLO biosynthesis pathway, carrier protein, and OST to create a glycoconjugate product in vivo that can then be purified. 311 The N-linked OST PglB has been the most commonly used enzyme for this purpose, successfully producing vaccine candidates against Shigella f lexneri 2a, 345 Extraintestinal Pathogenic E. coli, 346 Burkholderia pseudomallei, 347 E. coli O157, 348 Francisella tularensis, 349,350 Staphylococcus aureus, 351 and Streptococcus pneumoniae. 352,353 As discussed previously, limitations in the diversity of sugar donor substrates that can be utilized by PglB have been circumvented by using O-linked OSTs to produce glycoconjugate vaccines in vivo. Specifically, PglL has been used to produce vaccine candidates against Shigella flexneri 2a 88 and Salmonella enterica serovar Para-ACS Synthetic Biology pubs.acs.org/synthbio Review typhi, 354 while PglS has been used to recombinantly produce vaccine candidates against Streptococcus pneumoniae 90 as well as hypervirulent Klebsiella pneumoniae. 355 Vaccines protecting against fungi, parasites and viruses, which are commonly decorated with glycoproteins or glycans, have also been developed primarily with chemical synthesis strategies. Antifungal conjugate vaccines have been developed to protect against C. neoformans using the major natural capsular polysaccharide, glucuronoxylomannan (GXM). 356 Due to challenges with natural polysaccharide structures, shorter synthetic antigens for antifungal vaccines have been shown to protect against C. neoformans 357 and Candida species. 358−360 Beta-glucan conjugates have also been investigated as a potential broad spectrum antifungal vaccine. 361 There are also examples of glycoconjugates protecting against the HIV virus 362−364 which has a high concentration of oligomannose glycans on its surface, but identifying a successful vaccine that elicits neutralizing antibodies has proven difficult. While parasitic mechanisms of infection are still poorly understood, a Leishmania conjugate vaccine utilizing the lipophosphoglycan cap has also been investigated. 365,366 A recent review has discussed developments for vaccines against these targets. 312 Glycoconjugate vaccines can also be used to direct the immune system against cancers which specifically display abhorrent glycosylation patterns called tumor-associated carbohydrate antigens (TACAs) on their cell surface. 367 While chemical extraction of natural LLOs has been common for the production of antibacterial vaccines, isolation of TACAs is difficult due to expression and glycan heterogeneity. 341 Research on cancer glycoconjugate vaccines has been greatly enabled by novel chemical synthesis strategies. TACAs are either found as glycoproteins such as mucins (Tn, TF, STn, Globo-H, and Lewis Y (Le y )) or glycolipids in the case of gangliosides (GM2, GD2, GD3, fucosyl-GM1, Globo-H, Le y ). 313 As described above, some blood group antigens such as Le y can be either glycoproteins or glycolipids. TACAs have poor immunogenicity, making it even more important to conjugate to a carrier protein such as keyhole-limpet hemocyanine (KLH) 368 that increases the recognition and memory of the presented antigen.
Initial development of cancer vaccines focused on synthetic monomeric vaccines including ganglioside based antigens GM2, 369 GD3, 368,370,371 and GM3 372 conjugated to a KLH carrier to treat melanoma. Mimicry of the natural presentation of TACAs which cluster on the cell surface has been advantageous, particularly for mucin-based vaccines. Multivalent vaccines that present glycopeptide clusters of either Tn, sTn, or FT antigens conjugated to KLH have improved immunogenicity over a single presented antigen. 373−375 Multivalent vaccines have also been developed to mimic specific cancer types by combining a range of characteristic antigens in a single vaccine. 376−378 Additional information on cancer vaccines and strategies for engineering TACA presentation on carrier proteins is available in recent reviews. 312,313 The use of adjuvants to increase immune responses to both protein and glycoconjugate vaccines is critical for eliciting immune responses. However, most adjuvanted vaccines contain simple coformulations of immunostimulatory molecules with antigens, meaning that once these molecules separate in the body, the effect of the adjuvant may be lost. Recently, several glycans have been shown to have adjuvating effects which could enable site-specifically modified glycoprotein conjugates with self-adjuvating properties. For example, The αGal motif is an effective self:nonself discrimination epitope in humans and has been shown to confer adjuvant properties when associated with various peptide, protein, whole-cell, and nanoparticle-based immunogens. 22,314−318 The Lewis X motif has been shown to specifically target vaccine antigens to DC-SIGN receptors on dendritic cells which then present the antigen via the majorhistocompatibility complex class I-restricted and class IIrestricted systems, ultimately leading to increased antigenspecific antibody titers. 379 A Siaα2−3Gal structure has been shown to enable selective targeting and endocytosis of antigens by binding to siglec1 (Sn, CD169) on the surface of macrophages, ultimately resulting in increased antigen presentation to T-cells. 380 Ultimately, the ability to produce defined glycoproteins with these self-adjuvating groups could increase vaccine effectiveness or lead to the development of new vaccines.
Glycoprotein Diagnostics. The important carbohydrate interactions discussed so far have also been leveraged for diagnostics in the form of lectin arrays, 381 glycan arrays, 382 and glycoprotein arrays. 383 This section discusses the synthetic glycoprotein approaches that have been employed to detect and diagnose both infectious diseases as well as cancer biomarkers. Additional glycan diagnostic tools and applications have been recently reviewed elsewhere. 384 Antibodies generated during an adaptive immune response to a bacterial infection have specificity for glycan structures, which is leveraged in conjugate vaccine production. This relationship can also be used to detect the presence of antibodies generated in infected patients. There have been multiple approaches using an ELISA-based system using glycoproteins made with the PglB OST and the native AcrA acceptor protein for rapid diagnosis. 122 These works have used glycoproteins decorated with E. coli O157, O145, and O121 glycan antigens to diagnose HUS (an illness caused by Shiga toxin-producing E. coli bacteria) 385 and Yersinia enterocolitica O3 antigen to detect Brucella infections (a common bacterial zoonosis) through specific antibody binding. 386−388 Autoantibodies generated in response to cancer glycoproteins are a promising biomarker for early cancer detection 389 and can also be analyzed via glycoprotein diagnostics displaying cancer glycopeptides. In recent work from Pederson et al., a glycopeptide array was printed using synthetic O-glycosylated mucin fragments. 319 Two different methods were pursued, including chemoenzymatic synthesis of short glycopeptides as well as enzymatic production of larger mucin fusion proteins in E. coli followed by in vitro O-linked glycosylation using GalNAcTs. 319 These works showcase the opportunities and ability to harness multiple glycoprotein synthesis platforms for use in diagnostic applications.
Functional Glycomaterials. Glycomaterials are synthetic molecules including, but not limited to, lipids, polymers, supramolecular structures, and nanoparticles that have been decorated with glycans for use as therapeutics, vaccines, biomimetic materials, adaptive and nonadaptive infection prophylaxis. In this section, we focus specifically on examples of protein-based materials. Other types of glycomaterials have been reviewed elsewhere. 323 Synthetic glycobiology can enable the design of glycomaterials by providing additional control over glycan spacing, valency, and organization on unique structures not accessible using traditional protein expression or synthetic chemistry approaches. This precise control over glycan display can be useful for recapitulating natural properties, countering challenges faced by current therapeutics (such as the weak affinity of protein-carbohydrate interactions), 38 and providing control over self-assembly properties of nanomaterials. Recent ACS Synthetic Biology pubs.acs.org/synthbio Review works using synthetic glycosylation systems to generate glycoprotein materials with unique or beneficial properties, many that cannot be found in naturally glycosylated products or traditional protein scaffolds, are discussed below. On the nanoscale, glycans and glycoproteins are useful for the creation of self-assembling functional materials. Recent work leveraged self-assembling glycopeptides to create nanofibers to control galectin activity, an important consideration for multiple therapeutic applications. 65 A similar strategy employing selfassembling MUC1 glycopeptides to form β-sheet nanofibers has been used to generate a self-adjuvating anticancer vaccine. 321,322 Sulfated glycopeptide nanostructures can mimic GAG structures and bind and increase bioactivity of glycan-binding proteins such as growth factors. 390 In addition, glycopeptides have been used in self-assembling active polymersomes for drug delivery. 391 Another glycoprotein material strategy utilizes virus-like particles (VLPs) as supramolecular carrier proteins for vaccine antigens. In recent work, up to 340 copies of the Tn antigen (a common trisaccharide TACA) have been displayed on Q-beta bacteriophage capsids with addition by click chemistry. 392 On the microscale, engineering cell surfaces as a glycomaterial is emerging as a useful approach to control and study cellular behavior. 31,36 Cellular surfaces are coated in a thick layer of saccharides tethered to glycoproteins and glycolipids called the glycocalyx. Engineering the glycocalyx can be accomplished by chemoenzymatic remodeling of the cellular surface, 393 direct addition of glycomaterial substrates to cells, 394,395 or by engineering the cell to produce and display various glycoproteins. 36,396 Tuning the glycocalyx of mammalian cells has been shown to extensively modulate cellular behavior and responses to mechanical perturbation, which plays a particularly important role in cancer. 397−399 An exemplary application of cellular glycomaterial engineering is the prevention of mammalian cell aggregation in a bioreactor by overexpressing heavily glycosylated mucin proteins on the surface of HEK cells. 324 Diverse cellular functions such as adhesion and, by extension, replication can be similarly modulated using glycocalyx engineering. 400,401 Other efforts in glycocalyx engineering have been recently reviewed. 36,396,402 Finally, a macroscale application of glycoprotein materials involves surface functionalization of biomimetic materials. For example, specific sialoside epitopes chemically incorporated into a collagen biomaterial have selectively directed the fates of mesenchymal stem cells toward osteogenic or chondrogenic states. 66 ECM proteins decorated with poly-LacNAc glycans are known to interact with several important human galectins (notably Galectins 1, 3, and 8) which mediate cross-linking events that promote and modulate cell growth and adhesion. 64 Thus, glycosylation to create biomimetic materials or smart biomaterial scaffolds for use in regenerative medicine 320 have also been investigated. Heavily modified glycoproteins produced in human cells have also recently been shown to provide a promising glycomaterial lubricant (lubricin). 403

■ FUTURE DIRECTIONS
Driven by a rapidly increasing toolkit of natural and engineered biological parts, improved biosynthetic and analytical methods for testing designs of novel glycosylation systems, and an increasing appreciation for the unique biophysical and immunomodulatory properties that can be obtained using protein glycosylation, the field of synthetic glycobiology has a bright future. Key areas of focus in the upcoming years are likely to be (i) commercialization of highly engineered CHO cell systems for producing therapeutically relevant, homogeneous human glycans, (ii) methods to synthesize diverse glycoproteins in bacteria and in vitro (particularly for vaccines), (iii) the study and application of minimal protein glycosylation structures for stability or immunomodulation, (iv) the development of new therapeutic modalities based on the modulation or targeting of glycan structures in the human body, and (v) the development of glycoprotein-based materials, diagnostics, and other ex vivo applications which become viable with lower-cost, nonmammalian production systems.
The field of synthetic glycobiology is at an important inflection point. Thus far, limitations on our knowledge of glycosyltransferases and low-throughput methods for protein glycosylation pathway construction have led to the engineering of biological systems to contain nearly exact replicas of natural glycosylation systems. While this is certainly an important approach because it can help ensure that obtained structures and biological activities match those in nature, it also constrains the simplicity, robustness, and available design space of structures and pathways that can be exploited for societal and commercial benefit. We believe that increases in fundamental understanding of natural systems as well as improved methods to build and test glycoproteins for desired properties will drive the field toward a new generation of glycoengineering strategies that move beyond recapitulating pathways found in nature to the simplified and tailored design of glycoproteins with desired properties.

■ KEY CONCEPTS
Sugar donors: Activated sugar donors are made up of saccharides that are linked to lipids or nucleotidediphosphates such as uracil (UDP-), guanine (GDP-), cytosine (CMP-), thymine (TDP-), or adenine (ADP-). The sugar donor can either be a simple monosaccharide or a more complex polysaccharide structure built by multiple elaborating glycosyltransferases. More complex sugar donors are typically built on lipids before polypeptide modification and are referred to as lipid-linked oligosaccharides (LLOs). Polypeptide glycosyltransferases (ppGTs): A glycosyltransferase that conjugates sugar donors to amino acid side chains within proteins. Oligosaccharyltransferases (OSTs): A class of membranebound glycosyltransferases that conjugates lipid-linked oligosaccharides (LLOs) onto a protein by an en bloc transfer mechanism. Elaborating glycosyltransferases: Glycosyltransferases that transfer monosaccharides from sugar donors to other sugars. These glycosyltransferases build sugar structures either on proteins (following polypeptide modification by a polypeptide glycosyltransferase) or lipids (prior to transfer by an oligosaccharyltransferase). Sequon: A sequence of amino acids within a target protein that contains a glycosylation site and is necessary for glycosylation. Sequons that are intentionally introduced into proteins by altering primary amino acid sequences are known as Glycosylation Tags (GlycTags). Glycosidases: Enzymes that hydrolyze glycosidic bonds between sugar monomers. Lectins: Proteins that bind to specific sugar structures. Synthetic glycobiology: The application of synthetic biology tools and design principles to better understand and engineer glycosylation ■ REFERENCES  ACS Synthetic Biology pubs.acs.org/synthbio Review