Jacob F.
Wardman
ab and
Stephen G.
Withers
*abc
aDepartment of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada. E-mail: withers@chem.ubc.ca
bMichael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
cDepartment of Chemistry, University of British Columbia, Vancouver, BC V6T 1Z1, Canada
First published on 23rd May 2024
Carbohydrate-active enzymes (CAZymes) constitute a diverse set of enzymes that catalyze the assembly, degradation, and modification of carbohydrates. These enzymes have been fashioned into potent, selective catalysts by millennia of evolution, and yet are also highly adaptable and readily evolved in the laboratory. To identify and engineer CAZymes for different purposes, (ultra)high-throughput screening campaigns have been frequently utilized with great success. This review provides an overview of the different approaches taken in screening for CAZymes and how mechanistic understandings of CAZymes can enable new approaches to screening. Within, we also cover how cutting-edge techniques such as microfluidics, advances in computational approaches and synthetic biology, as well as novel assay designs are leading the field towards more informative and effective screening approaches.
CAZymes have found a number of applications beyond what Nature has developed them for. At an industrial level, CAZymes can play roles in biofuel production,6 textile processing,7 and the production of prebiotics8 and other designer food sugars.9 Applications in biomedical science include the re-modelling of blood and organ antigens,10–13 development of longer lasting biopharmaceuticals,14,15 and the manipulation and analysis of protein glycosylation patterns to provide more potent biopharmaceuticals.16,17 In addition, CAZymes often provide useful fundamental research tools for deciphering the roles of specific carbohydrate and glycoconjugate structures in biological systems (often called the glyco-code).4,18,19
The central role of carbohydrates in cellular structure, signaling and metabolism at all levels of life has led to an incredible diversity of enzymes for glycan biosynthesis and degradation. Consequently, Nature can often provide us with ready-made enzymatic solutions for our specific needs since, in many instances, organisms have spent millennia evolving enzymes to carry out the exact reactions that a researcher is interested in. Such enzymatic diversity can be accessed through creation of synthetic gene libraries or through methods such as functional metagenomics (for which random fragments of bacterial DNA are cloned into laboratory strains of bacteria for expression and screening).20 However, enzymes obtained from Nature may not always have the exact desired activity or specificity. Directed evolution – for which the Nobel Prize in chemistry was awarded in 2018 – has offered a useful way of modifying these enzymatic activities for different purposes.21 In principle: the process of directed evolution subjects genes to accelerated evolution through mutagenesis and subsequent screening/selection for the transformations or properties of interest to the researcher.
In order to find an enzyme for a desired purpose it is typically necessary to screen many different clones containing sequence variants. High-throughput screening is thus essential in CAZyme discovery and engineering campaigns. While differing in the source material, the screens applied in directed evolution, interrogation of synthetic gene libraries, and functional metagenomics are similar in their design and often interchangeable. This review will cover both recent advances and general principles in the field of (ultra)high-throughput screening for CAZymes. In so doing it will highlight both the scientific principles behind the approaches and the creativity of the authors in the development of these methods. Additionally, we will highlight underexplored reaction spaces and anticipated next steps in the field.
Outside of these primary classifications within the CAZy database are enzymes such as transglycosylases (TGs) and glycoside phosphorylases (GPs). TGs are mechanistically similar to GHs and classified within GH families. Rather than hydrolysis of glycosidic linkages, TGs preferentially catalyze the transfer of sugars to different non-water acceptors.27 TGs have been of great interest for carbohydrate synthesis as they do not require the same costly sugar donors as GTs since they can often use relatively inexpensive feedstocks such as sucrose.9,27,28 GPs catalyze the reversible phosphorolysis of carbohydrates and are found in several GH and GT families.29 The reversibility of this reaction is of synthetic utility as one can drive glycan synthesis through use of an excess of relatively accessible sugar-1-phosphates and/or removal of phosphate or glycan polymer product.30
While GHs are useful for a number of purposes – principally glycan degradation – the engineering of GHs as biosynthetic catalysts bestows additional utility. Towards this end, classes of redesigned enzymes such as glycosynthases31 and (thio)glycoligases32 have been developed (Fig. 2a and b). Both of these classes of enzymes were developed by the Withers laboratory based upon a mechanistic understanding of GHs in order to create mutants that are crippled hydrolytically but can perform glycan synthesis with the appropriate substrates.31,32 Use of these engineered enzymes may be preferred over GTs – the natural enzymes for glycan synthesis – since these mutant GHs are typically easily expressed and do not need expensive nucleotide/lipid-linked donor sugars for glycan biosynthesis. While this may also be true of TGs, an advantage of these engineered enzymes is that they do not degrade their reaction products – and can thus offer improved yields.28 Classically, glycosynthases and (thio)glycoligases can be developed from GHs through mutagenesis of the catalytic nucleophile and the catalytic acid/base residues respectively.31,32
Glycosynthases accomplish glycoside bond synthesis by using a glycosyl fluoride (1-fluoro sugar) of the opposite anomeric configuration to that of their native substrate (Fig. 2a).28,31 This activated sugar mimics the glycosyl-enzyme covalent intermediate found in the mechanism for retaining GHs (Fig. 1b), setting the enzyme up to accept different nucleophiles and form new glycosidic linkages.28 The glycosynthase concept has found widespread use for variants of Koshland GHs that carry out their reaction through a neighbouring group participation mechanism involving an oxazoline intermediate (Fig. 1c and 2b). Mutants of such enzymes lacking the oxazoline-stabilising residue can carry out glycosyl transfer using an oxazoline-bearing glycan donor, but are unable to hydrolyse the product so formed.28 These oxazoline donors can be readily prepared from glycans bearing an N-acetyl sugar at the reducing end using a mild and selective chemical reagent.33 And so, in this way, complex glycan donors – often derived from N-glycans that have been synthesised or isolated from natural sources – can be prepared and used in the synthesis of a variety of products, most notably glycoproteins.28
Thioglycoligases are mutant retaining glycosidases missing their acid/base residue. When reacted with a glycan donor containing a leaving group such as DNP or F that does not need acid catalysis, they rapidly form the glycosyl-enzyme intermediate (Fig. 2c).32,34–37 However, in the absence of the catalytic base, these enzymes are unable to efficiently turn over the covalent intermediate through hydrolysis. However introduction of a good nucleophile – such as a thiol – that does not need base catalysis results in ready turnover and formation of new thioglycosidic bonds.32,34–37 A further variation on this theme is that of the O-glycoligases, which are again mutants at the acid/base position.38,39 To date only mutants of α-glycosidases have been shown to react usefully with oxygen nucleophiles, thus to function as O-glycoligases. This is thought to be due to the higher reactivity of the β-glycosyl-enzyme intermediate than its alpha counterpart.38 Further combining these approaches, thioglycosynthases have also been developed through mutation of both catalytic residues.40 However, their application has been relatively limited due to their low reactivity. While this review covers efforts to evolve and improve these enzymes, interested readers are directed to other recent reviews for a more thorough interrogation of the transformations which can be catalyzed by these enzymes.18,28
In the absence of a colorimetric screening method, HPLC offers a near universal (albeit time intensive) detection method for carbohydrate modification. Desmet and colleagues have shown the power of this technique for monitoring the effects of modification of TGs in the formation of rare sugars from sucrose.9,48 These products are inherently difficult to screen for as the detection method must be able to distinguish the different stereo- and regioisomers which can be formed.9,48
Agar plate assays provide an accessible method for semi-quantitative screening. Many researchers will be familiar with the use of compounds such as X-Gal to detect β-galactosidase activity as part of blue-white screening. However, this approach can be extended for other CAZymes. As a nice example, an entirely new family of sialidases (GH156) was discovered through functional metagenomic screening of a hot spring library using X-Neu5Ac-containing agar plates.49 Other options exist such as halo assays with indicators such as Congo Red to detect polysaccharide degradation,50,51 or through incorporation of dyed carbohydrates into the media (often carbohydrates with conjugated azo dyes).52–54 An advantage of such screens is that they enable use of a near native substrate. While many approaches are purely qualitative, more quantitative screens have also been carried out using these methods. For example, transglycosylation activities have been engineered using screens based upon a comparison of reaction velocities in the presence/absence of a suitable acceptor.55 While these agar plate screens tend to use a large amount of material, they do enable screening of many variants relatively easily.
With the availability and expertise of suitable core facilities in many institutes, FACS offers an accessible method for ultrahigh-throughput screening of many enzymatic activities. The throughputs of FACS are typically on the order of >103 s−1 enabling ready screening of very large libraries.56 Further benefits of FACS includes an ability to easily multiplex reactions, limited specialized equipment needed to conduct the screen, and quantitative results. A typical FACS screen will involve either production/trapping of a fluorescent signal within the cell or at its cell surface.56 While this does make it amenable to many purposes, it does complicate certain screens as it can require efficient uptake of substrates or surface display and capture of enzymes/substrates, which can be non-trivial.
Of considerable recent interest in the CAZyme engineering space has been the use of microfluidic technologies, of which droplet-based microfluidics has been the most widely adopted.57 In most studies, aqueous reaction mixtures containing single clones are encapsulated within an oil medium that acts in the same manner as the walls of a microtiter plate (Fig. 3a).58 The assay is then allowed to proceed in the droplet with subsequent sorting of hits based upon formation of a fluorescent signal (Fig. 3a). A number of other methods (e.g. absorbance, fluorescence anisotropy, mass spectrometry, etc.) to measure activity within droplets have also been developed.57 As well, technologies such as pico-injection and controlled droplet merging allow for further manipulation of droplets and the performance of complex reactions and screens (Fig. 3b).57 Thus, at least part of the promise of droplet microfluidics is that the plethora of assays developed for plate-based screening can be readily adapted for microfluidics and thus conducted at much higher throughputs and with substantially less substrate.59 A particular advantage of droplet screening is that entry of the substrate into the cell is not necessarily required if the cell can be lysed within the droplet. As an example of the assay complexity possible and the manipulations that can be performed with droplet microfluidics, in work by Holstein et al., a cytotoxic protease was evolved via a multistep process (Fig. 3b). This involved encapsulation of single DNA molecules into droplets, followed by rolling circle amplification and in vitro transcription and translation steps, and then pico-injection of the substrate with subsequent sorting by FADS (Fig. 3b).60 As a key point for screen adaptation, in general, it seems that charged substrates and products are required for effective retention within droplets.61,62 While the exact mechanism of fluorophore exchange is not known, certain commonly used hydrophobic, un-charged fluorophores (e.g. hydroxycoumarin and resorufin derivatives) can diffuse into the oil or between droplets under certain conditions – thus leading to loss of genotype-phenotype linkage.61,62 Besides changes to the fluorophore itself, optimization of screening conditions (e.g. pH, surfactant concentration, inclusion of other additives) can also improve fluorophore retention.57,59
![]() | ||
Fig. 3 A selection of droplet microfluidic workflows used to screen for enzymatic activities. (a) Typical workflow employed for droplet microfluidic screening. Note that FADS requires specialized equipment. (b) Droplets can be readily manipulated as demonstrated in the workflow employed by Holstein et al. to screen for improved protease variants.60 (c) A common workflow for generation of double emulsion droplets for subsequent sorting on widely available FACS instruments.59,66 |
It should be noted that, while microfluidic screens require more technical expertise and set-up compared to FACS assays, they also offer considerable flexibility and open the door to entirely new ways of screening and characterisation. For example, in a recent study, Markin et al. were able to carry out the near simultaneous high-throughput kinetic characterization (kcat, KM, Ki) of thousands of enzyme variants with different substrates (albeit this was achieved using a complex microfluidic chip rather than droplets for compartmentalization).63 Even absorbance-based assays, which cannot be done by FACS but are widely used in plate-based screens, are possible with microfluidics.64,65
While microfluidics does require some specialized expertise, strides towards democratization of the methodology and more widespread use have been made with the development of double-emulsion (water/oil/water) droplet systems (Fig. 3c).59,66 These double emulsions are relatively straightforward to produce and can be sorted on widely available conventional FACS instruments – thus making droplet microfluidics available to those without experience in droplet sorter design or engineering.66 A nice example of this being used for CAZyme discovery is the work by Tauzin et al. wherein they applied the approach to the discovery of new β-GalNAcases.59 There are many further interesting examples of the applications of droplet microfluidics which are beyond the scope of this review. Interested readers are thus directed to the excellent recent review by Gantz et al. on the use of microfluidics for enzyme discovery and engineering.57
Other differences in enzyme mechanism can also be exploited in screening. Glycoside phosphorylases (GPs) differ from GHs in that inorganic phosphate acts as a nucleophile rather than H2O (Fig. 4b). Screens have been successfully applied for enhancements in the rate of substrate cleavage in the presence of phosphate in order to distinguish GPs from GHs.44,70 In a remarkable demonstration of the power of this technique, Franceus and colleagues were able to readily convert a GH to a GP via site-saturation libraries at “gatekeeping” residues with screens for rate enhancement upon addition of Pi.70
Owing to the near equal free energies of sugar-1-phosphates and oligosaccharides, reactions catalyzed by GPs are also readily reversible (Fig. 4b). This reversibility also allows for screening by measurement of the release of inorganic phosphate from sugar phosphate substrates using the molybdenum blue assay (Fig. 4c).71,72 This approach has been successfully applied to the directed evolution of a cellobiose phosphorylase to expand acceptor specificity,72 as well as in the screening of metagenomic libraries71 and synthetic gene libraries73 for GP activity. A key benefit of screening for reverse phosphorolysis is that the acceptor and donor pairs can be readily varied to rapidly probe substrate specificities.73 This approach enabled the discovery of the GlcNAc-β1,3-GlcNAc biopolymer – acholetin – from the activity of its corresponding GP from Acholeplasma laidlawii.73
PLs are a class of CAZyme that degrades uronic acid-containing substrates via a unique catalytic mechanism involving eliminative cleavage of the C4-oxygen bond to the sugar on the non-reducing side. However, they have been largely unexplored in protein engineering. PL activity can be assayed by monitoring the formation of the unsaturated uronic acid product via its absorbance at 235 nm (Fig. 1d).7 This method, while allowing use of natural substrates, suffers from low sensitivity, and is likely susceptible to interference from other assay components. Agar plate-based assays based upon halo formation in carbohydrate-containing media are also frequently used, though these are not specific for PLs.50,54,74 Nevertheless, combinations of these techniques have been used successfully. This is nicely demonstrated by Solbak et al.'s work in which they used azo-rhamnogalacturonan to detect pectinase and pectate lyase activity in an agar plate-based screen of a metagenomic library.54 They subsequently applied directed evolution to the best pectate lyase to enhance its thermostability and activity by monitoring activity on polygalacturonic acid after heat treatment. This process yielded variants with up to 16 °C increases in protein TM and 20 °C increases in temperature optima (from 50 °C to 70 °C).54 Thermostable PLs could be useful in applications such as textile processing where harsh, environmentally damaging, alkali solutions are otherwise used.54 Curiously, there has been limited application of artificial chromogenic substrates of PLs which might offer easier detection. Such substrates have been described by Rye et al. and were used in the determination of the catalytic mechanisms of these enzymes.75,76 Moreover, these substrates are selective for PLs over GHs as the chromophore is attached to the C4 hydroxyl rather than at the anomeric position.75,76
![]() | ||
Fig. 5 Coupled reactions enable high-throughput screening with highly relevant substrates. (a) Overview of a common coupled assay in which glucose is subsequently oxidized by glucose oxidase to form gluconolactone and H2O2. H2O2 can then be detected by use as a substrate for horseradish peroxidase along with a dye co-substrate. Alternatively, the produced H2O2 can also be used to initiate polymerization of a fluorescent hydrogel around the enzyme-producing cells for subsequent sorting.79,81 (b) Coupled assay used to screen for blood group converting enzymes. In this, once the initial GalNAc is removed, thereby converting the A antigen to the H antigen found in O-type blood, a series of coupling enzymes sequentially degrade the remaining trisaccharide to produce a fluorescent signal. (c) A coupled assay for use in screening for glycosynthases. As shown, a glucosynthase is used to produce a chromogenic cellobioside substrate. This cellobioside, but not the parent glucoside, can then be cleaved by an endo-acting cellulase to produce a fluorescent signal. |
More recent work by the Hollfelder laboratory has expanded this approach by using sugar dehydrogenases to enable measurement of the release of free sugars such as glucuronic acid, xylose, and galactose.78,86 This method uses the reduction of NAD(P)H by these enzymes during sugar oxidation to initiate an enzymatic cascade that results in reduction of a tetrazolium derivative (WST-1) to form a UV/Vis absorbent formazan.78 Ladeveze et al. show in their proof of concept that enzymes active on beechwood xylan, wheat arabinoxylan and xylobiose can be detected and that absorbance-activated droplet sorting can be used to sort active clones.78 Innovative work from the same laboratory offers further improvement through development of a highly sensitive fluorescence assay which operates under similar principles. In brief, the production of NAD(P)H by the sugar dehydrogenase is then coupled to the reduction of glutathione disulfide to glutathione by glutathione reductase.86 This free reduced glutathione can then release a quenched carboxy-coumarin fluorophore through nucleophilic cleavage of a sulfonic ester.86 The resulting fluorescent signal offers a much lower limit of detection compared to the absorbance based assay.86 This approach should prove useful in enzyme discovery campaigns via methods such as functional metagenomics where low enzyme expression levels are common.
Many GHs derive specificity from key interactions with both the components at the non-reducing end of the glycan and the reducing end (within their +1, +2, etc. subsites). Consequently, certain enzymes do not effectively recognize common chromogenic substrates, even if the enzymes are exo-acting.10,87 Such difficulties can be readily overcome through use of coupled assays with oligosaccharide substrates. An excellent example is in a screen utilized during the discovery of the blood group-cleaving GalNAc deacetylase and GH36 exo-α-GalNase by Rahfeld and colleagues (Fig. 5b).10 In this coupled assay, the terminal GalNAc is removed to expose a trisaccharide substrate which can then be degraded by added enzymes, in this case a fucosidase, galactosidase, and N-acetylhexosaminidase, to release the fluorophore. A similar approach had been previously applied by Kwan and colleagues for the directed evolution of a broad specificity blood group A cleaving enzyme from a Streptococcus pneumoniae GH98.11 Notably for the GH36 GalNase, this enzyme has almost 1000-fold lower activity against GalN-pNP as compared to a more natural substrate based on a blood group antigen tetrasaccharide.10
Coupled assays have also found use in screening for enzymes that assemble glycosides. In the general scheme introduced by Mayer and colleagues, the chromogenic substrate is cleaved by the coupling enzyme(s) only when the desired product – with both correct stereo- and regio-chemistry – is produced (Fig. 5c).88 This stands in contrast to indirect measurements such as enzymatic turnover or generalized product formation. This approach has been applied for the directed evolution of glycosynthases,89 and thioglycoligases.34 In both of these examples, the biosynthetic exo-β-glucosidase mutant produces a substrate for an endo-cellulase, resulting in ultimate fluorophore release (Fig. 5c).34,89 Interestingly, during the directed evolution of the Abg exo-β-glucosynthase by this approach, the identified mutations recapitulated interactions lost in creation of the glycosynthase mutant.89 It seems that by removing the catalytic nucleophile for creation of the glycosynthase, key interactions with the C2 hydroxyl (which are typically quite strong90) were also lost and only restored upon directed evolution.89 Excitingly, the best mutant from this screen turned over the glycosyl fluoride substrate with Glc-β-pNP as an acceptor with a kcat value over 1000-fold higher than the starting mutant and comparable to that for the native enzyme's hydrolysis of Glc-β-pNP.89,91 Such work highlights the ability of directed evolution campaigns to provide new understanding of how enzymes work.
Fundamental work on the screening of CAZymes by FACS includes the generalizable GT assay developed by Aharoni and colleagues.99 In this method, a glycoside conjugated to a fluorophore is transported into the cell via a sugar transporter (Fig. 6b). The fluorescent substrate is then glycosylated in the cell by the GT of interest and, once so modified, is no longer recognized by the transporter, resulting in accumulation of fluorescent substrate within cells containing active enzymes, which can then be sorted by FACS.8,99,100 This method has been applied to the directed evolution of sialyltransferases, galactosyltransferases, and fucosyltransferases.8,99,100 In an example of “you get what you screen for”, the initial iteration of this methodology resulted in the development of a hydrophobic binding pocket on the enzyme to which the dye moiety could bind, thereby improving reaction rates.99 This has been avoided in subsequent studies through the simultaneous use of two chemically distinct dyes such that the enzyme is not selected just through improved dye binding.100 While thus far this method has only been applied towards galactoside substrates (as transported by LacY), expression of different sugar transporters may further enhance the substrate scope of this method.
In mammalian cells, processing steps in the protein export pathway can be exploited for screening. In the approach developed by the Lindstedt laboratory, loss of O-glycosylation can be detected by furin cleavage of the aglycosylated protein within the trans Golgi network to reveal a fluorogen-activating protein (Fig. 6c).101,102 This has been used to probe compartmentalization of enzymes within the Golgi apparatus, the roles of different glycan biosynthesis genes, as well as in the identification of inhibitors of mucin-type O-glycosylation.101,102 While this method has not been used for directed evolution – and directed evolution is typically much more difficult in mammalian cells compared to bacteria or yeast103 – this does offer a useful method for engineering enzymes in a meaningful environment. This is especially the case since it offers access to enzymes which may be recalcitrant to recombinant expression in microbial systems.
The DeLisa group has also developed a method for improvement of bacterial oligosaccharyltransferases (OSTs) wherein secreted glycoproteins are detected by immunoblotting.107 This offers a potential improvement over cell-based screens as the assay is not confounded by the build-up of lipid-linked oligosaccharides which may be present in the cell membrane. Bacterial OSTs are useful for en bloc transfer of a number of different glycan structures, including eukaryotic N-glycans,108 to Asn residues within a D/E-X-N-X-S/T consensus sequence.109 Relaxation of this specificity is desirable as it allows for use of OSTs with more diverse protein targets including eukaryotic proteins. Ollis et al. were able to apply this immunoblot-based method for the directed evolution of an OST from Campylobacter jejuni (PglB) such that it had a more relaxed sequence specificity.107 This was done through site-saturation mutagenesis of two Arg residues thought to form salt bridges with the D/E in the consensus sequence.107 In applying this, they were able to engineer PglB such that it recognized the eukaryotic N-X-S/T sequon and could carry out effective glycosylation of RNase A with limited activity against the canonical bacterial sequon.107 This same immunoblotting approach has since been used to identify in-depth how insertion of different glycosylation sites (via inclusion of short sequons) affected the function and stability of a number of diverse proteins in a process they coined as “shotgun scanning glycomutagenesis”.110 Intriguingly, Li et al. showed that when shotgun scanning glycomutagenesis was applied to the scFv of an anti-HER2 antibody, a number of sites could be glycosylated in a manner that readily increased the affinity of the scFV for HER2 and/or increased the stability of the protein.110 Such results and methods open the door to new avenues in the glyco-optimization of therapeutics.
In an interesting case of combining different approaches, Keys and colleagues engineered a polysialyltransferase for the synthesis of well-defined polysialylated proteins with low dispersity.111 Decreased product dispersity is a difficult enzymatic property to screen for as time- intensive HPLC analysis of the products is the only method with sufficient resolution for the task. To overcome this, Keys et al. employed an ingenious neutral drift approach which combined high-throughput screening with lower-throughput characterization by HPLC.111 Neutral drift is a method for protein engineering in which after the introduction of mutations (often, but not necessarily, at a relatively high rate111,112) – rather than screening for variants with higher activity – the experimenter only screens for variants with similar activity to the WT enzyme.112 Through rounds of neutral drift, one can thus generate a pool of variants that are both highly divergent but still functional – enabling efficient evolution with decreased screening burden.112 As well, the produced protein scaffolds are often highly evolvable as the high mutational load frequently results in the enrichment of stabilizing mutations.111–113 For each round of their campaign, Keys et al. employed a previously developed colony immunoblotting assay to identify active polysialyltransferases from screens of >104 clones in each round.114 This method does not allow for differentiation of polydispersity, and so for each round, they then analyzed the product profiles of just 100–200 clones by HPLC.111 Ultimately, in their third round of screening, they characterized just 51 clones by HPLC and identified a single residue which was able to dictate whether the polysialyltransferases produced products with small or large dispersity. This single residue switch was key in modulating the ability of the enzyme to bind elongated polysialic acid polymers – thus shifting the enzyme between processive and distributive modes of action (and high dispersity and low dispersity product profiles respectively).111
Screens using glycan-binding proteins have also been utilized in microtitre plate-based formats. For example, Hancock et al. carried out the directed evolution of glycosphingolipid synthase with product detection by use of an ELISA.115 In this work, they sought to expand the lipid specificity of an endoglycoceramidase glycosynthase mutant originally derived from Rhodococcus sp. Strain M-777 (EGC-II). While the naïve glycosynthase is largely sphingosine-specific, they sought to expand this activity towards phytosphingosine, for which the native enzyme has 10000-fold less activity. This enhancement in activity could then enable use of the enzyme for synthesis of diverse gangliosides. The authors used the GM1-specific Cholera toxin B subunit as the binding protein in the assay. The Cholera toxin B subunit, while specific for the glycan portion of the ganglioside, is promiscuous with respect to the lipid and so ideal for this purpose. Binding of the Cholera toxin B subunit could then be detected by antibody binding coupled to horse radish peroxidase activity. From a screen of 10
000 clones, a single mutation adjacent to the active site drove an 8100-fold increase in its activity towards phytosphingosine while maintaining activity against its native substrate. As a result, the mutant enzyme had near equal activity against sphingosine and phytosphingosine.115
A useful method for identifying preferred substrates of retaining GHs is through the use of 2F-sugar GH inactivators (Fig. 8a).116,117 In this approach, introduction of electronegative fluorine at the C2 position of the substrate results in a great reduction in the rates of both steps of the reaction through destabilization of the oxocarbenium ion-like transition states via inductive effects.117 When an activated leaving group is also incorporated to accelerate the first step, reaction of these inactivators with a GH results in accumulation of the covalent glycosyl-enzyme intermediate, thus inactivation of the enzyme. This intermediate is resistant to hydrolysis, but, remarkably, in the presence of a productively bound glycan acceptor, can be reactivated via transglycosylation.116 The identity of the preferred reducing end substituent can thus be identified by screening for (rates of) reactivation with libraries of different potential glycan acceptors in a method that is readily done in a plate-based format.116 Moreover, the identity of the linkage can also be determined through subsequent analysis of the rescue product.116 The initial study profiled the aglycone specificities of a number of GHs against a panel of glycosides and free sugars with the goal of using this information to guide GH-mediated glycoside synthesis.116 The approach has also been applied to study the substrate scope of E. coli glucuronidase and to better understand which non-carbohydrate small molecule substrates it acts on.118 Taking this even further, a synthetic gene library of GH1s was screened for their ability to cleave a library of glycosides modified individually with azide and amine substituents at the 3-, 4- and 6- positions.119 The aglycone specificities of the top eight hits were then profiled at high-throughput using a total of 83 different potential acceptors, allowing the identification of optimal enzyme/substrate combinations for synthesis of specific glycosides using the glycosynthase technology.119
The substrate preferences of GTs that act on polypeptides are of interest, especially with respect to the production of glycoproteins. Of particular interest have been the polypeptide N-acetylgalactosaminyltransferases (ppGalNAcTs), O-linked GlcNAc transferase (OGT), OSTs, and N-glycosyltransferases (NGTs). In seminal studies on ppGalNAcTs, sequence specificities were assessed by incubation of the enzymes with UDP-GalNAc and randomized pools of peptides. Subsequent purification of the modified glycopeptides by lectin affinity chromatography followed by Edman sequencing enabled facile identification of preferred substrate sequences (Fig. 8b).120–126 Results obtained have informed the software, IsoGlyP, which predicts ppGalNAcT glycosylation for a given sequence.127 Correspondingly, IsoGlyP has found considerable use in the design of synthetic sequons to target ppGalNAcT activity.128,129
Phage and mRNA display offer new pathways for determining sequence preferences and engineering bacterial N-glycosylation.130,131 In 2010, the Aebi group developed a method known as “GlycoPhage” (Fig. 8c).130 In this, a bacterial N-glycoprotein (AcrA from Campylobacter jejuni) is genetically linked to a phage coat protein (pIII). The AcrA-pIII fusion is exported to the periplasm upon expression where it can be glycosylated by OST. It is then incorporated into the phage progeny to make glycophage. The glycophage can then be detected and/or pulled down by a number of different methods. In the proof of concept study, the authors were able to enrich for sequences that could be glycosylated by C. jejuni OST by varying the N-glycosylation sequons and panning for binding using a glycan-specific antibody.130 This method should, in principle, be applicable to evolving any enzyme involved in glycan biosynthesis.130 In 2022, the DeLisa group described an mRNA display screen for glycoprotein sequon engineering (Fig. 8d).131 In this method, the protein of interest is fused to a short sequence which stalls the ribosome such that the nascent polypeptide and mRNA are still bound within the ribosome.131 The polypeptide can then be glycosylated through cell-free glycoprotein synthesis, the glycoprotein enriched by affinity purification, and the sequence of the RNA within the stalled ribosome determined by sequencing of the RT-PCR product.131 While not described within the paper, this method should be applicable to engineering/determination of glycosylation sequons and other aspects of protein glycosylation and affords a much higher throughput than cell-based methods (>109 clones per screen).131
A number of studies have applied self-assembled monolayers for matrix-assisted desorption/ionization mass spectrometry (SAMDI-MS) for determination of sequence specificities (Fig. 8e). The workflows for the campaigns outlined here are as follows: glycosylation reactions are carried out in microtitre plates, and the reaction mixtures are applied to a plate with a coating that immobilizes the substrate/products (e.g. reactive maleimide to couple to Cys-containing peptides or Ni-NTA for His-tagged protein binding).132–134 Following this, the plates are washed and then analyzed by MALDI-TOF MS.132–134 A series of studies by the Jewett and Mrksich groups have shown the utility of SAMDI for characterizing the specificities of OGT, ppGalNAcTs, and NGTs.132–134 These experiments have enabled the identification of a number of “GlycTags” for modification using NGTs.133 In fact, by profiling a number of different NGTs, and determining sequence preferences of these enzymes, semi-orthogonal GlycTags were identified. With this knowledge in hand, site-selective protein glycosylation was then carried out using these NGT-GlycTag pairs.133 Such an approach is particularly useful since introduction of the Glc or GlcN handles by NGTs can allow for subsequent extension to produce a glycan of interest.28,133,135 In this way, through cycles of handle attachment and glycan installation, proteins of defined glycosylation at different sites can be readily prepared.133
However, there have been a number of creative implementations of selections in the engineering of CAZymes beyond simple growth screens. As an example, the Cornish laboratory has shown the utility of chemical complementation for engineering of glycosynthases144 and cellulases.145 In brief, this method relies on the use of proteins that can recognize different chemical moieties. These are then fused to DNA-binding proteins such that gene expression is either activated or blocked when the small molecule of interest is present. While this approach has been used for glycosynthases and cellulases, and should be highly scalable, it has seen limited use since its first implementations.144,145 Of recent note is also the use of a combined antibiotic selection and auxotroph complementation strategy by Bennet and colleagues for the evolution of a sialidase to specifically cleave deaminated neuraminic acid (Kdn).146 In this, activity against Kdn was selected for in a tyrosine auxotroph line of E. coli by supplementation of the culture media with Kdn-caged tyrosine. Meanwhile, the native Neu5Acase activity of the sialidase was selected against by inclusion of Neu5Ac-α2,6-Gal-chloramphenicol. In applying this they were able to identify variants with up to 20-fold enhancements in Kdnase activity and up to 60-fold change in specificity.146
Artificial intelligence has yielded CAZymes that are not as yet found in Nature.149 In a remarkable case, Madani et al. used a large language model – trained on sequence data from >19000 PFAM families with further tuning on lysozyme sequences from CAZy families GH24, GH73, GH108, and others not yet classified within CAZyDB – to design a suite of lysozymes. These designed lysozymes were found to have similar activity profiles – both in the number of enzymes with activity (73% of designed enzymes) and the level of activity – to those of a set of random natural enzymes from the lysozyme families used for training. These screens used fluorescence-quenched fluorescein-labeled cell wall from Micrococcus lysodeikticus as a substrate. Further characterization showed that sequences with identity as low as 31% to any known enzyme were found to have activities and structural folds similar to those of known lysozymes.149 While this does not yet offer the ability to explicitly encode/discover new activities, it does offer a means for exploration of sequences not found in Nature.
The inherent reactivities of metals have been utilized in a number of natural and engineered enzymatic activities.148,150,151 For many hydrolytic enzymes, metals such as Zn2+ act as Lewis acids and activate a water molecule for nucleophilic attack.152 However, metals are not generally used for catalysis in this manner by GHs. In the closest instance, families GH127 and GH146 utilize a Zn2+ ion to activate the thiol of a Cys residue that then acts as the nucleophile.153,154 This leaves one to wonder whether there is a reason why Nature has not utilized “metalloglycosidases” more broadly. In recent work, β-glucosidase activity was introduced into OmpF (a porin) via introduction of a Zn2+ binding site with an open coordination site.155 Through rounds of directed evolution using a simple X-Glc and cellobiose-growth dependent agar plate assay, the authors were able to improve the activity of this enzyme against β-glucosides by two orders of magnitude.155 Curiously, these enzymes show a strong but incomplete dependence upon Zn2+ for hydrolysis with a competing non-metal dependent reaction.155 While the best enzymes were still relatively slow (kcat/KM of 10 min−1 M−1), it will be interesting to see how well metalloenzymes can be adapted to glycoside hydrolysis. Given the “late” transition states associated with glycosyl transfer via oxocarbenium ion-like transition states5 it may be that there is little advantage to increasing the nucleophilicity of water in this way. Analogous work on phosphotriesterases has shown that promiscuous enzymes that operate by different mechanisms than those found in Nature, can be evolved to have the same catalytic efficiency as the natural enzymes.156 Albeit in the case of phosphotriesterases, the natural enzymes are metal-dependent while the enzyme evolved in the laboratory does not use a metal for catalysis.156
Directed evolution and deep characterization of GHs can also be carried out in droplets.161 Romero and colleagues performed deep mutational scanning of a GH1 from Streptomyces sp. (Bgl3) in droplets and characterized millions of mutants in terms of activity and thermostability.161 This was the first deep mutational scan of this class of enzyme. Curiously, it revealed residues essential for function which were distant from the active site and also non-conserved in other GH1 family members.161 Such results provide clear indications that, even though highly conserved sites are indispensable for function, proteins acquire local adaptations and networks of residues required for enzyme function over the course of evolution.63 The authors further show the utility of droplet microfluidics for characterizing variants by heat-treating the droplets and then screening for thermostable variants.161
In 2023, Lipsh-Sokolik et al. were able to show the utility of activity-based protein profiling for identification of highly active designed GH10 xylanases.162 Activity based probes based upon epoxide and aziridine derivatives of glycosides have been developed extensively by the Overkleeft group.163 These probes enable capture of retaining Koshland GHs via formation of an irreversibly linked cyclitol upon attack of the epoxide/aziridine by the enzyme's catalytic nucleophile (Fig. 9a).163 In this screen, the probes carried a fluorophore and thus linkage of the probes to a yeast surface displayed enzyme allowed for separation of active variants via FACS (Fig. 9b).162 Diverse xylanases were generated in silico via recombination of natural GH10 sequences with further re-design and optimization to eliminate incompatibilities due to epistasis. By applying this FACS-based approach, libraries of >105 variants could be readily screened, with 103–104 active designed xylanases being identified in these screens. While the screen itself only requires the enzymes to carry out the first step in the retaining Koshland mechanism, most of the identified enzymes were capable of carrying out the full catalytic cycle. Moreover, by using different probes, the authors were able to further separate variants with activity on xylan, cellulose, or both. Further work using these activity-based probes for CAZyme engineering seems likely in addition to their previously established utility in identification of enzymes from complex mixtures,164 and inhibitor discovery,165 amongst a number of other applications.163 While there may be concerns about the ability of the method to identify highly active variants – as only a single, partial turnover can be achieved – a number of other successfully applied methodologies for enzyme engineering also only allow for a single turnover of the screening substrate.166,167
Phage assisted continuous evolution (PACE) has been a largely unexplored method for CAZyme engineering. One of the great benefits of continuous evolution strategies such as PACE is that they enable rapid, unsupervised, explorations of sequence space.168 Indeed, depending upon the experimental setup, almost 200 rounds of evolution can be carried out in a single week whereas many directed evolution campaigns would only finish a single round in that time.169 The basis on which PACE was developed is that the activity of interest must be linked to successful synthesis of the pIII coat protein (Fig. 9c).170 In brief, E. coli are used as hosts for phage which lack the pIII gene (gIII) but carry the gene of interest.170 These E. coli host cells carry gIII under the control of a genetic circuit that is activated by the activity of interest.170 Replication of the phage is thus dependent upon the activity of interest.170 The E. coli also carry a plasmid for increased rate of mutagenesis that enables continual diversification of the library.170 And so, over time, phage encoded enzyme variants capable of driving pIII expression (and thus phage replication) are continually selected for and continually diversified.170 PACE has been shown to be useful for evolution of for a number of enzymes including proteases, base editors, RNA polymerases, and aminoacyl-tRNA synthetases.170 With respect to CAZymes, in the closest available example, PACE has been shown to theoretically enable screening for CEs capable of de-esterifying IPTG (Fig. 9c).171 Note though that, while esterases with improved activity could be found by PACE, the authors actually had greater success with a non-continuous evolution approach (phage assisted continuous selection (PACS)). In PACS, mutants are mutagenized in vitro prior to introduction into phage rather than relying upon a mutagenesis-enhancing plasmid for continuous mutagenesis in vivo.171 The study also identified a number of different esterified substrates with which one could readily apply PACE or PACS.171 In principle, many more substrates could be employed as long as transcription factor-activator pairs with requisite specificities are known. And so, better understandings of the transcriptional activation in microbial carbohydrate degradation pathways (such as polysaccharide utilization loci) could offer further avenues for exploration.172
The Withers laboratory has developed the mucinase/O-glycopeptidase enabled linking of O-glycosylation and related activities (MELiORA) method for use in engineering of enzymes active on mucin-type O-glycoproteins (Fig. 9d and e).173 With respect to CAZyme engineering, MELiORA uses O-glycopeptidases (a type of peptidase with absolute specificity for mucin-type O-glycosylated protein substrates174) to identify the glycosylation state of a fluorescent protein FRET probe. Using this, enzymes that add or remove glycans, such as GTs (Fig. 9d) and GHs (Fig. 9e), can be assayed with a read-out based upon the change in FRET arising from O-glycopeptidase-catalysed cleavage of the probe. Because all of the assay components can be expressed within the cell (i.e. the probe, the glycosyltransferases and other enzymes required for glycan synthesis,128,175 the O-glycopeptidase, etc.), enzyme variants can also be readily screened at ultrahigh-throughput via FACS. Even for plate-based assays, the approach still enables screening using complex mucin-type O-glycoprotein substrates with minimal costs; as the glycoprotein substrates – with diverse, defined glycan structures – can be readily produced in E. coli with high yields.128,173,175 In many ways MELiORA is ideal as the protein substrate is likely similar, at least locally, to that on which one would want the enzymes to ultimately act. When used in screens for glycan biosynthetic enzymes, MELiORA also offers the same benefits as in the coupled enzyme screen devised by Mayer and co-workers.88 That is, with the use of the appropriate O-glycopeptidase, it is only when the glycan of interest is produced that there is any change in fluorescence. Moreover, the method is not limited to a FRET probe as initially described. Rather, almost any screening system for peptidase activities should be readily amenable to MELiORA, thus these screens can be converted such that they enable screening for CAZymes.176 While MELiORA was only applied to the directed evolution of an O-glycopeptidase in the initial study, the broader applicability of the approach should be borne out by further work in the field.173
A great benefit of using natural substrates in screening is that the observed activity improvements should be readily translatable to the desired task and conditions. However, many of these screens suffer from the need for processing steps (e.g. addition of developing reagents, additional incubation steps, or elevated incubation temperatures to drive chemical reactions) that limit throughput and add difficulty. The creativity and ingenuity displayed within the field, particularly in the use of coupled assays and highly relevant substrates (or ideally the desired substrate) to detect activity stand as highlights and portend future developments. In moving this forward there is likely to be synergy between the engineering of CAZymes and the engineering of the coupling enzymes used (whether they be oxidases,79,81 dehydrogenases,78,86O-glycopeptidases173 or whatever).
As is evident from their absence from this review, but widespread use within the field, there are a number of enzymes for which little high-throughput engineering has been carried out. Prominent examples of this are the ENGases – incredibly powerful enzymes for remodeling of N-glycans to produce designer glycoproteins.178 The workflow for such glycoengineering often employs both a hydrolytic enzyme (to remove the native N-glycans) and then glycosynthase mutants to install the desired glycan structure. Serious engineering work could generate new enzymes of improved specificity, perhaps allowing simple site-selective glycosylations to be performed. In other enzyme classes, limited work has been performed on engineering enzymes such as PLs and lytic polysaccharide monooxygenases (LPMOs) (which are classified in the CAZy database in a number of auxiliary activity families179 and have attracted attention due to their ability to initiate degradation of otherwise recalcitrant crystalline cellulose and chitin6,179). Indeed, there are only two reported LPMO directed evolution campaigns.180,181 In a rather impressive study, Jensen et al. employed high-throughput mass spectrometry-based screening to almost completely shift an LMPO's substrate specificity from cellulose to chitin.180 In light of these examples, it is clear that further work is needed within the field to expand the available assays for these enzymes and thereby enable improvement of what are already highly useful and in-demand enzymes.
This journal is © The Royal Society of Chemistry 2024 |