Robert A. Shepherd
a,
Conrad A. Fihn
a,
Alex J. Tabag
ab,
Shaun M. K. McKinnie
a and
Laura M. Sanchez
*a
aDepartment of Chemistry and Biochemistry, University of California Santa Cruz, Santa Cruz, CA 95064, USA +1 831-459-4676. E-mail: lmsanche@ucsc.edu
bPharmaceutical Sciences, University of Kentucky, Lexington, KY 40506, USA
First published on 27th February 2025
Covering: 2015 to 2024
Mass spectrometry (MS)-based methods have been implemented extensively for enzyme engineering due to their label-free nature, making them suitable for screening a wide range of biochemical systems. Over the past decade, advancements in mass spectrometry, separation science, and the implementation of hyphenated methods have allowed for more streamlined analysis of large volumes of samples while maximizing the richness and dimensionality of the data collected. In this review we highlight recent advancements in mass spectrometry that have allowed for more efficient, robust, and rigorous enzyme engineering for various applications relating to natural products chemistry.
An increasingly popular approach to creating new chemical diversity from natural product enzymes is to employ directed evolution (DE). DE is a protein engineering method that mimics evolution in a laboratory setting over a short period of time to drive the activity of enzymes toward an intended, non-native biochemical function.11,12 For example, ‘new-to-nature’ chemistries enabled by heme protein DE efforts have led to development of efficient biocatalysts for performing complex chemistries, such as cyclopropanation, and formation of carbon–silicon bonds.12,13 DE entails performing iterative rounds of mutagenesis on a specific gene, screening of genetic mutants (enzyme variants), and selection of variants that better perform the desired function.8 These functional genes are then replicated and amplified, serving as the starting point for subsequent rounds of genetic diversification, screening, and selection. Methods such as random mutagenesis (i.e. error-prone PCR, tandem repeat insertion, rolling circle amplification, etc.) have enabled rapid generation of thousands-to-millions of genetic mutants (upwards of 1010 mutants), thus introducing adequate genetic diversity in short amounts of time.8,14,15 One major limitation to DE efforts is that screening large mutant libraries can be time-consuming and laborious, presenting a need for high throughput screening (HTS) methods that simultaneously increase efficiency while lowering costs.8,14,16
Many conventional DE screening efforts have relied on visual and/or optical methods for screening and selection of improved enzyme variants. An early example of this comes from Chen and Arnold, where subtilisin E, a serine protease isolated from Bacillus subtilis, was assessed for its activity in dimethylformamide by measuring the halo formed around mutant B. subtilis colonies resulting from the hydrolysis of agar-embedded casein by engineered subtilisin E variants.17 Additionally, colorimetric assays employing chromogenic substrates have been developed to obtain a visual readout of enzyme activity. A popular example is the use of X-galactose (X-Gal) as a substrate for measuring β-galactosidase activity on solid media. The prefix ‘X’ is an abbreviation for the 5-bromo-4-chloro-3-indoxyl moiety on the anomeric carbon of (X-)galactose, which dimerizes post-hydrolyzation, forming a blue precipitate.18 Colonies with a darker blue color are thus selected for subsequent rounds of DE.19 These visual screening methods are generally cost effective and easy to employ, but are often limited to specific enzymatic systems, rendering them inapplicable to a broader range of experimental scenarios.
To further increase the limited throughput offered by these manual screening and selection approaches, researchers have employed automated screening and selection technologies such as digital imaging (DI),20,21 microtiter plate workflows,22–26 and microfluidic (“lab-on-a-chip”) sorting devices in combination with various optical assays to measure the readout of individual enzyme mutants in an automated, high-throughput manner. Currently, microfluidic sorting devices, such as those used in fluorescence-activated droplet sorting (FADS),27,28 offer the highest throughput for mutant enzyme selection, with sorting rates as high as 30000 droplets per second (1 droplet contains 1 enzyme variant). Multichannel microfluidic devices are used to suspend single cells expressing variant enzymes within aqueous microdroplets containing the respective reaction substrate and components. This allows for the enzyme reaction to occur within each droplet during a short incubation period. The microdroplets are emulsified within a steady flow of biocompatible oil and ultimately dielectrophoretically (DEP) sorted based on the fluorescence of the reaction products. Droplets that induce fluorescent signals that meet a specified fluorescence threshold are sorted, and the variant enzymes they harbor are prioritized for further rounds of DE.
While the discussed methods address the problem of throughput, they all require the use of chromogenic, absorbent, or fluorescent reaction components which severely limits the range of their applicability.14,29–31 Thus label-free approaches to DE screening have become a necessity in recent years. Over the last decade, mass spectrometry (MS)-based methods have gained significant traction for enzymatic screening due to their label-free nature, making them suitable for screening a wide range of biochemical systems without the need for extensive optimization.4
In this review we discuss different MS-based approaches that have been applied to screen the chemical products resulting from various DE screening campaigns. A discussion of the advantages, limitations, and throughput capabilities of MS-based DE screening workflows will be a major highlight throughout. Additionally, we briefly discuss alternative MS-based HTS techniques employed for enzymatic reaction screening in general, and highlight their potential for high-throughput natural product DE screening (Fig. 1 and Table 1).
Technique | Speed (seconds per sample) | Advantages | Limitations | Application examples |
---|---|---|---|---|
References | ||||
Visual techniques | Not reported | Inexpensive | Relies on human selection of ‘hits’ | 17 and 19 |
Lower throughput compared to other methods mentioned below | ||||
Data is easy to interpret | Not easily automated | |||
Relies on a visibly observable change in phenotype | ||||
Colorimetric microplates | ∼8 | Automated data collection and analysis | Reaction scope limited to those with fluorescent products | 84 |
Minimal human intervention | Limited screening capacity | |||
Requires access to a plate reader and sometimes robotics | ||||
Digital imaging | ∼1.2 | Relatively inexpensive | Reaction scope limited to those with fluorescent products | 21 and 85 |
Data is easy to interpret | X-gal methods where side product detection is employed have higher risk for false positives | |||
Microfluidics sorting | ∼3.6 × 10−4 | Extremely fast. Well suited for extremely large mutant libraries | Reaction scope limited to those with fluorescent products | 9 |
Screening devices must be adapted/customized for every biological system screened | ||||
LC-MS | 600–1200 for standard LC-MS | Label-free | Slow. Often <10 variants per hour of screening | 86 |
High sensitivity | The above raises challenges in shared instrument settings | |||
Does not require chromogenic substrate or products | Access to expensive equipment (i.e. MS instrumentation) | |||
Direct infusion ESI-MS | 10–20 | Label-free | No online separation of analytes | 30, 87 and 45 |
High sensitivity | Sensitive to ion suppression from salts and buffers | |||
Does not require chromogenic substrate or products | Access to expensive equipment (i.e. MS instrumentation) | |||
LDI-MS | 1–5 | Label-free | Matrix effects | 14 and 60 |
High sensitivity | Sample heterogeneity makes reliable quantitation more challenging | |||
Addresses throughput limitation of LC-MS | No online separation of analytes | |||
Does not require chromogenic substrate or products | Sensitive to ion suppression from salts and buffers | |||
Access to expensive equipment (i.e. MS instrumentation) |
After genetic perturbation (mutagenesis), mutant genes are transformed into a microbial host (typically E. coli or yeast) that subsequently acts as an ‘enzyme factory’ after protein expression is induced, producing high amounts of the enzyme of interest. Once a sufficient level of protein expression has been achieved, the enzyme reaction is performed either in whole-cell lysates, or in vitro with purified enzymes. Enzyme activity is then most commonly assessed using LC-MS, where the reaction substrate(s) and product(s) are monitored using extracted ion chromatograms (EICs), extracting solely m/z values of interest from complex reaction mixtures (Fig. 2A). If the change in the chemotype is desirable, such as an improved substrate/product ratio, the gene responsible for production of the engineered enzyme variant is then sequenced, identifying specific changes to the enzyme structure responsible for the favorable change in activity. The gene is then isolated, amplified, and reintroduced into the DE cycle, acting as the starting point (template) for further mutagenesis, screening, and selection (Fig. 2B). While this process is robust, it is time-consuming to perform lysate reactions on large mutant libraries, and even more so to perform in vitro reactions. The additional time required for chromatographic separation and MS analysis of hundreds-to-thousands of samples is also a notable challenge,14,15,32,33 however several methods are discussed in the following sections that address this.
It should be noted that Fig. 2 highlights the simplest scenario depicting the detection of a single expected product. A unique advantage of MS-based screening methods is the flexibility regarding the simultaneous monitoring of multiple unique molecular species within a single analysis, this sets MS apart from many of the screening methods discussed in Introduction, which are often limited to monitoring for a single analyte in a given experiment. As long as resolution of the analytes can be achieved (by m/z, retention time, and/or collisional cross section in the case of ion mobility spectrometry), multiple analytes can be monitored within a single MS experiment. One approach coined ‘Substrate Multiplexed Screening’ (SUMS) leverages this advantage of MS to select for catalytically promiscuous enzyme variants generated via DE. SUMS involves screening enzyme variants with a mixture of substrates to directly assess mutations that affect substrate preference and scope.34,35 DE campaigns utilizing SUMS have relied on LC-MS for analysis which impacts overall throughput. However, it is worth highlighting that the compatibility of MS with multidimensional, multiplexed analyses has allowed for significantly increased data richness from a single experiment (monitoring of multiple reaction products) when compared to conventional DE screening techniques, and has fostered the implementation of unique screening approaches that extend outside the conventional theme of improving enzyme activity for a single substrate. Finally, MS-based screening approaches lend themselves to monitoring the formation of unexpected side products, which represents a notable possibility when working with variant enzymes. In this scenario, relative abundance of side products could be factored into the selection criteria (i.e. a desirable chemotype), allowing for more informed decisions to be made related to the relative efficiency and/or promiscuity of variant enzymes for a specific (i.e. intended) transformation.
![]() | ||
Fig. 3 Truncated workflow for screening biocatalytic reactions and identifying improved phenylalanine ammonia lyase (PAL) variants. Reprinted and adapted with permission from Kempa et al.44 Copyright © 2021 American Chemical Society. Created with http://BioRender.com. |
Kempa et al. later expanded the application of DiBT-IMMS to a HTS campaign of mutant phenylalanine ammonia lyases (PALs) to enhance their activity toward the conversion of electron-rich cinnamic acid derivatives to their noncanonical phenylalanine counterparts via the enantioselective addition of ammonia.44 A metagenomic PAL, AL-11 (GenBank accession number: MW026687), was subjected to DE based on its promiscuity toward a variety of substituted cinnamic acid derivatives, namely di- and trimethoxycinnamic acids, which are generally considered inactive with most PALs. Employment of DiBT-IMMS for biocatalytic screening enabled an analysis rate of ∼40 s per sample. While this is a longer acquisition time than other ESI-MS techniques discussed below, this is primarily due to the nature of the samples themselves and how they are presented to the mass spectrometer. For DiBT-MS, biotransformation reaction solutions are spotted onto membranes, and directly analyzed using DESI-MS in a spatially-resolved fashion omitting automated liquid handling and sample-cleanup steps. This decreases sample handling and preparation time pre-acquisition. Additionally, by acquiring data as a spatially-resolved MS ‘image’, data visualization as a heatmap of relative product abundance is effectively incorporated into the data acquisition itself, saving time post-screening while also providing adequate sample coverage. These factors shorten overall DE campaign length, despite the slightly increased sample acquisition time required for DiBT-MS. This facilitated rapid and comprehensive screening of three mutagenic PAL libraries to identify key amino acid residues necessary for both successful and enhanced conversion to the L-amino acid products. Engineering of AL-11 began with a point mutation of the large polar ligand Q84 to several amino acids of varying size and polarity. The mutation Q84V improved conversion of the substrate to the target amino acid product by nearly 7-fold compared to WT AL-11. To further expand the genetic diversity of this engineering approach, three additional combinatorial libraries were constructed using a degenerate codon set RBT (coding for T, S, I, G, A, V), iteratively targeting neighboring active site residues (N199, L196, L148, I400) in an activity-driven manner. The authors were ultimately successful at engineering AL-11 variants with enhanced activities, some capable of >99% conversion of multi-substituted cinnamic acid substrates to their corresponding L-phenylalanine derivative products. Most hits harbored the L196T mutation in combination with mutated residue N199I, N199G, or N199V, which ultimately gave >95% conversion of 2,3-dimethoxycinnamic acid to the respective enantioselective amino acids. Additionally, Kempa et al. highlight the use of IMS-MS for increasing confidence in their screening workflow using an example from a separate panel of mutagenized imine reductases (IREDs). While IMS can be used to determine an analyte's collisional cross section (CCS), further increasing confidence in the identification of specific analytes, it can alternatively be used as an additional mass filter when coupled to MS systems. If the IMS device is tuned to monitor within a specific mobility range, only ions that fall within this specified range will be detected. Ultimately, this allows for molecules that may have the same or similar m/z value (isomers vs. isobars, respectively), but different CCS values from the analyte of interest to be disregarded during the data acquisition process. Doing so prevents false positives resulting from the detection of m/z's that correspond to constitutional isomers or isobaric species. In this example, Kempa et al. used traveling wave ion mobility spectrometry (TWIMS) to filter a signal corresponding to an IRED substrate's second 13C isotopologue peak (m/z 208; [M + 2]+ of the substrate). This approach prevented false positives resulting from the detection of an isobaric m/z value that did not correspond to the target analyte.
With the aid of RapidFire MS screening, Zetzsche et al. developed an engineered biocatalytic platform for the selective cross coupling of phenolic substrates through oxidative C–C bond formation.45 Biaryl compounds have many uses pertaining to drug design, materials science, and asymmetric catalysis, but there are many challenges in their synthesis. Specifically, metal-catalyzed cross-coupling offering tunable site-selectivity comes at the expense of extra synthetic steps and arduous optimization of reaction conditions, with the formation of tetra-ortho-substituted biaryl bonds remaining a notable synthetic challenge.45,47,48 Several classes of oxidative enzymes that mediate the dimerization of biaryl natural products have been observed from various natural sources including, but not limited to, bacteria and fungi.45,49,50 Notable examples fall within the cytochrome P450 enzyme superfamily, with many arbitrating highly selective oxidation reactions, and a smaller subset catalyzing site-selective and atroposelective dimerizations. Following optimization, Zetzsche et al.45 were successful at expressing an Aspergillus oxidative P450 enzyme, KtnC, in Pichia pastoris (now classified Komagataella phaffi). Using this optimized biosynthetic system, successful dimerization of the native coumarin substrate was observed with increased formation of the expected biaryl product (Fig. 4).
The authors later expanded the scope of reactions toward unnatural (non-native) biaryl cross couplings between non-equivalent phenolic substrates with various patterns of substitution. It was observed that WT KtnC catalyzed the formation of various unnatural cross-coupled products with maximized yields over dimerized products, so long as the stoichiometry was tuned. KtnC was also shown to tolerate a range of coumarins substituted with various electron-rich and electron-deficient functional groups. However, as the substrates varied further in structure from that of the native coumarin scaffold, a notable decrease in cross-coupling activity and site-selectivity was observed.
Zetzsche et al. next sought to use DE to expand the activity of WT KtnC and address the limitations of chemo-, site-, and atroposelectivity associated with its biosynthetic capabilities. Using a semi-rational approach, thousands of KtnC enzyme variants were generated containing one or more substitutions within 12 Å of the active site. Following transformation of the mutant genes into S. cerevisiae (strain BY4742) on histidine dropout agar plates (2% glucose), the resulting colonies were inoculated into 96-well plates containing histidine dropout minimal media (4% glucose). Following overnight incubation, the cells were pelleted by centrifugation and the media removed, followed by resuspension into histidine dropout minimal media containing the reaction substrates. After incubation for 2–3 days, reactions were quenched with methanol and centrifuged. The resulting supernatants were subsequently screened for enzymatic product formation using RapidFire MS at a rate of ∼12 s per sample, not including blank injections performed between each new sample to minimize carryover. Peak areas of EIC's for internal standard, substrate, and cross-coupled products were collected for each sample and the relative percent conversions were calculated. The average conversion resulting from template reactions for each plate was set to 1.0, and the conversions for each variant were normalized to that of the template, providing a readout of relative fold improvement. Variants with increased conversion relative to WT KtnC were selected for further DE. Over five rounds of evolution, nine active site amino acid substitutions were performed (P142R, D322E, E329M, C331R, F336Y, G396W, R401Q, S513R, V516M), generating the engineered KtnC variant, LxC5, which was observed to have a 92-fold improvement in both activity and site-selectivity toward the target product. However, this did come at the expense of decreased atroposelectivity as determined by orthogonal chiral reversed phase chromatography. This limitation was addressed through two additional rounds of DE to create the KtnC variant LxC7, which catalyzed the cross-coupling to the target product in a 77:
23 er, improving atroposelectivity relative to LxC5 (52
:
48 er), but decreasing overall reaction yield.
Zhang et al. implemented an automated sample preparation workflow coupled to MALDI-TOF MS analysis to detect the biocatalytic reaction products at a rate of ∼5 seconds per sample.60 The generality of their sample preparation procedure suggests this HTS workflow could be applied to a variety of biocatalytic systems with diverse substrate scopes so long as a suitable matrix was chosen and the proper equipment was available. In their study, Zhang et al. aimed to use DE to engineer a cyclodipeptide synthase (CDPS) to produce novel diketopiperazine (DKP) products; CDPSs are recognized as challenging targets for rational engineering approaches, and DKPs can be useful pharmacophores.64–67 The CDPS AlbC was central to the assay and utilizes phenylalanyl-tRNAPhe (Phe-tRNAPhe) and leucyl-tRNALeu (Leu-tRNALeu) as substrates to synthesize cyclo(L-Phe-L-Leu) (cFL) in its native host Streptomyces noursei. They successfully screened a random mutagenesis library consisting of ∼4500 CDPS variants within a week using unlabeled MALDI-TOF MS analysis coupled to an integrated, robotic workcell (Fig. 5).
![]() | ||
Fig. 5 (A) Biosynthetic scheme of diketopiperazines by cyclopeptide synthase (CDPS) (B) truncated workflow of the MALDI-MS-based HTS method for directed evolution of AlbC in E. coli by Zhang et al.60 Copyright © 2022 Royal Society of Chemistry (RSC). Created with http://BioRender.com. |
This ultimately led to the discovery of an AlbC mutant that produced a new cyclopeptide product (cFV), which revealed a previously unknown residue (F186) that had a substantial impact on the substrate specificity of AlbC. This result was confirmed by further LC-MS and computational analyses of AlbC clones harboring the same mutation (F186L). Additionally, by generating and screening site-saturation mutagenesis (SSM) libraries of AlbC, Zhang et al. were able to confirm the impact of known mutations and reveal new specificity-modulating mutations (T206F) in AlbC, further highlighting the utility of random engineering approaches like DE, and the importance of broadly applicable (unlabeled/untargeted) HTS methods for screening challenging biochemical systems (Fig. 6).
![]() | ||
Fig. 6 Structural illustration of the substrate binding pocket of the wildtype AlbC and the mutants F186L and T206. Note: Panel (A) superimposition of the WT with T206F. Panels (B and C) schematic representation of the hydrogen-bond and hydrophobic interaction between cFL WT (B) and cFV F186L (C). Dashed lines indicate hydrogen bond (yellow) interactions. The key amino acid residues of WT, F186L, and T206F are drawn as sticks in green, yellow and orange, respectively. Figure and caption reprinted from Zhang et al.60 Copyright © 2022 Royal Society of Chemistry (RSC). |
With a goal of continuing to evolve this catalyst in a high-throughput manner, Pluchinsky et al. sought the use of an LDI-based methodology, ‘SAMDI’, to overcome the need for chromatography while monitoring this reaction.14 SAMDI works by applying MALDI to self-assembled monolayers (SAMs) that assemble on a surface (gold) by adsorption. When the surface is submerged in a solution of alkane thiolates, the sulfur atoms coordinate to the gold in a densely packed array. The alkane portion extends from the gold and can be functionalized to provide surfaces with defined chemical reactivities.69 Pluchinsky et al.14 utilized a gold-coated MALDI plate, which was soaked in a solution of disulfide alkanes functionalized with a mixture of maleimide and tri-(ethylene glycol) functional groups, chosen based on the chemistry available on the reaction substrate and products. This generated a MALDI plate that, when lysate reaction mixtures were spotted, selectively immobilized the reaction substrate and product on the surface via a conjugate/Michael addition. This allowed subsequent analysis of the analyte-alkane thiolate conjugates by MALDI-MS.70 The researchers identified the products by a corresponding change in mass of 86 Da and integrated the peaks for the substrate and product to obtain a reaction yield. For each of the libraries, heat maps were generated to exhibit the relative activities of the variants (Fig. 8). Promising library members were run at an analytical scale and their activities were validated using GC-MS. The variant exhibiting the highest conversion to the desired product was then chosen to be the template for the succeeding round of evolution.14
![]() | ||
Fig. 8 SAMDI employed for HTS of cytochrome P411 variants. (A) Schematic representation of SAMDI. The thiol headgroup of functionalized alkane thiolates coordinates to the gold surface of the MALDI plate, forming a SAM. The reaction substrate and product are subsequently immobilized on the functionalized surface, followed by matrix application and MALDI-MS analysis. (B) Exposed thiols resulting from HCl induced deprotection of (E)-S-(7-methoxyhept-5-en-1-yl) ethanethioate (substrate) and ethyl (E)-9-(acetylthio)-3-methoxynon-4-enoate (product) are immobilized (covalently tethered) to the SAM via a Michael addition to the maleimide functional group. (C) MALDI analysis was performed, specifically monitoring for the respective m/z values corresponding to the thiolated substrate ([M + Na]+ = 1033.5861) and product ([M + Na]+ = 1119.6229). A mass shift of 86 Da is indicative of successful insertion of the ethyl acetate moiety onto the substrate via cytochrome P411. Reprinted (adapted) with permission from Pluchinsky et al.14 Copyright © 2020 American Chemical Society. (D) High throughput visualization of the MALDI-MS spectra represented as heat maps displaying mutants shaded and organized by relative fold improvement normalized by the average of template controls on each respective 96-well plate. Specifically, reaction yield was calculated (yield = [product area]/[substrate area + product area]) for each enzyme variant, and visualized relative to the average yield observed by the template control enzymes for each plate screened. Reprinted and adapted with permission from Pluchinsky et al. Copyright © 2020 American Chemical Society. |
To identify enzymes with increased activity, iterative rounds of random mutagenesis and screening were performed in E. coli. Over three rounds of evolution, data for ∼5000 variants were acquired. SAMDI screening was estimated to be ∼140-fold faster than using GC-MS as the primary screening method. The reproducibility of the technique was demonstrated by selecting and scaling up one variant from the final round of evolution to be repeatedly screened using SAMDI. Overall, Pluchinsky et al. showed that SAMDI-MS analysis can be applied as an effective and efficient DE screening technique.14 Given SAMDI's previous applications in HTS and optimization of both synthetic71 and enzymatic reactions,72–74 it is likely that SAMDI could be employed to monitor a diverse range of biochemical transformations relevant to NP enzyme engineering that are not limited to fluorescent probes or downstream signaling molecules. The caveat to this is that the transformation being observed must exhibit a corresponding change in mass between the substrate and the product(s), and must contain a selectively reactive functional group for covalent attachment to the reactive headgroup of the disulfide alkanes. Lastly, SAMDI does require the use of MALDI plates specifically modified for constructing SAMs, however these have become commercially available in recent years (Charles River Laboratories).
This method was first applied to examine the substrate tolerance of the biosynthesis for the antibiotic plantazolicin (PZN), a linear azol(in)e-containing peptide member of the ribosomally synthesized and post-translationally modified peptides (RiPPs) family of natural products produced by Bacillus velezensis. PZN is first synthesized as an unmodified peptide via translation at the ribosome. A trimeric heterocycle synthetase—composed of the enzymes PznB, PznC, and PznD—then converts select cysteine, serine, and threonine residues in the C-terminal (core) region of the precursor peptide into thiazol(in)e and (methyl)oxazol(in)e heterocycles through a series of cyclodehydration and dehydrogenation reactions. These heterocycles are then stabilized through a series of post-translational modifications, including additional oxidations catalyzed by the dehydrogenase subunit, which converts thiazoline and oxazoline intermediates into aromatic thiazole and oxazole rings. This step is followed by dimethylation by PznL and leader peptide cleavage mediated by the protease PznE, culminating in the production of the bioactive natural product.79 Analogues of PZN were assembled by site-saturation mutagenesis of the precursor peptide gene at two non-cyclized positions (Fig. 9a). After data collection, it was analyzed by t-distributed stochastic neighbor embedding (t-SNE) analyses for unsupervised clustering of spectra and manual examination, and the positions of mutants were mapped to the plate. t-SNE is a dimensionality reduction technique used to visualize high-dimensional data in two or three dimensions. This method captures the data structure by preserving local relationships, making exploring clusters or patterns in complex datasets practical. Thirteen PZN analogues, previously isolated by liquid cultivation and extraction, and ten previously unreported variants were detected.
After success with PZN, the screening method was aimed to quantify changes in the relative abundance of rhamnosyltransferases (RhLs) reaction products by modifying enzymatic specificities in the RhL biosynthesis pathway (Fig. 9b). Directed protein evolution of a two-enzyme pathway for the biosynthesis of RLs was carried out and two mutant strains, identified by MALDI-TOF analysis, were confirmed to produce significantly different ratios of RhL products than WT. The authors showed that this method can be both reproducible in multiple contexts as well as improve upon the rapidity and cost with which the mutants can be screened relative to traditional methods such as automated liquid handling (5 s and $0.0065 versus 15 s and $0.86, respectively).78
Another example of MALDI MSI for enzymatic screening can be highlighted by its use for metabolic biosensing to map enzyme activity of a G2PS1 type-3 polyketide synthase (PKS) from Gerbera hybrida.80 Through their unique screening approach, Xu et al. combine the scalability of microfluidics technology with the generalizability of MALDI MSI for assessing enzyme activity (Fig. 10). G2PS1 natively catalyzes the biosynthesis of triacetic acid lactone (TAL) via condensation of a starter acetyl-CoA unit with two malonyl-CoA units, followed by cyclization of the triketide chain. Importantly, it has been shown that active site mutations in type-3 PKSs can drastically alter the kinetics and scope of polyketide products, allowing access to potentially novel product species.80–82 Since the G2PS1 PKS enzyme of interest in this study catalyzes molecules associated with primary metabolism, the activity of the enzyme has a direct influence on the metabolite profile of the host, which can be leveraged to detect enzyme activity even if the target enzyme product is not directly observed. As such, this approach can also be used as a general method to characterize a range of products produced by engineered enzymes of interest.
![]() | ||
Fig. 10 Truncated schematic of the printed droplet microfluidics (PDM) – MALDI MSI sample preparation and analysis workflow used for enzymatic screening of G2PS1 mutants by Xu et al.80 (A) Yeast cells harboring mutant G2PS1 genes are suspended in aqueous yeast nitrogen base minimal media solution. (B) Individual cells are encapsulated into aqueous droplets suspended in fluorocarbon oil. (C) The encapsulated single cells are collected into a 5 mL syringe and stored vertically in a shaking 30 °C incubator for 5 days to form isogenic colonies within each droplet. (D) Droplets containing isogenic colonies are scanned at multiple wavelengths and sorted based on cell density. (E) Droplets containing a uniform cell density are printed onto the custom glass MALDI slide into individual wells and allowed to dry. 2,5-dihydroxybenzoic acid (DHB) MALDI matrix is subsequently sprayed over the printed MALDI slide. (F) Once dry, the slide containing 10![]() |
First, a semi-rational G2PS1 type-3 PKS mutant library was constructed, varying four key amino positions within or near the substrate binding pocket (T199, L202, M259, L261). Ultimately, the library consisted of 1960 codon-shuffled members that were subsequently synthesized into a plasmid backbone and transformed into Yarrowia lipolytica. Single cells were encapsulated and cultured in 300 pL droplets to generate isogenic colonies. This cultivation produced additional material compared to a single cell, thus boosting MS signal and aiding in the collection of reliable metabolomic data. The resulting isogenic colonies were individually dispensed using PDM (∼30 minutes for 10000 colonies) onto custom glass slides etched with 10
000 wells (80 μm in diameter) with rounded bottoms to concentrate the dispensed material to the center of each well (Fig. 10E). With PDM, the encapsulated colonies are individually scanned at multiple wavelengths, allowing only those within a narrow cell density range (corresponding to a specified optical threshold processed by a field-programmable gate array) to be dispensed onto the glass slide (Fig. 10C). Finally, the glass slide is dried and spray-coated with matrix, ultimately being subjected to MALDI MSI analysis between m/z 30–630 (Fig. 10F). The authors note that higher capacity slides with the dimensions of a standard MALDI target plate can also be constructed, allowing for up to 100
000 wells.
Since MS is used for analysis, information on many molecules within the host cells can be obtained, enabling discovery of unexpected enzyme activities. To visualize and identify these varying enzyme activities, Xu et al. plot variations in the metabolome of the cells as a Uniform Manifold Approximation and Projection (UMAP), which visualizes high-dimensional data within a single plane while preserving clustering information. Through UMAP analysis, four clusters were observed, each representing groups of cells with distinct, yet related, metabolite profiles. Upon mapping the m/z for the target polyketide (TAL, m/z 127), it was observed that one large cluster represented productive G2PS1 mutants biosynthesizing varying amounts of the TAL product. Through further analysis of the generated UMAP, cells within a separate prominent cluster were shown to produce high amounts of another metabolite corresponding to m/z 169. Using LC-MS/MS, this mass was confirmed to correspond to another reported alternative product of G2PS1, 6-acetonyl-4-hydroxy-2-pyrone (AHP). Approximately 50 mutants from each cluster were selected for sequencing and results were validated by re-transforming and testing several mutants in bulk analyses using LC-MS. From their initial mutagenesis campaign, mutant TLMV (differing from WT G2PS1 by the mutation L261V) was observed to have high selectivity for the TAL product, with a 1.62-fold improvement in activity over WT G2PS1. Conversely, mutant TLIG (differing from WT G2PS1 by the mutations M259I and L261G) was observed to have high selectivity for the AHP product, with a 52.1-fold improvement in activity over WT G2PS1. Taking the observed mutations and their relative production of TAL and AHP into account, Xu et al.80 used a consensus design approach83 to engineer the TLLL mutant (with all key active site residues remaining the same relative to the WT, with the exception of the mutation M259L) which showed a 1.98-fold increase in production of the TAL product, and a 17.4-fold increase in production of the AHP product relative to the WT (Fig. 11). Ultimately, these results support that generation of specific enzymatic products can be inferred by clustering metabolomic profiles of the hosts, without directly clustering the m/z values of the products themselves. This also allows for a more generalized metabolomic analysis that could potentially be expanded for the assessment of promiscuous (engineered) enzyme activity without the use of time-consuming analytical screening technologies. It is important to note that this approach may not obtain the same results when analyzing enzymes that do not use endogenous substrates associated with the hosts' primary metabolism. Finally, it should be highlighted that this approach requires specialized equipment that extends beyond access to a MALDI-MS instrument alone. As stated previously, custom glass slides must be constructed, and specialized software for selection of droplets containing adequate cell density must be used. As with many microfluidic devices, the physical preparation of the device itself requires highly specific conditions and parameters for adequate sorting of droplets, which could present further challenges to both access and implementation.
![]() | ||
Fig. 11 The smallest condensation/cyclization products typically produced by type III polyketide synthase include triacetic acid lactone (TAL, the native product of G2PS1), formed from one acetyl-CoA and two malonyl-CoA, and 6-acetonyl-4-hydroxy-2-pyrone (AHP), formed from one acetyl-CoA and three malonyl-CoA. Schema adapted from Xu et al.80 |
While MADS has not yet been reported for de novo DE screening and selection, Payne et al. assessed its application for the selection of a previously engineered variant of a 4-hydroxyhydrodipicolinate synthase (DapA E84T) producing increased amounts of lysine in vivo relative to WT DapA (Fig. 12).91 From their test dataset, MADS offered a 2.9% false positive rate (i.e. 2.9% of droplets sorted into the ‘winning’ pool contained cells expressing WT DapA), and an acceptably low false negative rate of 4.3% (i.e. 4.3% of droplets sorted into the ‘waste’ pool contained cells expressing DapA E84T). This level of accuracy is consistent with widely used fluorescence-based droplet sorting techniques, such as FADS.27,28,91
![]() | ||
Fig. 12 Schematic representation of the MADS enzyme screening pipeline as outlined by Payne et al.91 E. coli cells expressing either WT DapA or DapA E84T are suspended into water-in-oil droplets and subsequently incubated to allow the formation of droplet-encapsulated isogenic colonies. The droplets are then split, allowing one portion to be analyzed by ESI-MS. The other portion is sorted by DEP based on the corresponding MS signal intensity. Selected droplets (“winners”) are resuspended in fresh cell media for bulk culture, allowing subsequent phenotyping of the gene(s) and validation of activity by LC-MS/MS. Reprinted (adapted) with permission from Payne et al.91 Copyright © 2023 American Chemical Society. |
These results suggest that utilizing mass spectrometry as the detection/selection method in microfluidic sorting campaigns may be a viable option for screening challenging biocatalytic transformations in a label-free, high-throughput manner. Though just one example, MADS highlights the capabilities and importance of leveraging the combined strength(s) of existing analytical technologies to address the ever-evolving challenges associated with HTS and DE campaigns.
Previously, we discussed the applications of IMS-MS in DiBT-IMMS as a means of removing false positives.44 IMS-MS has potential further applications beyond this as an efficient dereplication tool, such as quickly differentiating isomeric compounds in DE screens. Prodiginines, for example, are a family of natural pigments produced by certain bacteria. These pigments have been widely studied, including their biosynthesis, for their promising wide range of biological activities, including antibacterial, antifungal, anticancer, and immunosuppressive properties.92 Prodiginines can exist in various isomeric and tautomeric forms due to their complex structure, which requires more lengthy liquid chromatographic methods to separate and fragmentation, as well as NMR for differentiation. Ramachandra et al. highlighted the differentiation of these products through these methods recently.93
Marshall et al. highlighted using IMS-MS as a rapid alternative for differentiating prodiginines' isomeric forms. Collisional cross-section (CCS) values were employed to differentiate isomeric cyclic prodiginines and confirm their identities. The measured CCS values for synthetic standards were matched to theoretical CCS calculations (<2% difference), which shows these types of differentiations could be made in the absence of standards.94 This work demonstrates ion mobility's power for rapid natural product identification, conformational analysis, and HTS capabilities compared to traditional techniques.
The increasing advantages of using native and engineered enzymes in chemical synthesis has made the field of biocatalysis a vibrant, exciting, and dynamic scientific subdiscipline.95,96 As the chemical industry continues to adapt to the growing presence of biocatalysis and how it can complement traditional chemical methods,97,98 this also highlights the importance of improving technologies to better identify and characterize biocatalysts and enable their downstream applications. Advancements in microfluidics,99 elegant multi-enzyme cascade assays,100,101 the fusion of biocatalysis with photocatalysis,102,103 and the de novo design of novel biocatalysts104 highlight a handful of the landmark technological achievements in protein engineering and biocatalysis within recent years. However, as emphasized throughout this article, advances in MS techniques and modalities can provide nearly universal support for DE campaigns while also considering their individual needs. The versatility of MS ionization, detection, and resolution modalities alongside compatibility across different biological and chemical matrices allows this technique to be broadly applicable across DE screening approaches. MS-based DE screens are most beneficial for biocatalytic transformations that generate obvious differences in m/z features or incorporate moieties with diagnostic isotope patterns (like halogenases). However, the balance between rapid screening and depth of sampling is more challenging when assessing closely related chemical structures, including regioisomers, atropoisomers, or enantiomers with identical m/z values. Orthogonal validation is frequently required when assessing these chemical products, showing that there is not a ‘one size fits all’ model for the application of MS-based techniques in DE campaigns. As MS technologies continue to evolve themselves, it is exciting to see how this will continue to enhance the discovery of novel biocatalysts and their reactions.
Finally, in surveying these literature associated with these different screening approaches, a noted bottleneck and challenge for many technologies is a lack of integrated data analysis capabilities for visualizing the output of these assays. This complicates how the data is processed and visualized with each individual campaign setting thresholds for hits. Creating a vendor agnostic analysis approach would be highly beneficial to the continued development and implementation of MS-based HTS DE assays. In surveying the literature and taking into account best practice examples from pharmaceutical HTS screening, use of internal standards to normalize the mass spectrometry based readouts is highly beneficial in the design of these assays. In our own efforts, we have found the addition of heavy internal standards to be highly critical in normalizing the analysis when comparing the reactant to product ratios as well as clearly defining positive and negative controls for each assay. Synthesis of heavy internal standards to use for natural products would still present a potential challenge in implementation at times. Overall, the number of high throughput approaches is growing and the incorporation of orthogonal separations, such as ion mobility, that add dimensionality compatible with the MS time scales will facilitate the measurement of these DE campaigns that keeps pace with the genetic based approaches.
This journal is © The Royal Society of Chemistry 2025 |