Irene M.
Francino-Urdaniz
and
Timothy A.
Whitehead
*
Department of Chemical and Biological Engineering, University of Colorado, JSC Biotechnology Building, 3415 Colorado Avenue, Boulder, CO 80305, USA. E-mail: timothy.whitehead@colorado.edu; Tel: +1 303-735-2145
First published on 29th September 2021
This mini-review presents a critical survey of techniques used for epitope mapping on the SARS-CoV-2 Spike protein. The sequence and structures for common neutralizing and non-neutralizing epitopes on the Spike protein are described as determined by X-ray crystallography, electron microscopy and linear peptide epitope mapping, among other methods. An additional focus of this mini-review is an analytical appraisal of different deep mutational scanning workflows for conformational epitope mapping and identification of mutants on the Spike protein which escape antibody neutralization. Such a focus is necessary as a critical review of deep mutational scanning for conformational epitope mapping has not been published. A perspective is presented on the use of different epitope determination methods for development of broadly potent antibody therapies and vaccines against SARS-CoV-2.
Irene M. Francino-Urdaniz obtained her BS in Chemical Engineering on 2019 at Universitat de Barcelona, Spain, and her MSc on 2021 in Chemical Engineering at the University of Colorado Boulder. She is currently a third year PhD student at the University of Colorado Boulder. Her research focuses on identifying antibody escape mutants and developing broadly neutralizing antibodies using yeast surface display. |
Tim Whitehead trained in Chemical Engineering at Vanderbilt University (BE) and the University of California-Berkeley (PhD). After a short postdoctoral position at the University of Washington under David Baker, he started his independent group in 2011 at Michigan State University. Since 2019 he has been at University of Colorado Boulder, where his lab focuses on solving data-driven protein engineering and design challenges in the fields of antibody engineering, enzyme engineering, and biosensor development. |
An important class of protein–protein interactions are antibody interactions with antigens. Here, the epitope is defined as the antigenic surface recognized by a given antibody. Identifying the structures, sequences, and sequence constraints on such antigen epitopes is essential for solving difficult problems in basic and applied immunology. For example, a key idea in modern vaccine design has been that antigen structures can be modified rationally to present critical epitopes that elicit antibodies that neutralize infection (neutralizing antibodies or nAbs) that, in turn, confer long-lasting protection. The first proof of concept demonstration of such structure-based vaccine design in Phase I clinical trials was published4 for an immunogen mimicking a key conformational epitope of a viral protein in respiratory syncytial virus. Similarly, the search for a universal influenza A vaccine was jump-started by the structural and sequence identification of a conserved epitope on the influenza surface protein haemagglutinin.5–7 Antibodies targeting this haemagglutinin epitope are able to neutralize broadly across different influenza A subtypes. This structural definition of an epitope led to immunogen designs that elicit high levels of broadly neutralizing antibody titers in a recently completed phase I clinical trial.8 Thus, therapeutic and prophylactic strategies are informed by, and often start with, a sequence and structural definition of an antigenic epitope.
There exist several relatively mature technologies available to delineate the sequences, structures, or sequence constraints of epitopes. In fact, several comprehensive reviews of individual methods have been published in this century.9–16Table 1 lists common experimental methods for epitope mapping. There are two major classifications of epitopes primarily based on the experimental method used for their identification. Linear epitopes are those that involve sequential residues in the primary amino acid sequence and can be identified using techniques like peptide microarrays, phage, or bacterial display. By contrast, conformational epitopes involve surfaces recognized by antibodies only when a protein is folded in its tertiary or quaternary state. Such conformationally sensitive epitopes are typically resolved by structural determination using X-ray crystallography or electron microscopy (EM). Less commonly, hydrogen–deuterium exchange coupled to mass spectrometry (HDX-MS)16 or deep mutational scanning17 can be employed. All methods have their relative strengths and drawbacks, but generally it has been difficult to compare directly between methods as not all are typically performed on the same set of proteins.
Category | Technique | Information Obtained | Comparative Advantage | Comprehensive review |
---|---|---|---|---|
Linear epitopes | Peptide arrays | Linear peptide sequence recognized by antibody | Massive parallelization allows proteome-size scalability | Katz et al.12 |
Phage and bacterial display | Can use linear and constrained peptides in a high throughput format | Pande et al.14 | ||
Conformational epitopes | Electron microscopy (cryo-/negative stain) | Atomic structure of an antigen-antibody complex | Structural determination of large, complex complexes with only small amounts of material needed | Renaud et al.11 |
X-Ray crystallography | Highest quality atomic structural determination | Malito et al.10 | ||
HDX-MS | Antigenic surfaces shielded from solvent in presence of antibody | Description of dynamic conformations | Sun et al.16 | |
Deep mutational scanning | Comprehensive antigenic sequence determinants to binding/competitive inhibition | High resolution sequence constraints on antigenic epitopes and evaluation of point mutants |
The emergence of SARS-CoV-218 has led to intense research on its virology, epidemiology, and therapeutic and prophylactic interventions.19 During this time, dozens of research groups around the world identified antibodies raised against natural SARS-CoV-2 infection.20–24 This outpouring of research represents a natural experiment for the relative strengths, weaknesses, and types of information inherent in different epitope mapping methods. Thus, in this review we critically survey techniques used for epitope mapping on SARS-CoV-2. However, we do not intend an in-depth explanation of all the methods since exhaustive modern reviews already exist and are cited. Nonetheless, an additional focus on this mini-review is given on epitope mapping and identification of mutants which escape antibody neutralization using deep mutational scanning,17 as to our knowledge no comprehensive review exists. Thus, the second half of this review is given to the critical appraisal of different deep mutational scanning strategies because since the effect of individual mutations on binding can be studied, deep mutational scanning is especially relevant when developing antibodies against evolving viruses.
Given that well over a hundred thousand papers have been published on SARS-CoV-2,19 a comprehensive review is impractical for this short mini-review format. We apologize to colleagues whose work we have failed to cite.
The RBD is a major target for neutralizing antibodies since it is responsible for binding ACE2.27 In the early days of the COVID-19 pandemic, antibodies from SARS-CoV convalescent patients were screened against SARS-CoV-2 S RBD. An early cross-reactive antibody is CR3022,20 and this antibody defines one non-neutralizing and broadly conserved epitope on RBD distal to its RBM (Fig. 1b). Another early described broadly conserved epitope is the one recognized by mAb S30931 (Fig. 1b), which recognizes an epitope defined by a conserved N-linked glycan at Asn343. In contrast to the CR3022 epitope, antibodies at this S309 epitope neutralize both SARS-CoV and SARS-CoV-2. Further into the pandemic, SARS-CoV-2 specific nAbs were identified from convalescent patients and, for some, their epitopes were structurally determined by X-ray crystallography. Some examples are P2B-2F6,37 P4A1,38 CB6;39 some other antibodies such as PR107740 were isolated from immunized mice. A large fraction of these nAbs bind at or adjacent to the ACE2 binding site. In particular, P4A138 covers the majority of the ACE2 footprint. As one example, nAbs from the IGHV3-53 germline class represent the most common antibodies elicited from natural infection from the original Wuhan-Hu-1 strain.32 Structures of IGHV3-53 nAbs CC12.1, CC12.3, and B38 define the basis of neutralization by competitive inhibition of ACE2 recognition32,33 (Fig. 1b).
Antibodies can also neutralize SARS-CoV-2 by binding at the NTD, with several crystallographic studies pinpointing the key epitopes.34,35 There are conserved epitopes between SARS-CoV and SARS-CoV-2 NTD but all are non-neutralizing; conversely, the key non-conserved epitope is neutralizing and has been named ‘supersite’34,36 (Fig. 1c). Most of the NTD surface is covered by a glycan shield, and the supersite is one of the only exposed proteinaceous surfaces on NTD. Structural studies show that antibodies from different germline classes bind this key aglycosylated epitope.35 Unfortunately, this supersite undergoes extensive antigenic variation, and many variants of concern (VoC) are no longer neutralized using supersite nAbs elicited from the original Wuhan-Hu-1 strain.34
Overall, X-ray crystallography has been a key technique in the study of SARS-CoV-2 epitopes as it was used to define individual conserved and non-conserved epitopes on the RBD and NTD of the SARS-CoV-2 S. Key limitations of this technique include the difficulty of the preparation of high diffraction quality crystal of full-length S ectodomain, limiting determination of epitopes to those entirely contained within individual RBD and NTD domains.
Dozens of cryo-EM and, less commonly, negative stain-EM structures41 of potent neutralizing antibodies in complex with S have been reported. We list here a few of the antibodies that can be grouped in two representative examples of the types of epitopes that can be analyzed using electron microscopy. In the first example, a study led by Adimab scientists used cryo-EM to determine the epitope of a broadly neutralizing antibody developed by Adimab that binds to S RBD.42 Regeneron too has developed an antibody cocktail binding to S RBD whose epitope has been mapped using this same technique.43 Likewise, these specific complexes could have been determined by X-ray crystallography since the epitope is entirely contained within an S RBD monomer. Novel epitopes such as the one of H014 on RBD44 and the anti-S NTD antibody 4A836 can also be determined using EM. In another example, a different group used cryo-EM to characterize the epitope for a nAb that binds simultaneously to two of the three RBDs contained in the S trimeric complex.45 This specific complex would be difficult to determine by X-ray crystallography. Thus, cryo-EM can be used for complexes both amenable and refractory to solution by X-ray crystallography.
Cryo-EM and X-ray crystallography can be combined to define the structural epitopes recognized by antibodies elicited from natural infection. An excellent example of a joint study was reported by Barnes et al., who defined the four major classes of antibodies binding to RBD epitopes46 (see definitions in Fig. 1d).
Synthetic peptide arrays have been used to study epitopes of monoclonal antibodies and convalescent patient serum on the whole S protein by several groups.47–52 Even though this review focuses on the S protein epitope mapping, one group has used synthetic peptide arrays to identify proteome-wide epitopes for SARS-CoV-2 and other coronaviruses,47 highlighting the advantage of scale for synthetic peptide arrays.
The identified linear epitopes on S are clustered in defined regions (Fig. 1e) mainly at cleavage sites or sites necessary for conformational changes for viral entry, like the S1/S2 cleavage site,32 the S2′ cleavage site, and the CTD.48,49 While the majority of the linear epitopes are found outside of the RBD, several have also been identified on the RBD.50 These combined studies highlight the diversity of the antibody response on the entire S protein and pinpoint immunodominant epitopes as well as epitopes that are relatively occluded from antibody recognition. However, there is a lack of information on the correlates of protection for these identified epitopes, and the structural basis for recognition must be inferred by structural information given by cryo-EM and X-ray crystallography.
While HDX-MS can facilitate the understanding of the conformational dynamics of binding, it may give recurrent false positives and the experimental proposal must fulfill an exacting list of requirements to obtain good results.16 Thus, HDX-MS is usually coupled to methods like cryo-EM to marry conformational dynamics with structural insight.
Fig. 2 Overview of independent deep mutational scanning workflows for conformational epitope mapping. |
In deep mutational scanning, the antigenic sequence dependence on binding can be assessed for nearly every single point mutant in the protein sequence. This information is used to identify conformational epitopes under the assumption that epitope positions are less tolerant of mutations than non-epitope positions. Deep mutational scanning workflows for conformational epitope mapping are similar at a superficial level. The antigen of choice is displayed on the surface of a eukaryotic cell. Next, binding to an antibody or receptor is monitored using a flow cytometer after cell labeling with appropriate fluorophores. Comprehensive mutagenesis of the antigen gene is performed thus generating a library of antigen mutants that can be transformed into the relevant cell type. A population of cells, where each cell displays a distinct antigen mutant, is split and incubated in several different binding conditions. For example, each reaction could contain a different amount (or none) of antibody. After fluorophore labeling, the cells are screened using a cell sorter. Different populations are distinguished using gates on different light scattering or fluorescent values. For example, a gate is typically set to identify cells maintaining high antibody binding as inferred by a high fluorescence signal in the appropriate channel. Populations of cells are sorted according to these gates by fluorescence activated cell sorting (FACS), regrown, plasmids harvested and prepared for deep sequencing, and then sequenced. For each sorted population the frequency of each variant is enumerated; along with other information about sorting conditions, this information is processed either qualitatively or quantitatively to identify the effect of each introduced mutation on the binding considered in the assay.
The Procko group used deep mutational scanning to identify ACE2 mutations that increase binding to SARS-CoV-2 S RBD60 in order to develop a receptor trap prophylactic and therapeutic against SARS-CoV-2. Key mutations found to increase ACE2 binding to S were those removing N-linked glycans that partially shield the ACE2 surface recognized by the S RBD. The best engineered soluble ACE2 (sACE2) variant can outcompete natural ACE2 for binding to S RBD. Further, the authors showed that sACE2 can neutralize different coronaviruses, including SARS-CoV and SARS-CoV-2.63 To engineer this receptor, ACE2 was displayed on the surface of mammalian cells and incubated with soluble S RBD. The variants that bind tighter to the S RBD than native ACE2 were collected and identified by an increase in frequency in the binding population relative to a control.56
The Bloom research group used deep mutational scanning for the quantitative assessment of the sequence dependence of S RBD on ACE2 binding affinity.61 This same platform was also used to map epitopes and escape mutants for several monoclonal antibodies,26,62 predicting in advance the N501Y mutation observed in several VoC. S RBD is displayed on the surface of yeast and labeled either with soluble ACE2 or mAb at multiple different concentrations. Cell populations collected depend on whether epitopes or escape mutants are identified, and sequence data is processed using a quantitative maximum likelihood estimation method.64
The Whitehead group has developed a method that identifies the near-comprehensive set of escape mutants on S RBD for neutralizing antibodies that directly compete with ACE2 for binding.3 Several antibodies can be tested in parallel. Most escape mutations identified in the study are located adjacent to but not directly on the ACE2 binding footprint. Most intriguing, many escape mutants map to K417, including K417N which is present in the circulating Delta plus VoC (B.1.617.2 + K417N) and in the Beta VoC and K417T present in the Gamma VoC.65,66 To identify escape mutants, an aglycosylated S RBD construct is displayed on the surface of yeast and a competitive binding experiment is performed between a given antibody and soluble ACE2. Cells harboring RBD variants able to maintain ACE2 binding in the presence of a nAb are collected, and a novel algorithm is used to identify escape mutants.
The above studies all performed different strategies, shown in Fig. 2, with these differences instructive for those setting up a deep mutational scanning experiment. One major difference between groups is the display technique. One group displayed bona fide ACE2, including its membrane-spanning pass, on mammalian cells, while the other groups used an artificial genetic fusion of S RBD to a yeast cell surface protein. The yeast display set-up maintains several advantages for deep mutational scanning: relatively fast growth rates, excellent genetics and high transformation efficiency, robust cells, and validated protocols.67 In our hands 11 of 12 tested antibodies targeting S RBD maintained binding to the engineered construct on yeast,3 attesting to the fidelity of the platform. Still, it remains difficult to display complicated glycoproteins in the active form,3 and yeast has different N-linked glycosylation patterns involving heavy mannosylation relative to mammalian cells.68 Therefore, antibodies that target across S protomers, that involve glycan recognition, or that bind on the S2 protein cannot be considered using yeast display. While mammalian cell display has several disadvantages relative to yeast display, the key advantage is displaying a membrane protein in its native context. In the Procko case, using the native ACE2 conformation was essential to identify that the removal of the glycans increases the binding affinity to RBD.
The two next steps in deep mutational scanning are (i.) performing comprehensive mutagenesis of the gene to be scanned; and (ii.) transforming the resulting DNA libraries into cells. Comprehensive mutagenesis on plasmid DNA can be performed using several methods like PFunkel,69 nicking,70 or overlap extension PCR mutagenesis.60 Illumina sequencing platforms typically utilize 250 base pair DNA sequencing, which limits the linear stretch of the gene which contains mutations to typical 250–350 bp. Covering an entire gene like ACE2 or S RBD, which are both larger than 350 bp, requires multiple libraries for coverage. These libraries are colloquially referred to as ‘tiles’. Both the Procko and Whitehead groups used this tiling strategy (Fig. 2). The main disadvantage of tiling is handling each library independently – separate labeling, sorting, and DNA prep steps must be performed for each tile. In contrast, the Bloom group encoded all mutations on S RBD in a single library. Then, they utilized PacBio long read sequencing to haplotype each set of mutants on S RBD to a unique barcode (Fig. 2). Illumina short read sequencing of the short barcode could then be used to identify frequencies of each mutant. This approach has a higher upfront cost of library haplotyping (the PacBio step) but has more streamlined downstream steps with less expensive sequencing on the backend.
All groups used FACS to screen cell populations. Both Procko and Bloom groups used direct labeling either with antigen or antibody. In contrast, the Whitehead group developed a competitive ACE2 binding screening assay for a neutralizing antibody to infer the set of escape mutants. All groups also used Illumina for next generation sequencing of library DNA. The Procko and Whitehead groups screened and sequenced each tile separately, while the Bloom group sequenced library barcodes only. Best practices for these screening steps involve making true biological replicates (DNA libraries prepared and transformed independently) and sorting replicate libraries on different days.
In the final step, sequencing results are analyzed with a method appropriate for each approach. The analysis results are qualitative or quantitative and depend on factors in the experimental approach like the choice of display format, the type of mutagenesis performed, and screening strategy. In deep mutational scanning workflows the first step is to enumerate the frequency of each variant for each sequenced population. The simplest qualitative analysis is to compare the frequency of a selected population with a reference population that has passed through the cell sorter but is otherwise not screened for binding. The log transform of this frequency change between populations is called an 'enrichment ratio'. The Procko group used this qualitative analysis to determine the relative binding for their ACE2 variants. Such qualitative analyses are simple to perform and suitable for engineering goals like developing superior ACE2 receptor traps. However, this enrichment ratio analysis is subject to consider noise resulting from complexity bottlenecks in the FACS screening, DNA preparation, and sequencing steps. Thus, one drawback from a qualitative analysis is hit identification – how does one determine high enrichment ratios that result from binding events rather than ones that occur by chance? The Whitehead group solved this problem by independently sorting a control population subject to the same screening criteria as their competitively inhibited yeast cells. This control population was then used to set an empirical False Discovery Rate at which an enrichment ratio is not expected to occur by chance in a population of a given size.
The most sophisticated approach for analysis came from the Bloom group, who sought to quantitatively estimate binding dissociation constants for S RBD mutants. Their approach involved sorting using many different labeling concentrations of ACE2 or antibody and using a maximum likelihood estimation approach to infer dissociation constants.64 This protocol is very exhaustive, with precision coming at the expense of throughput. Thus, this is a suitable protocol to analyze a few antibodies in depth.
In summary, these three groups’ contributions show how different experimental observables result from different experimental strategies.
Our mini-review described at length different conformational epitope mapping methods by deep mutational scanning as no in-depth review for this methodology exists. We are especially excited about the ability to delineate the sequence constraints on binding by both ACE2 and nAbs, as these constraints dictate the boundaries of the emerging arms race between future mutations on SARS-CoV-2 VoC and the ability of the humoral response in the vaccinated and naturally infected population to respond. It remains to be seen whether deep mutational scanning can inform the next generation of design of monoclonal antibody therapies and vaccine candidates.
This journal is © The Royal Society of Chemistry 2021 |