Ting Xue‡
a,
Weikun Wu‡b,
Ning Guob,
Chengyong Wua,
Jian Huangb,
Lipeng Laib,
Hong Liua,
Yalun Lia,
Tianyuan Wang*b and
Yuxi Wang*a
aTargeted Tracer Research and Development Laboratory, Precision Medicine Research Center, Department of Respiratory and Critical Care Medicine, West China Hospital, Sichuan University, Chengdu 610041, P. R. China. E-mail: yuxiwang@scu.edu.cn.com
bXtalPi AI Research Center, 7F, Tower B, Dongsheng Building, No. 8, Zhongguancun East Road, Haidian District, Beijing 100083, P. R. China. E-mail: tianyuan.wang@xtalpi.com
First published on 10th May 2021
The RBD (receptor binding domain) of the SARS-CoV-2 virus S (spike) protein mediates viral cell attachment and serves as a promising target for therapeutics development. Mutations on the S-RBD may alter its affinity to the cell receptor and affect the potency of vaccines and antibodies. Here we used an in silico approach to predict how mutations on RBD affect its binding affinity to hACE2 (human angiotensin-converting enzyme2). The effect of all single point mutations on the interface was predicted. SPR assay results show that 6 out of 9 selected mutations can strengthen binding affinity. Our prediction has reasonable agreement with the previous deep mutational scan results and recently reported mutants. Our work demonstrated the in silico method as a powerful tool to forecast more powerful virus mutants, which will significantly benefit the development of broadly neutralizing vaccine and antibody.
The adhesion of the virus to the target cell and the following membrane fusion process are essential steps in virus infection thus those two processes are promising targets for drug development. As is typical for coronavirus, the homo-trimeric spike glycoprotein (S protein, comprising S1 and S2 subunit in each monomer) on the envelope of SARS-CoV-2 is responsible for the cell adhesion process.8 SARS-CoV-2 uses hACE2 as the receptor for host cell entry, and8 the dissociation constant KD of S protein RBD binding to hACE2 was determined to be in the scale of nM. X-ray crystal structures of RBD was resolved in complex with the hACE2 receptor,9 revealing essential interactions on the binding surface.
As cell entry is the very first step of virus infection, blocking the binding of S protein to hACE2 can potentially prevent virus transmission. Many developing therapeutics are based on this concept.5 Monoclonal neutralizing antibodies separated from convalescent patient showed complete competition with RBD and can reduce virus titers in infected lungs in a mouse model.10 Recombinant vaccine that comprises residues from the S protein RBD was shown to induce functional antibody response in immunized animals.6,7 What's more, de novo protein inhibitors with KD of picomolar-level has been designed to neutralize the virus by targeting S protein RBD.
While therapeutics relying on the RBD binding surface require the surface to be consistent enough, SARS-CoV-2 is an RNA virus that is known to have high mutation rate. According to reported data,11 196 missense mutations in the gene encoding the RBD domain have been discovered. Although most mutants on the RBD domains were determined to be less infectious,12 some variants indeed became more resistant to neutralizing antibodies. We are interested in finding RBD mutants with higher affinity, which may escape future treatments. Thus inspiring us to make precautionary therapeutics in the near future.
13In silico affinity maturation technology is widely used in protein engineering and antibody discovery. Rosetta Flex ddG method is a ddG estimation method developed within the Rosetta macromolecule modeling suite. The ddG represents the difference in protein–protein interaction strength upon mutation. The protocol generates an ensemble of models to include conformational plasticity around a specific/given mutation site and then calculates the average ddG over the ensemble. This method has been shown to outperform previous methods,14 and considerable improvement on predicting binding-stabilizing mutations was observed.
Since the Rosetta Flex ddG protocol was published, some researcher had applied the algorithm to help their project.
20Sophia et al. used this protocol to identify hot spot interactions in the αIIb and β 3 stalks that regulate αIIb β 3 function, and found that the correlation between prediction and the extent of αIIb β 3 activation reached 0.59. The Pearson correlation coefficient matched well with the original benchmark. Moreover,22 Huy et al. used powerful technology (MRBLE-pep) with Rosetta Flex ddG protocol to correctly predict that I3 and I5 mutations decrease affinity of CN-binding peptides.
In spite of the fact that Rosetta Flex ddG protocol was developed to predict the mutations in protein–protein interface, Matteo et al. found that combing Rosetta Flex ddG protocol with Molecular Dynamics simulation was able to quantitatively predict changes of ligand binding affinity upon protein mutations.21 The data is a close match to the experimentally determined ΔΔG values, with a root-meansquare error of 1.2 kcal mol−1.
In this article, Rosetta Flex ddG protocol was used to predict the binding affinity change of point mutations on the RBD binding surface. Candidate mutants with large negative predicted ddG score were selected for further experimental validation. 6 of the 9 recommended mutants showed improved affinity to hACE2 in SPR affinity assay.
The mutated RBD proteins were prepared using the Expi293 Expression System (code: A14635). The specific operation was performed following the kit instruction. Expi293F cells were maintained in serum-free Expi293 expression medium, and the expression plasmid transfected into Expi293 cells by using an ExpiFectamine 293 Transfection Kit (all from Gibco). Five days after transfection the medium was collected, and the protein was purified by Ni-NTA (QIAGEN) column chromatography (nonspecifically binding contaminants were washed on Ni-NTA column using PBS, pH 7.4 containing 20 mM imidazole and eluted with PBS, pH 7.4 supplemented with 300 mM imidazole.). The eluted fractions were pooled and dialyzed against PBS (pH7.4) to remove the imidazole. Purified proteins were checked by SDS-PAGE and protein concentrations were quantified using Nano-Drop.
Mutant | Flex ddG-ΔΔGbindinga | SPR-KDb | SPR-ΔΔGbindingc |
---|---|---|---|
a ΔΔGbinding calculated by Rosetta Flex ddG. dG_cross is used as ddG score, which means the binding energy of the interface calculated with cross-interface energy terms. The unit is REU. Uncertainties were estimated from the standard deviations of 48 replicated trajectories.b KD (dissociation constant) assayed by SPR. The unit is nM. For WT, uncertainties were estimated from the standard deviations of 3 replicated experiments. For other mutants, the experiments were carried out only once.c ΔΔGbinding calculated from KD measured by SPR. The unit is kcal mol−1. | |||
WT | 0.00 | 21.08 ± 3.01 | 0.00 |
Q498W | −3.66 ± 1.80 | 7.10 | −2.70 |
Q498R | −2.04 ± 1.34 | 11.60 | −1.48 |
T500W | −1.90 ± 0.56 | 21.80 | 0.08 |
S477H | −1.39 ± 1.16 | 13.90 | −1.03 |
Y505W | −1.23 ± 0.41 | 16.10 | −0.67 |
T500R | −1.21 ± 1.38 | 12.20 | −1.36 |
N501V | −1.02 ± 1.09 | 158.50 | 5.00 |
Y489W | −1.01 ± 0.50 | 38.90 | 1.52 |
Q493M | −0.82 ± 1.39 | 6.90 | −2.77 |
Next, we carried out SPR affinity measurement for WT RBD and 9 potentially stabilizing mutants to validate the prediction result. Fig. 1 shows the SPR sensorgram of WT and all mutants, the SPR sensorgram are shown.
The experimental KD are reported in Table 1. 6 of the 9 selected mutants with predicted elevated binding affinity have lower KD to hACE2 than WT RBD in SPR assay. The prediction accuracy is comparable to that reported in previous work.14 Among the 9 selected mutants, the best binder is mutation Q493M with a KD of 6.90 nM, showing a 3-fold improvement in affinity compared with that of WT. There is a modest Pearson correlation coefficient of 0.41 between predicted ddG and SPR measured KD (Fig. 2).
Fig. 2 ddG calculated from SPR result versus flex ddG predicted ddG value. The Pearson correlation coefficient of the regression is 0.41. |
Among the 5 RBD sites (477, 493, 498, 500 and 505) which give affinity-increased mutants,15 4 of them were identified as key residues interacting with hACE2 (493, 498, 500 and 505)9 and 3 of them were SARS-CoV-2 unique mutants different from SARS-CoV-1 (477, 493 and 498) in previous research. The increased affinity coupled with those mutations shows that the key residues in natural binding pattern aren't necessarily optimized for binding. And even after evolution towards more potent binding, there is still great capacity for increased binding affinity.
Mutant | mut_ddG | SPR result | hbonds_int | dSASA_int | dSASA_hphobic | dSASA_polar |
---|---|---|---|---|---|---|
a hbonds_int: the change of hydrogen bond numbers at the interface. dSASA_int: the change of solvent accessible area buried at the interface, in square Angstroms. dSASA_hphobic: the change of the hydrophobic part of solvent accessible area buried at the interface, in square Angstroms. dSASA_polar: the change of the polar part of solvent accessible area buried at the interface, in square Angstroms. | ||||||
Q498W | −3.66 | Stabilize | 0.31 | 39.61 | 58.39 | −18.78 |
Q498R | −2.04 | Stabilize | 1.65 | 27.11 | −8.08 | 35.20 |
S477H | −1.39 | Stabilize | 0.77 | 52.13 | 17.81 | 34.32 |
Y505W | −1.23 | Stabilize | 0.02 | 16.04 | 31.05 | −15.01 |
T500R | −1.21 | Stabilize | 0.71 | 43.28 | −5.05 | 48.33 |
Q493M | −0.82 | Stabilize | −0.92 | −4.67 | 71.23 | −75.90 |
T500W | −1.90 | Destabilize | −0.08 | 48.38 | 34.08 | 14.30 |
N501V | −1.02 | Destabilize | 0.00 | 0.05 | 12.30 | −12.25 |
Y489W | −1.01 | Destabilize | 0.04 | 49.17 | 50.73 | −1.56 |
In the crystal structure, the WT Q498 forms a new hydrogen bond with nearby hACE2 42N (Fig. 3a). According to the best Q498W mutant model, the 498W residue forms a hydrogen bond with hACE2 38D (Fig. 3b). The Q498W mutation also increases hydrophobic contact with nearby residues, as indicated by the increment of hydrophobic SASA surface. For the Q498R mutant, in most of the mutant models, 498R forms hydrogen bond with hACE2 42N (Fig. 3c), and in some of the models also with hACE2 38D (Fig. 3d). This phenomenon could be explained by the fact that the longer hydrophobic part of arginine compared with glutamine could afford increased hydrophobic interaction. On average, there are 1.65 additional interface hydrogen bonds formed after mutation which may contribute to the increasing affinity. S477H mutation forms a new hydrogen bond with the terminal NH2 group of hACE2 19S (Fig. 3e) according to the mutant model. Calculation result shows that additional 0.77 hydrogen bond formed in average along with increased hydrophobic interaction. For the Y505W mutant, the model shows hydrogen bond has no contribution. In the original crystal structure of WT, 505Y doesn't form any hydrogen bond with other residues. The mutation Y505W is likely increasing the affinity by evolving to bear better hydrophobic contact (Fig. 3f), as shown in the increase of hydrophobic interactions. For the T500R mutant, the dominant factor is newly formed hydrogen bond interaction which is similar to Q498R mutant. The WT 500T forms a hydrogen bond with 41Y (Fig. 3g). In the best mutant model, T500R forms hydrogen bond with hACE2 329E and 330N (Fig. 3h), while in some other model T500R only forms hydrogen bonds to 330N (Fig. 3i). The Q493M mutant is the only affinity-increase mutant with significant decrease in hydrogen bond count. 493Q in WT originally forms a hydrogen bond with hACE2 35E (Fig. 3j). The van de waals contact increases significantly when mutating to a hydrophobic residue like methionine (Fig. 3k), indicated by the large dSASA_hphobic in Table 2.
According to above analysis, the enhanced interaction mainly come from additional hydrogen bonds and hydrophobic interactions. The Q493M mutant also reveal that the hydrophobic term itself can be sufficient for leaning the equilibrium to more potent binding.
By analyzing the models of false positive predictions, we illustrated that the main error may come from the broken of hydrogen bonds which are not considered in the ddG calculation. The missing hydrogen bond may be existing hydrogen bond in the scoring step and the hydrogen bond destroyed in the previous relaxing step. Visual examination of the model structure to see whether the original hydrogen bond is disrupted and comparing with the predicted surface properties may help spot those inaccuracies and exclude some false positive instance before wet-lab experiments.
Using the result measured by DMS as the ground truth, the prediction performance of Rosetta Flex ddG method is calculated. Remarkably, the precision on predicting the stabilizing mutation is 0.18, and the recall is 0.50 given a ratio of 40/432 stabilized and destabilizing mutations, which shows the potential of the Flex ddG method to suggest stabilized mutations for enhanced binding affinity (Table 3).
DMS stabilizing | DMS neutral | DMS destabilizing | Precision | |
---|---|---|---|---|
Predicted stabilizing | 20 | 1 | 93 | 0.18 |
Predicted neutral | 0 | 0 | 0 | 0 |
Predicted destabilizing | 20 | 2 | 339 | 0.94 |
Recall | 0.50 | 0 | 0.78 |
17,18The SARS-CoV-2 mutants carrying the mutation Y453F were discovered in Denmark to circulate between human and minks and have brought up concerns recently. Our prediction (−0.30 REU, in ESI file 2†) and16 the deep mutational scan results both shows that the Y453F mutation will strengthen the interaction between RBD to hACE2, which is consistent with the maintained transmissibility and pathogenicity. Although more studies on the potential effect on treatment, diagnostic test and virus antigenicity are still ongoing, preliminary results showed that the cluster-5 strain which carries another 3 mutations outside the RBD domains was more difficult to be recognized by convalescent sera.19 It is also a concern that spreading of the virus in minks may bring up more fatal variants.
The still-mutating SARS-CoV-2 alerts us of the need of an efficient mechanism in dealing with newly discovered pathogens. For such RNA virus under constant mutagenesis, it is necessary to quickly spot possible mutants with increased pathogenicity or infectivity. Given the relatively low cost and short period of computational methods, we can build a computational forecast system on virus pathogenicity in the early stage of a virus breakout. The Flex ddG method used in this work would be a constructive part in this system, predicting pathogenicity by evaluating mutant binding affinity to cell receptor. The mutations on the binding surface are especially essential in vaccine or antibody development and the predicted mutations can be considered in designing multi valent vaccine to prevent possible immune escape.
Furthermore, in silico affinity maturation itself is a promising technology in the development of macromolecule drugs. For instance, it can help reduce the scale of experimental screening in optimizing the affinity of peptide or protein drugs. Our work demonstrated the ability of the previously reported Flex ddG method to serve as an efficient affinity maturation method in predicting stabilizing mutants with no need for the time-consuming MD process. We anticipate that combined with machine learning and deep learning technology, Rosetta Flex ddG-based in silico affinity maturation can be improved to be faster and more accurate in the near future.
Footnotes |
† Electronic supplementary information (ESI) available: ESI 1 and ESI file 2. See DOI: 10.1039/d1ra00426c |
‡ Co-first author. |
This journal is © The Royal Society of Chemistry 2021 |