Ke
Zuo
abcd,
Agata
Kranjc
a,
Riccardo
Capelli
e,
Giulia
Rossetti
afg,
Rachel
Nechushtai
c and
Paolo
Carloni
*abh
aComputational Biomedicine, Institute of Advanced Simulation IAS-5 and Institute of Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich 52425, Germany. E-mail: p.carloni@fz-juelich.de
bDepartment of Physics, RWTH Aachen University, Aachen 52074, Germany
cThe Alexander Silberman Institute of Life Science, The Hebrew University of Jerusalem, Edmond J. Safra Campus at Givat Ram, Jerusalem 91904, Israel
dDepartment of Physics, Università degli Studi di Ferrara, Ferrara 44121, Italy
eDepartment of Biosciences, Università degli Studi di Milano, Via Celoria 26, Milan 20133, Italy
fJülich Supercomputing Center (JSC), Forschungszentrum Jülich GmbH, Jülich 52425, Germany
gDepartment of Neurology, Faculty of Medicine, RWTH Aachen University, Aachen 52074, Germany
hJARA Institute: Molecular Neuroscience and Imaging, Institute of Neuroscience and Medicine INM-11, Forschungszentrum Jülich GmbH, Jülich 52425, Germany
First published on 5th May 2023
Structure-based drug design protocols may encounter difficulties to investigate poses when the biomolecular targets do not exhibit typical binding pockets. In this study, by providing two concrete examples from our labs, we suggest that the combination of metadynamics free energy methods (validated against affinity measurements), along with experimental structural information (by X-ray crystallography and NMR), can help to identify the poses of ligands on protein surfaces. The simulation workflow proposed here was implemented in a widely used code, namely GROMACS, and it could straightforwardly be applied to various drug-design campaigns targeting ligands’ binding to protein surfaces.
The NMR structures of LTPS/protein complexes in the PDB are only a fraction of those solved by X-ray (less than 0.01% of the PDB). However, NMR may also provide useful information at the qualitative level in case it is not able to determine the structure; furthermore, the data coming from NMR experiments intrinsically contain the fluctuations of the relevant movements of the system. This information can be rationalized, for instance, by monitoring the changes in chemical shifts of a protein upon ligand binding in aqueous solution.17–19 Finally, computer-aided drug design also presents difficulties when addressing LTPS/protein complexes: the lack of deep binding pockets makes it difficult to efficiently use molecular docking protocols and high-throughput screening - the first steps of the “in silico” design of new potential therapeutic ligands.20
Here we underline a general way to address the unique binding mode of LTPS. We propose a combination of advanced molecular simulation methods, such as metadynamics,21–23 as a possible route to accurately identify LTPS poses on the target proteins in an aqueous solution, if the X-ray structures or NMR information is available. Metadynamics is an excellent method to be used here as it allows efficiently predicting the complex free energy landscape associated with ligand binding as a function of several collective variables (CVs) (for instance, Section 2 uses three CVs). There are several other excellent methods able to calculate free energy profiles (such as umbrella sampling), but metadynamics is more cost-effective than umbrella sampling, and as a purely explorative approach (especially in ligand binding), it is one of the few viable techniques.
In our proposed protocol, the simulation predicts a ligand's pose and potency, while the experiment validates the model. To illustrate how this works in practice, we consider two examples from our research. First, we compare our predictions with accurate X-ray crystallography data, to provide an insight into the drug binding modes on passing from the solid state to solution. We focus here on the human nutrient deprivation autophagy factor-1 (NAF-1) protein target for a variety of diseases, including cancer (work performed here). The second exploits the qualitative NMR information to predict the poses of a ligand on the prion protein, as described in ref. 24. The results are validated by comparing the calculated free energy of binding with the corresponding experimental values. In both cases, our prediction suggested multiple poses for the association of the ligands with the proteins in an aqueous solution. The latter resulted in a quite complex free energy landscape of ligand binding, with multiple minima separated by small barriers. The protocol can be straightforwardly extended to other ligand/proteins complexes having similar characteristics of ligand binding and available experimental structural information.
Fig. 1 X-Ray structure of NAF-1, and its complex with ligand 1. (a) The protein (PDB ID: 4OO732) is a homodimer. Each monomer (shown in deep blue and magenta, respectively) contains a [2Fe–2S] cluster with a 3Cys:1His coordination (Cys99, Cys101, Cys110, and His114) and a beta cap. (b) Chemical structure of ligand 1. (c) 1·NAF-1 complex X-ray structure (PDB ID: 7P0P33). The ligand, shown as a wheat stick model, is sandwiched between two adjacent proteins in the crystal unit cell, coloured as in (a). |
The NAF-1 protein is anchored by two α-helices to the outer membrane of mitochondria (OMM), the endoplasmic reticulum (ER) membrane, and the ER-mitochondria-associated membrane (MAM).30–32 The protein is a homodimer, with each monomer composed of a β-cap and a [2Fe–2S] cluster binding domain. The topology of the second structural domain is in the order of L1-β1-L2-β2-L3-α-L4-β3 (L: loop, β: β sheet). Each monomer also contains a [2Fe–2S] cluster, of which one Fe is coordinated by two cysteines (Cys99 and Cys101) buried inside the protein and the other one is coordinated by one cysteine (Cys110) and one histidine (His114) on the surface of each monomer (Fig. 1a).32 The two monomers also form hydrophobic interactions with each other.32 The iron-bound histidine His114 can exist in a protonated or deprotonated form according to the protonation state of the Nε of the imidazole ring. Thus, NAF-1 can exist in four different states at physiological pH (states I–IV described in Table 1). For state I, the experimental ligand binding free energy has been measured by isothermal titration calorimetry at pH 8.0.33
Ligand 1 (2-benzamido-4-(1,2,3,4-tetrahydronaphthalen-2-yl)-thiophene-3-carboxylate), binds to the NAF-1 surface with a relatively low binding affinity (Kd = 7.20 μM, Fig. 1b)‡. The X-ray structure of the 1·NAF-1 complex33 shows that (i) the ligand's tetralin ring forms hydrophobic interactions with Ala97, Ile129 of one protein, and with Val83′ of another, adjacent protein – its residues are indicated with a prime (′) symbol; (ii) the ligand's phenyl forms hydrophobic interactions with Pro80, Ala97′, and Ile129; (iii) the carboxylate group forms salt bridges with Lys81 and Lys81′ (Fig. 1c).
We used volume-based well-tempered metadynamics,21 an exact method to reconstruct the free energy landscape associated with 1 bound to human NAF-1 in an aqueous solution at physiological temperature (310 K). The free energy was calculated as a function of three collective variables, using the same computational protocol as in ref. 11 (see ESI† for computational details). Our calculations were based on an educated guess of the 1·NAF-1 complex structure and then compared with the experimental structure of the complex.33
The calculated lowest free energy of binding in the state I, where the Fe ions are in their oxidized states (Table 1), was in good agreement with the experimental value (−6.6 kcal mol−133), considering that the latter was obtained at a lower temperature. In the global minimum 1 (Fig. 2), ligand 1's phenyl moiety forms hydrophobic interactions with Ile86 and Leu91, and polar van der Waals (vdW) interactions with Asp90, Asn87, and Glu85. The carbonyl oxygen forms an H-bond with Tyr98, whereas the thiophene ring and tetralin ring form hydrophobic interactions with Ala109, Pro108, and Phe107, Thr106, respectively. In minimum 2 (−4.7 ± 0.8 kcal mol−1, Fig. 2), higher by 2.7 kcal mol−1 from the minimum 1, the tetralin group forms a π-cation interaction with Lys116, the thiophene ring forms hydrophobic interactions with Leu120, and the carboxylate oxygens interacts with Glu119 through a water molecule. In minimum 3 (−3.5 ± 0.8 kcal mol−1, Fig. 2), higher than minimum 1 by 3.9 kcal mol−1, ligand 1 binds to the L2 loop. Its phenyl group forms metastable interactions with Ser92, the thiophene ring with Leu93, Thr94, and the tetralin ring forms metastable interactions with Lys95 (see Fig. 2). The binding of the other three states is reported in the ESI.† Multiple binding poses could be identified in all cases, with a free energy of binding slightly lower (in absolute values) than that of state I (see ESI†).
In summary, our simulations could fairly reproduce the binding affinity of ligand 1 to human NAF-1 in aqueous solutions. Ligand 1 turns out to bind to the [2Fe–2S] cluster region differently from the X-ray structure. It also binds to the L1/L3 and L2 loops. We attribute this difference, at least in part, to the different binding properties of the drug on passing from the solid state to the solution, where roughly half of the drug's contacts are replaced by the solvent.§ Finally, the redox state of the [2Fe–2S] clusters and the protonation states of the Fe-bound histidine residues have a significant impact on the ligand binding free energies and poses (See ESI†).
Fig. 3 Structure of human prion protein and ligand 2. (a) The structured part of the human PrPC protein (Leu125-Arg228) is composed of three α-helices and a β-sheet (PDB ID: 1HJM41). Residues, shown by NMR experiments as being involved in binding are depicted in orange sticks. Coloured spots next to the residues show which representative 2 protomer (shown in (b)) interacts with them after the metadynamics simulations. (b) The representative binding poses of the three 2 protomers with human PrPC protein resulting from metadynamics. Uncharged protomer 2 is shown as yellow sticks; 2+ and 22+ are shown in red and blue colours, respectively. (c) Chemical structure of ligand 2. At acidic pH, at which the NMR experiments were performed,37 except uncharged 2, the ligand can also be present in two other ionic forms, 2+ and 22+. Fig. 3a and b were adapted and reprinted, respectively, from ref. 24. Copyright 2009, American Chemical Society (ACS). |
Ligands binding to the protein may interfere with the interconversion to the scrapie form. Changes in 1H-15N HSQC (heteronuclear single quantum coherence) NMR spectra of the protein have established that ligands such as ligand 2 (2-pyrrolidin-1-yl-N-[4-[4-(2-pyrrolidin-1-yl-acetylamino)-benzyl]-phenyl]-acetamide, Fig. 3c)¶ bind to the surface of the mouse prion protein (mPrPC, which has 98% sequence similarity and 86% sequence identity with the human protein).37 Ligand 2 can exist in three protomers under the conditions of the NMR experiments: uncharged 2, and positive 2+ and 22+ (Fig. 3c). Although one could suggest that the ligand binds to different PrP regions, based on the NMR data, only one binding site could be identified.37
Bias-exchange metadynamics simulations based on the AMBER99 force field38,39 provided a ligand binding/unbinding free energy (7.8 ± 0.9 kcal mol−1) that agreed well with the experimental value37 (7.5 kcal mol−1). The calculations identified different poses for the three protomers (Fig. 3b)||24, including the one suggested in ref. 37. Taken together, the poses were consistent with all the ligand/protein contacts deduced by the NMR data (Fig. 3a and b). Approaches combining NMR and enhanced sampling, such as that presented here, could be applied to other proteins undergoing fibril formation in neurodegenerative diseases, e.g. amyloid-beta in Alzheimer's disease, α-synuclein in Parkinson's disease, and huntingtin in Huntington's disease.40 All of them feature no binding cavities.
Our proposed combined experimental/simulation protocol is by no means the only way to foresee how to overcome the current limitations. During the last decade, a plethora of methods devoted to the study of protein–ligand binding have been developed, such as funnel metadynamics,49 lambda-dependent umbrella sampling,50 and ligand Gaussian accelerated molecular dynamics (LiGaMD).51 One of them has been presented in this perspective (Section 2). Here we presented two possible flavours of metadynamics to tackle this problem, while, as already said, a large number of techniques (also not metadynamics-related) can be effectively applied. In addition, and most importantly, ML-based techniques are expected to significantly increase the predictive power with the expansion/growth of experimental structural information (along with affinity data). Finally, the increasing number of Cryo-EM protein/LTPS complex structures’ (currently ∼0.08% of the PDB) is expected to boost a variety of rational LTPS design campaigns,52 possibly assisted by molecular simulation and/or ML.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3cp01388j |
‡ 1 accelerates the cluster release relative to the protein without ligands in vitro.33 |
§ Some of the residues at the protein surface might also change their conformations in solution as they may form contacts with protein images in the solid state. |
¶ 2 features three different protomers at pH 4.5, the pH at which the experiments were carried out37 (Fig. 3). |
|| This contrast with standard molecular docking protocol, which was not consistent with all the ligand/protein contacts identified by NMR. |
** The codes to be used are readily available in PLUMED42–44 (https://www.plumed.org//) which can be used in codes such as GROMACS45,46 AMBER,47etc. |
This journal is © the Owner Societies 2023 |