José Rogério A. Silva*a,
Hendrik G. Krugerb and
Fábio A. Molfetta*c
aLaboratório de Planejamento e Desenvolvimento de Fármacos, Instituto de Ciências Exatas e Naturais, Universidade Federal do Pará, Belém, Pará 66075-110, Brazil. E-mail: rogerio@ufpa.br
bCatalysis and Peptide Research Unit, University of KwaZulu-Natal, Durban 4000, South Africa
cLaboratório de Modelagem Molecular, Instituto de Ciências Exatas e Naturais, Universidade Federal do Pará, CP 11101, 60075-110 Belém, PA, Brazil. E-mail: fabioam@ufpa.br
First published on 2nd July 2021
The main protease (Mpro or 3CLpro) is a conserved cysteine protease from the coronaviruses and started to be considered an important drug target for developing antivirals, as it produced a deadly outbreak of COVID-19. Herein, we used a combination of drug reposition and computational modeling approaches including molecular docking, molecular dynamics (MD) simulations, and the calculated binding free energy to evaluate a set of drugs in complex with the Mpro enzyme. Particularly, our results show that darunavir and triptorelin drugs have favorable binding free energy (−63.70 and −77.28 kcal mol−1, respectively) in complex with the Mpro enzyme. Based on the results, the structural and energetic features that explain why some drugs can be repositioned to inhibit Mpro from SARS-CoV-2 were exposed. These features should be considered for the design of novel Mpro inhibitors.
COVID-19 (COronaVIrus Disease 2019) is a disease caused by the SARS-CoV-2 coronavirus that has spread rapidly around the world since it was reported in late 2019 in Wuhan, China.3 SARS-CoV-2 is the seventh coronavirus known to infect humans and that can cause serious diseases, such as SARS-CoV and the Middle East respiratory syndrome (MERS-CoV),4 while HKU1,5 NL63,6 OC43 (ref. 7) and 229E8 are associated with mild symptoms. SARS-CoV-2 is the one with the highest transmission rate, as it includes asymptomatic carriers, a long latency period, and a high infection rate.9 COVID-19 causes fever, fatigue, dry cough, muscle pain, shortness of breath, and in some cases lead to pneumonia.10 In addition, a small proportion of those infected (∼5%) develop diseases that can progress to severe acute respiratory syndrome (SARS), which can lead to death.9 Since the initial SARS-CoV-2 outbreak, the World Health Organization (WHO) declared on March 11, 2020, COVID-19 as a pandemic disease due to the spread of the coronavirus.11,12 Until April 26, 2021, the total number of confirmed cases of COVID-19 was 146841882 with over 3104743 deaths in at least 223 countries,13 which indicates the epidemic is a serious public health problem.
The 3-chymotrypsin-like protease enzyme (3CLpro), also known as the “main protease” (Mpro), is considered an attractive target in drug design for the treatment of coronavirus infection.14,15 The role of this protease involves the proteolytic processing of viral polyproteins, being considered essential in the process of replication and maturation of the virus.16 The Mpro cysteine protease contains three domains, domain I being formed by residues 1 to 101, and domain II consisting of antiparallel barrels that are formed by residues 102 to 184, and for enzyme activity domain III, composed mainly of alpha helices, formed by residues 201 to 301.17,18 The substrate-binding region is located in the gap between domains I and II and consists of a catalytic dyad formed by the residues Hys41 and Cys145, in which cysteine acts as a nucleophile and histidine as a proton acceptor.14,19 In addition, as Mpro does not have a homologous human enzyme, it presents itself as an ideal target for drug design.15,19,20
A traditional drug discovery process is expensive and time-consuming and can take decades to complete.21 Currently, there are no drugs available for the treatment of COVID-19, and in the absence of effective treatment, the redirection of drugs becomes an attractive solution, due to the reduction in development time and cost, as the drugs have already been tested to what concerns safety in humans, which means they should quickly enter the most advanced clinical stages.22,23
Structure-Based Virtual Screening (SBVS) is a method used to identify and prioritize ligands for subsequent in vitro and in vivo profiling and, therefore, is an attractive way to identify new compounds for more effective treatment of COVID-19.18,24 Therefore, in this study, a virtual screen based on the structure of the receptor in a set of molecules of approved drugs was used to identify promising candidates that could be tested against the SARS-CoV-2 enzyme. Then, molecular dynamics and binding free energy calculations were carried out to evaluate the inhibition potential of the selected commercial drugs against the Mpro enzyme of SARS-CoV-2. Overall, this work shines a light on the Mpro–inhibitor complex interactions, which is paramount for the development of new drugs against COVID-19.
Before starting the virtual screen with all compounds chosen from the database, a re-docking of the N3 crystallographic ligand was carried out. Thus, the best conformation obtained by the program is compared with that of the ligand co-crystallized in the enzyme. In this way, an analysis is made by comparing the values of the root-mean-square deviation (RMSD). RMSD values below 2 Å show that the re-docking was successful according to data in the literature.29 The PoseView online server30 was used to analyze the interactions of hydrogens as well as hydrophobic interactions between the selected ligands and the amino acids of the enzyme, derived from molecular docking.
The two-dimensional structures of the 2998 compounds that were retrieved from the Repurposing hub (http://www.broadinstitute.org/repurposing) and built-in the MarvinSketch program,31 as well as all structures, were optimized in the OpenBabel program.32 For the selection of compounds, the total drug-score parameter calculated by the OSIRIS Property Explorer program was also evaluated.33 This measurement is a score generated by the combination of the parameters of drug-similarity, hydrophobicity (clogP), aqueous solubility (logS), molar mass, and toxicity risk data.34
Each resulting system was minimized using 10000 steps of the steepest descent method followed by 5000 steps of the conjugate-gradient method until the root-mean-square of the gradient was below 10−4 kcal (mol−1·Å−1). After, each minimized system was heated up from 0 to 300 K with solute restrained using a harmonic potential with a force constant of 100 kcal (mol−1·Å−2) during 500 ps. Along the equilibration stage, the positional restraint force was reduced from 50 to 2 kcal (mol−1·Å−2) during 1.2 ns of MD simulations. Then, 200 ps of density equilibration without any restraint under the NPT was performed. Finally, 100 ns of NPT simulation at 300 K was performed with a 2 fs time step. The GPU version of the pmemd45 module of the Amber18 package was used to run all MM MD simulations. The structural analysis of each system was performed by computing the root-mean-square deviations (RMSD) of the main chain atoms concerning average coordinates structures. Similarly, the root-mean-square fluctuations (RMSF) of Cα atoms of each amino acid residue were computed. Besides, MD runs for Mpro-triptorelin system were also performed with a different starting structure (random seed). In addition to that, additional replicas starting from different sets of atomic coordinates and velocities were also chosen to generate completely independent trajectories. Details are presented in the ESI† files. The enabled us to calculate the error for these free energy results.
ΔGbind = ΔEMM + ΔGSOLV − TΔS | (1) |
ΔEMM = ΔEint + ΔEele + ΔEvdW | (2) |
The ΔGSOLV is the sum of the electrostatic solvation energy (ΔGGB) (polar contribution) and the nonpolar contribution (ΔGSA), according to eqn (3):
ΔGSOLV = ΔGGB + ΔGSA | (3) |
Particularly, ΔGSA is estimated by using the solvent-accessible surface area (SASA) method.50 As the entropic term (−TΔS) has a heavy computational cost, it has been neglected for binding free energy calculations.51
Finally, since the MM/PBSA48,50 approach is computationally time-consuming,52 MM/GBSA has been used to compute the contributions from individual residues by free energy decomposition analysis.51 This method has been frequently applied in drug design studies.53–58
Structure-based virtual screening was carried out in the gold program 2020.1, which uses a genetic algorithm to generate and select conformers of the database of compounds. From the results of re-docking, it was found that re-docking of the crystallographic ligand with the gold program gives an RMSD value of 2 Å. Thus, it is observed that the GOLD program was able to reproduce the conformation of the crystallographic ligand and the main interactions obtained at the Mpro enzyme site.29 Besides, molecular docking calculations have been successfully applied in recent SARS-CoV-2 Mpro studies.71–75
Initially, 75 of the nearly 3000 drug molecules that had the best scores in the GOLD program were selected. It is noteworthy that the selection criterion was based on the score of remdesivir, as this has a potential effect in the treatment of COVID-19.9,29,76 After, a visual inspection of the docking results was carried out on the Poseview online server,30 taking into account mainly the interactions with the catalytic dyad residues of the Mpro enzyme, in addition to the drug-score values to qualify the best drug candidates against COVID-19.77 Thus, from the 75 structures, ten hit compounds were selected for further analysis (Table 1), considering the highest gold score values with Mpro enzyme. Besides, they were selected using as criteria hydrogen interactions, complementarity through visual analysis at the active site of the Mpro enzyme and the drug-score values.
Compound | Gold score | Hydrogen interactions | Hydrophobic interactions | Drug score | Pharmacological activity |
---|---|---|---|---|---|
Afamelanotide | 111.02 | Asn119, 2 × (Asn142), His164, Gln189 | His41, Leu141, Asn142, Gly143, Met165, Gln189 | 0.31 | Erythropoietic protoporphyria |
Triptorelin | 110.38 | 2 × (Asn142), Glu166, Pro168, Asp187, Gln189 | Thr25, Ser46, Met49, Asn142, Cys145, His163, Met165, Glu166, Gln189 | 0.33 | Antineoplastic, synthetic analog of gonadotropin-releasing hormone palliative treatment of advanced prostate cancer. Stimulates the release of luteinizing hormone |
Pentagastrin | 98.06 | Gly143, Ser144, His163, Glu166, Gln189 | Asn142, Met165, Pro168, Gln189, Thr190, Ala191 | 0.18 | Evaluation of gastric acid secretory function |
Terlipressin | 97.51 | Thr24, Ser139, Phe140, 2 × (Asn142), Gly143, Met165, 2 × (Glu166), Gly170 | Thr25, Ser46, Met49, His172 | 0.46 | Vasoactive drug in the management of hypotension |
Adaptavir | 92.78 | Thr24, Thr26, Ser46, Thr190, Gly143, 2 × (Cys145), Thr190, Gln192 | Met165 | 0.50 | Chemokine receptor antagonist |
Pepstatin | 90.27 | Thr24, 2 × (Thr26), His41, Cys145, Gln189 | Cys145 | 0.44 | Inhibitor of aspartic proteinases |
Octreotide | 88 | Ser46, Met49, Cys145, Glu166 | His41, Ser46, Met49, Glu166, Pro168, Gln189, Thr190, | 0.21 | Potent inhibitor of growth hormone, glucagon, and insulin |
Darunavir | 85.46 | Cys145, 2 × (Ser144), 2 × (Glu166), Gln189 | His41, Met49, Leu141, Asn142, Met165, Gln189 | 0.29 | Protease inhibitor used as a treatment of human immunodeficiency virus (HIV) |
Afatinib | 84.43 | His41, 2 × (Glu166), Gln189 | His41, Cys145, Met165, Asp187 | 0.24 | Therapy of selected forms of metastatic non-small cell lung cancer |
Foretinib | 80.31 | Thr26, Cys145 | Thr25, Thr26, His41, Met49, Leu50, Gly143, Met165, Gln189 | 0.25 | Treatment of breast cancer |
In addition to the dyad, the enzyme Mpro has a catalytic region formed by the residues Phe140, Asn142, Gly143, Ser144, Cys145, Met165, Glu166, Gln189 and Thr190.78 As a protease, the catalytic site contains different subsites where interactions occur with the amino acids of the substrate whose peptide bond will be hydrolyzed. The S1 subsite is formed by the residues Phe140, Leu141, Asn142, His163 and Glu166, the S2 subsite is formed by the residues Met49, Tyr54, His164, Asp187 and Arg188, the S3 subsite is formed by the residues Met165, Leu167, Gln189, Thr190 and Gln192, in addition to the S1′, formed by the residues His41, Gly143, Ser144 and Cys145.75
Analyzing Table 1, it was possible to observe that the selected ligands interact with the enzyme through hydrogen bonds with the Thr24, Thr26, Hys41, Ser46, Met49, Asn119, Ser139, Phe140, Asn142, Gly143, Ser144, Cys145, His163, His164, Met165, Glu166, Pro168, Gly170, Asp187, Gln189, Thr190 and Gln192. In addition, among the ten main compounds selected, the interactions of residues His41, Asn142, Cys145, Glu166 and Gln189 can be highlighted, in which they make two, three, five, six and six hydrogen bonds, respectively. As mentioned earlier, His41 and Cys145 are part of the catalytic dyad, and a computational study of Wang's drug repurposing showed that His41 and Gln189 residues were considered important.79 These residues are mainly part of the S1 subsite (Asn142, Cys145 e Glu166) and S2 (His41 e Gln189), in addition, studies by Koulgi and collaborators suggest that the inhibitory effect is observed when interactions with these key residues occur at the active site of the enzyme.80
Apart from remdesivir, application of darunavir against Covid-19 should be highlighted, especially since it has been studied as a possible treatment for SARS-CoV-2, and due to in vitro evidence that supports its ability to fight the disease.81
Fig. 1 Overlay of all Mpro systems after 100 ns of MD simulations. Mpro is shown in the cartoon model while repurposed drugs are in the stick model. All 3D structures of representative snapshots of Mpro systems are available as ESI† (PDB format). |
The root-mean-square deviations (RMSDs) of each system concerning the average structure were plotted in Fig. 2. As the plots show, the RMSDs of each system tend to converge after 20 ns simulation time, indicating the system is stable and equilibrated. Besides, all systems show very small RMSDs, which change from 0.97 ± 0.23 Å (pentagastrin) to 1.40 ± 0.20 Å (octeotride), which suggests that they are in a similar conformation. Additionally, for the Mpro-triptorelin, which shows the best binding free energy (as shown below), two more 100 ns of MD runs were performed using random seed with different atomic velocities. These results suggest that their average structures have very high 3D conformations and similar RMSD plots (Fig. S1 in the ESI† file).
Fig. 2 Graphical representation of the RMSD values (Å) during 100 ns of MD simulation time for the selected repurposed drug complexes. |
To explore the fluctuations of the residue of system-wise, the root-mean-square fluctuation (RMSF) analysis was carried out. The RMSF of the Cα atoms of the protein for each residue during whole MD simulations for the complexes is shown in Fig. 3. The highest RMSF values are attributed to the N- and C-terminals, and loop regions. As shown in Fig. 3, the region closer to the active site formed by the Cys44–Asn53 residues (represented in red), that presented greater fluctuation to the terlipressin compound. Based on this observation, Bzówka et al. suggested that the presence of an inhibitor in this region might stabilize the loops surrounding the active site.72 Another highlight is that residues are formed by the Asp153–Cys156 (represented in orange). In addition, a study by Shekh and colleagues describes the importance of Cys156 in the accessibility of external agents in the Mpro enzyme.83 Another region that presented high values of RMSF (Fig. 3), for all compounds, was the region formed by the Gly275–Thr292 residues (represented in yellow).
Inhibitor | ΔEele | ΔEvdW | ΔGGB | ΔGSA | ΔGbind |
---|---|---|---|---|---|
Triptorelin | −121.9 ± 0.3 | −92.4 ± 0.2 | 147.9 ± 0.3 | −10.8 ± 0.01 | −77.3 ± 0.2 |
Darunavir | −207.7 ± 0.7 | −67.7 ± 0.2 | 221.9 ± 0.6 | −10.2 ± 0.01 | −63.7 ± 0.3 |
Foretinib | −104.1 ± 0.4 | −63.6 ± 0.2 | 122.2 ± 0.4 | −7.0 ± 0.01 | −52.5 ± 0.2 |
Pentagastrin | 24.8 ± 0.5 | −57.9 ± 0.2 | −7.9 ± 0.4 | −7.3 ± 0.02 | −48.2 ± 0.2 |
Adaptavir | −34.5 ± 0.3 | −53.3 ± 0.2 | 51.6 ± 0.3 | −6.1 ± 0.02 | −42.2 ± 0.2 |
Afamelanotide | −185.0 ± 1.4 | −57.8 ± 0.2 | 211.2 ± 1.3 | −7.7 ± 0.03 | −39.4 ± 0.3 |
Afatinib | −115.7 ± 0.5 | −51.3 ± 0.1 | 134.6 ± 0.5 | −6.6 ± 0.01 | −38.9 ± 0.1 |
Octeotride | −243.1 ± 1.0 | −43.1 ± 0.2 | 261.7 ± 0.9 | −6.1 ± 0.02 | −30.5 ± 0.2 |
Pepstatin | 42.9 ± 0.4 | −49.8 ± 0.2 | −15.9 ± 0.3 | −6.4 ± 0.02 | −29.2 ± 0.2 |
Terlipressin | −220.2 ± 1.4 | −54.2 ± 0.3 | 256.5 ± 1.6 | −7.4 ± 0.04 | −25.4 ± 0.2 |
The energy components of the binding free energies were also listed in Table 2. As can be seen from Table 2, the ΔEele, ΔEvdW and ΔGSAterms are favorable contributors to most of the repurposed drugs, except for the pentagastrin and pepstatin drugs. Interestingly, in these systems, the term ΔGGB is favorable, which does not occur on other systems. Then, in most of the systems, the favorable electrostatic interactions are equilibrated by the unfavorable polar solvation upon binding. As a result, the total electrostatic interaction contributions have detrimental effects on the binding of repurposed drugs. Except for the pentagastrin and pepstatin, the positive solvation contribution indicates that the repurposed drugs were exposed to water interaction in their respective complexes. If we consider the contributions of different binding free energy components of Mpro–ligand binding, the most important determinant of difference in the binding affinity are ΔEvdW and ΔGGB contributions.
Interestingly, according to molecular docking results, the triptorelin was pointed with the second-highest value of gold score (Table 1) and after 100 ns of MD simulations has become the most promising Mpro inhibitor among the selected compounds. On the other hand, terlipressin, which has the fourth-highest value of gold score (Table 1), has been identified as the weakest Mpro inhibitor. This evidence can be observed by comparing the binding free energy values of the selected repurposed drugs (Table 2) it is possible to observe two compounds with distinct energy values. The calculated values of the terlipressin and triptorelin compounds have ΔGbind (−25.39 ± 0.21 and −77.28 ± 0.20 kcal mol−1, respectively) and ΔEvdW contribution (−54.24 ± 0.30 and −92.42 ± 0.19 kcal mol−1, respectively). These results suggest the importance of applied computational strategies as highlighted in previous studies.25
To better understand the individual contribution of each residue for the binding free energy for the lowest ΔGbind, a residual decomposition analysis of ΔGbind was performed using MM/GBSA approach.51 Here, any residue that contributes to the binding free energy values below −1.50 kcal mol−1 was considered an important residue to the binding process. The energy contribution of each residue for the complexes formed by terlipressin (weakest binding free energy) and triptorelin (strongest binding free energy) obtained by the MM/GBSA method are shown in Fig. 4, once they have shown the different binding free energy values. The residue contributions to binding energies using the plugin CHEWD84 with Chimera85 software are shown on the right side of Fig. 4. From this, the favourable interactions are at the blue of the colour scale, that is, those that contribute to the stabilization of ligand into complex, while the red colour represents interaction by residue with positive values and corresponds to unfavourable interaction values.
Fig. 4 Per-residue binding free energy decomposition (in kcal mol−1) (left) and representative snapshot (right) of Mpro systems after 100 ns of MD simulations. (A) Terlipressin and (B) triptorelin. The carbon atoms of each inhibitor are shown in gray color. The per-residue results for other systems are available in the ESI file (Fig. S2†). |
For the Mpro–terlipressin complex (Fig. 4A) the residues that most significantly contribute to the total energy and, therefore, to the stabilization of the complex are His41, Cys44, Met49, Met165, Pro168, Pro188, Gln189, Thr190 and Ala191, that displayed energy values of −2.46, −1.79, −1.65, −1.53, −1.77, −1.86, −1.75, −3.25 and −1.82 kcal mol−1, respectively. The binding free energy decomposition of Mpro–triptorelin complex (Fig. 4B) revealed that the residues Thr25, His41, Ser46, Asn142, Cys145, Met165, Glu166, Pro168, Asp187, Gln189 and Ala191 contribute greatly to the complex stability, exhibiting the energy values −1.74, −2.85, −1.65, −2.78, −1.69, −3.14, −1.91, −3.18, −1.90, −1.53 and −1.54 kcal mol−1, respectively. From these results, we can observe that the enzyme–triptorelin complex present more favorable contributions to the binding free energy when compared to those formed by the terlipressin complex. Moreover, we should pay more attention to those residues with a relatively large difference in the contribution to the total binding free energies. For the terlipressin system, we can highlight the residues His41 and Thr190, while for the triptorelin system, the most relevant residues are His41, Asn142, Met165, and Pro168.
It is worth highlighting that His41, Met165, Pro168, Gln189 and Ala191 residues were important for both ligands. Interestingly, that triptorelin compound has interactions with catalytic dyad formed by His41 and Cys145 residues.15 Besides, the interactions with residuals Thr25, Cys145, Met165, Asp187, Gln189 and Gln192 had already been observed in the molecular docking results (Table 1).
Particularly, we can highlight the results for the darunavir, which shows a moderate binding free energy value (−63.70 ± 0.25 kcal mol−1). This drug has shown evidence in vitro studies involving a possible treatment for SARS-CoV-2.86 Therefore, our results suggest that Mpro would be a potential target for the darunavir drug.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d1ra03956c |
This journal is © The Royal Society of Chemistry 2021 |