Profiling the interaction mechanism of indole-based derivatives targeting the HIV-1 gp120 receptor

Jinghui Wang; Yan Li; Yinfeng Yang; Jingxiao Zhang; Jian Du; Shuwei Zhang; Ling Yang

doi:10.1039/C5RA04299B

View PDF VersionPrevious ArticleNext Article

DOI: 10.1039/C5RA04299B (Paper) RSC Adv., 2015, 5, 78278-78298

Profiling the interaction mechanism of indole-based derivatives targeting the HIV-1 gp120 receptor†

Jinghui Wang^ab, Yan Li*^a, Yinfeng Yang^a, Jingxiao Zhang^a, Jian Du^b, Shuwei Zhang^a and Ling Yang^c
^aKey Laboratory of Industrial Ecology and Environmental Engineering (MOE), Department of Materials Sciences and Chemical Engineering, Dalian University of Technology, Dalian, Liaoning 116024, China. E-mail: jhwang_dlut@163.com; yinfengyang@yeah.net; d11107022@mail.dlut.edu.cn; zswei@dlut.edu.cn; yanli@dlut.edu.cn; Tel: +86-0411-84986062
^bInstitute of Chemical Process Systems Engineering, Dalian University of Technology, Dalian 116024, Liaoning, China. E-mail: dujian@dlut.edu.cn
^cLaboratory of Pharmaceutical Resource Discovery, Dalian Institute of Chemical Physics, Graduate School of the Chinese Academy of Sciences, Dalian, Liaoning 116023, China. E-mail: yling@dicp.ac.cn

Received 11th March 2015 , Accepted 10th September 2015

First published on 10th September 2015

Abstract

A glycoprotein exposed on a viral surface, human immunodeficiency virus type 1 (HIV-1) gp120 is essential for virus entry into cells as it plays a vital role in seeking out specific cell surface receptors for entry. The CD4 binding site of gp120 is known to be highly conserved among the different circulating subtypes and therefore constitutes particularly interesting targets for drug design. The rational selection of the training and test sets is a crucial step in the process of establishing three-dimensional quantitative structure–activity relationship (3D-QSAR) models. In the present study, a novel rational division methodology based on a genetic algorithm (GA) was employed for building QSAR models. In order to verify the rationality of the GA splitting approach, the modeling set was subdivided into a training set (80% of the modeling set) and a test set (20% of the modeling set) using a self-organizing map (SOM) and random division for comparison. The models of SOM and random splitting exhibit almost the same proper statistical results (Table 5) indicating that models based on these three division methods generate significant statistical results for the test sets, proving the reasonability of the GA division method for building the models. In addition, molecular docking, molecular dynamics (MD) and Molecular Mechanics/Poisson–Boltzmann Surface Area (MM-PBSA) were performed for further extending the study and confirming the goodness of the CoMFA and CoMSIA models. The corresponding 3D contour maps along with the docking and MD simulation results have also identified the key structural requirements of HIV-1 inhibitors responsible for the activity: (1) bulky substituent at position-3 of the oxazole ring increases the biological activity; (2) electrostatic groups at positions-2,3 of the oxazole ring and positions-14,15 of the indole ring are helpful for the HIV-1 inhibition; (3) hydrophobic groups at position-12 of indole are favored. (4) Asp368, Asn425, Thr257, Ser375, Asp474, Gly473 form several H-bonds which are crucial for the ligand–target interactions. The decomposition of free energies by MM-PBSA indicates the polar salvation contribution terms are the major driving force for the interaction between the inhibitor and HIV-1 protein. The present work provides useful guidelines for future structural modifications of this class of compounds towards the development of superior HIV-1 inhibitors.

1. Introduction

The acquired immunodeficiency syndrome (AIDS) is caused by the human immunodeficiency virus type 1 (HIV-1) afflicting the immune system of the organism.¹ Despite enormous costs on effective HIV-1/AIDS prevention campaigns, every day more than 7000 novel HIV-1 infections are occurring and approximately 34 million people were estimated to live with HIV-1 in 2011.² Thus for the treatment of the disease, intensive studies for the development of novel antiviral agents and multidrug combination therapies have been carried out which resulted in a significant increase in life expectancy for those with access to therapy.³ However unfortunately, though highly active antiretroviral therapy (HAART) cocktails consisting of reverse transcriptase and protease inhibitors have been proven successful in reducing the viral load in AIDS patients, they fail in the eradication of HIV up to date.^4,5 What's worse, the rapid emergence of drug resistance has rendered many treatments ineffective, which will continue to be a formidable challenge for molecular design and drug discovery.⁶ Thus, today the need for novel therapeutic agents with improved dosing regimes that are better tolerated with a reduced side effect burden is still urgent.

The HIV-1 entry process involves multiple stages, namely, four sequential steps including the attachment, CD4 binding, co-receptor binding, and membrane fusion, with several targets including two viral glycoproteins (gp120, gp41) and two cellular receptors (CD4, CCR5/CXCR4) involved.⁷ The entry steps seem to be attractive targets because of the opportunities for stopping the virus from gaining entry to cells in the first place.⁷ In fact, the interaction of HIV-1 surface glycoprotein gp120 with CD4, a glycoprotein receptor expressed on mammalian cells, is the first critical step of the series of carefully choreographed events that allow virus access to host cells.^8,9 In the second step, the binding of gp120 to CD4 induces a conformational change in the virus glycoprotein that exposes binding sites recognized by co-receptors expressed on the surface of host cells, either CCR5 or CXCR4, depending on virus tropism.⁸ CCR5-using virus strains (R5-tropic) predominate during the primary infection and the early asymptomatic phase whereas CXCR4-using (X4-tropic) viruses appear during later disease stages in about 50% of HIV-1-infected individuals.¹⁰ In the last two steps, co-receptor binding subsequently triggers additional structural changes that lead to exposure of the envelope gp41 fusion protein that allows its insertion into the cell membrane and mediates fusion of the viral envelope with the host cell.¹¹ As gp120 dissociates from gp41, the heptad repeat 2 in each of the three gp41 units condense with their respective heptad repeat 1 domains, forming a 6-helix bundle and bringing viral and host membrane into close proximity.¹⁰ Therefore, blocking the binding of CD4 with gp120, and more specifically, preventing the CD4-induced conformational isomerization that promotes the co-receptor binding and viral cell fusion, may have potential therapeutic value for the treatment of HIV infection and AIDS.¹²

The virus envelope glycoprotein gp120 contains five conserved regions (C1–C5) and five protein domains (V1–V5), identified by comparison of the primary amino acids sequences of gp120 proteins derived from different HIV-1 isolates.⁷ The conserved domains contribute to the core of gp120, while numerous N-linked glycosylation sites are located near the surface of the molecule.¹³ The X-ray crystal structures also reveal a water-filled channel that connects the Phe43 cavity to the gp120 surface and is flanked by the inner domain and bridging sheet. Research into gp120 inhibitors able to block the gp120–CD4 interaction has received ongoing attention in recent years which led to the discovery of active small-molecules characterized by a high degree of chemical diversity.² The binding of gp120 and CD4 creates a roughly spherical 152 Å³ cavity (the Phe43 cavity) at the interface between gp120 and CD4; the Phe43 cavity is bounded by highly conserved residues from all three gp120 domains and by a single CD4 residue, Phe43.¹⁴ As a consequence, the CD4–gp120 cavity has been suggested to be a potential target for small molecules and the inhibition of HIV-1 entry that block gp120–CD4 interactions. Presently, a computational protocol based on molecular docking was employed with the aim of identifying ligands targeting Phe43 cavity.¹⁵

To date, most gp120 entry inhibitors reported are proteins or peptides. Due to their low oral bioavailability and the high price, their applications in clinical trials are limited. Therefore, blocking the gp120–CD4 interaction has attracted much attention on the development of active small-molecule gp120 inhibitors in recent years.² Recent publications and patent disclosures indicate that there has been an increased effort to identify small-molecule HIV-1 gp120 inhibitors that can block HIV-1 gp120–CD4 interactions.¹⁵ BMS-378806, BMS-488043, BMS-663068, BMS-599793, NBD-556 and NBD-557 (Fig. 1) together with their analogs are mainly potent small-molecule inhibitors that are reported to prevent the binding of HIV-1 gp120–CD4 receptors in nanomolar range. BMS-378806 is the first small-molecule CD4-attachment inhibitor that enters into clinical trials, but failed to exhibit sufficient exposure to merit proceeding into an efficacy trial.¹⁶ It potently inhibited the laboratory-adapted HIV-1 strains and retained high activity against primary isolates.¹⁷ During the exploration of the resistant viral strain derived from BMS-378806, a mechanism was confirmed, which involves the targeting of viral entry by inhibition of the binding of HIV-1 envelope gp120 protein to CD4 receptors via a specific and competitive mechanism with a 1 [thin space (1/6-em)] :1 stoichiometry, similar to that of the soluble CD4.^18,19 BMS-488043 is BMS-378806's successor²⁰ as a low-molecular-weight attaching inhibitor of HIV-1 entry.²¹ With improved pharmacokinetic properties, it demonstrates antiviral efficacy and favorable safety profiling in HIV-infected subjects, prohibiting HIV-1's infection via envelope conformational alterations and primarily interfering with cellular CD4 binding.²¹ BMS-488043 has entered into phase II clinical trial, but due to the large dose (1800 mg), it has been discontinued and finally got replaced by BMS-663068. By comparison, BMS-663068 is a lead compound in therapeutic clinical trials with an improved potency, a seemingly higher barrier for resistance as well as a good safety profile in humans.²² In addition, two N-phenyl-N V-(2,2,6,6-tetramethyl-piperidin-4-yl)-oxalamide analogs, NBD-556 and NBD-557, have been identified by Debnath et al. Both compounds show micromolar potency against HIV-1. Mechanism of action study reveals that NBD-557 and NBD-556 bind gp120 with large favorable enthalpy and unfavorable entropy changes as well as a large negative heat capacity change, resembling CD4 itself.²³ However, according to the results of Zhao et al., NBD-556 and NBD-557 inhibited the CD4-dependent virus in a dose-dependent manner with IC₅₀ values of 22.6 μM and 13.4 μM, respectively, but neither of the compounds inhibited the infection of the target cells by the CD4-independent virus at concentrations up to 118 μM indicating that these compounds inhibit HIV-1 entry and infection by primarily blocking the gp120–CD4 interaction.²⁴ Thus, resistance to HIV-1 inhibitors is a serious threat to efficient HIV treatment. Moreover, many of the HIV-1 protein inhibitors in the market suffer from poor pharmacokinetic properties due to poor aqueous solubility, low metabolic stability, high protein binding, and poor membrane permeability.²⁵ The development of new HIV-1 protein inhibitors addressing these issues is therefore of high importance.


	Fig. 1 Chemical structures of BMS-378806, BMS-488043, BMS-663068, BMS-599793, NBD-556 and NBD-557.

In discovery of rational small-molecule inhibitors, one of the major problems is the understanding of the interaction between drugs and their receptors.^26–28 As an artificial tool to design novel bioactive compounds, computer simulation technology provides more effective drug design.² Over the last few years, three-dimensional quantitative structure–activity relationship (3D-QSAR) methods including especially the approaches of comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) have been successfully employed to aid the design of new drug candidates, ranging from enzyme inhibitors to receptor ligands.²⁹ Presently, a series of 279 novel indole-based derivatives has been reported as HIV-1 attachment inhibitors exhibiting potent HIV-1 inhibitory activities and antiviral activities.^8,16,30–36 To explore the binding mode of this series of inhibitors against HIV-1, as well as to predict the inhibitory activities of unknown compounds, we constructed the 3D-QSAR models. Docking and molecular dynamics (MD) simulations were also carried out to investigate the possible binding mode between HIV-1 proteins with these inhibitors as well as to find their key structural features affecting the inhibiting activities. The binding free energies of compound 149 to protein were calculated by accurate molecular mechanics combining with Poisson–Boltzmann surface area (MM-PBSA) methods. The established models not only predict the biological activity of the derivatives rapidly and accurately but also provide valuable information for structural modifications to design novel potent inhibitors.^37,38

The selection of training and test sets is the most important and the also most difficult step which often consumes much time in building QSAR models. Random division is a widely used approach in establishing the QSAR model robustness. However, the disadvantage of dividing the overall set randomly is that it does not provide any rationale for selecting test set chemicals. In addition, it is time-consuming, especially for lager database (containing more than two hundreds compounds in the dataset). To achieve the optimal model, the selection of training and test sets should be based on some rational algorithms; otherwise, poor predictive ability of QSAR models may be obtained. Rational division algorithms which attempt to “intelligently” divide data sets into training and test sets with the goal of producing more predictive models.³⁹ Genetic algorithm (GA), derived from Darwin's theory of natural selection and evolution, is a highly efficient optimization algorithm which has already been successfully applied in many QSAR analyses.⁴⁰ Based on the Darwinian principle of survival of the fittest, GA obtains optimal solutions after a series of iterative computations (i.e., selection or reproduction, crossover or recombination, and mutation). In view of this, we design a new rational splitting method based on GA by using SYBYL Programming Language (SPL) which makes sure that the spitted the training and test sets were distributed within the whole descriptor space occupied by the entire dataset. For comparison, the modeling set was also subdivided into a training set (80% of the modeling set) and a test set (20% of the modeling set) using self-organizing maps (SOM) and random division. The purpose of these two methods is to determine whether GA splitting method lead to more predictive models compared to SOM and random division. When compared with the results obtained by the SOM and random selection methods, the GA splitting exhibits almost the same proper statistical results which indicate that the reasonability of GA division methods for building the models.

2. Materials and methods

2.1 Computational workflow

As aforementioned, a computational workflow was followed in order to predict the inhibitory activity of newly designed analogues. The whole work is summarized in the hierarchical chart (Fig. 2).


	Fig. 2 Hierarchical flow-chart of the computational workflow.

2.2 Chemical and biological data

In this work, a total of 279 HIV-1 inhibitors were collected from the Meanwell's group.^8,16,30–36 The chemical structures of these compounds are unequivocally known, which were clustered into seven subgroups (structures in ESI Fig. S1 and Table S1†) according to the molecular descriptors and structure difference. Each subgroup is represented by several compounds. For modeling purposes, the EC₅₀ values of the data set were converted into corresponding pEC₅₀ by taking their inverse logarithms (−log [thin space (1/6-em)]

EC₅₀) and used as a dependent variable for further investigation. For the consideration of activity gradients, the compounds were chosen from the above seven categories with an approximate ratio of 4 [thin space (1/6-em)]

1 of the training set and the test set. Table 1 shows the representative structures of HIV-1 inhibitors and their pEC₅₀ values.

Table 1 Representative structures of HIV-1 inhibitors and their pEC₅₀ values

No.	pEC₅₀	No.	pEC₅₀
a Test set.
007	4.70	125	8.12
054^a	6.41	149	11.27

2.3 GA splitting method

In this study, in order to rationalize the division process, a new rational GA-based splitting program was designed by using SPL (Fig. S3 in ESI†) to make sure that the splitted training and test sets were distributed within the whole descriptor space occupied by the entire dataset. This program aims to select training and test sets rationally from a preferable database set. The detailed methodology about (GA)-based splitting approach is depicted in Fig. 3.


	Fig. 3 System architecture of building QSAR models workflow. Div (= 2, 3, 4), the whole compounds can be divisible by div (= 2, 3, 4) as initial test set. N, the number of compounds selected, which have been added to the training set from the test set when the ratio of training to test compounds is ranging from 5:1 to 3:1. The algorithm steps are mainly as follows: (1) the whole compounds can be divisible by div (= 2, 3, 4) as initial test set, the rest of compounds are considered as the training test. (2) Add N compounds to the training set from the test set, calculate the model parameters and record the log file. (3) The best compounds (Q² > Q_threshold², R_ncv² > R_ncvthreshold², R_pred² > R_{predthreshold}²) are added to the training test. (4) The best model is chosen from the log file.

Like most rational design algorithms, the new splitting is designed to meet following criteria: (1) the chemicals in the training set must be structurally diverse enough to cover the whole descriptor space of the overall data set. (2) The compounds in the training and test sets should be close to each other.⁴¹ Fig. S4 (in ESI†) depicts an example of the GA splitting which spites the 11 compounds into the training and test sets.

2.4 Splitting the dataset using SOM and random approaches

To verify the rationality of the GA splitting and QSAR models, the entire dataset was divided into a training set and a test using two different methods: SOM and random selection method. Self-organizing maps⁴² creates a set of prototype vectors representing the dataset and carries out a topology preserving projection of the prototypes from the d-dimensional input space onto a low-dimensional grid, which is a convenient visualization space for showing the cluster structure of the data.⁴³

Random division is a widely used method to establish the model robustness. It consists of rebuilding the models using randomized activities of the modeling set and subsequent assessment of the model statistics.⁴⁴ This method guarantees that the prediction set spans the entire range of the experimental measurements and is numerically representative of the dataset.^45,46

2.5 Chemical structure optimization and alignment

Molecular modeling and database alignment were performed by using the SYBYL 6.9 program package.⁴⁷ 3D structures of all compounds were constructed using the Sketch Molecule module. Energy minimizations were performed using the Tripos force field⁴⁸ with a distance-dependent dielectric and the Powell conjugate gradient algorithm with a convergence criterion of 0.01 kcal mol⁻¹. Gasteiger–Hückel charges were applied to all the molecules used for 3D-QSAR studies.

During the process, the most active compound 149 was used as the template for alignment and the rest of molecules aligned to fit the rest inhibitors with the common substructure as shown in blue in Fig. 4A. To derive the best possible 3D-QSAR statistical model, three different alignment rules were employed in this study. Alignment-I (Fig. 4B) is a ligand-based method, where the most potent compound 149 was used as a template to fit the remaining training and test sets of molecules by using substructure-alignment function available in SYBYL software. Alignment-II (Fig. 4C) is the receptor-based alignment, in which all molecules were docked into the receptor firstly. Then those docking solutions with the highest scores and conformations similar to the ligand in the crystal structure were overlaid to derive the receptor-based molecular alignment. With respect to Alignment-III (Fig. 4D), it is also a receptor-based method, in which all molecular conformations were first obtained from Alignment-II and were subsequently subjected to the process as Alignment-I, i.e., the skeleton of compound 149 was used as the template to fit all remaining compounds adopting the receptor-based conformations.


	Fig. 4 (A) Compound 149 was used as a template for alignments. The common substructure is shown in blue. Alignment-I, -II and -III of all the compounds are shown in panels (B), (C) and (D), respectively. Molecules are colored in white for common C, blue for N, red for O, yellow for S, cyan for H atoms, respectively.

2.6 CoMFA and CoMSIA descriptors

The CoMFA and CoMSIA descriptors were used as independent variables, and pEC₅₀ values were used as dependent variables in partial least squares (PLS) regression analysis to derive 3D-QSAR models using the standard implementation. The cross-validation analysis was performed using the leave-one-out (LOO) method in which one molecule was removed from the data set and its activity was predicted using the model derived from the rest of the dataset. A cross-validated coefficient, Q², was then obtained and provided a glimpse of model predictive power. The predictive correlation coefficient (R_pred²), based on the molecules of test set, was calculated according to the equation shown as below:


	(1)

where SD is the sum of squared deviations between the inhibitory activities of the test set and mean activities of the training molecules and PRESS is the sum of squared deviations between predicted and actual activity values for each molecule in the test set.

2.7 Protonation state assignment

The X-ray crystal structure of HIV-1 was taken from PDB (code: 3TGS) with the resolution of 2.7 Å in the current docking process. The crystal unit contains a dimer (chains A and B). Since there are two relatively strong H-bonds in chain B, which are formed by NBD-556 with Asn425 (–O⋯HN, 2.91 Å) and Gly473 (–O⋯HN, 2.77 Å), we chose one of the monomer (3TGS_B) for the docking study. His, Gln, and Asn residues were manually checked for possible flipped orientation, protonation, and tautomeric states with Pymol 0.99 (DeLano Scientific, San Carlos, USA) side-chain wizard script. The receptor structures was generated by using default values to determine the protonation states of the titratable groups. As such, amines were protonated, carboxylate groups were negatively charged, and hydroxyl groups are considered to be neutral.^49–54 The active site was defined with a 10 Å radius around the ligand present in the crystal structure to any of the inhibitor atoms were specified and a core sub-pocket was mentioned in order to obtain the appropriate bioactive conformation of the ligand within the active site.⁵⁵ Hydrogen atoms were then added to the protein, and the structures of protein and ligand were combined in a single Macromodel (Schrodinger Inc, San Diego, CA, USA) file.

2.8 Molecular docking

To locate the appropriate binding orientations and conformations of the inhibitors interacting with HIV-1, a powerful computational searching method was needed. The advanced molecular docking program GOLD (version 5.1), with a powerful GA method for conformational search and docking programs,^56,57 was employed to generate an ensemble of docked conformations.

Water-mediated H-bonds are reported sometimes playing a crucial role in the stabilization of the ligand at the active site.⁵⁸ Therefore, all water molecules in the crystal structure were retained. The maximum number of GA runs was set to 10 for each compound. At the end of the computation, each conformer of all 279 inhibitors was docked into the binding pocket, and those docked conformations of every ligand were saved. Having produced viable poses of the ligands bound to HIV-1 with GOLD, those top ranked structures were extracted together for further analysis.

2.9 Molecular dynamics simulation

The MD simulations of all the molecules in the data set were performed using the AMBER 10.0 software package.⁵⁹ The docked structures of gp120 with compound 149 systems were used as the initial structures for MD simulations. It was carried out in a water solvated simulation box, with periodic boundary conditions (PBC), the force field parameters for the ligand was generated by the general AMBER force field (GAFF) using the Antechamber program. The complex was first placed in a cubic box with a minimum 10 Å distance from the protein surface. The full system was subjected to energy minimization using steepest descent algorithm for 10 [thin space (1/6-em)]

000 steps, and then the system was equilibrated by a 200 ps MD simulation at 300 K.

For each protein–ligand complex, the binding free energy (ΔG_bind) was calculated by the Molecular Mechanics/Poisson–Boltzmann Surface Area (MM-PBSA) approach implemented in the Amber.⁶⁰ The difference in binding free energy is computed for each molecular species including complex, ligand, and protein. Binding free energy is calculated using the following equations:


ΔG_bind = ΔE_MM + ΔG_sol	(2)


ΔG_MM = ΔE_internal + ΔE_{electrostatic} + ΔE_vdw	(3)


ΔG_sol = ΔG_PB/GB + ΔG_SA	(4)

where ΔG_MM is the interaction energy between protein and the inhibitor and ΔG_sol is the solvation energy. The molecular mechanical (MM) free energy (ΔE_MM) is the sum of internal energy (ΔE_internal), the electrostatic interaction energy (ΔE_{electrostatic}), van der Waals interaction energy (ΔE_vdw), and internal energy of bonds, angles and torsions. ΔG_sol is the change of the solvation free energy upon binding, and includes the electrostatic solvation free energy ΔG_PB (polar contribution calculated using generalized Born model), and the nonelectrostatic solvation component ΔG_SA (nonpolar contribution estimated by solvent accessible surface area).⁶¹ Moreover, several earlier literatures have shown that entropy does not contribute much to the binding free energy to the same protein, so the entropy contribution was not calculated in this work.^62–65

3. Results

3.1 Reliability of the 3D-QSAR models

To judge whether a QSAR model is reliable fitting for prediction of unknown molecules, PLS analysis along with the LOO cross-validation method was used. Several statistical parameters were analyzed, including the Q², R_ncv², and standard error of estimate (SEE), F-statistic values and the optimum number of components (OPN) as well as the R_pred². For the HIV-1 3D-QSAR studies, by using GA method based on genetic algorithm for splitting the dataset, good correlations are observed in the obtained CoMFA and CoMSIA models demonstrated by the high values of Q² and other statistical results.

It is well known that the CoMFA and CoMSIA models are alignment sensitive, with the quality and the predictive ability of the models directly dependent on the alignment rules, and differences in the statistical values are observed with different alignments.⁶⁶ Table 2 summarizes the statistical results of the 3D-QSAR analyses using different superimpositions. Obviously, in both the CoMFA and CoMSIA analyses, the ligand-based alignment (Alignment-I) leads to the models with larger Q², R_ncv², R_pred² and lower SEE, standard error of prediction (SEP) values than those obtained from other two alignments (Alignments-II and III). Hence, the ligand-based conformational alignment was further used for systematic CoMFA and CoMSIA analyses.

Table 2 Summary of CoMFA and CoMSIA results^a

PLS statistics	Alignment-I		Alignment-II		Alignment-III
PLS statistics	CoMFA	CoMSIA	CoMFA	CoMSIA	CoMFA	CoMSIA
a Q², cross-validated correlation coefficient after the leave-one-out procedure; R_ncv², non-cross-validated correlation coefficient; SEE, standard error of estimate; F, ratio of R_ncv² explained to be unexplained = R_ncv²/(1 − R_ncv²); R_pred², predicted correlation coefficient for the test set of compounds; SEP, standard error of prediction; OPN, the optimal number of principal components. Alignment-I, a ligand-based method, where the most potent compound 149 was used as a template to fit the remaining training and test sets of molecules by using substructure-alignment function available in SYBYL software. Alignment-II, the receptor-based alignment, in which all molecules were docked into the receptor firstly. Then those docking solutions with the highest scores and conformations similar to the ligand in the crystal structure were overlaid to derive the receptor-based molecular alignment. Alignment-III, it is also a receptor-based method, in which all molecular conformations were first obtained from Alignment-II and were subsequently subjected to the process as Alignment-I, i.e., the skeleton of compound 149 was used as the template to fit all remaining compounds adopting the receptor-based conformations.
Q²	0.53	0.54	0.23	0.21	0.25	0.33
R_ncv²	0.89	0.83	0.65	0.68	0.41	0.72
SEE	0.57	0.72	0.98	0.93	1.26	0.88
F	169.81	141.21	62.40	71.60	47.12	85.12
R_pred²	0.85	0.83	0.56	0.58	0.49	0.70
SEP	1.20	1.17	1.44	1.47	1.42	1.35
OPN	10	7	6	6	3	6

Contribution (%)
Steric	67.60	14.80	39.30	11.00	47.80	11.70
Electrostatic	32.40	24.70	60.70	40.50	52.20	40.90
Hydrophobic	—	38.30	—	28.60	—	25.80
HB acceptor	—	22.20	—	19.90	—	21.50

The best CoMFA model is obtained by using steric and electrostatic fields and shows a Q² of 0.53 with an optimal number of components of 10, indicating a good internal prediction. This model also exhibits a high R_ncv² of 0.89 with relatively lower SEE of 0.57 and higher F value of 169.81 in the final non-cross-validation model. The steric component of these indole analogs on the inhibitory potency described by this model is 67.60%, whereas the electrostatic portion is 32.40%. All the above values reveal that the CoMFA model possesses a statistical significance and good internal predictive ability.

CoMSIA is an alternative molecular field analysis method to CoMFA. It is thought to be less affected by changes in molecular alignment and provides smoother and interpretable contour maps as a result of employing Gaussian type distance dependence with the molecular similarity indices.⁶⁷ The CoMSIA models were generated considering all possible combinations of CoMSIA field descriptors viz. steric, electrostatic and hydrophobic, H-bond donor and acceptor. Therefore, all 31 possible descriptors' combinations of different CoMSIA fields were calculated with their respective Q² value and optimum number of components obtained. Fig. 5 depicts the obtained Q², R_ncv² and R_pred² values for all combinations of the fields considered in CoMSIA analysis. Obviously, the combination of steric, electrostatic, hydrophobic and H-bond acceptor leads to the best CoMSIA model with larger Q², R_ncv², R_pred² values and thus was selected for further analysis. It gives an optimal number of components of 7 with a Q² of 0.54 which is almost consistent well with the CoMFA model (0.53). In addition, the high R_ncv² value of 0.83 and F value of 141.21 along with the low SEE value of 0.72 indicate that the CoMSIA model is statistically satisfactory. The steric, electrostatic, hydrophobic and H-bond acceptor field descriptors contributed 14.80%, 24.70%, 38.30% and 22.20% of the variance in anti-HIV-1 activity, respectively.


	Fig. 5 Graph of all 31 CoMSIA descriptors' combinations with their respective Q², R_ncv² and R_pred² values. The S, E, H, D and A represent the steric, electrostatic, hydrophobic, H-bond donor and acceptor property fields, respectively.

A crucial test for the predictability of a QSAR model is to check its ability of a model to predict the biological activity of compounds that have not been included in the training set.⁶⁸ In order to further validate the predictive ability of the obtained models, a test set of 69 compounds excluded in the construction of the models was employed by calculating their predicted activities to validate the reliabilities of the built models. The predicted R_pred² values for the test set from CoMFA and CoMSIA were 0.85 and 0.83, respectively, indicating good predictive ability for our models. The correlation between the calculated and experimental values pEC₅₀ for the training and test set molecules based on the optimal CoMFA and CoMSIA models is shown in Fig. 6. Fig. S5 (ESI†) illustrates the radar plot of observed versus predicted biological activity values for both training and test sets of the developed CoMSIA model. These statistical data and graphs indicate that the predicted pEC₅₀ values agree well with the experimental values in a tolerable error range. Overall, our models exhibit high precision both in regular cross validation and in prediction of activity of test set molecules. The high predictive powers of the CoMFA and CoMSIA training models for sets of diverse structural scaffolds suggest that these models possess high accommodating capacities and wide applicability in the development of novel HIV-1 inhibitors.


	Fig. 6 The ligand-based correlation plots of the predicted versus the actual pIC₅₀ values using the training (filled red triangles) and the test (filled black diamonds) set compounds based on (A) CoMFA and (B) CoMSIA models, respectively.

3.2 Graphical interpretation of CoMFA and CoMSIA models

To visualize the information content of the derived 3D-QSAR models, CoMFA and CoMSIA contour maps were generated. These maps provide information on factors affecting the activity of the compounds studied. This is particularly important when increasing or reducing the activity of a compound by changing its molecular structural features contributes to the interaction between the ligand and the active site region of a receptor. To aid visualization, the most active compound 149 in the whole dataset is shown as a template molecule superimposed with the 3D-QSAR contour maps. Fig. 7 and 8 depict the contours of CoMFA (steric and electrostatic) and CoMSIA (steric, electrostatic, hydrophobic, and H-bond acceptor fields), respectively.


	Fig. 7 CoMFA StDev*Coeff contour plots. (A) Steric (green/yellow) contour map in combination with compound 149. Green contours indicate regions where bulky groups increase the activity; yellow contours indicate regions where bulky groups decrease the activity; (B) electrostatic contour map (red/blue) in combination with compound 149. Red contours indicate regions where negative charged groups increase the activity; blue contours indicate regions where positive charged groups increase the activity.

As shown in Fig. 7A, the steric interactions are represented by green and yellow contour maps, where bulky groups near the green regions increase the inhibitory activity but cause opposite effect near the yellow regions. It has been recognized that a large positive green region near position-3 of oxazole ring shows a favorable steric interaction, indicating that molecules carrying a bulky substituent near this position should be more active than those without or with smaller substituents at the same location. For instance, compound 261 (pEC₅₀ = 9.83) fits into the green big contour around oxazole ring and thus shows significantly improved activity compared to another structural analog 254 (pEC₅₀ = 9.12). This is also in agreement with the fact that compounds 41, 106 and 118 with increased bulky substituents exhibit higher biological activities. Thus to improve the inhibitory effect of the compound, new analogs with increasing steric substituents in these regions should be developed. Meanwhile, a large yellow contour exists around position-14 of the indole ring, indicating that the substitution of too large group may decrease the activity. This is consistent with the reported experimental results that compounds 14 (pEC₅₀ = 4.68) and 16 (pEC₅₀ = 3.92) exhibit lower activities than molecules 8 (pEC₅₀ = 9.28) and 9 (pEC₅₀ = 9.35). If the inhibitor structure is too close or directly in contact with the yellow contours, there will be an overall decrease of the inhibitory activity.

Fig. 7B presents the electrostatic contour maps, which clearly show the areas of positive and negative electrostatic distributions. The blue and red contours represent positive and negative favored regions, respectively. In the positive electrostatic field, a big blue contour around the position-3 of oxazole ring and a small blue contour near position-14 of the indole ring reveal that the electron-donating substituents at these positions are essential for the inhibitory activity. Take compounds 102, 121, 122, 141, 158, 159, 161, 160 for examples, the strong and electron-donating –NH₂, –NHCH₃, –NCH₃CH₃ and –OCH₃ groups at the position-3 of oxazole ring and position-14 of indole ring in these compounds result in significantly increased activity. Moreover, two medium sized red contours at position-2 of the oxazole ring and the position-15 show that the electron-withdraw substituents at these positions are favored for activity. This phenomenon may have something to do with the atoms N and O (electronegative atom) of compounds 149–153, 157, 164, 167 at the position-2, which may be the reason for the small red contour showing around here. The function and location of atoms O at positions-5 and -16 may lead to the red contour appearing around position-15.

In our study, the CoMSIA model not only calculates the steric and electrostatic fields, but also uses the hydrophobic and HB acceptor fields to correlate with the inhibitors activity. The steric contour maps from the CoMSIA analysis (Fig. 8A) are generally in accordance with the field distributions of CoMFA maps. Only a subtle difference exists in the electrostatic contours of the CoMFA and CoMSIA models. As shown in Fig. 8B, a large size polyhedron in the vicinity of position-5 indicates that the electron-withdraw substituents at this position are favored for activity. As both CoMFA and CoMSIA models were built based on the same alignment database, this discrepancy may be explained by the different implementations of the fields for CoMFA and CoMSIA.⁶⁹ Thus, we assume that both steric and electrostatic fields play important roles in the binding affinity and should be given equal attention.


	Fig. 8 CoMSIA StDev*Coeff contour maps combined with compound 149. (A) Steric contour maps: green regions indicate where bulky groups increase the activity, yellow regions where small groups improve the activity. (B) Electrostatic contour maps: blue, favored for electropositive groups; red, disfavored for electronegative groups. (C) Hydrophobic contour maps: yellow contours indicate regions where hydrophobic substituents enhance the activity; gray contours indicate regions where hydrophilic substituents enhance the activity; (D) HB acceptor contour maps: magenta contours indicate regions where HB acceptors on the receptor improve the activity; yellow contours indicate regions where HB acceptors on the receptor demote the affinity.

Fig. 8C depicts the hydrophobic-favored and hydrophilic-favored regions which were shown as yellow and gray contours, respectively. One big yellow contour near position-12 of the indole ring and one small yellow embedded in the oxazole ring suggest that the occupancy of these regions by hydrophobic groups would increase the inhibitory activity. Indeed, most of the derivatives involved in this study possess a hydrophobic group at these areas, falling into these contours. This can serve as an explanation for the higher activities of compounds 3, 4 and 45 bearing hydrophobic Cl, CF₃, Br and CH₃, respectively. Additionally, the large gray contour near position-4 of oxazole ring and the small one located at position-19 of piperazinyl ring suggest the favor for hydrophilic groups at these locations.

The H-bond acceptor contours of the CoMSIA model are depicted in Fig. 8D, where H-bond acceptor favored and disfavored regions are shown as magenta and yellow contours, respectively. A magenta isopleth close to the carbonyl oxygen of the position-23 suggests that H-bond acceptor group may benefit the potency at this position, which is in agreement with the observations that most active compounds in the data set contain an H-bond acceptor group here. Similarly, a medium-sized magenta contour appears at the position-3 of oxazole ring, suggesting the favor of H-bond acceptor groups for the HIV-1 inhibitory potency, which is well illustrated by the N atom at position-3 of oxazole ring here.

In summary, the observations described above report the requisite structure for potent activities of the inhibitors of this class.

3.3 Polarizability contour plots

In a manner analogous to the 3D-QSAR algorithm, the polarizability fields were calculated at each lattice intersection on a regularly spaced grid.⁷⁰ The steric (Lennard-Jones potential) and electrostatic (Coulomb potential) fields were calculated at each grid point using an sp³ carbon probe atom with +1 charge 0.2 nm grid spacing and 0.15 nm van der Waals radium.^71–73 To minimize the domination by large steric and electrostatic energies, an energy cutoff of 30 kcal mol⁻¹ for both the steric and electrostatic contributions was set as threshold. The CoMSIA method, incorporating steric, electrostatic, hydrophobic, H-bond donor and acceptor fields, was carried out using a probe atom with radius 2.0 Å, +1.0 charge and hydrophobic and H-bond properties of +1. The attenuation factor was set to the default value of 0.3.

As shown in Fig. 9, the blue and red contours represent areas near the compound 149 that desire enhanced and decrease polarizability. In the positive polarizability field, a big blue contour around the position-3 of oxazole ring and a small blue contour near position-14 of the indole ring reveal that the electron-donating substituents at these positions are essential for the inhibitory activity. Two big blue contours are seen around the position-3 of oxazole ring. For the majority of the compounds in the data set, halogenation is required at these positions in order to facilitate binding. For the indole derivatives, it is hypothesized that the aromatic nature of the lateral phenyl rings acts as a surrogate for the electron rich halogens. These regions are seen near the lateral positions of the various molecular cores. For the majority of the compounds in the training set, halogenation is required at these positions in order to facilitate binding. The strong and electron-donating groups at this position are significantly increased activity. This result is very consistent with our existing knowledge of the structure–activity relationships for these molecules in that it has been previously hypothesized that increasing the polarizability along the long axis of the molecule is beneficial for binding.⁷⁴ Regions of space that require a corresponding decrease in the polarizability are found closer to the medial positions and are denoted by red polyhedral.⁷⁴ Two medium sized red contours at position-2 of the oxazole ring and the position-15 show that the electron-withdraw substituents at these positions are less polarizability for activity (Fig. 9).


	Fig. 9 CoMFA StDev*Coeff contour plots from the analysis of the polarizability field combined with compound 149. Red contours indicate regions where less polarizability desired; blue contours indicate regions where greater polarizability desired.

3.4 Docking analysis

As an effective computational method for predicting the binding structure of substrate in its receptor, molecular docking has been successfully used in many applications especially in exploring the target–ligand interaction mechanism.⁷⁵ Presently, to investigate the interaction mechanism of the HIV-1 attachment inhibitors with their target, all the obtained compounds were docked into the binding site of gp120 protein.

3.4.1 Validation of HIV-1 gp120 model. Before docking analysis, verifying the rationality of the docking parameters as well as the accuracy is necessary. The co-crystallized NBD-556 extracted from the X-ray structure of the complex was flexibly re-docked to the active binding site of HIV-1 gp120 (Fig. 10A). As shown in Fig. 10A, redocked NBD-556 was almost in the same orientation with its original co-crystallized conformation at the HIV-1 gp120 binding pocket, and the root mean square deviation (RMSD) between the redocked conformation and the crystallographic conformation of compound NBD-556 was 0.23 Å, suggesting an acceptable docking reliability of docking procedure. Thus it is assumed that the docking project can be extended to search the binding conformations to HIV-1 gp120 for other inhibitors. In addition, a superimposition of the simulated and experimental poses of NBD-556 within the Phe43 cavity shows only slight difference in the position of the solvent-exposed tetramethylpiperidine group (Fig. 10B). The good superposition of the docking and experimental conformations proves another time that our docked model is reliable.


	Fig. 10 (A) Binding conformations of co-crystallized (cyan) and re-docked (green) compound NBD-556. (B) Superimposition of the predicted (magenta) compound 149 and experimental (green; PDB code: 3TGS) conformation of NBD-556 within the Phe43 cavity.

3.4.2 Binding mode analysis obtained from molecular docking. In this section, the results derived from the docking of the HIV-1 attachment inhibitors in the active site of gp120 protein will be discussed. The attention has been focused on the most characteristic receptor–ligand interactions for a few key inhibitor structures and particularly for compound 149, the most active one in the data set. Fig. 11 shows a view of compound 149 in the active site of gp120 protein and the binding mode reveals the main interactions with the protein active site. It clearly characterizes that the hydrophobic interaction at this cavity plays a critical role in binding of inhibitors to gp120 receptor. The resulting binding pose indicates that the indole group is located at the entrance of the binding pocket containing the protein residues Gly367, Asp368, Asn425 and Ile371. The central piperazinyl moiety is involved in extensive hydrophobic contacts with residues Met426, Trp427, Met475 and Ile424, while the phenyl group linked to the piperazine ring makes hydrophobic interactions with the side chain of Tyr251, Val255, Phe376 and Phe382. It is also interesting to note that polar binding areas close to Trp427 and Met426 flank the entrance of or are embedded in a water channel.⁷⁶


	Fig. 11 Docked conformation of compound 149 into HIV-1 gp120. The projection highlights the structure of the active site with compound 149 which is displayed in sticks.

Aside from the hydrophobic interactions described above, H-bonds are also analyzed during the simulation. In general, the geometric criterion for the formation of H-bonds is an H-acceptor distance less than 3.5 Å (the distance between 3.2 and 4.0 Å is a weak H-bond) and the donor–H-acceptor angle larger than 120°. Although the indole group of compound 149 is too large to insert into the cavity, a strong H-bond is observed between NH of the indole ring and residue Asp368 (–O⋯HN, 2.68 Å). This interaction is crucial for the high affinity binding of CD4 and makes indole group perfectly fixed at the entrance of gp120 binding pocket. In addition, two H-bonds are formed by the oxygen atom at position-23 of compound 149 with the backbone carbonyl group of Thr257 (–O⋯HN, 3.66 Å) as well as the main chain amino group of Ser375 (–O⋯HN, 3.03 Å), which are in good agreement with the results in the studies of Kong et al.⁷ Moreover, the predicted complex is stabilized by additional water-mediated hydrogen bonding interactions between the oxygen atom at position-16 and residue Met426, forming an original H-bond. On the basis of these interactions, it was summarized that molecules with hydrophobic groups that can be accommodated within the hydrophobic cavity or with some H-bond donor groups that can interact with the hydrophilic region at the entry of the cavity comprising of residues Trp427, Asp368 would act as potential inhibitors.¹⁵ These results perfectly fill the bottom of the Phe43 cavity, and emerge as essential in determining the potency of the inhibitors, which might explain the best inhibitory activity exhibited by compound 149.

3.5 Molecular dynamics simulation

In order to evaluate the plausibility of the binding mode we predicted by GOLD, MD simulations, a well-established method to computationally probe the structure and dynamics of biological macromolecules were performed. Generally, it is considered more reliable to employ an average structure of the last 1 ns in the simulation compared to the use of a single crystal structure in an MD simulation. Further analyses of the MD simulation reveal a variable active site pocket and provide an attractive alternative for structural refinement for the treatment of both the ligand and protein as elastic, which allows the adjustment of the conformation upon the ligand binding for the whole system. Therefore, presently, 5 ns simulations of the docked complex structure of gp120 with ligand 149 were carried out to obtain a dynamical picture of the conformational changes that take place in aqueous solution with the purpose of investigating the conformational alterations that happened in both the ligand 149 and gp120 (Fig. 12).


	Fig. 12 (A) Plot of RMSD of the heavy atoms of the complex in the MD-simulated structures. (B) View of superimposed backbone atoms of the average structure for the MD simulations and the initial structure of the docking for the complex. Compound 149 is represented as carbon-chain in green for the initial complex and carbon-chain in orange for the average structure, respectively.

To check the accuracy of docking and the quality of the chosen conformations, we calculated the heavy atom RMSD between the docked and cocrystallized conformations from the existing X-ray structures. RMSD provides a direct measurement of the structural drift from the initial coordinates as well as the atomic fluctuation over the course of a MD simulation. Fig. 12A depicts the RMSD of the trajectory with respect to the initial structure of the gp120 protein and the small molecule complex ranging from 0 to 0.35 Å. During the initial 3 ns, the RMSD raises and the protein still adapts towards its optimal conformation. After 3 ns, there is no significant difference in the three dimensional structures of the protein and the RMSD of the complex reaches about 0.32 Å which value remains unchanged throughout the simulation. This clearly indicates a metastable conformation after the 3 ns of simulation for docked complex structure. The RMSD analysis confirms that the complex reaches equilibrium after about 3 ns of simulation.

Fig. 12B shows an overlay of the docked structure and the average structure for the last 1 ns, where the green ribbon represents the initial structure of the docked complex, the orange ribbon shows the average structure of the MD simulations, with compound 149 represented as stick and line for the initial complex and the average complex, respectively. According to the analyzed interactions between compound 149 and the receptor, it is noticed that only little difference is observed between the average structure extracted from the MD simulations and the docked model of the complex. Both models are in good agreement with each other, indicating that the MD-simulated low-energy conformation of compound 149 has similar orientation with the docking model.

A statistical analysis of the hydrogen bonding interactions during the course of the MD simulation reveals that some atoms of the ligand and the protein moved closer to or further from each other due to the dynamics in the binding process. As shown in Fig. 13, the majority of key amino acids (within 4.5 Å distance from the ligand) from the average complex structure of MD simulation are quite similar to those obtained from the docking results. Compound 149 anchors into the binding pocket constituted by Asn425, Met426, Glu370, Ile424, Trp427, Phe382, Ser375, Ser256, Thr257, Met475, Ile371, Gly473, Asp474 and Asp368. The important H-bond interactions from MD are shown in Fig. 13. Obviously, three H-bonds are formed in both the docking and MD simulations, i.e., one formed by the NH of indole ring with Asp368 (–O⋯HN, 3.92 Å), other two H-bonds between the oxygen atom at position-23 of compound 149 and Thr257 (–N⋯HN, 3.81 Å) as well as Ser375 (–O⋯HN, 2.56 Å). In addition, due to the rotations of the indole ring and oxygen atoms at 15,16-postions of the ligand, a new H-bond between the oxygen atom at 15-postion and Asn425 is observed. Furthermore, three water-mediated H-bonds are formed in the system and play essential roles in the stabilization of the ligand at the active site. One water-mediated H-bond is formed between the oxygen atom of position-15 of compound 149 and residue Met426. Additionally, the oxygen atom at position-16 is suggested to be involved in several salt bridges with Asp474 and Gly473 of gp120 as well as establish polar contacts with two water molecules. All these interactions indicate that the compound moves much closer to the residues in the binding site and in this way goes deeper into the pocket at the end of the MD simulation.


	Fig. 13 Plot of the MD-simulated structures of the binding site with compound 149. H-bonds are shown as dotted black lines; active site amino acid residues are represented as sticks.

3.6 Analysis of the components of binding free energy

In order to quantitatively evaluate the difference of the affinities of the studied inhibitor to protein, the relationship between the components of binding free energy and the structures of inhibitors needs to be established. To do this, we applied the MM-PBSA, a tool allowing the decomposition of electrostatic solvation free energy into atomic contributions in a straightforward way, to study the energetic implications of the contributions of the ligand to the binding of receptor in the presence of solvent and thermal fluctuations. The binding free energy consists of four parts, namely the vdW energy ΔE_vdw, the electrostatic energy ΔE_{electrostatic}, the non-polar energy ΔG_SA, and the polar energy ΔG_PB/GB.

The binding free energy was decomposed into identifiable energy components, and the detailed contribution of various energy components is also shown in Table 3. As observed in drug receptor complex simulations, the total binding free energies of binding system is −20.92 kcal mol⁻¹. This result reveals that the presence of the ligand favors the binding of gp120. The intermolecular van der Waals interactions were significant contributions for inhibitor binding to protein while the non-polar solvation (ΔE_vdw + ΔG_SA) term is unfavorable for binding in a complex. Thus, the favorable electrostatic contribution associated with solvation is not fully compensated by the unfavorable contribution to the MM energy. When we compared the contribution between the predicted binding free energies and each energy component, it is found that the contribution of ΔE_{electrostatic} is greater than that of ΔE_vdw to ΔE_gas(MM) (ΔE_{electrostatic} + ΔE_vdw) because the variance of ΔE_{electrostatic} (−24.93) is approximately ten larger than that of ΔE_vdw (−44.46). This suggests that the complex formation is driven predominantly by the electrostatic interaction. Therefore, for discussed above, we can conclude that the favorable binding free energy for receptor/ligand complex formation stems predominantly from the polar salvation contribution terms (ΔE_{electrostatic} + ΔG_PB/GB), which correspond to the burial of solvent accessible surface area upon binding, contributed slightly favorably, while nonpolar (ΔE_vdw + ΔG_SA) interactions provide most of all the directional constraints for the complexation, that is, the relative positions of the molecules and so the binding of ligand into gp120 are mainly driven by ΔE_{electrostatic} and ΔG_PB/GB, which makes the single largest contribution in absolute value.

Table 3 Binding free energy and its components for the studied inhibitor with HIV-1 gp120 protein^a

Component	Mean (kcal mol⁻¹)	Std.
a ΔE_vdw, energies shown as contributions from van der Waals energy; ΔE_{electrostatic}, electrostatic energy; ΔG_sol, solvation energy; ΔG_bind, bind free energy; Std., standard error of mean values.
ΔE_vdw	−44.46	0.36
ΔE_{electrostatic}	−24.93	0.62
ΔG_PB/GB	−52.04	0.45
ΔG_SA	3.57	0.01
ΔE_gas(E_MM)	−69.39	0.54
ΔG_sol	48.47	0.45
ΔG_bind	−20.92	0.42

3.7 MD simulations for NBD-556 and NBD-557

NBD-556 and NBD-557 identified by a HIV syncytium formation assay on a drug-like small-molecule chemical library of 33 [thin space (1/6-em)]

000 compounds have been reported exhibiting single digit micromolar potency against selected HIV-1 laboratory strains with minimal cytotoxicity.¹⁵ In order to investigate the binding process of the two novel small molecules NBD-556 and NBD-557, MD simulations were performed on the docking complexes of the two compounds to further investigate ligand–receptor interactions in the binding process. As shown in Fig. 14, it is found that both compounds are situated deep inside the phe43 cavity. The NBD-556 at the binding pocket is in line with its original co-crystallized (3TGS) conformation, indicating an acceptable MD reliability of the procedure. In addition, in order to understand the dynamical flexibilities of gp120 and the interactions between protein and ligand, hydrogen bonds for compounds NBD-556 and NBD-557 are also analyzed. The binding mode for compound NBD-557 is similar to that of NBD-556. Two H-bonds are formed in both the NBD-556 and NBD-557 with the Gly473 and Asn425, respectively. Additionally, one H-bond is found between the oxygen atoms at NBD-556 with Asn425, while this oxygen atom at NBD-557 is found to be involved in several salt bridges between Met426/Glu429 and a water molecule. By comparing entire MD simulation trajectories for the two compounds NBD-556 and NBD-557 with compound 149 of our results in the MD simulation, quite similar interaction mechanisms are found between the three binding modes. This shows that they complement and validate each other. In spite of the discrepancies shown between these three modes, there is some generality in binding interactions: a salt bridge with Met426, several H-bonds with Gly473 and Asn425.


	Fig. 14 Superimposition of the MD-simulated structures of the binding site with compounds NBD-556 (green) and NBD-557 (violet) into HIV-1 gp120 (PDB code: 3TGS).

The binding affinities of both compounds with HIV-1 gp120 were calculated using the MM-PBSA method. Table 4 shows the binding free energies and energy components, i.e., electrostatic, van der Waals, nonpolar solvation, and polar solvation components. It can be seen that there is a slight difference between the calculated binding free energies of NBD-556 and NBD-557, which well corresponds to the difference in their binding affinities.⁷⁷ A quick look at Table 4 indicates that the contribution of ΔE_vdw is higher than ΔE_{electrostatic} on the final estimated ΔG_bind. The van der Waals and electrostatic terms provide major favorable contributions to the inhibitor binding, whereas polar solvation terms are a disadvantageous ingredient for the two compounds. Nonpolar solvation terms mainly contribute for the ΔG_bind. By comparison of the two compounds, it shows that the sum of electrostatic energy including electrostatic interaction and nonpolar contribution to solvation, is similar in both systems. In addition, the sum of van der Waals interaction and the nonpolar solvation contribution of the NBD-556 is less than that of NBD-557, and the difference can partly explain why the H-bonds change of NBD-556 at binding pocket leads to its reduced binding affinity.

Table 4 Binding free energy and its components for the NBD-556 and NBD-557 with HIV-1 gp120 protein^a

Component	NBD-556	NBD-557
a ΔE_vdw, energies shown as contributions from van der Waals energy; ΔE_{electrostatic}, electrostatic energy; ΔG_sol, solvation energy; ΔG_bind, bind free energy.
ΔE_vdw	−41.06	−44.24
ΔE_{electrostatic}	−14.06	−14.95
ΔG_PB/GB	40.80	37.27
ΔG_SA	−3.13	−2.89
ΔE_gas(E_MM)	−55.12	−59.19
ΔG_sol	37.67	34.38
ΔG_bind	−17.45	−24.81

Experimental studies clearly demonstrated that the IC₅₀ value of NBD-556 is higher than that of NBD-557.²⁴ At the same time, the total binding free energy between NBD-557 and gp120 is −24.81 kcal mol⁻¹ which is bigger than that in NBD-556 (−17.45 kcal mol⁻¹). Thus, the calculated binding free energies are greatly compatible with the experimental ones. Although prime MM-PBSA did not accurately predict the binding affinities, it was successful in ranking the compounds according to relative activity.

In 3D-QSAR model, electrostatic field is the largest contribution to the activity, and in MD simulations, electrostatic interactions also provided significant contribution. MD simulation could not only verify the reliability of molecular docking but also give more detailed ligand–receptor interactions. These results give an atomic insight to the possible binding mode of the small molecule inhibitors, which may benefit the understanding of the mechanism of this kind of inhibitors and may be useful for rational drug design.

4. Discussion

4.1 GA division versus SOM and random approach

We have also generated 3D-QSAR models using SOM and random division for comparison. In the current work, to investigate the descriptor space, totally 1664 2D descriptors were calculated using the Dragon software package (Dragon Professional, version 5.4) for the inhibitors. The total 279 compounds were divided into a training set of 210 molecules for generating the subsequent QSAR models and a test set of 69 molecules for evaluating the predictive quality of the models. Fig. 15 demonstrates the distribution of the molecules, where the black font denotes the training set and red font stands for the test set, respectively. Inspecting this figure, the training set molecules present a sufficient coverage of the 2D-space occupied by the whole data set, which provides a sound proof for the effectiveness of diversity sampling. It is also worth to mention that the descriptors involved in diversity sampling and principal component analysis are different, which implies sufficient diversity and representativeness of the training set molecules in various descriptor spaces. All these in formations confirm that: (1) representative points of the test set are close to those of the training set. (2) The training and test sets uniformly fill the whole chemical space, indicating that the rationality of the training and test compounds by SOM splitting.


	Fig. 15 SOM analysis of the structural descriptors for the human HIV-1 inhibitors. Black and red fonts denote the training and test set compounds, respectively.

The random selection was also used for separating the dataset into a training set and a test set. The advantage of this approach is that the test set compounds are “unknown” to the models since they are excluded from the model development procedure, especially the variable selection.⁴¹ The selection was performed maintaining the class proportions, that is, the number of test molecules of each class is proportional to the number of training molecules of that class. The CoMFA and CoMSIA models exhibit good predictivity with low residuals for these chemicals, indicating that the random division is reliable.

Similar to GA splitting, the ligand-based alignment (Alignment-I) leads to models with larger Q², R_ncv², R_pred² than those obtained from the other two alignments in both the random and SOM division. Hence, we focus on the ligand-based model for further research. The results from the SOM and random analysis are given in Table 5. In SOM model, the CoMFA model yields the cross-validated Q² = 0.53 with 8 components, non-cross-validated R_ncv² of 0.84 and a predictive R_pred² of 0.84, with steric and electrostatic field contributions of 68.20% and 31.80%, respectively; the CoMSIA model yields the Q² = 0.54 with 7 components, R_ncv² = 0.82 and a predictive R_pred² = 0.83, with steric, electrostatic, hydrophobic and field contributions of 14.60%, 26.50%, 35.10% and 23.80%, respectively. In random selection, the CoMFA and CoMSIA Q² were found to be 0.56 and 0.56, respectively, suggesting that the CoMFA and CoMSIA models are statistically significant. The non-cross validated regression gave an R_ncv² of 0.90 and 0.81, and is able to predict the affinity of the compounds with an R_pred² of 0.75 and 0.83, respectively, suggesting that the CoMFA and CoMSIA models are able to accurately predict the affinity of compounds of ∼75% and ∼83%. When compared with the results obtained by the GA selection method, the SOM and random splitting exhibit almost equal statistical confidence and productivity. The distribution of experimental and predicted pEC₅₀ values for training and test sets according to the CoMFA and CoMSIA model from SOM and random are represented (Fig. 16). According to these plots, the predicted pEC₅₀ values are in correlation with the experimental ones within the tolerable error range.

Table 5 Summary of ligand-based CoMFA and CoMSIA models built based on GA splitting, random division and SOM splitting methods

PLS statistics	GA splitting		SOM		Random division
PLS statistics	CoMFA	CoMSIA	CoMFA	CoMSIA	CoMFA	CoMSIA
N	10	7	8	7	10	6
Q²	0.53	0.54	0.53	0.54	0.56	0.56
SEP	1.20	1.17	1.14	1.12	1.15	1.14
R_ncv²	0.89	0.83	0.84	0.82	0.90	0.81
SEE	0.57	0.72	0.66	0.69	0.56	0.75
F	169.81	141.21	131.69	135.39	176.32	147.52
R_pred²	0.85	0.83	0.84	0.83	0.75	0.83

Contribution (%)
Steric	67.60	14.80	68.20	14.60	67.30	15.00
Electrostatic	32.40	24.70	31.80	26.50	32.70	25.90
Hydrophobic	—	38.30	—	35.10	—	37.00
HB acceptor	—	22.20	—	23.80	—	22.20


	Fig. 16 The experimental versus the predicted pEC₅₀ values for ligand-based CoMFA and CoMSIA models from SOM and random division.

The models derived from GA, SOM and random are all quite satisfactory with respect to statistical significance and actual predictive ability, indicating that models based on GA, SOM and random division methods generate significant statistical results for the test sets, and proving the reasonability of GA division method for building the models (Table 5). Close observation of the results in fitting to the statistical parameters in three splitting methods provide the conclusion that the three models developed have a respectable predictive ability based on the structural perception of known HIV-1 inhibitors.⁶⁸

4.2 Comparison of 3D contour maps with MD results

The CoMFA and CoMSIA steric contour maps show a green contour that favors bulky substitutions near the position-2 of oxazole ring and position-19 of piperazinyl ring (Fig. 7A and 8A), which is confirmed by the docking and MD results that bulky substitutions at this position will have an unfavorable steric interaction with Asp368, Asn425, Met426, Trp427 and Ile371. It seems that the pocket is deep enough and thus bulkier groups at the positions of the piperazinyl and oxazole ring would be able to enhance the interactions with the residues of the pocket.

The active site of HIV-1 is mostly hydrophobic and this is in good agreement with the CoMSIA model. As shown in Fig. 11 and 13, the binding pocket is basically hydrophobic, which correlates well with our previous contour map. For example, the presence of several hydrophilic residues (Ile371, Ile424, Met475, Trp427, Leu116 and Val255) near the position-12 of indole ring coincides well with the white contours in Fig. 8C. Likewise, hydrophilic amino acid residues Gly473, Asn425, Gln428, Ser375 and Thr257 around the position-19 of piperazinyl ring indicate that compounds with hydrophobic groups in this position may reduce the activity. These conclusions are in well agreement with our previous CoMSIA hydrophobic contour maps.

The CoMSIA H-bond acceptor contours show a magenta contour around the nitrogen atom of the pyridine ring which favors the H-bond acceptor atoms. According to the MD results, two H-bonds are formed by the oxygen atom at position-23 (acceptor) with the backbone –NH of Ser375 (–O⋯HN, 2.56 Å) and Thr257 (–N⋯HN, 3.81 Å), which are consistent with the magenta contour located around the position-26 (the HB acceptor favored region) in previous CoMSIA H-bond acceptor contour maps (Fig. 8D). Another magenta contour is also observed near the methyl group of the pyrimidine substitution which points toward the exterior of the active site, suggesting that H-bond acceptor atoms would be favorable for the inhibitory activity. This observation correlates well with the red contour located around the position-1 in previous CoMSIA H-bond acceptor contours. It is also in agreement with the red contour near the same methyl group that favors electronegative groups (Fig. 8B).

All in all, the docking and MD analyses reveal that the steric, electrostatic, hydrophobic and H-bond acceptor interactions between the ligand and these key amino acid residues in the binding pocket of HIV-1 gp120 correlate well with the contour maps, indicating that the QSAR model is reasonable and can offer constructive suggestions for further design of potent HIV-1 inhibitors.

4.3 Docking comparison

The application of the docking method has been accompanied with the development of HIV-1 inhibitors. Several docking studies^{15,22,78–84} on HIV-1 gp120 receptor have been reported contributing to the exploitation and optimization of potent small inhibitors molecules. To gain more insights into the binding mode of the HIV-1 inhibitors and clarify whether they bind to HIV-1 gp120 receptor through similar mechanism with other binding mode, a comparison of the binding mechanism of with other HIV-1 inhibitors with the docking model obtained from our results was conducted presently. Recently, several crystal structures^{15,22,78–84} of HIV-1 gp120 receptor with similar compounds to those we are studying have been developed and 10 representative binding modes are summarized in Table 6. The detailed results are as follows:

Table 6 Important information of docking studies performed by other researchers

Researchers	PDB ID	Binding site	Binding interactions	Crucial residues
Caporuscio et al.⁷⁸	2B4C	Phe43 cavity	H-bond, hydrophobic effect	Tyr384
Uttekar et al.⁷⁹	2B4C	V3 loop	H-bond, hydrophobic effect	Gly312, Gln328, Arg298
LaLonde et al.⁸⁰	1G9M	Phe43 cavity	Bridging hydrogen bond, hydrophobic effect	Water, Gly473, Trp427, Ser375
Tintori et al.⁸¹	3TGS	Phe43 cavity	H-bond, hydrophobic effect	Water, Asp368, Gly473
Tintori et al.⁸¹	1G9M	Phe43 cavity	H-bond, hydrophobic effect	Water, Asp368, Gly473, Asn425
Balupuri et al.⁸²	4DKR	Phe43 cavity	H-bond, hydrophobic effect	Val255
Schader et al.²²	3SE8	Phe43 cavity	H-bond, hydrophobic effect	Trp427, Thr283, Thr455
Balupuri et al.⁸³	4DKR	Phe43 cavity	Hydrophobic effect	Ser375, Phe376, Asn377, Phe382, Asn425
Sivan et al.¹⁵	1RZJ	Phe43 cavity	H-bond, hydrophobic effect	Asp368
Gadhe et al.⁸⁴	1G9N	Phe43 cavity	H-bond, hydrophobic effect	Ser375

To the best of our knowledge so far,^{15,22,78–84} two HIV-1 gp120 binding pockets (listed as Phe43 cavity and V3 loop, respectively) have been proposed. The position of the Phe43 cavity is between the inner domain and bridging sheet (the detail is illustrated in the methods) while the V3 loop (residues 296–330) is beside the bridging sheet. Phe43 cavity represents the conventional binding cavity that almost all docking researches have referred to where a typical interaction of H-bond and hydrophobic are found with several crucial residues. With regard to V3 loop, it was introduced by Uttekar et al.⁷⁹ in 2012. Docking studies revealed that residues Gly321, Gln328 and Arg298 are participated in the interaction with inhibitors as well forming hydrogen bonds indicating that they are crucial in the biding mode.⁷⁹

In addition, with regard to the majority of docking results that show binding effect in Phe43 cavity, the docking modes mainly can be sorted into four classes according to various configurations of the small ligands. We use these three modes here for further analysis and compare them with our own binding model. Fig. 17 shows the binding mode of compounds NBD-556, BMS-599793, SU1 (BMS derivative), compound 149 (our docking results) as representatives of the four promising scaffolds identified. In fact, common characteristics share among the four binding modes that the prominent interaction is H-bonds and hydrophobic effect. All the compounds have been proven to occupy approximately the same region in the deep groove into the domains of Phe43 cavity.


	Fig. 17 Binding modes of HIV-1 gp120 selected from docking results. (A–C) The interaction maps of Phe43 cavity obtained by Tintori et al.,⁸¹ Schader et al.,²² and Sivan et al.¹⁵ (D) Features obtained from our present work.

The differences between the four binding modes are mainly the H-bond network including the residues referred to the pattern of interaction. Analysis of the binding mode of NBD-556 (Fig. 17A) showed that the oxalamide moiety made two H-bonds with the backbone carbonyl oxygen atoms of Asn425 and Gly473.⁸¹ The tetramethylpiperidine group was suggested to be involved in a salt bridge with Asp368 of gp120 and established polar contacts with a water molecule.⁸¹ As illustrated in Fig. 17B, the proposed binding mode of BMS-599793 encompassed three H-bonds with Trp427, Thr283 and Thr455.²² Binding pose of most potent derivative of BMS is shown in Fig. 17C, it clearly characterizes that the hydrophobic interaction at this cavity plays a major role in binding of inhibitors to gp120 receptor.¹⁵ Hydrogen bond interaction is seen with Asp368. We do not find the H-bonds between the Asn425, Gly473, Trp427, Thr283 and Thr455, which exist in the NBD-556 and BMS-599793. The result may be attributed to the fact that the limit volume of the binding cavity is hard to accommodate the larger group of the ligand. As shown in Fig. 17D, three H-bonds are observed with the residue Asp368, Thr257 and Ser375. Moreover, the predicted complex is stabilized by additional water-mediated hydrogen bonding interactions between the oxygen atom at position-16 and residue Met426, forming an original H-bond. By comparing the three binding mode with our docking results, it is found that their interaction mechanisms are quite similar to each other, showing that they complement and validate each other. The dock pose analysis of the existing inhibitors of HIV-1 gp120 revealed important binding requirements, these include the need of a hydrophobic group that can interact with the hydrophobic cavity of Phe43 and also have hydrogen bonds that can interact with residues Trp427, Gly473, Ser375 and Asp368. These may give an insight into the binding of different types of HIV-1 inhibitors and may help provide a new strategy for a ligand based technique exploiting fragment features in combination with structure-guided identification of novel scaffolds.

5. Conclusion

In this study, several computational methods including 3D-QSAR, molecular docking and MD simulations were performed to identify the structural features of 279 HIV-1 attachment inhibitors. The SPL code of GA-based splitting was also applied to create reasonable models. The combined analysis has led to the development of two satisfactory i.e., the CoMFA (Q² = 0.53, R_ncv² = 0.89, R_pred² = 0.85) and CoMSIA models (Q² = 0.54, R_ncv² = 0.83, R_pred² = 0.83), which show both significant internal and external predictive powers. Additionally, the 3D contour maps generated from the optimal models provide useful concepts about the types of the protein environment surrounding the inhibitors. Furthermore, a good consistency is also observed among the QSAR models, docking and the MD results. Overall, our findings are summarized as follows (Fig. 18):


	Fig. 18 The interaction features of ligand 149 with HIV-1 receptor from the MD results, which displays the key structural features impacting the activity obtained from our present work.

(1) A rational division based on GA was employed for building QSAR models. In order to verify the rationality of GA methods, the modeling set was built using SOM and random division for comparison. The three models exhibit almost the same proper statistical results, indicating that models based on GA methods generate significant statistical results for the test sets, proving the reasonability of GA division method for building the models.

(2) The contour maps have identified the key structural requirements of HIV-1 inhibitors for the activity: bulky substituents at position-3, electrostatic groups at positions-2, -3, -14, and -15, hydrophobic groups at position-12 can increase the biological activities.

(3) Based on these interactions form MD, it is summarized that molecules with hydrophobic groups that can be accommodated within the hydrophobic cavity and having some H-bond donor groups that can interact with the hydrophilic region at the entry of the cavity comprising of residues Asp368, Asn425, Thr257, Ser375, Asp474, Gly473 would act as potential inhibitors.

(4) MM-PBSA results show that the binding of the considered compound 149 to HIV-1 protein was mainly driven by polar salvation contribution terms (ΔE_{electrostatic} + ΔG_PB/GB).

Fig. 18 displays the key structural features impacting the activity obtained from our present work. All these results obtained from this study yield reliable and precious informations for further structure-based and ligand-based drug-design optimizations.

Conflict of interest

The author(s) confirm that this article content has no conflicts of interest.

Acknowledgements

Thanks for the financial support given by the National Natural Science Foundation of China (Grant No. 11201049). The research is also supported by the high-performance computing platform of Northwest A&F University, with financial support given by the National Natural Science Foundation of China (Grant No. 10801025 and 30973590), the National High Technology Research and Development Program (“863”) of China (No. 2009AA02Z205).

References

S. P. Korolev, Y. Y. Agapkina and M. B. Gottikh, Acta Naturae, 2011, 2, 12–28 Search PubMed.
G. Férira, A. Hänchenb, K. O. Françoisa, B. Hoorelbekea, D. Huskensa, F. Dettnerb, R. D. Süssmuthb and D. Scholsa, Virology, 2012, 433, 308–319 CrossRef PubMed.
A. Finzi, S. H. Xiang, B. Pacheco, L. P. Wang, J. Haight, A. Kassa, B. Danek, M. Pancera, P. D. Kwong and J. Sodroski, Mol. Cell, 2010, 37, 656–667 CrossRef CAS PubMed.
H. Sharma, X. Cheng and J. K. Buolamwini, J. Chem. Inf. Model., 2012, 52, 515–544 CrossRef CAS PubMed.
Z. K. Sweeney and K. Klumpp, Curr. Opin. Drug Discovery Dev., 2008, 11, 458–470 CAS.
A. K. Ghosh, D. D. Anderson, I. T. Weber and H. Mitsuya, Angew. Chem., Int. Ed., 2012, 51, 1778–1802 CrossRef CAS PubMed.
R. Kong, J. J. Tan, X. Ma, W. Z. Chen and C. X. Wang, Biochim. Biophys. Acta, Proteins Proteomics, 2006, 1764, 766–772 CrossRef CAS PubMed.
N. A. Meanwell, O. B. Wallace, H. Q. Fang, H. Wang, M. Deshpande, T. Wang, Z. W. Yin, Z. G. Zhang, B. C. Pearce, J. James, K. S. Yeung, Z. Qiu, J. J. K. Wright, Z. Yang, L. Zadjura, D. L. Tweedie, S. Yeola, F. Zhao, S. Ranadive, B. A. Robinson, Y. F. Gong, H. G. H. Wang, W. S. Blair, P. Y. Shi, R. J. Colonno and P. F. Lin, Bioorg. Med. Chem. Lett., 2009, 19, 1977–1981 CrossRef CAS PubMed.
R. W. Doms and D. Trono, Genes Dev., 2000, 14, 2677 CrossRef CAS.
A. Jekle, E. Chow, E. Kopetzki, C. Ji, M. J. Yan, R. Nguyen, S. Sankuratri, N. Cammack and G. Heilek, Antiviral Res., 2009, 83, 257–266 CrossRef CAS PubMed.
J. Auwerx, O. Isacsson, J. Söderlund, J. Balzarini, M. Johansson and M. Lundberg, Int. J. Biochem. Cell Biol., 2009, 41, 1269–1275 CrossRef CAS PubMed.
W. T. Zhang, G. Canziani, C. Plugariu, R. Wyatt, J. Sodroski, R. Sweet, P. Kwong, W. Hendrickson and I. Chaiken, Biochemistry, 1999, 38, 9405–9416 CrossRef CAS PubMed.
M. S. Bahia, S. K. Gunda, S. R. Gade, S. Mahmood, R. Muttineni and O. Silakari, J. Mol. Model., 2011, 17, 9–19 CrossRef CAS PubMed.
N. Madani, A. L. Perdigoto, K. Srinivasan, J. M. Cox, J. J. Chruma, J. LaLonde, M. Head, A. B. Smith III and J. G. Sodroski, J. Virol., 2004, 78, 3742–3752 CrossRef CAS.
S. K. Sivan, R. Vangala and V. Manga, Bioorg. Med. Chem., 2013, 21, 4591–4599 CrossRef CAS PubMed.
T. Wang, Z. Yang, Z. X. Zhang, Y. F. Gong, K. A. Riccardi, P. F. Lin, D. D. Parker, S. Rahematpura, M. Mathew, M. Zheng, N. A. Meanwell, J. F. Kadow and J. A. Bender, Bioorg. Med. Chem. Lett., 2013, 23, 213–217 CrossRef CAS PubMed.
T. Wang, Z. Zhang, O. B. Wallace, M. Deshpande, H. Fang, Z. Yang, L. M. Zadjura, D. L. Tweedie, S. Huang, F. Zhao, S. Ranadive, B. S. Robinson, Y. F. Gong, K. Ricarrdi, T. P. Spicer, C. Deminie, R. Rose, H. G. Wang, W. S. Blair, P. Y. Shi, P. F. Lin, R. J. Colonno and N. A. Meanwell, J. Med. Chem., 2003, 46, 4236 CrossRef CAS PubMed.
H. G. Wang, R. E. Williams and P. F. Lin, Curr. Pharm. Des., 2004, 10, 1785–1793 CrossRef CAS.
L. K. Tsou, C. H. Chen, G. E. Dutschman, Y. C. Cheng and A. D. Hamilton, Bioorg. Med. Chem. Lett., 2012, 22, 3358–3361 CrossRef CAS PubMed.
Q. Guo, H. T. Ho, I. Dicker, L. Fan, N. Zhou, J. Friborg, T. Wang, B. V. McAuliffe, H. H. Wang and R. E. Rose, J. Virol., 2003, 77, 10528–10536 CrossRef CAS.
L. Li, H. Chen, R. N. Zhao and J. G. Han, J. Mol. Model., 2013, 19, 905–917 CrossRef CAS PubMed.
S. M. Schader, S. P. Colby-Germinario, P. K. Quashie, M. Oliveira, R. Ibanescu, D. Moisi, T. Mespléde and M. A. Wainberg, Antimicrob. Agents Chemother., 2012, 56, 4257–4267 CrossRef CAS PubMed.
A. Schön, N. Madani, J. C. Klein, A. Hubicki, D. Ng, X. Yang, A. B. Smith III, J. Sodroski and E. Freire, Biochemistry, 2006, 45, 10973–10980 CrossRef PubMed.
Q. Zhao, L. Ma, S. Jiang, H. Lu, S. Liu, Y. X. He, N. Strick, N. Neamati and A. K. Debnath, Virology, 2005, 339, 213–225 CrossRef CAS PubMed.
D. Yadav, S. Paliwal, R. Yadav, M. Pal and A. Pandey, PLoS One, 2012, 7, 1–18 Search PubMed.
V. P. Zambre, V. A. Hambarde, N. N. Petkar, C. N. Patel and S. D. Sawant, RSC Adv., 2015, 5, 23922–23940 RSC.
P. Li, J. X. Chen, J. N. Wang, W. Zhou, X. Wang, B. H. Li, W. Y. Tao, W. Wang, Y. H. Wang and L. Yang, J. Ethnopharmacol., 2014, 151, 93–107 CrossRef CAS PubMed.
J. L. Ru, P. Li, J. N. Wang, W. Zhou, B. H. Li, C. Huang, P. D. Li, Z. H. Guo, W. Y. Tao, Y. Y. Yang, X. Xu, Y. Li, Y. H. Wang and L. Yang, J. Cheminf., 2014, 16, 6–13 Search PubMed.
Y. Yao, X. D. Zhang, Z. Z. Wang, C. L. Zheng, P. Li, C. Huang, W. Y. Tao, W. Xiao, Y. H. Wang, L. Q. Huang and L. Yang, J. Ethnopharmacol., 2013, 25, 619–638 CrossRef PubMed.
N. A. Meanwell, O. B. Wallace, H. Wang, M. Deshpande, B. C. Pearce, A. Trehan, K. S. Yeung, Z. l. Qiu, J. J. K. Wright, B. A. Robinson, Y. F. Gong, H. W. Gene, H. D. Wang, W. S. Blair, P. Y. Shi and P. F. Lin, Bioorg. Med. Chem. Lett., 2009, 19, 5136–5139 CrossRef CAS PubMed.
T. Wang, J. F. Kadow, Z. X. Zhang, Z. W. Yin, Q. Gao, D. D. Wu, D. D. Parker, Z. Yang, L. Zadjura, B. A. Robinson, Y. F. Gong, W. S. Blair, P. Y. Shi, G. Yamanaka, P. F. Lin and N. A. Meanwell, Bioorg. Med. Chem. Lett., 2009, 19, 5140–5145 CrossRef CAS PubMed.
T. Wang, Z. W. Yin, Z. X. Zhang, J. A. Bender, Z. Yang, G. Johnson, Z. Yang, L. M. Zadjura, C. J. D'Arienzo, D. D. Parker, C. Gesenberg, G. A. Yamanaka, Y. F. Gong, H. T. Ho, H. Fang, N. Zhou, B. V. McAuliffe, B. J. Eggers, L. Fan, B. Nowicka-Sans, I. B. Dicker, Q. Gao, R. J. Colonno, P. F. Lin, N. A. Meanwell and J. F. Kadow, J. Med. Chem., 2009, 52, 7778–7787 CrossRef CAS PubMed.
K. S. Yeung, Z. L. Qiu, Q. F. Xue, H. Q. Fang, Z. Yang, L. Zadjura, C. J. D'Arienzo, B. J. Eggers, K. Riccardi, P. Y. Shi, Y. F. Gong, M. R. Browning, Q. Gao, S. Hansel, K. Santone, P. F. Lin, N. A. Meanwell and J. F. Kadow, Bioorg. Med. Chem. Lett., 2013, 23, 198–202 CrossRef CAS PubMed.
K. S. Yeung, Z. L. Qiu, Z. W. Yin, A. Trehan, H. Q. Fang, B. Pearce, Z. Yang, L. Zadjura, C. J. D'Arienzo, K. Riccardi, P. Y. Shi, T. P. Spicer, Y. F. Gong, M. R. Browning, S. Hansel, K. Santone, J. Barker, T. Coulter, P. F. Lin, N. A. Meanwell and J. F. Kadow, Bioorg. Med. Chem. Lett., 2013, 23, 203–208 CrossRef CAS PubMed.
K. S. Yeung, Z. L. Qiu, Z. Yang, L. Zadjura, C. J. D'Arienzo, M. R. Browning, S. Hansel, X. S. Huang, B. J. Eggers, K. Riccardi, P. F. Lin, N. A. Meanwell and J. F. Kadow, Bioorg. Med. Chem. Lett., 2013, 23, 209–212 CrossRef CAS PubMed.
J. A. Bender, Y. Zhong, B. Eggers, Y. F. Gong, P. F. Lin, D. D. Parker, S. Rahematpura, M. Zheng, N. A. Meanwell and J. F. Kadow, Bioorg. Med. Chem. Lett., 2013, 23, 218–222 CrossRef CAS PubMed.
H. Yu, J. X. Chen, X. Xu, Y. Li, H. H. Zhao, Y. P. Fang, X. X. Li, W. Zhou, W. Wang and Y. H. Wang, PLoS One, 2012, 7, e37608 CAS.
W. Zhou, C. Huang, Y. Li, J. Y. Duan, Y. H. Wang and L. Yang, Toxicology, 2013, 304, 173–184 CrossRef CAS PubMed.
P. Ambure and K. Roy, RSC Adv., 2014, 4, 6702–6709 RSC.
A. Golbraikh, M. Shen, Z. Y. Xiao, Y. D. Xiao, K. H. Lee and A. Tropsha, J. Comput.-Aided Mol. Des., 2003, 17, 241–253 CrossRef CAS.
Y. P. Castillo, C. Lazar, J. Taminau, M. Froeyen, M. Á. Cabrera-Pérez and A. Nowé, J. Chem. Inf. Model., 2012, 52, 2366–2386 CrossRef PubMed.
M. Taha, A. Qandil, D. Zaki and M. AlDamen, Eur. J. Med. Chem., 2005, 40, 701–727 CrossRef CAS PubMed.
T. Kohonen, Biol. Cybern., 1982, 43, 59–69 CrossRef.
J. Vesanto and E. Alhoniemi, IEEE Trans. Neural Network, 2000, 11, 586–600 CrossRef CAS PubMed.
P. Shah, S. Kumar, S. Tiwari and M. I. Siddiqi, J. Chem. Biol., 2012, 5, 91–103 CrossRef PubMed.
E. Papa and P. Gramatica, SAR QSAR Environ. Res., 2008, 19, 655–668 CrossRef CAS PubMed.
A. C. Good and M. A. Hermsmeier, J. Chem. Inf. Model., 2007, 47, 110–114 CrossRef CAS PubMed.
J. Gasteiger and M. Marsili, Tetrahedron, 1980, 36, 3219–3228 CrossRef CAS.
A. Warshel and A. Dryga, Proteins: Struct., Funct., Bioinf., 2011, 79, 3469–3484 CrossRef CAS PubMed.
R. S. McCoy and S. B. Braun-Sand, Theor. Chem. Acc., 2012, 131, 1293 CrossRef.
A. Warshel and C. N. Schutz, Proteins: Struct., Funct., Genet., 2001, 44, 400–417 CrossRef PubMed.
A. Warshel, P. K. Sharma, M. Kato and W. W. Parson, Biochim. Biophys. Acta, Proteins Proteomics, 2006, 1764, 1647–1676 CrossRef CAS PubMed.
R. Borštnar, M. Repič, S. C. L. Kamerlin, R. Vianello and J. Mavri, J. Chem. Theory Comput., 2012, 8, 3864–3870 CrossRef.
M. Repič, M. Purg, R. Vianello and J. Mavri, J. Phys. Chem. B, 2014, 118, 4326–4332 CrossRef PubMed.
Y. Guo, J. Xiao, Z. Guo, F. Chu, Y. Cheng and S. Wu, Bioorg. Med. Chem. Lett., 2005, 13, 5424–5434 CrossRef CAS PubMed.
M. Clark, R. D. Cramer and N. Vanopdenbosch, J. Comput. Chem., 1989, 10, 982–1012 CrossRef CAS PubMed.
G. Jones, P. R. Willett and C. Glen, J. Mol. Biol., 1995, 245, 43–53 CrossRef CAS.
G. Jones, P. Willett, R. C. Glen, A. R. Leach and R. Taylor, J. Mol. Biol., 1997, 267, 727–748 CrossRef CAS PubMed.
Y. D. Kwon, A. Finzi, X. L. Wu, C. Dogo-Isonagie, L. K. Lee, L. R. Moore, S. D. Schmidt, J. Stuckey, Y. P. Yang, T. Q. Zhou, J. Zhu, D. A. Vicic, A. K. Debnath, L. Shapiro, C. A. Bewley, J. R. Mascola, J. G. Sodroski and P. D. Kwong, Proc. Natl. Acad. Sci. U. S. A., 2012, 109, 5663–5668 CrossRef CAS PubMed.
P. A. Kollman, I. Massova, C. Reyes, B. Kuhn, S. H. Huo, L. Chong, M. Lee, T. Lee, Y. Duan, W. Wang, O. Donini, P. Cieplak, J. Srinivasan, D. A. Case and T. E. Cheatham, Acc. Chem. Res., 2000, 33, 889–897 CrossRef CAS PubMed.
H. Tzoupis, G. Leonis, S. Durdagi, V. Mouchlis, T. Mavromoustakos and M. G. Papadopoulos, J. Comput.-Aided Mol. Des., 2011, 25, 959–976 CrossRef CAS PubMed.
Z. G. Zhou, M. Madrid, J. D. Evanseck and J. D. Madura, J. Am. Chem. Soc., 2005, 127, 17253–17260 CrossRef CAS PubMed.
Y. Zhao, W. Li, J. Zeng, G. Liu and Y. Tang, Proteins, 2008, 72, 635–645 CrossRef CAS PubMed.
X. L. Shen, M. Takimoto-Kamimura, J. Wei and Q. Z. Gao, J. Mol. Model., 2012, 18, 203–212 CrossRef CAS PubMed.
X. Li, L. Ye, X. Wang, W. Shi, H. Liu, X. Qian, Y. Zhu and H. Yu, Chemosphere, 2013, 92, 795–802 CrossRef CAS PubMed.
C. Muñoz, F. Adasme, J. H. Alzate-Morales, A. Vergara-Jaque, T. Kniess and J. Caballero, J. Mol. Graphics Modell., 2012, 32, 39–48 CrossRef PubMed.
Z. Y. Zhang, L. Y. An, W. X. Hu and Y. H. Xiang, J. Comput.-Aided Mol. Des., 2007, 21, 145–153 CrossRef CAS PubMed.
R. S. Nayana, S. K. Bommisetty, K. Singh, S. K. Bairy, S. Nunna, A. Pramod and R. Muttineni, J. Chem. Inf. Model., 2009, 49, 53–67 CrossRef CAS PubMed.
S. Lu, H. C. Liu, Y. D. Chen, H. L. Yuan, S. L. Sun, Y. P. Gao, P. Yang, L. Zhang and T. Lu, Int. J. Mol. Sci., 2011, 12, 8713–8739 CrossRef CAS PubMed.
A. Warshel, Angew. Chem., 2014, 53, 10020–10031 CrossRef CAS PubMed.
A. Warshel and S. C. L. Kamerlin, Proteins: Struct., Funct., Bioinf., 2010, 78, 1339–1375 Search PubMed.
A. Warshel and S. C. L. Kamerlin, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 2011, 1, 30–45 CrossRef PubMed.
A. Warshel, Computer Modeling of Chemical Reactions in Enzymes and Solutions, Wiley, 1997 Search PubMed.
M. Bradley, J. Chem. Inf. Comput. Sci., 2001, 41, 1301–1307 CrossRef CAS PubMed.
S. K. Tripathi, C. Selvaraj, S. K. Singh and K. K. Reddy, Med. Chem. Res., 2012, 21, 4239–4251 CrossRef CAS.
N. Madani, A. Schön, A. M. Princiotto, J. M. LaLonde, J. R. Courter, T. Soeta, D. Ng, L. P. Wang, E. T. Brower, S. H. Xiang, Y. D. Kwon, C. C. Huang, R. Wyatt, P. D. Kwong, E. Freire, A. B. Smith III and J. Sodroski, Structure, 2008, 16, 1689–1701 CrossRef CAS PubMed.
Y. Yang, J. Qin, H. Liu and X. Yao, J. Chem. Inf. Model., 2011, 51, 680–692 CrossRef CAS PubMed.
F. Caporuscio, A. Tafi, E. González, F. Manetti, J. A. Esté and M. Botta, Bioorg. Med. Chem. Lett., 2009, 19, 6087–6091 CrossRef CAS PubMed.
M. M. Uttekar, T. Das, R. S. Pawar, B. Bhandari, V. Menon, G. S. K. Nutan and S. V. Bhat, Eur. J. Med. Chem., 2012, 56, 368–374 CrossRef CAS PubMed.
J. M. Lalonde, M. A. Elban, J. R. Courter, A. Sugawara, T. Soeta, N. Madani, A. M. Princiotto, Y. D. Kwon, P. D. Kwong, A. Schön, E. Freire, J. Sodroski and A. B. Smith III, Bioorg. Med. Chem., 2011, 19, 91–101 CrossRef CAS PubMed.
C. Tintori, M. Selvaraj, R. Badia, B. Clotet, J. A. Esté and M. Botta, ChemMedChem, 2013, 8, 475–483 CrossRef CAS PubMed.
A. Balupuri and S. J. Cho, J. Chosun Natural Sci., 2013, 3, 138–142 CrossRef.
A. Balupuri, C. G. Gadhe, P. K. Balasubramanian, G. Kothandan and S. J. Cho, Arch. Pharm. Res., 2014, 37, 1001–1015 CrossRef CAS PubMed.
C. G. Gadhe, G. Kothandan, T. Madhavan and S. J. Cho, Med. Chem. Res., 2012, 21, 1892–1904 CrossRef CAS.

Footnote

† Electronic supplementary information (ESI) available. See DOI: 10.1039/c5ra04299b

Click here to see how this site uses Cookies. View our privacy policy here.