In silico de novo design of novel NNRTIs: a bio-molecular modelling approach

Nilanjana Jain (Pancholi)a, Swagata Guptab, Neelima Saprec and Nitin S. Sapre*a
aDepartment of Applied Chemistry, SGSITS, Indore, India. E-mail: sukusap@yahoo.com; Tel: +91 9826607444
bDepartment of Chemistry, Govt. BLPPG College, MHOW, India
cDepartment of Mathematics and Computational Sc., SGSITS, Indore, India

Received 29th November 2014 , Accepted 8th January 2015

First published on 8th January 2015


Abstract

Six novel non-nucleoside reverse transcriptase inhibitors exhibiting high efficacy are designed using in silico mathematical modelling techniques and the results are validated using a docking technique. An in silico assessment of interaction potential and structural requirements of 5-alkyl-2-alkylamino-6-(2,6-difluorophenylalkyl)-3,4-dihydropyrimidin-4(3H)-one (DABO) analogues in the non-nucleoside inhibitor binding pocket is also performed. Efficient use of 3D-pharamacophoric (SALL, HDALL, HAALL and RALL) and 3D-averaged alignment (C[thin space (1/6-em)]log[thin space (1/6-em)]P and dipole moment) descriptors is made in this study. The chemometric analyses, using support vector machine, back propagation neural network and multiple linear regression, are performed. The relative potentials of these chemometric methods is also assessed and the results, SVM (r = 0.939, MSE = 0.071, q2 = 0.876), BPNN (r = 0.923, MSE = 0.104, q2 = 0.818) and MLR (r = 0.912, MSE = 0.096, q2 = 0.832), indicates that SVM describes the relationship between the descriptors and inhibitory activity in a better manner. The results also suggest that there is a non-linear relationship between the descriptors and inhibitory activity. The study further suggests that isopropyl/propenyl groups as R and R′, oxobutyl group as X and di or tri-substitution as R′′ are the best suited substituents for exhibiting better inhibitory activity.


Introduction

Extensive research is going on to develop a cure for human immunodeficiency virus type-1 (HIV-1) infection, one of the fatal viruses.1–4 The non-nucleoside reverse transcriptase inhibitors (NNRTIs) are prominent members of current combinatorial drug therapy against HIV-1 infection, which exhibit significant potency and are relatively less toxic.5–7 However, the rapid manifestation of these into drug-resistant viral strains has relegated the therapeutic efficacy of the inhibitors.8–10 Recent advances in ligand-based and structure-based drug design (LBDD and SBDD) approaches, coupled with virtual screening, are robust tools for the design of newer compounds acting against HIV-1 infection.11–15

The reverse transcriptase (RT) enzyme is essential for the conversion of genetic RNA into DNA, and thus plays a significant role in the drug discovery pipeline to combat HIV-1 infection.16,17 The RT enzyme is a heterodimer made of p66 and p51 subunits, where each subunit contains thumb, palm and finger domains. The p66 subunit contains the functional active site that binds the nucleic acid template primer to the nucleotide triphosphate.18,19 NNRTIs are highly potent and moderately toxic compared to nucleoside reverse transcriptase inhibitors (NRTIs) and do not require cellular activation to inhibit HIV-1 RT.20 These are non-competitive inhibitors that bind to a hydrophobic “pocket” in the p66 subunit of HIV-1 RT, approximately 10 Å away from the polymerase binding site.21 X-ray crystallographic studies have revealed the prominent interactions of the inhibitor within the non-nucleoside inhibitor binding pocket (NNIBP) of the protein and have facilitated the design of more effective inhibitors.22 The NNIBP does not exist in the unliganded RT and is produced by binding of the ligand with the side chains of aromatic (including Y181 and Y188) and hydrophobic amino-acid residues of the viral protein.23,24 NNRTI resistance mutations influence the binding of the inhibitors to their binding pocket either by changing the size, shape and polarity of the NNIBP or affecting the entry of NNRTIs into this pocket.25 Among the various FDA-approved NNRTIs, nevirapine (Viramune/Viramune XR) is a highly effective inhibitor, which has emerged as a key drug for the prevention of viral transmission. Recent studies have suggested that nevirapine is more effective in crossing the blood–brain barrier.26 Another NNRTI, delaviridine, which is bulkier in size than nevirapine, exhibit better interactions with RT, viz. hydrogen bond interactions with K103 and hydrophobic interactions with P236.27 Efavirenz (Sustiva, Stocrin, EFV, DMP-266) is a potent NNRTI that binds to HIV-1 RT at a site distinct from the polymerase catalytic site, which has also been found to be effective when combined with either nevirapine, nelfinavir or indinavir.28,29 Etravirine (ETR/TMC125), a second-generation NNRTI, exhibited an enhanced barrier to resistance and is found to be extremely effective in achieving the viral suppression as well as improving the immunity in treatment-experienced HIV-infected patients.30 Among the newly discovered NNRTIs, rilpivirine is an antiretroviral exhibiting better bioavailability, easier formulation and administration compared to etravirine,31,32 and the FDA approved NNRTIs are given in Table 1.

Table 1 Chronological approval status of NNRTIsa
S. no. Name Structure Approval status
a http://aidsinfo.nih.gov/education-materials/fact-sheets/21/58/fda-approved-hiv-medicines, updated 28, November 2014.
1. Nevirapine (BI-RG-587)/Viramune/Viramune XR 11-cyclopropyl-5,11-dihydro-4-methyl-6H-dipyrido [3,2-b:2′,3′-e][1,4]diazepin-6-one image file: c4ra15478a-u1.tif Approved in 1996/extended release 2011
2. Delavirdine/DLV/Rescriptor N-[2-({4-[3-(propan-2-ylamino)pyridin-2-yl]piperazin-1-yl}carbonyl)-1H-indol-5-yl]methanesulfonamide image file: c4ra15478a-u2.tif Approved in 1997
3. Efavirenz (DMP266)/Strocin™/Sustiva™ (S)-6-chloro-(cyclopropylethynyl)-1,4-dihydro-4-(trifluoromethyl)-2H-3,1-benzoxazin-2-one image file: c4ra15478a-u3.tif Approved in 1998
4. R165335/Etravirine/TMC125/Intelence™ 4-[[6-amino-5-bromo-2-[(4-cyanophenyl)amino]-4-pyrimidinyl]oxy]-3,5-dimethylbenzonitrile image file: c4ra15478a-u4.tif Approved in 2008
5. R278474/Rilpivirine/TMC278/Edurant 4-[[4-[4-[(1E)-2-cyanoethenyl]-2,6-dimethylphenyl]amino]-2-pyrimidinyl]amino]benzo-nitrile image file: c4ra15478a-u5.tif Approved in 2011


ATP analogs are also reported to be used for inhibition of HIV-transcription.33 The latest entrant, undergoing clinical trials, in the armamentarium of anti-retrovirals is doravirine (MK-1439), which has robust antiviral activity and better tolerability.34 The 5-alkyl-2-alkylamino-6-(2,6-difluorophenylalkyl)-3,4-dihydropyrimidin-4(3H)-one (DABO) analogs effectively inhibit the replication of a variety of HIV-1 strains at the reverse transcriptase step.35 Efforts are ongoing to make structural changes so that the DABO analogs demonstrate enhanced potency. Derivatives, belonging to rationally designed broad spectrum NNRTIs, such as dihydroalkyloxybenzyloxopyrimidines (O-DABOs), dihydroalkylthiobenzyloxopyrimidines (S-DABOs), dihydroalkylamino difluorobenzyloxopyrimidines (NH-DABOs), N,N-disubstituted amino(2,6-difluorophenylalkyl)-3,4-dihydropyrimidin-4(3H)-ones (F2-N,N-DABOs), dihydro(alkyl thio)(naphthylmethyl)oxopyrimidines (DATNOS), are synthesized.36–42

The common structure of DABO consists of a pyrimidinone ring and a di-fluoro substituted aromatic ring attached through a substituted CH bridging group. The three substituents R and X are attached to the pyrimidinone ring while R′ is attached to the bridging CH group. Earlier studies on DABO are based on various SAR analyses, where the structural requirements for enhancing the biological activity have been quantitatively analyzed.43–46 The 3-dimensional RT complex-DABO crystal structure analysis has provided newer dimensions to interpretation of the drug–receptor interaction profile, and this has certainly aided in substantiating the SAR analyses for enhanced structural refinement.47–52

The present study deals with design of novel NNRTIs by performing chemometric analyses of two important types of descriptors namely (a) 3D pharmacophoric [dipole moment, SALL, HDALL, RALL and HAALL] and (b) 2D average alignment [octanol–water partition coefficient (C[thin space (1/6-em)]log[thin space (1/6-em)]P) descriptors in understanding the interactions potential of 5-alkyl-2-alkylamino-6-(2,6-difluorophenylalkyl)-3,4-dihydropyrimidin-4(3H)-one (DABO) analogs in the NNIBP.53 It is reported that docking templates can be very useful if 3D structural conformation of a ligand is evident.54 These templates basically accounts for vital contribution of the pharmacophores in the structural alignment of the ligand within the NNIBP. These pharmacophore based templates assist in extracting requisite information based on similarity overlap for each individual template point. During template alignment procedure, ligand flexibility is also considered. The poses similar to the reference ligand are searched by the docking engine guided by pharmacophoric template groups. The validation procedure focuses on atomic overlap of each template groups centre with the concerned ligand.

The Gaussian function is used to evaluate an atom, matching a group definition based on its distance from the group centres. The four template groups incorporated in the present work are steric (SALL), hydrogen donor (HDALL), hydrogen acceptor (HAALL) and ring (RALL). In addition, two extremely relevant descriptors, octanol–water partition coefficient (C[thin space (1/6-em)]log[thin space (1/6-em)]P) (an ‘alignment-average’ descriptor), and dipole moment (μ), a 3D parameter, are included for the sake of better assessment of anti-viral activity in the non-nucleoside inhibitor binding pocket (NNIBP).55 Optimization of physicochemical properties, such as lipophilicity and increase ligand-lipophilicity efficiency (LLE) is reported by Mowbray et al.56 These descriptors are selected to characterize the molecular architecture for their capability to correlate diverse biochemical phenomena of the concerned molecular series. The obtained relevant information is assessed chemometrically to correlate the structural features with the anti-viral activity of the DABO derivatives.57

The chemometric models have established themselves soundly in quantifying the correlation between selected molecular aspects and their impact on the biological response.58–61 The development of these rational SAR models focus on necessary chemical features leading to a better pharmaco-toxico-kinetic profile of a lead candidate curtailing irrelevant experimental determinations.62,63 These statistical validation methodologies form the basis of SAR studies, though there are still some limitations.64 The quantitative relationships between the molecular entity and its physicochemical and biological properties appear to be rather more complex, and nonlinearity also prevails in many cases, thus extracting pivotal information from a SAR model is an aspect that needs to be focused on very seriously.65,66

In the present study, non-linear (BPNN [back propagation neural network] and SVM [support vector machine]) and linear (MLR [multiple linear regression]) techniques are used for chemometric assessment of the training dataset. The results are validated using a test set. Finally, an external dataset (compounds with approximate inhibitory activity) and an in silico designed virtual dataset of molecules is assessed and reported with their predicted activities. External and structural validation of the designed in silico ligands of VDS is performed using a docking technique. The complete work performed is presented in Scheme 1.


image file: c4ra15478a-s1.tif
Scheme 1 A scheme presenting the flow of the work.

Materials and methods

Molecular structural data set

The structures of 53-DABO analogues are drawn using ChemDraw Ultra 7.0 (ref. 67) and optimized using MM2 force field. The pharmacophoric 3D descriptors based on group centre overlap are calculated using Molegro Virtual Docker (MVD),68 while C[thin space (1/6-em)]log[thin space (1/6-em)]P and dipole moment are calculated using Sybyl-X 2.0 suite software.69

Methodology

Conformational analysis. Conformational search for the all the DABO derivatives (36-training, 9-test, 1-outlier, 7-appx., 6-VDS) has been performed using Sybyl-X 2.0 suite software.69 A random search method is used for performing conformational analysis, as is suggested for complex cases. Randomization is involved with the Cartesian as well as the internal coordinates. The method locates energy minima of a molecule by randomly adjusting the selected bonds and minimizing the energy of the resulting geometry. It involves making random torsion changes to selected bonds, followed by minimization. The parameters used for performing random conformational sampling are: maximum cycles: 200, energy cut-off: 10 kcal mol−1, RMS threshold 0.10, convergence threshold: 0.05, maximum hits: 6, force field: Tripos (using default setting), charges: Gasteiger–Marsili and checking the symmetry of the conformer. The details of structure of compounds, conformers and energy are given separately (in ESI).
Generation of 3d descriptors. 3D descriptors (hydrogen donor, hydrogen acceptor, steric and ring) based on group centre overlap are generated using the known 3D conformation of the highest active ligand. A template represents a collection of specific chemical features associated with an atom (in a molecule) crucial for binding interactions. The molecules can thus be aligned with the template. Based on the similarity overlap, crucial information can be deduced.

The Gaussian formula, given in eqn (1), is used to determine the amount of overlap for the specific group centre:

 
image file: c4ra15478a-t1.tif(1)
where, d is the distance from the position of the atom to the centre in the group, ω is a weight factor for the template group and r0 is a distance parameter.70,54

Dipole moment (μ). Presence of polarity and its magnitude in a molecule are crucial parameters in determination of specific binding interactions within the binding pocket of an enzyme. Dipole moment is a 3D electrostatic descriptor represented by the vector μ and reported in Debye units (D) and calculated using eqn (2).71,72
 
image file: c4ra15478a-t2.tif(2)
where ϕi is molecular orbitals, [r with combining circumflex] is electron position operator, Za is a-th atomic nuclear charge and [R with combining right harpoon above (vector)]a is position vector of a-th atomic nucleus.
Octanol–water partition coefficient (C[thin space (1/6-em)]log[thin space (1/6-em)]P). Octanol–water partition coefficient is the ratio of equilibrium concentration of a solute in a non-polar solvent to the concentration of the same solute in a polar solvent. The logarithm of partition coefficient, log[thin space (1/6-em)]P, has been accounted as a hydrophobic parameter in “extra-thermodynamic” Hammett methodology. Generally 1-octanol is suggested in the non-polar phase. The log[thin space (1/6-em)]P (Oct–water) from the structure based on a substitution method is calculated as given in the equation eqn (3).73,74
 
πX = log[thin space (1/6-em)]P(C6H5X) − log[thin space (1/6-em)]P(C6H5) (3)

Rekker used a reductionist approach (eqn (4)), to derive the constants for carbon, hydrogen and for polar fragments.75 The fragment values (f) and interaction factors (F) had to be identified and evaluated by the relation reported by Hansch.76

 
image file: c4ra15478a-t3.tif(4)

Chou and Jurs used a constructionist approach to calculate log[thin space (1/6-em)]P, considering the hydrophobic portions of solutes as “hydrocarbon-like” and defining these carbon and hydrogen fragment values as being truly constant and is reported as C[thin space (1/6-em)]log[thin space (1/6-em)]P.77

Chemometric validation methods. Till date, a variety of chemometric methods have been developed and used to handle multivariate data analyses, on which lies the onus of reliable QSAR interpretations.78–80 The robustness of a derived model is extremely important while validating the utility of descriptors in deducing the pharmacophoric qualities of the inhibitors. Three techniques namely SVM, BPNN and MLR are used in the present study. A brief account of these methods is presented herewith.
Support vector machine (SVM). SVM is based on the structural risk minimization (SRM) principle, which is least sensitive to data over fitting.81 This method can be applied to linear as well as nonlinear classification and trained faster.82 SVM has been successful in correlating various quantitative structure–activity/property relationships in the areas of computer-aided drug design methods.83–87 It is a supervised learning method, and support vectors are used with suitable kernel functions.88–90 For the present study ν-support vector regression and ε-support vector regression based on LIBSVM are considered, and in each case linear, polynomial, sigmoid, and radial basis functions are used.
Back propagation neural network (BPNN). Neural networks resemble human brain neuron network and can handle complex and non-linear data and thus extract the hidden relationships between the dependent and independent variables.91 Rumelhart et al.92 developed the back-propagation neural network (BPNN) as a solution to the problem of training multi-layer perceptrons.

The molecular descriptors are encoded in the form of input neurons, which are multiplied by the weight of each neuron. A sigmoid non-linear transfer function is then applied and a suitable bias is applied for shifting the transfer function to either side. These are then multiplied to the output weights, transformed and interpreted.93 The residual error in this supervised learning method i.e. the difference between the experimental quantity and network's predicted quantity is evaluated. This calculated error is allowed to propagate backward through the network and the weights are accordingly adjusted so as to observe the same input pattern and minimize the residual error. This pattern is repeated till a relationship or no relationship is derived.94–97 The most crucial characteristic of a neural network setup is deciding the number of neurons within the hidden layer and is decided by the ratio rho (ρ). Neural network models are very efficient in handling and extracting non-linear relationships. The ratio is maintained within the range 1.0–3.0 to get sensible results.98,99

Multiple linear regressions (MLR). Multiple linear regression (MLR) is a method where the values of the regression coefficients (bn's) are evaluated using the least squares curve fitting method.100,101
y = b1x1 + b2x2 + b3x3 + … + bnxn + c
where ‘y’ is the dependent variable, ‘x1, x2, …, xn’ are the independent variables, ‘b1, b2, …, bn’ are the regression coefficients and ‘c’ is the constant.

This is the most widely used owing to its fast and easy interpretability. However, for complex systems such as a biological system, the linear combination of descriptor information can often lead to a model with limited accuracy, simply due to the assumption of linearity in the data.

Results and discussion

The main aim of the present study is to design novel NNRTIs of HIV-1. 3D pharmacophoric [dipole moment, SALL, HDALL, RALL and HAALL] and (b) 2D average alignment [octanol–water partition coefficient (C[thin space (1/6-em)]log[thin space (1/6-em)]P) descriptors are used for extracting their relationship with anti-viral activity using SVM, ANN and MLR techniques. The inhibitory activity of virtually designed compounds is predicted using the derived relationship. The compounds that exhibited activity higher than the highest active reference compound (cpd no. 26, pEC50 = 8.30) are extracted from this virtual dataset.

The ligand selected as reference for template alignment in the docking wizard is compound no. 26 with highest activity (pEC50 = 8.30) in the dataset. Four template groups are taken for comparing the reference ligand with the rest of the ligands of the dataset. These are classified into (i) steric: consisting of all 22 non-hydrogen atoms and are used for shape matching only, (ii) hydrogen donor: consisting of 3 hydrogen donor functional groups, (iii) hydrogen acceptor: consisting of 1 hydrogen bond acceptor group and (iv) ring: constituting 12 atoms, which are part of rings. Any atom closer than 2.0 Å to the existing centre of a template group is accepted as equal to that centre, and the optimal match corresponds to a value of 1.0.

Fig. 1A–D shows all the template groups derived from compound number 26 (pEC50 = 8.30) and the contribution of each specific atom to a template group is also evident from this figure.


image file: c4ra15478a-f1.tif
Fig. 1 (A–D) Template groups derived from the reference ligand [compound no. 26, (pEC50 = 8.3010)] of DABO derivatives.

A training set of 36 ligands is considered for the 3D similarity-based alignment using template groups and is presented in Table 2. Table 2 presents substituents (R, R′, X and Y) on the parent structure, experimental and calculated anti-HIV-1 activities (effective concentration pEC50), all the descriptors namely, octanol–water partition coefficient (C[thin space (1/6-em)]log[thin space (1/6-em)]P), dipole moment (μ) in debye units, four 3D descriptors based on group centre overlap based approach [steric (SALL), hydrogen donor (HDALL), hydrogen acceptor (HAALL), ring (RALL)]. Table 2 also presents the compounds of the test set and outlier. The structure shown in blue colour is the highest active ligand used as a template for estimation of 3D pharmacophoric descriptors.

Table 2 Substituents (R, R′ X and Y) on the parent structure, anti-HIV-1 activity (effective concentration pEC50), octanol–water partition coefficient (C[thin space (1/6-em)]log[thin space (1/6-em)]P), dipole moment (μ) in debye units, steric (SALL), hydrogen donor (HDALL), hydrogen acceptor (HAALL), ring (RALL) and calculated pEC50 using MLR, BPNN and SVM of DABO analoguesa

image file: c4ra15478a-u6.tif

S. no. R R′ X Y Exp-pEC50 C[thin space (1/6-em)]log[thin space (1/6-em)]P SALL HDALL HAALL RALL μ Calculated pEC50
MLR BPNN SVM
a pEC50 = −log[thin space (1/6-em)]EC50 (where EC50 is the effective concentration of a compound required to activate 50% protection of MT-4 cell against the cytopathic effect of HIV-1). The data points α represent the test set and β as Outliers.
1 H H Iso-propyl H 6.398 2.562 0.699 0.758 0.050 0.718 2.768 6.284 6.063 6.400
2 H H sec-Butyl H 6.886 3.091 0.731 0.715 0.161 0.726 3.348 6.712 6.573 6.760
3 H H Methoxyethyl H 6.097 1.754 0.657 0.510 0.045 0.592 4.692 5.881 5.873 6.090
4 H H Methylthioethyl H 6.523 2.472 0.617 0.429 0.035 0.546 4.191 6.027 6.009 6.187
5 H H Cyclohexyl H 6.854 3.755 0.743 0.879 0.378 0.803 3.342 7.339 7.190 7.384
6 H H Phenyl H 6.538 3.703 0.711 0.667 0.337 0.770 3.060 6.762 6.655 6.750
7 Me H n-Propyl H 7.699 3.231 0.846 0.890 0.713 0.868 2.680 7.409 7.317 7.404
8 Me H Iso-propyl H 7.523 3.011 0.733 0.907 0.215 0.750 3.269 7.058 6.936 7.159
9 Me H n-Butyl H 7.699 3.760 0.669 0.763 0.167 0.658 3.259 7.200 7.057 7.308
10 Me H sec-Butyl H 7.222 3.540 0.706 0.858 0.157 0.657 3.262 7.448 7.285 7.565
11 Me H Cyclohexyl H 7.523 4.204 0.753 0.920 0.384 0.767 3.261 7.848 7.588 7.890
12 Me H Phenyl H 7.222 4.152 0.742 0.748 0.449 0.780 3.000 7.344 7.192 7.311
13 Me H 2,6-F2-phenyl H 6.854 4.488 0.474 0.068 0.490 0.326 1.672 6.619 6.632 6.798
14 Me H 2-Cl-phenyl H 6.886 4.902 0.440 0.064 0.385 0.308 1.799 6.699 6.683 6.929
15 Me H 3-Cl-phenyl H 6.699 4.902 0.463 0.054 0.448 0.322 2.749 6.878 6.772 6.910
16 Me H 4-Cl-phenyl H 7.000 4.902 0.462 0.079 0.487 0.325 3.442 7.038 6.835 6.993
17 Me H 4-Me-phenyl H 6.745 4.651 0.424 0.108 0.497 0.338 2.733 6.691 6.662 6.739
18 Me H Phenylmethyl H 6.301 3.621 0.650 0.524 0.106 0.584 2.699 6.684 6.566 6.733
19 H Me Methyl H 6.959 2.123 0.867 0.940 0.987 0.952 2.591 6.893 6.823 7.044
20 H Me n-Propyl H 7.699 3.181 0.829 0.848 0.745 0.800 2.585 7.511 7.423 7.444
21 H Me Iso-propyl H 7.523 2.961 0.946 0.964 0.901 0.967 2.550 7.557 7.453 7.723
22 H Me n-Butyl H 7.523 3.710 0.697 0.692 0.477 0.564 2.578 7.663 7.527 7.517
23 H Me sec-Butyl H 7.398 3.490 0.870 0.847 0.718 0.808 2.581 7.775 7.615 7.713
24 Me Me Methyl H 7.523 2.572 0.861 0.892 0.888 0.865 2.694 7.282 7.232 7.277
25 Me Me n-Propyl H 8.222 3.630 0.910 0.936 0.875 0.896 2.684 7.986 7.772 8.035
26 Me Me Iso-propyl H 8.301 3.410 1.000 1.000 1.000 1.000 2.651 8.041 7.813 8.286
27 Me Me n-Butyl H 8.000 4.159 0.782 0.811 0.646 0.687 2.676 8.146 7.847 8.004
28 Me Me Methylthioethyl H 6.854 3.320 0.735 0.768 0.582 0.636 3.686 7.713 7.517 7.491
29 Me Me Cyclohexyl H 7.585 4.603 0.834 0.685 0.618 0.856 2.708 7.576 7.369 7.557
30 Me Me Phenyl H 7.620 4.551 0.807 0.694 0.503 0.766 2.404 7.705 7.496 7.672
31 H H Me 4.260 1.149 0.635 0.531 0.123 0.674 2.803 5.017 4.869 4.954
32 H Me Me 5.620 1.548 0.741 0.484 0.905 0.784 2.092 5.605 5.387 5.640
33 H H Cyclopentyl H 7.046 3.196 0.743 0.870 0.331 0.805 3.356 6.984 6.870 7.033
34 Me H Cyclopentyl H 7.699 3.645 0.758 0.923 0.349 0.775 3.268 7.525 7.359 7.572
35 H Me Cyclopentyl H 7.523 3.595 0.557 0.773 0.232 0.506 3.348 7.386 7.271 7.502
36 Me Me Cyclopentyl H 7.699 4.044 0.601 0.696 0.116 0.509 2.748 7.433 7.291 7.580
α37 H H Ethyl H 6.097 2.253 0.692 0.776 0.081 0.738 2.770 6.081 5.830 6.178
α38 H H n-Propyl H 6.959 2.782 0.708 0.630 0.106 0.718 2.768 6.181 5.991 6.216
α39 H H Cyclopropyl H 5.499 2.078 0.732 0.909 0.287 0.778 3.293 6.520 6.366 6.549
α40 H H n-Butyl H 7.000 3.311 0.640 0.471 0.039 0.570 3.328 6.445 6.341 6.536
α41 Me H Methyl H 6.398 2.173 0.794 0.818 0.711 0.804 2.689 6.752 6.658 6.655
α42 Me H Methylthioethyl H 7.796 2.921 0.669 0.752 0.192 0.665 3.799 6.799 6.694 6.882
α43 H Me Methylthioethyl H 7.886 2.871 0.651 0.651 0.423 0.514 3.584 7.242 7.153 7.070
α44 H Me Cyclohexyl H 7.658 4.154 0.765 0.640 0.501 0.794 2.616 7.120 6.998 7.077
α45 H H H H 3.967 0.912 0.570 0.770 0.016 0.554 2.962 5.622 5.328 5.757
β46 H H Methyl H 5.824 1.724 0.416 0.089 0.134 0.367 2.729 4.692 4.942 4.582


Correlation analyses

Various regression models (nonlinear and linear) are generated to evaluate the behaviour of the descriptors. The uni-variate linear correlation matrix for the correlation of all the descriptors with pEC50 and their individual impact coefficient (IC) for DABO analogues is presented in Table 3.
Table 3 The uni-variate correlation (r, r2) and impact (IC) coefficients of 3D and 2D descriptors with pEC50 and linear equation for DABO analogues (training set)
Descriptor r r2 Impact coefficient (IC) Equation
C[thin space (1/6-em)]log[thin space (1/6-em)]P 0.502 0.252 0.413 0.413 × C[thin space (1/6-em)]log[thin space (1/6-em)]P + 5.664
SALL 0.426 0.182 2.290 2.290 × SALL + 5.469
HDALL 0.466 0.217 1.264 1.264 × HDALL + 6.250
HAALL 0.388 0.151 1.031 1.031 × HAALL + 6.631
RALL 0.293 0.085 1.178 1.178 × RALL + 6.296
μ 0.107 0.011 −0.141 −0.141 × μ + 7.515


Perusal of the coefficients of the descriptors suggests that C[thin space (1/6-em)]log[thin space (1/6-em)]P exhibit the highest correlation potential while the dipole moment shows the lowest one under linearity conditions. Considering the uni-variate relationship of the descriptors with the antiviral activity of the training set of the DABO analogues, the following order of correlation (R2) is observed.

C[thin space (1/6-em)]log[thin space (1/6-em)]P > HDALL > SALL > HAALL > RALL > μ

However, the order of impact (IC) of the descriptor follows the following order:

SALL > HDALL > RALL > HAALL > C[thin space (1/6-em)]log[thin space (1/6-em)]P > μ

The result of uni-variate linear correlation analyses shows that though C[thin space (1/6-em)]log[thin space (1/6-em)]P has the highest R2 (0.252) indicating relatively high linear relatedness yet its impact (coefficient of the descriptor = 0.413) on the biological response is not the highest. SALL showed a low potential i.e. a less linear relationship with the biological response (R2 = 0.182) yet it has the highest impact (coefficient of the descriptor = 2.290) on the antiviral activity.

Table 4 presents detailed analyses of non-linear and linear chemometric methods used in the present investigation. In Table 4, ‘k’ is the no. of descriptors, ‘r2’ is the correlation coefficient, ‘q2’ is cross validated ‘r2’ from the (LOO) and N-CV procedure, rho (ρ) is the Spearman rank correlation coefficient, MSE is the mean squared error and PRESS is the predictive sum of squares.

Table 4 Comparative analyses of models built by multiple linear regressions (MLR), Back Propagation Neural Network (BPNN) and Support Vector Machine (SVM) techniques for training seta
S. no.   Model K r r2 radj2 Spearman (rho) PRESS MSE q2
a MS = manual selection, RBFK = radial basis function kernel, PK = polynomial kernel, SK = sigmoid kernel, LK = linear kernel.
1 MLR MS 6 0.912 0.832 0.798 0.813 0.096 0.832
2 LOO 6 0.850 0.723 0.666 0.773 5.791 0.161 0.719
3 NCV (N = 10) 6 0.859 0.739 0.684 0.789 5.465 0.152 0.735
1 BPNN MS 6 0.923 0.852 0.827 0.104 0.818
2 LOO 6 0.833 0.693 0.749 6.540 0.182 0.683
3 NCV (N = 10) 6 0.799 0.639 0.727 8.263 0.230 0.600
1 SVM (epsilon-radial) 36SV:RBFK MS 6 0.939 0.883 0.849 0.071 0.876
2 LOO 6 0.895 0.802 0.789 0.117 0.796
3 NCV (N = 10) 6 0.899 0.809 0.785 0.112 0.805
1 SVM (epsilon-polynomial) 34SV:PK MS 6 0.332 0.111 0.676 0.587 −0.023
2 LOO 6 −0.643 0.414 −0.674 0.589 −0.027
3 NCV (N = 10) 6 −0.390 0.157 −0.340 0.666 −0.161
1 SVM (epsilon-sigmoid) 36SV:SK MS 6 0.908 0.824 0.803 0.103 0.820
2 LOO 6 0.846 0.716 0.751 0.170 0.704
3 NCV (N = 10) 6 0.818 0.669 0.731 0.199 0.653
1 SVM (epsilon-linear) 36SV:LK MS 6 0.906 0.820 0.800 0.105 0.817
2 LOO 6 0.832 0.692 0.739 0.184 0.678
3 NCV (N = 10) 6 0.813 0.660 0.710 0.203 0.646
1 SVM (nu-radial) 22SV:RBFK MS 6 0.930 0.864 0.820 0.080 0.861
2 LOO 6 0.872 0.760 0.768 0.140 0.757
3 NCV (N = 10) 6 0.878 0.771 0.772 0.132 0.770
1 SVM (nu-polynomial) 18SV:PK MS 6 0.406 0.165 0.689 0.583 −0.016
2 LOO 6 −0.804 0.646 −0.791 0.615 −0.072
3 NCV (N = 10) 6 −0.423 0.179 −0.372 0.609 −0.062
1 SVM (nu-sigmoid) 22SV:SK MS 6 0.910 0.828 0.818 0.099 0.828
2 LOO 6 0.884 0.782 0.793 0.125 0.781
3 NCV (N = 10) 6 0.851 0.724 0.768 0.160 0.721
1 SVM (nu-linear) 22SV:LK MS 6 0.911 0.829 0.814 0.098 0.829
2 LOO 6 0.876 0.767 0.788 0.134 0.766
3 NCV (N = 10) 6 0.850 0.723 0.761 0.161 0.719


Support vector machine (SVM) analyses

In the present study, ε-support vector regression and ν-support vector regression with variable kernels [linear (SVM-LK), polynomial (SVM-PK), sigmoid (SVM-SK), and radial basis function (SVM-RBFK)] are considered and eight models are generated using a random seed 3386003317. Optimal parameter settings are fine-tuned and accordingly good results are obtained. The following parameters, cost: 100[thin space (1/6-em)]000, gamma = 0.00038, epsilon (ε): 0.001/Nu (ν): 0.5, termination criterion tolerance: 0.001 are chosen for performing the analyses. Table 4 also presents all the statistical details of the eight SVM models. It is observed that the radial basis function kernel performs best, followed by sigmoid and linear kernels in both the ε and ν techniques. In all the cases, the correlation coefficients are comparable. In either case, the polynomial kernels perform worst. The SVM regression models (ε-RBFK and ν-RBFK) are better (higher correlation coefficients and lower mean squared errors) than the MLR and BPNN regression models. The better results obtained using the SVM method can be attributed to non-linearity among the various parameters and also signifies the robustness of the models derived.

Back propagation neural network (BPNN) analyses

In the back-propagation method considered for training the neural network, same random seed (3386003317) as in the case of SVM is used. For training the network, the following parameters: max training epoch = 733, learning rate = 0.33, output learning rate = 0.44, momentum = 0.20, neural architecture = (7I, 3H, 1O) and initial weight (±) = 0.20 are considered. The best BPNN model (r = 0.923), points towards significant non-linearity. The relevance scores of the respective 3D descriptors, as suggested by the best model relating biological activity using BPNN method is presented in Fig. 2.
image file: c4ra15478a-f2.tif
Fig. 2 Relevance scores of the 3D descriptors of DABO derivatives as obtained from the BPNN regression analysis.

(n = 36, r = 0.923, r2 = 0.852, Spearman (rho) = 0.827, MSE = 0.104).

Of these descriptors, RALL has the most prominent enhancing effect on the anti-viral activity, followed by the lipophilicity (C[thin space (1/6-em)]log[thin space (1/6-em)]P). HDALL shows a better effect on antiviral activity than HAALL. Leave-one-out (LOO) and N-cross validated (N-CV) methods are used to validate the results. Dipole moment (μ), a 3D electrostatic descriptor, exhibits the lowest impact on the antiviral activity.

Multiple linear regression (MLR) analyses

MLR analyses are also performed on data with the same random seed (3386003317) as that in SVM and BPNN. Of the various MLR analyses, the backward elimination method gave the best model (MLR: r = 0.912) and the multivariate relationship of descriptors with antiviral activity is presented herewith in eqn (5):
 
pEC50 = +0.519 (± 0.631)C[thin space (1/6-em)]log[thin space (1/6-em)]P + 3.908 (± 0.728)SALL + 2.320 (± 0.854)HDALL + 0.789 (± 0.297)HAALL − 4.099 (± 1.018)RALL + 0.126 (± 0.096)μ + 3.018 (5)

(n = 36, r = 0.912, r2 = 0.832, radj2 = 0.798, Spearman (rho) = 0.813, MSE = 0.096).

The high positive coefficients of SALL and HDALL show that they impart an enhancing effect whereas a high negative coefficient of RALL shows that it imparts an adverse effect on the anti HIV-1 activity. The C[thin space (1/6-em)]log[thin space (1/6-em)]P, HAALL and dipole moment (μ) have comparatively less enhancing effects.

Fig. 3 presents the graph between the experimental and calculated activity (pEC50) derived from the best of MLR, BPNN and SVM models for the training set. A good fit is observed between observed and calculated activity reconfirming the robustness of the methods used.


image file: c4ra15478a-f3.tif
Fig. 3 A graph of comparative analyses of observed and calculated pEC50 (MLR, BPNN and SVM) values for the training set of DABO derivatives.

Compound no. 46 is observed as an outlier. It has unsubstituted R, bridged R′ and methyl moiety present at X, and has a moderate value of antiviral activity.

Cross-validation

A test-set of DABO derivatives with antiviral activity spanning the complete range is considered for external validation of the aforementioned chemometric regression models. A comparison of the validated results, shown in Fig. 4, suggests good agreement between the experimental and estimated activity. The results of regression analyses using all the methods for the test set are found to be highly encouraging and are given herewith: SVM: r = 0.839, r2 = 0.703, Spearman (rho) = 0.767, BPNN: r = 0.822, r2 = 0.676, Spearman (rho) = 0.800 and MLR: r = 0.816, r2 = 0.666, Spearman (rho) = 0.800. In case of training as well as the test set, the results of the SVM validation are the best.
image file: c4ra15478a-f4.tif
Fig. 4 A graph of comparative analyses of observed and calculated pEC50 (MLR, BPNN and SVM) values for the test set of DABO derivatives.

It is observed that the steric attributes are important in elucidating the interaction mechanisms in the case of DABO derivatives and the present findings establish the role of steric interactions very well. The hydrogen donor attribute behaves better than the hydrogen acceptor attribute. It is again well evident that the hydrogen donor ability of 3-NH is crucial to the antiviral activity of the DABO derivatives.102

Estimation of anti-viral activity for approximate dataset and virtual dataset (VDS)

The SVM (ε-RBFK), BPNN and MLR methods are used for prediction of the antiviral activities of the previously synthesized DABOs (whose antiviral activities are reported in approximate terms). Table 5 records the details of such 7 compounds (comp. nos 47–53) and their predicted antiviral activities. It is observed that in certain instances, there is a wide gap between the reported and the predicted activity values.
Table 5 Descriptors, anti-HIV-1 activity (approximate) and predicted anti-viral activity (SVM, BPNN and MLR) for the compounds with approximate activity and the virtual data set

image file: c4ra15478a-u7.tif

Comp. no. R R′ X Y R′′ pEC50 (exp) C[thin space (1/6-em)]log[thin space (1/6-em)]P SALL HDALL HAALL RALL μ pEC50 (predicted) SVM pEC50 (predicted) BPNN pEC50 (predicted) MLR
a Compounds with activity reported in approximate terms.b Virtually designed compounds.
47a Me H 1-Naphthyl H >3.699 5.326 0.559 0.450 0.373 0.419 3.599 7.971 7.790 8.042
48a Me Me H Me >4.032 1.997 0.718 0.383 0.731 0.658 2.196 5.783 6.001 5.906
49a H H CN H >3.699 0.807 0.624 0.819 0.025 0.626 1.218 5.515 5.057 5.383
50a Me H CN H >3.699 1.256 0.813 0.816 0.757 0.818 2.718 6.218 6.466 6.328
51a H H NO2 H >3.699 −2.228 0.664 0.777 0.185 0.638 3.031 3.942 4.251 4.174
52a Me H NO2 H >3.699 1.779 0.837 0.811 0.745 0.796 5.221 6.973 7.194 7.079
53a H H COCH2CH3 H >3.699 2.201 0.718 0.871 0.061 0.730 3.998 6.790 6.706 6.548
54b i-Propenyl Me 2-Methylbutyl H 2,6-di-Cl 6.370 0.543 0.187 0.010 0.257 1.528 8.452 7.628 8.026
55b i-Propenyl H 2-Methylbutyl H 2-Br,4-F,6-Cl 6.270 0.505 0.116 0.002 0.226 1.104 8.288 7.462 7.730
56b i-Propenyl i-Propyl i-Pentyl H 2,6-Di-Cl 6.370 0.473 0.148 0.005 0.226 1.612 8.307 7.460 7.796
57b i-Propyl i-Propyl 2-Oxobutyl H 2,4,6-Tri-Cl 6.300 0.505 0.251 0.013 0.245 1.655 8.508 7.659 8.058
58b i-Propyl i-Propyl 2-Oxobutyl H 2,6-Di-Br 5.880 0.553 0.283 0.013 0.279 1.912 8.328 7.620 7.997
59b i-Propenyl i-Propyl 2-Oxobutyl H 2,6-Di-Br,4-F 5.770 0.565 0.316 0.010 0.280 2.526 8.402 7.655 8.132


A virtual dataset of 150 compounds, with structure similarity with the DABO derivatives is created, and their anti-viral activities are predicted using the best derived SVM (ε-RBFK) model, and for comparison sake the activities predicted using BPNN and MLR, are shown. Care is taken to obtain molecules with not only better activities but also with favourable substitution(s). Finally, six compounds (comp. nos 54–59) with antiviral activity higher than 8.25 are extracted, which could be earmarked for synthesis and subsequent development as lead or probable drug derivatives. The results suggest that substitution of the di-fluoro groups at the 2 and 6 positions in ring ‘B’ by chloro and/or bromo and tri-substitution at the 2,4 and 6 position by Br/Cl/F, smaller modification in R and X, yielded better inhibitory activity (pEC50). Therefore, compounds designed in silico may reduce the time frame in the search for better drug like candidates. Table 4 records the six structures that are virtually generated using the DABO like template. The template used for deriving the virtual data set is also given in Table 5. Table 5 also records all the descriptors and predicted effective concentration (pEC50) for the virtual data set.

Fig. 5 presents the graphical comparison of predicted values of pEC50, estimated using the SVM technique for the compounds with approximate activity and that of the virtual dataset.


image file: c4ra15478a-f5.tif
Fig. 5 A graph depicting the estimated pEC50 values for compounds with approximate activity and the VDS of DABO analogues.

The structures of the virtually designed compounds, exhibiting pEC50 values > 8.25, that are identified as lead/drug molecules from the virtual data set are presented in Fig. 6.


image file: c4ra15478a-f6.tif
Fig. 6 Structures of lead compounds (Virtual Data Set) with their predicted pEC50.

Validation of results for VDS by docking and virtual screening

The docking protocol involving the receptor–ligand crystal structure (1JLA)103 is highly satisfactory (RMSD = 0.37 Å) for running the docking simulations of the virtual compounds. Fig. 7a shows the snapshot of the overall superimposition of compound no. 26 (highest active ligand with pEC50 = 8.301) along with all the VDS compounds while Fig. 7b shows the secondary view. From Fig. 7a it can be observed that compound no. 26 and all the VDS compounds are closely surrounded by amino acids GLY190, LYS101, VAL106, TYR318, PHE227, LEU234, CYS181, TYR188, VAL189, VAL175, LYS103, SER105, PRO236, PRO225, HIS235, and TRP229.
image file: c4ra15478a-f7.tif
Fig. 7 (a) Superimposed structures of comp. no. 26 and all VDS compounds (comp. nos 54–59) showing interactions with the surrounding amino acids, (b) secondary view of superimposed structures of comp. no. 26 and all VDS compounds (comp. nos 54–59).

Table 6 presents the respective distances (in Å) from the nearest amino acid residues of the binding pocket for the TNK-651, highest active ligand (26), compound no. 47 and the virtually designed compounds (54–59). From Table 5, it is clear that compound no. 26 is better bound as compared to TNK-651 in NNIBP, due to closer interaction with the surrounding amino acid residues. Compound no. 47 (with the highest predicted activity amongst compounds whose activity is reported in approximate terms) and compound no. 26 exhibit similar interactions based on their respective distance from the referred amino acids. The comparison made above suggest that all compounds of VDS show more proximity with the surrounding amino acids and thus resulting in better interactions. Thus, the order of proximity and thereby possibilities of closer interaction in NNIBP is as follows: VDS compounds > 26 ≅ 47 > TNK-651. Among all the compounds of VDS, it is observed that compound no. 57 (also whose predicted activity is the highest, 8.508) is in relatively nearer vicinity than the other VDS compounds and thus better binding can be interpreted in terms of more favourable interactions. Compared to compound no. 26, five of the VDS compounds have shown better interactions with the amino acids while compound no. 55 whose activity, though a little less, has shown nearly similar binding.

Table 6 Distances (in Å) of TNK-651, highest active ligand (26), compound 47a and virtually designed compounds (54–59) from the nearest amino acid residues
Amino Acid Residues Compounds
TNK_651 26 47 54 55 56 57 58 59
a The compounds whose activity is reported in approximate terms.
LYS101 2.606 1.626 1.997 1.619 1.643 1.643 1.466 1.449 1.879
LYS103 4.310 3.868 3.768 4.195 4.096 3.817 3.578 4.169 3.558
LEU234 3.576 4.236 2.930 4.626 3.701 4.256 3.252 3.299 3.300
PRO236 3.610 3.590 4.718 3.538 3.581 4.094 3.630 3.596 3.753
TRP229 5.026 4.747 3.812 4.367 4.945 4.229 3.321 4.101 4.750
TYR188 4.156 3.700 4.054 2.594 3.327 2.489 3.568 3.032 3.298
TYR318 2.839 3.302 3.297 3.636 1.618 3.379 3.613 1.787 3.701


The interactions of the docked pose of compound nos 26, 47 and 57 with the 1JLA molecule are presented in Fig. 8A–I. Electrostatic contour maps (Fig. 8B, E and H) indicate the favourable polar interactions between the protein and the ligand.


image file: c4ra15478a-f8.tif
Fig. 8 (A–I) Hydrogen bond, electrostatic and hydrophobic interactions of comp. no. 26 (A–C), comp. no. 47 (D–F) and comp. no. 57 (G–I) with surrounding amino acid residues.

The following observations are that the compounds with bulkier substitution at R′′, isopropyl and isopropylene at R and R′ and oxo-butyl group as X show better activity. The blue contour enclosing most of the molecular region represent positive interactions favouring nucleophilic assortments. The hydrogen bond interaction observed for virtual compound no. 57 is similar to that of compound no. 26. Isopropyl group (R and R′), and chlorine attached on ring B are inserted into the blue pocket exhibiting favourable hydrophobic interactions. The oxo-butyl group inserted into the red pocket shows favourable hydrophilic interactions. The ring B also shows π–π stacking behaviour with TYR188 suggesting the stability of the ligand–protein complex.

Conclusions

The present paper reports six novel virtually designed DABO derivatives. The new compounds have been obtained by the combination of peculiar chemical features of a well-known family of pyrimidinone-containing NNRTIs (DABOs) and showed a discrete, characteristic structure–activity relationship profile.

The chemometric analyses for better understanding of interactive capability of DABO derivatives within the NNIBP have yielded surprising results. The biological response of the antiviral compounds is enhanced mainly by the presence of steric attributes, hydrogen donor capabilities and lipophilicity, and bulkier groups at R, R′ and X are suggested for better biological response. On the basis of results obtained from VDS, it can be concluded that compounds with 2,6,di-chloro and/or di-bromo and tri-halo-substitution on ring-B and minor modifications of R and X are better performers. The better performance of SVM and BPNN models over the MLR model is suggestive of the fact that there invariably exists some degree of non-linear relationship between the anti-viral activity and descriptors used. In addition, among the chemometric tools, SVM certainly has better applicability as well as interpretability in terms of varied kernel settings leading to a search spanning from linear to nonlinear (radial, sigmoid, polynomial) relationships. The statistical results of the training set are validated and complimented by the test set. A significant point observed while comparing all the VDS structures concludes that the chain length of X affects activity but the chain length for X should be five bonds or less to get enhanced activity and the presence of O-atom is still more beneficial. Thus, the better activity of compound no. 57 can be attributed to the presence of the oxo-butyl group and exhibit favourable polar interactions with PRO236, TYR318 and VAL106 residues in the binding pocket. The relative orientation of the aromatic moieties of the ligand with respect to the orientation of the aromatic moieties of the receptor and their involvement in stacking type interactions, guide the polar interactions. The predicted antiviral activities of synthesized DABOs (whose activities are reported in approximate terms) show a wide gap. The in silico generated six virtual compounds showed a high biological response, and thus they can be used as precursors or lead compounds and can be synthesized and tested.

Acknowledgements

The authors (Nilanjana Jain (Pancholi)) and Swagata Gupta thank DST, New Delhi for grant of Woman Scientist Fellowship (WOS-A) and Principal, GBLPPG College MHOW, respectively.

Notes and references

  1. B. M. Mathers, L. Degenhardt, C. Bucello, J. Lemon, L. Wiessing and M. Hickman, Bull. W. H. O., 2013, 91(2), 102–123 CrossRef PubMed.
  2. L. Genovese, M. Nebuloni and M. Alfano, Front. Immunol., 2013, 4(86), 1–12,  DOI:10.3389/fimmu.2013.00086 , eCollection 2013.
  3. H. Okatch, B. Ngwenya, K. M. Raletamo and K. Andrae-Marobela, Anal. Chim. Acta, 2012, 7(30), 42–48 CrossRef PubMed.
  4. V. Repunte-Canonigo, C. Lefebvre, O. George, T. Kawamura, M. Morales, G. Koob, A. Califano, E. Masliah and P. P. Sanna, Mol. Neurodegener., 2014, 9(1), 26 CrossRef PubMed.
  5. P. Zhan, X. Chen, D. Li, Z. Fang, E. De Clercq and X. Liu, Med. Res. Rev., 2013, 33, E1–E72 CrossRef CAS PubMed.
  6. L. Schneider, N. Ktorza, S. Fourati, L. Assoumou, E. Courbon, F. Caby, C. Blanc, M. Tindel, R. Agher, A. G. Marcelin, V. Calvez, G. Peytavin and C. Katlama, HIV Clin. Trials, 2012, 13(5), 284–288 CrossRef CAS PubMed.
  7. C. Reynolds, C. B. de Koning, S. C. Pelly, W. A. van Otterlo and M. L. Bode, Chem. Soc. Rev., 2012, 41(13), 4657–4670 RSC.
  8. R. Paredes and B. Clotet, Antiviral Res., 2010, 85, 245–265 CrossRef CAS PubMed.
  9. M. P. de Béthune, Antiviral Res., 2010, 85(1), 75–90 CrossRef PubMed.
  10. R. Gupta, A. Hill, A. W. Sawyer and D. Pillay, Clin. Infect. Dis., 2008, 47(5), 712–722 CrossRef PubMed.
  11. T. Homma, Nippon Rinsho, 2012, 70, 326–330 Search PubMed.
  12. A. M. Almerico, M. Tutone and A. Lauria, J. Comput.-Aided Mol. Des., 2008, 22, 287–297 CrossRef CAS PubMed.
  13. N. Jain, S. Gupta, N. Sapre and N. S. Sapre, Mol. BioSyst., 2014, 10(2), 313–325 RSC.
  14. N. S. Sapre, N. Jain (Pancholi), S. Gupta and N. Sapre, RSC Adv., 2013, 3, 10442–10451 RSC.
  15. N. S. Sapre, N. Pancholi, S. Gupta and N. Sapre, J. Comput. Chem., 2008, 29(11), 1699–1706 CrossRef CAS PubMed.
  16. S. E. Galembeck, F. M. Bickelhaupt, F. C. Guerra and E. Galembeck, J. Mol. Model., 2014, 20(7), 2332,  DOI:10.1007/s00894-014-2332-3.
  17. H. Huang, R. Chopra, G. L. Verdine and S. C. Harrison, Science, 1998, 282, 1669–1675 CrossRef CAS.
  18. M. T. Christen, L. Menon, N. S. Myshakina, J. Ahn, M. A. Parniak and R. Ishima, Chem. Biol. Drug Des., 2012, 80(5), 706–716 CAS.
  19. Y. Hsiou, J. Ding, K. Das, A. D. Clark Jr, S. H. Hughes and E. Arnold, Structure, 1996, 4, 853–860 CrossRef CAS.
  20. A. L. Hopkins, J. Ren, R. M. Esnouf, B. E. Willcox, E. Y. Jones, C. Ross, T. Miyasaka, R. T. Walker, H. Tanaka, D. K. Stammers and D. I. Stuart, J. Med. Chem., 1996, 39, 1589–1600 CrossRef CAS PubMed.
  21. J. Ding, K. Das, C. Tantillo, W. Zhang, A. D. Clark Jr, S. Jessen, X. Lu, Y. Hsiou, A. Jacobo-Molina, K. Andries, R. Pauwels, H. Moereels, L. Koymans, P. A. J. Janssen, R. H. Smith Jr, M. Kroeger Koepke, C. J. Michejda, S. H. Hughes and E. Arnold, Structure, 1995, 3, 365–379 CrossRef CAS.
  22. J. Ren and D. K. Stammers, Virus Res., 2008, 134(1–2), 157–170 CrossRef CAS PubMed.
  23. G. Maga, M. Radi, M. A. Gerard, M. Botta and E. Ennifar, Viruses, 2010, 2(4), 880–899 CrossRef CAS PubMed.
  24. N. S. Sapre, S. Gupta, N. Pancholi and N. Sapre, J. Comput.-Aided Mol. Des., 2008, 22, 69–80 CrossRef CAS PubMed.
  25. K. Das, A. D. Clark, P. J. Lewi, J. Heeres, M. R. De Jonge, L. M. Koymans, H. M. Vinkers, F. Daeyaert, D. W. Ludovici, M. J. Kukla, B. De Corte, R. W. Kavash, C. Y. Ho, H. Ye, M. A. Lichtenstein, K. Andries, R. Pauwels, M. P. De Béthune, P. L. Boyer, P. Clark, S. H. Hughes, P. A. Janssen and E. Arnold, J. Med. Chem., 2004, 47, 2550–2560 CrossRef CAS PubMed.
  26. E. De Clercq, Chem. Biodiversity, 2004, 1, 44–64 CAS.
  27. G. Meng, Y. Liu, A. Zheng, F. Chen, W. Chen, E. De Clercq, C. Pannecouque and J. Balzarini, Eur. J. Med. Chem., 2014, 82, 600–611 CrossRef CAS PubMed.
  28. V. Braz, A. L. Holladay and D. M. Barkley, Biochemistry, 2010, 49(3), 601–610 CrossRef CAS PubMed.
  29. S. Brück, S. Witte, J. Brust, D. Schuster, F. Mosthaf, M. Procaccianti, J. A. Rump, H. Klinker, D. Petzold and M. Hartmann, Eur. J. Med. Res., 2008, 13(7), 343–348 Search PubMed.
  30. J. D. Croxtall, Drugs, 2012, 72(6), 847–869 CrossRef CAS PubMed.
  31. A. C. Achhra, M. A. Boyd, M. G. Law, G. V. Matthews, A. D. Kelleher and D. A. Cooper, PLoS One, 2014, 9(6), e99530 Search PubMed.
  32. J. Vingerhoets, L. Rimsky, V. Van Eygen, S. Nijs, S. Vanveggel, K. Boven and G. Picchio, Antiviral Ther., 2013, 18(2), 253–256 CrossRef CAS PubMed.
  33. A. Narayanan, G. Sampey, R. Van Duyne, I. Guendel, K. Kehn-Hall, J. Roman, R. Currer, H. Galons, N. Oumata, B. Joseph, L. Meijer, M. Caputi, S. Nekhai and F. Kashanchi, Virology, 2012, 432(1), 219–231 CrossRef CAS PubMed.
  34. M. T. Lai, M. Feng, J. P. Falgueyret, P. Tawa, M. Witmer, D. DiStefano, Y. Li, J. Burch, N. Sachs, M. Lu, E. Cauchon, L. C. Campeau, J. Grobler, Y. Yan, Y. Ducharme, B. Côté, E. Asante-Appiah, D. J. Hazuda and M. D. Miller, Antimicrob. Agents Chemother., 2014, 58(3), 1652–1663 CrossRef PubMed.
  35. M. Botta, M. Artico, S. Massa, A. Gambacorta, M. E. Marongiu, A. Pani and P. La Colla, Eur. J. Med. Chem., 1992, 27(3), 251–257 CrossRef CAS.
  36. Y. He, F. Chen, X. Yu, Y. Wang, E. De Clercq, J. Balzarini and C. Pannecouque, Bioorg. Chem., 2004, 32(6), 536–548 CrossRef CAS PubMed.
  37. Y. He, F. Chen, G. Sun, Y. Wang, E. De Clercq, J. Balzarini and C. Pannecouque, Bioorg. Med. Chem. Lett., 2004, 14(12), 3173–3176 CrossRef CAS PubMed.
  38. M. Yu, E. Fan, J. Wu and X. Liu, Curr. Med. Chem., 2011, 18(16), 2376–2385 CrossRef CAS.
  39. S. Yang, F. E. Chen and E. De Clercq, Curr. Med. Chem., 2012, 19(2), 152–162 CrossRef CAS.
  40. R. Ragno, A. Mai, G. Sbardella, M. Artico, S. Massa, C. Musiu, M. Mura, F. Marturana, A. Cadeddu and P. La Colla, J. Med. Chem., 2004, 47(4), 928–934 CrossRef CAS PubMed.
  41. Y. Wang, F. Chen, E. Clercq, J. Balzarini and C. Pannecouque, Eur. J. Med. Chem., 2009, 44(3), 1016–1023 CrossRef CAS PubMed.
  42. M. Radi, M. Pagano, L. Franchi, D. Castagnolo, S. Schenone, G. Casaluce, C. Zamperini, E. Dreassi, G. Maga, A. Samuele, E. Gonzalo, B. Clotet, J. A. Esté and M. Botta, ChemMedChem, 2012, 7(5), 883–896 CrossRef CAS PubMed.
  43. M. A. de Brito, C. R. Rodrigues, J. J. Cirino, R. B. de Alencastro, H. C. Castro and M. G. Albuquerque, J. Chem. Inf. Model., 2008, 48(8), 1706–1715 CrossRef PubMed.
  44. M. A. de Brito, C. R. Rodrigues, J. J. Cirino, J. Q. Araújo, T. Honório, L. M. Cabral, R. B. de Alencastro, H. C. Castro and M. G. Albuquerque, Molecules, 2012, 17(7), 7666–7694 CrossRef PubMed.
  45. N. S. Sapre, T. Bhati, S. Gupta, N. Pancholi, U. Raghuvanshi, D. Dubey, V. Rajopadhyaya and N. Sapre, J. Biophys. Chem., 2011, 2(3), 361–372 CrossRef CAS.
  46. R. Costi, R. Di Santo, M. Artico, S. Massa, A. Lavecchia, T. Marceddu, L. Sanna, P. La Colla and M. E. Marongiu, Antiviral Chem. Chemother., 2000, 11(2), 117–133 CAS.
  47. Y. Mao, Y. Li, M. Hao, S. Zhang and C. Ai, J. Mol. Model., 2012, 18(5), 2185–2198 CrossRef CAS PubMed.
  48. D. Rotili, M. Tarantino, M. B. Artico, E. Nawrozkij, B. Gonzalez-Ortega, A. Clotet, J. Samuele, A. Esté, G. Maga and A. Mai, J. Med. Chem., 2011, 54(8), 3091–3096 CrossRef CAS PubMed.
  49. M. Yu, Z. Li, S. Liu, E. Fan, C. Pannecouque, E. De Clercq and X. Liu, ChemMedChem, 2011, 6(5), 826–833 CrossRef CAS PubMed.
  50. M. Radi, C. Falciani, L. Contemori, E. Petricci, G. Maga, A. Samuele, S. Zanoli, M. Terrazas, M. Castria, A. Togninelli, J. A. Esté, I. Clotet-Codina, M. Armand-Ugón and M. Botta, ChemMedChem, 2008, 3(4), 573–593 CrossRef CAS PubMed.
  51. Y. P. He, J. Long, S. S. Zhang, C. Li, C. C. Lai, C. S. Zhang, D. X. Li, D. H. Zhang, H. Wang, Q. Q. Cai and Y. T. Zheng, Bioorg. Med. Chem. Lett., 2011, 21(2), 694–697 CrossRef CAS PubMed.
  52. M. Radi, L. Angeli, L. Franchi, L. Contemori, G. Maga, A. Samuele, S. Zanoli, M. Armand-Ugon, E. Gonzalez, A. Llano, J. A. Esté and M. Botta, Bioorg. Med. Chem. Lett., 2008, 18(21), 5777–5780 CrossRef CAS PubMed.
  53. A. Mai, M. Artico, R. Ragno, G. Sbardella, S. Massa, C. Musiu, M. Mura, F. Marturana, A. Cadeddu, G. Maga and P. La Colla, Bioorg. Med. Chem., 2005, 13, 2065–2077 CrossRef CAS PubMed.
  54. N. S. Sapre, S. Gupta, N. Pancholi and N. Sapre, J. Comput. Chem., 2009, 30(6), 922–933 CrossRef CAS PubMed.
  55. R. D. Cramer and B. Wendt, J. Comput.-Aided Mol. Des., 2007, 21, 23–32 CrossRef CAS PubMed.
  56. C. E. Mowbray, R. Corbau, M. Hawes, L. H. Jones, J. E. Mills, M. Perros, M. D. Selby, P. A. Stupple, R. Webster and A. Wood, Bioorg. Med. Chem. Lett., 2009, 19(19), 5603–5606 CrossRef CAS PubMed.
  57. S. Gritsch, S. Guccione, R. Hoffmann, A. Cambria, G. Raciti and T. Langer, J. Enzyme Inhib., 2001, 16(3), 199–215 CrossRef CAS.
  58. H. Van de Waterbeemd, Drug Des. Discovery, 1993, 9(3–4), 277–285 CAS.
  59. Y. Brito-Sánchez, J. A. Castillo-Garit, H. Le-Thi-Thu, Y. González-Madariaga, F. Torrens, Y. Marrero-Ponce and J. E. Rodríguez-Borges, SAR QSAR Environ. Res., 2013, 24(3), 235–251 CrossRef PubMed.
  60. Y. Zhou, J. Jiang, W. Lin, H. Zou, H. Wu, G. Shen and R. Yu, Eur. J. Pharm. Sci., 2006, 28, 344–353 CrossRef CAS PubMed.
  61. P. Liu and W. Long, Int. J. Mol. Sci., 2009, 10(5), 1978–1998 CrossRef CAS PubMed.
  62. J. Li, S. Li, B. Lei, H. Liu, X. Yao, M. Liu and P. Gramatica, J. Comput. Chem., 2010, 31(5), 973–985 CAS.
  63. F. Gharagheizi, B. Tirandazi and R. Barzin, Ind. Eng. Chem. Res., 2009, 48, 1678–1682 CrossRef CAS.
  64. K. Roy and P. P. Roy, Eur. J. Med. Chem., 2009, 44(7), 2913–2922 CrossRef CAS PubMed.
  65. X. J. Yao, A. Panaye, J. P. Doucet, R. S. Zhang, H. F. Chen, M. C. Liu, Z. D. Hu and B. T. Fan, J. Chem. Inf. Comput. Sci., 2004, 44(4), 1257–1266 CrossRef CAS PubMed.
  66. H. Tanaka, M. Baba, M. Ubasawa, H. Takashima, K. Sekiya, I. Nitta, S. Shigeta, R. T. Walker, E. De Clercq and T. Miyasaka, J. Med. Chem., 1991, 34(4), 1394–1399 CrossRef CAS.
  67. ChemDraw Ultra 7.0.0, http://www.cambridgesoft.com Search PubMed.
  68. Molegro Virtual Docker, V. 6.0.0, 2013, http://www.molegro.com Search PubMed.
  69. Sybyl-X, 2.1 suite, 2013 Search PubMed.
  70. A. Tavlarakis and R. H. Zhou, Mol. Simul., 2009, 35, 1224–1241 CrossRef CAS.
  71. K. Osmialowski, J. Halkiewicz, A. Radecki and R. J. Kaliszan, J. Chromatogr., 1985, 346, 53–60 CrossRef CAS.
  72. P. W. Atkins, Quanta, A Handbook of Concepts, Oxford University Press, New York, 2nd edn, 1991 Search PubMed.
  73. T. Fujita, J. Iwasa and C. Hansch, J. Am. Chem. Soc., 1964, 86, 5175–5180 CrossRef CAS.
  74. R. Smith, C. Hansch and M. Ames, J. Pharm. Sci., 1975, 64(4), 599–606 CrossRef CAS.
  75. R. Mannhold and R. Rekker, Perspect. Drug Discovery Des., 2000, 18(1), 1–18 CrossRef CAS.
  76. C. Hansch and A. Leo, Substituent Constants for Correlation Analysis in Chemistry and Biology, Wiley Interscience, New York, 1979 Search PubMed.
  77. J. Chou and P. Jurs, J. Chem. Inf. Comput. Sci., 1979, 19, 172–178 CrossRef CAS.
  78. G. N. Elliott, H. Worgan, D. Broadhurst, J. Draper and J. Scullion, Soil Biol. Biochem., 2007, 39, 2888–2896 CrossRef CAS PubMed.
  79. K. Roy and J. T. Leonard, Indian J. Chem., Sect. A: Inorg., Bio-inorg., Phys., Theor. Anal. Chem., 2006, 45, 126–137 Search PubMed.
  80. A. Niazi, S. Jameh-Bozorghi and D. Nori-Shargh, J. Hazard. Mater., 2008, 151, 603–609 CrossRef CAS PubMed.
  81. E. Pourbasheer, S. Riahi, M. R. Ganjali and P. Norouzi, Eur. J. Med. Chem., 2009, 44, 5023–5028 CrossRef CAS PubMed.
  82. C. C. Chang and C. J. Lin, Neural. Comput., 2002, 14(8), 1959–1977 CrossRef PubMed.
  83. C. Cortes and V. Vapnik, Mach. Learn., 1995, 20, 273–297 Search PubMed.
  84. G. Liang and Z. Li, J. Mol. Graphics Modell., 2007, 26, 269–281 CrossRef CAS PubMed.
  85. R. Darnag, E. L. Mostapha Mazouz, A. Schmitzer, D. Villemin, A. Jarid and D. Cherqaoui, Eur. J. Med. Chem., 2010, 45, 1590–1597 CrossRef CAS PubMed.
  86. Y. Cong, X. Yang, W. Lv and Y. Xue, J. Mol. Graphics Modell., 2009, 28, 236–244 CrossRef CAS PubMed.
  87. Z. Shi, X. H. Ma, C. Qin, J. Jia, Y. Y. Jiang, C. Y. Tan and Y. Z. Chen, J. Mol. Graphics Modell., 2012, 32, 49–66 CrossRef CAS PubMed.
  88. H. Golmohammadi, Z. Dashtbozorgi and W. E. Acree, Eur. J. Pharm. Sci., 2012, 47, 421–429 CrossRef CAS PubMed.
  89. R. K. Prasoona, A. Jyoti, Y. Mukesh, S. Nishant, N. S. Anuraj and J. Shobha, Interdiscip. Sci.: Comput. Life Sci., 2013, 5(1), 45–52 CrossRef CAS PubMed.
  90. N. Segata and E. blanzieri, J. Mach. Learn. Res., 2010, 11, 1883–1926 Search PubMed.
  91. Y. Li, Y. Qin, X. Chen and W. Li, PLoS One, 2013, 8(9), e73186 CAS.
  92. D. Rumelhart, G. Hinton and R. Williams, Nature, 1986, 323, 533–536 CrossRef.
  93. V. Akman and P. Blackburn, J. Logic Lang. Inform., 2000, 9, 391–395 CrossRef.
  94. C. Wang, L. Li, L. Wang, Z. Ping, M. T. Flory, G. Wang, Y. Xi and W. Li, Diabetes Res. Clin. Pract., 2013, 100(1), 111–118 CrossRef PubMed.
  95. M. Szaleniec, R. Tadeusiewicz and M. Witko, Neurocomputing, 2008, 72, 241–256 CrossRef PubMed.
  96. S. So and G. Richards, J. Med. Chem., 1992, 35, 3201–3207 CrossRef CAS.
  97. V. Maniezzo, IEEE Trans. Neural Networks, 1994, 5, 39–53 CrossRef CAS PubMed.
  98. T. A. Andrea and H. Kalayeh, J. Med. Chem., 1991, 34, 2824–2836 CrossRef CAS.
  99. V. W. Porto, D. B. Fogel and L. J. Fogel, IEEE Expert., 1995, 10, 16–22 CrossRef.
  100. D. L. Selwood, D. J. Livingstone, J. C. W. Comley, A. B. O'Dowd, A. T. Hudson, P. Jackson, K. S. Jandu, V. S. Rose and J. N. Stables, J. Med. Chem., 1990, 33(1), 136–142 CrossRef CAS.
  101. L. Eriksson, E. Johansson, M. Muller and S. Wold, J. Chemom., 2000, 14, 599–616 CrossRef CAS.
  102. N. S. Sapre, S. Gupta, N. Pancholi, A. Sikarwar and N. Sapre, Acta Chim. Slov., 2007, 54, 797–804 CAS.
  103. http://www.rcsb.org/pdb/explore.do?structureId=1jla.

Footnote

Electronic supplementary information (ESI) available. See DOI: 10.1039/c4ra15478a

This journal is © The Royal Society of Chemistry 2015