Nilanjana Jain (Pancholi)a,
Swagata Guptab,
Neelima Saprec and
Nitin S. Sapre*a
aDepartment of Applied Chemistry, SGSITS, Indore, India. E-mail: sukusap@yahoo.com; Tel: +91 9826607444
bDepartment of Chemistry, Govt. BLPPG College, MHOW, India
cDepartment of Mathematics and Computational Sc., SGSITS, Indore, India
First published on 8th January 2015
Six novel non-nucleoside reverse transcriptase inhibitors exhibiting high efficacy are designed using in silico mathematical modelling techniques and the results are validated using a docking technique. An in silico assessment of interaction potential and structural requirements of 5-alkyl-2-alkylamino-6-(2,6-difluorophenylalkyl)-3,4-dihydropyrimidin-4(3H)-one (DABO) analogues in the non-nucleoside inhibitor binding pocket is also performed. Efficient use of 3D-pharamacophoric (SALL, HDALL, HAALL and RALL) and 3D-averaged alignment (ClogP and dipole moment) descriptors is made in this study. The chemometric analyses, using support vector machine, back propagation neural network and multiple linear regression, are performed. The relative potentials of these chemometric methods is also assessed and the results, SVM (r = 0.939, MSE = 0.071, q2 = 0.876), BPNN (r = 0.923, MSE = 0.104, q2 = 0.818) and MLR (r = 0.912, MSE = 0.096, q2 = 0.832), indicates that SVM describes the relationship between the descriptors and inhibitory activity in a better manner. The results also suggest that there is a non-linear relationship between the descriptors and inhibitory activity. The study further suggests that isopropyl/propenyl groups as R and R′, oxobutyl group as X and di or tri-substitution as R′′ are the best suited substituents for exhibiting better inhibitory activity.
The reverse transcriptase (RT) enzyme is essential for the conversion of genetic RNA into DNA, and thus plays a significant role in the drug discovery pipeline to combat HIV-1 infection.16,17 The RT enzyme is a heterodimer made of p66 and p51 subunits, where each subunit contains thumb, palm and finger domains. The p66 subunit contains the functional active site that binds the nucleic acid template primer to the nucleotide triphosphate.18,19 NNRTIs are highly potent and moderately toxic compared to nucleoside reverse transcriptase inhibitors (NRTIs) and do not require cellular activation to inhibit HIV-1 RT.20 These are non-competitive inhibitors that bind to a hydrophobic “pocket” in the p66 subunit of HIV-1 RT, approximately 10 Å away from the polymerase binding site.21 X-ray crystallographic studies have revealed the prominent interactions of the inhibitor within the non-nucleoside inhibitor binding pocket (NNIBP) of the protein and have facilitated the design of more effective inhibitors.22 The NNIBP does not exist in the unliganded RT and is produced by binding of the ligand with the side chains of aromatic (including Y181 and Y188) and hydrophobic amino-acid residues of the viral protein.23,24 NNRTI resistance mutations influence the binding of the inhibitors to their binding pocket either by changing the size, shape and polarity of the NNIBP or affecting the entry of NNRTIs into this pocket.25 Among the various FDA-approved NNRTIs, nevirapine (Viramune/Viramune XR) is a highly effective inhibitor, which has emerged as a key drug for the prevention of viral transmission. Recent studies have suggested that nevirapine is more effective in crossing the blood–brain barrier.26 Another NNRTI, delaviridine, which is bulkier in size than nevirapine, exhibit better interactions with RT, viz. hydrogen bond interactions with K103 and hydrophobic interactions with P236.27 Efavirenz (Sustiva, Stocrin, EFV, DMP-266) is a potent NNRTI that binds to HIV-1 RT at a site distinct from the polymerase catalytic site, which has also been found to be effective when combined with either nevirapine, nelfinavir or indinavir.28,29 Etravirine (ETR/TMC125), a second-generation NNRTI, exhibited an enhanced barrier to resistance and is found to be extremely effective in achieving the viral suppression as well as improving the immunity in treatment-experienced HIV-infected patients.30 Among the newly discovered NNRTIs, rilpivirine is an antiretroviral exhibiting better bioavailability, easier formulation and administration compared to etravirine,31,32 and the FDA approved NNRTIs are given in Table 1.
S. no. | Name | Structure | Approval status |
---|---|---|---|
a http://aidsinfo.nih.gov/education-materials/fact-sheets/21/58/fda-approved-hiv-medicines, updated 28, November 2014. | |||
1. | Nevirapine (BI-RG-587)/Viramune/Viramune XR 11-cyclopropyl-5,11-dihydro-4-methyl-6H-dipyrido [3,2-b:2′,3′-e][1,4]diazepin-6-one | Approved in 1996/extended release 2011 | |
2. | Delavirdine/DLV/Rescriptor N-[2-({4-[3-(propan-2-ylamino)pyridin-2-yl]piperazin-1-yl}carbonyl)-1H-indol-5-yl]methanesulfonamide | Approved in 1997 | |
3. | Efavirenz (DMP266)/Strocin™/Sustiva™ (S)-6-chloro-(cyclopropylethynyl)-1,4-dihydro-4-(trifluoromethyl)-2H-3,1-benzoxazin-2-one | Approved in 1998 | |
4. | R165335/Etravirine/TMC125/Intelence™ 4-[[6-amino-5-bromo-2-[(4-cyanophenyl)amino]-4-pyrimidinyl]oxy]-3,5-dimethylbenzonitrile | Approved in 2008 | |
5. | R278474/Rilpivirine/TMC278/Edurant 4-[[4-[4-[(1E)-2-cyanoethenyl]-2,6-dimethylphenyl]amino]-2-pyrimidinyl]amino]benzo-nitrile | Approved in 2011 |
ATP analogs are also reported to be used for inhibition of HIV-transcription.33 The latest entrant, undergoing clinical trials, in the armamentarium of anti-retrovirals is doravirine (MK-1439), which has robust antiviral activity and better tolerability.34 The 5-alkyl-2-alkylamino-6-(2,6-difluorophenylalkyl)-3,4-dihydropyrimidin-4(3H)-one (DABO) analogs effectively inhibit the replication of a variety of HIV-1 strains at the reverse transcriptase step.35 Efforts are ongoing to make structural changes so that the DABO analogs demonstrate enhanced potency. Derivatives, belonging to rationally designed broad spectrum NNRTIs, such as dihydroalkyloxybenzyloxopyrimidines (O-DABOs), dihydroalkylthiobenzyloxopyrimidines (S-DABOs), dihydroalkylamino difluorobenzyloxopyrimidines (NH-DABOs), N,N-disubstituted amino(2,6-difluorophenylalkyl)-3,4-dihydropyrimidin-4(3H)-ones (F2-N,N-DABOs), dihydro(alkyl thio)(naphthylmethyl)oxopyrimidines (DATNOS), are synthesized.36–42
The common structure of DABO consists of a pyrimidinone ring and a di-fluoro substituted aromatic ring attached through a substituted CH bridging group. The three substituents R and X are attached to the pyrimidinone ring while R′ is attached to the bridging CH group. Earlier studies on DABO are based on various SAR analyses, where the structural requirements for enhancing the biological activity have been quantitatively analyzed.43–46 The 3-dimensional RT complex-DABO crystal structure analysis has provided newer dimensions to interpretation of the drug–receptor interaction profile, and this has certainly aided in substantiating the SAR analyses for enhanced structural refinement.47–52
The present study deals with design of novel NNRTIs by performing chemometric analyses of two important types of descriptors namely (a) 3D pharmacophoric [dipole moment, SALL, HDALL, RALL and HAALL] and (b) 2D average alignment [octanol–water partition coefficient (ClogP) descriptors in understanding the interactions potential of 5-alkyl-2-alkylamino-6-(2,6-difluorophenylalkyl)-3,4-dihydropyrimidin-4(3H)-one (DABO) analogs in the NNIBP.53 It is reported that docking templates can be very useful if 3D structural conformation of a ligand is evident.54 These templates basically accounts for vital contribution of the pharmacophores in the structural alignment of the ligand within the NNIBP. These pharmacophore based templates assist in extracting requisite information based on similarity overlap for each individual template point. During template alignment procedure, ligand flexibility is also considered. The poses similar to the reference ligand are searched by the docking engine guided by pharmacophoric template groups. The validation procedure focuses on atomic overlap of each template groups centre with the concerned ligand.
The Gaussian function is used to evaluate an atom, matching a group definition based on its distance from the group centres. The four template groups incorporated in the present work are steric (SALL), hydrogen donor (HDALL), hydrogen acceptor (HAALL) and ring (RALL). In addition, two extremely relevant descriptors, octanol–water partition coefficient (ClogP) (an ‘alignment-average’ descriptor), and dipole moment (μ), a 3D parameter, are included for the sake of better assessment of anti-viral activity in the non-nucleoside inhibitor binding pocket (NNIBP).55 Optimization of physicochemical properties, such as lipophilicity and increase ligand-lipophilicity efficiency (LLE) is reported by Mowbray et al.56 These descriptors are selected to characterize the molecular architecture for their capability to correlate diverse biochemical phenomena of the concerned molecular series. The obtained relevant information is assessed chemometrically to correlate the structural features with the anti-viral activity of the DABO derivatives.57
The chemometric models have established themselves soundly in quantifying the correlation between selected molecular aspects and their impact on the biological response.58–61 The development of these rational SAR models focus on necessary chemical features leading to a better pharmaco-toxico-kinetic profile of a lead candidate curtailing irrelevant experimental determinations.62,63 These statistical validation methodologies form the basis of SAR studies, though there are still some limitations.64 The quantitative relationships between the molecular entity and its physicochemical and biological properties appear to be rather more complex, and nonlinearity also prevails in many cases, thus extracting pivotal information from a SAR model is an aspect that needs to be focused on very seriously.65,66
In the present study, non-linear (BPNN [back propagation neural network] and SVM [support vector machine]) and linear (MLR [multiple linear regression]) techniques are used for chemometric assessment of the training dataset. The results are validated using a test set. Finally, an external dataset (compounds with approximate inhibitory activity) and an in silico designed virtual dataset of molecules is assessed and reported with their predicted activities. External and structural validation of the designed in silico ligands of VDS is performed using a docking technique. The complete work performed is presented in Scheme 1.
The Gaussian formula, given in eqn (1), is used to determine the amount of overlap for the specific group centre:
(1) |
(2) |
πX = logP(C6H5X) − logP(C6H5) | (3) |
Rekker used a reductionist approach (eqn (4)), to derive the constants for carbon, hydrogen and for polar fragments.75 The fragment values (f) and interaction factors (F) had to be identified and evaluated by the relation reported by Hansch.76
(4) |
Chou and Jurs used a constructionist approach to calculate logP, considering the hydrophobic portions of solutes as “hydrocarbon-like” and defining these carbon and hydrogen fragment values as being truly constant and is reported as ClogP.77
The molecular descriptors are encoded in the form of input neurons, which are multiplied by the weight of each neuron. A sigmoid non-linear transfer function is then applied and a suitable bias is applied for shifting the transfer function to either side. These are then multiplied to the output weights, transformed and interpreted.93 The residual error in this supervised learning method i.e. the difference between the experimental quantity and network's predicted quantity is evaluated. This calculated error is allowed to propagate backward through the network and the weights are accordingly adjusted so as to observe the same input pattern and minimize the residual error. This pattern is repeated till a relationship or no relationship is derived.94–97 The most crucial characteristic of a neural network setup is deciding the number of neurons within the hidden layer and is decided by the ratio rho (ρ). Neural network models are very efficient in handling and extracting non-linear relationships. The ratio is maintained within the range 1.0–3.0 to get sensible results.98,99
y = b1x1 + b2x2 + b3x3 + … + bnxn + c |
This is the most widely used owing to its fast and easy interpretability. However, for complex systems such as a biological system, the linear combination of descriptor information can often lead to a model with limited accuracy, simply due to the assumption of linearity in the data.
The ligand selected as reference for template alignment in the docking wizard is compound no. 26 with highest activity (pEC50 = 8.30) in the dataset. Four template groups are taken for comparing the reference ligand with the rest of the ligands of the dataset. These are classified into (i) steric: consisting of all 22 non-hydrogen atoms and are used for shape matching only, (ii) hydrogen donor: consisting of 3 hydrogen donor functional groups, (iii) hydrogen acceptor: consisting of 1 hydrogen bond acceptor group and (iv) ring: constituting 12 atoms, which are part of rings. Any atom closer than 2.0 Å to the existing centre of a template group is accepted as equal to that centre, and the optimal match corresponds to a value of 1.0.
Fig. 1A–D shows all the template groups derived from compound number 26 (pEC50 = 8.30) and the contribution of each specific atom to a template group is also evident from this figure.
Fig. 1 (A–D) Template groups derived from the reference ligand [compound no. 26, (pEC50 = 8.3010)] of DABO derivatives. |
A training set of 36 ligands is considered for the 3D similarity-based alignment using template groups and is presented in Table 2. Table 2 presents substituents (R, R′, X and Y) on the parent structure, experimental and calculated anti-HIV-1 activities (effective concentration pEC50), all the descriptors namely, octanol–water partition coefficient (ClogP), dipole moment (μ) in debye units, four 3D descriptors based on group centre overlap based approach [steric (SALL), hydrogen donor (HDALL), hydrogen acceptor (HAALL), ring (RALL)]. Table 2 also presents the compounds of the test set and outlier. The structure shown in blue colour is the highest active ligand used as a template for estimation of 3D pharmacophoric descriptors.
S. no. | R | R′ | X | Y | Exp-pEC50 | ClogP | SALL | HDALL | HAALL | RALL | μ | Calculated pEC50 | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MLR | BPNN | SVM | ||||||||||||
a pEC50 = −logEC50 (where EC50 is the effective concentration of a compound required to activate 50% protection of MT-4 cell against the cytopathic effect of HIV-1). The data points α represent the test set and β as Outliers. | ||||||||||||||
1 | H | H | Iso-propyl | H | 6.398 | 2.562 | 0.699 | 0.758 | 0.050 | 0.718 | 2.768 | 6.284 | 6.063 | 6.400 |
2 | H | H | sec-Butyl | H | 6.886 | 3.091 | 0.731 | 0.715 | 0.161 | 0.726 | 3.348 | 6.712 | 6.573 | 6.760 |
3 | H | H | Methoxyethyl | H | 6.097 | 1.754 | 0.657 | 0.510 | 0.045 | 0.592 | 4.692 | 5.881 | 5.873 | 6.090 |
4 | H | H | Methylthioethyl | H | 6.523 | 2.472 | 0.617 | 0.429 | 0.035 | 0.546 | 4.191 | 6.027 | 6.009 | 6.187 |
5 | H | H | Cyclohexyl | H | 6.854 | 3.755 | 0.743 | 0.879 | 0.378 | 0.803 | 3.342 | 7.339 | 7.190 | 7.384 |
6 | H | H | Phenyl | H | 6.538 | 3.703 | 0.711 | 0.667 | 0.337 | 0.770 | 3.060 | 6.762 | 6.655 | 6.750 |
7 | Me | H | n-Propyl | H | 7.699 | 3.231 | 0.846 | 0.890 | 0.713 | 0.868 | 2.680 | 7.409 | 7.317 | 7.404 |
8 | Me | H | Iso-propyl | H | 7.523 | 3.011 | 0.733 | 0.907 | 0.215 | 0.750 | 3.269 | 7.058 | 6.936 | 7.159 |
9 | Me | H | n-Butyl | H | 7.699 | 3.760 | 0.669 | 0.763 | 0.167 | 0.658 | 3.259 | 7.200 | 7.057 | 7.308 |
10 | Me | H | sec-Butyl | H | 7.222 | 3.540 | 0.706 | 0.858 | 0.157 | 0.657 | 3.262 | 7.448 | 7.285 | 7.565 |
11 | Me | H | Cyclohexyl | H | 7.523 | 4.204 | 0.753 | 0.920 | 0.384 | 0.767 | 3.261 | 7.848 | 7.588 | 7.890 |
12 | Me | H | Phenyl | H | 7.222 | 4.152 | 0.742 | 0.748 | 0.449 | 0.780 | 3.000 | 7.344 | 7.192 | 7.311 |
13 | Me | H | 2,6-F2-phenyl | H | 6.854 | 4.488 | 0.474 | 0.068 | 0.490 | 0.326 | 1.672 | 6.619 | 6.632 | 6.798 |
14 | Me | H | 2-Cl-phenyl | H | 6.886 | 4.902 | 0.440 | 0.064 | 0.385 | 0.308 | 1.799 | 6.699 | 6.683 | 6.929 |
15 | Me | H | 3-Cl-phenyl | H | 6.699 | 4.902 | 0.463 | 0.054 | 0.448 | 0.322 | 2.749 | 6.878 | 6.772 | 6.910 |
16 | Me | H | 4-Cl-phenyl | H | 7.000 | 4.902 | 0.462 | 0.079 | 0.487 | 0.325 | 3.442 | 7.038 | 6.835 | 6.993 |
17 | Me | H | 4-Me-phenyl | H | 6.745 | 4.651 | 0.424 | 0.108 | 0.497 | 0.338 | 2.733 | 6.691 | 6.662 | 6.739 |
18 | Me | H | Phenylmethyl | H | 6.301 | 3.621 | 0.650 | 0.524 | 0.106 | 0.584 | 2.699 | 6.684 | 6.566 | 6.733 |
19 | H | Me | Methyl | H | 6.959 | 2.123 | 0.867 | 0.940 | 0.987 | 0.952 | 2.591 | 6.893 | 6.823 | 7.044 |
20 | H | Me | n-Propyl | H | 7.699 | 3.181 | 0.829 | 0.848 | 0.745 | 0.800 | 2.585 | 7.511 | 7.423 | 7.444 |
21 | H | Me | Iso-propyl | H | 7.523 | 2.961 | 0.946 | 0.964 | 0.901 | 0.967 | 2.550 | 7.557 | 7.453 | 7.723 |
22 | H | Me | n-Butyl | H | 7.523 | 3.710 | 0.697 | 0.692 | 0.477 | 0.564 | 2.578 | 7.663 | 7.527 | 7.517 |
23 | H | Me | sec-Butyl | H | 7.398 | 3.490 | 0.870 | 0.847 | 0.718 | 0.808 | 2.581 | 7.775 | 7.615 | 7.713 |
24 | Me | Me | Methyl | H | 7.523 | 2.572 | 0.861 | 0.892 | 0.888 | 0.865 | 2.694 | 7.282 | 7.232 | 7.277 |
25 | Me | Me | n-Propyl | H | 8.222 | 3.630 | 0.910 | 0.936 | 0.875 | 0.896 | 2.684 | 7.986 | 7.772 | 8.035 |
26 | Me | Me | Iso-propyl | H | 8.301 | 3.410 | 1.000 | 1.000 | 1.000 | 1.000 | 2.651 | 8.041 | 7.813 | 8.286 |
27 | Me | Me | n-Butyl | H | 8.000 | 4.159 | 0.782 | 0.811 | 0.646 | 0.687 | 2.676 | 8.146 | 7.847 | 8.004 |
28 | Me | Me | Methylthioethyl | H | 6.854 | 3.320 | 0.735 | 0.768 | 0.582 | 0.636 | 3.686 | 7.713 | 7.517 | 7.491 |
29 | Me | Me | Cyclohexyl | H | 7.585 | 4.603 | 0.834 | 0.685 | 0.618 | 0.856 | 2.708 | 7.576 | 7.369 | 7.557 |
30 | Me | Me | Phenyl | H | 7.620 | 4.551 | 0.807 | 0.694 | 0.503 | 0.766 | 2.404 | 7.705 | 7.496 | 7.672 |
31 | H | H | — | Me | 4.260 | 1.149 | 0.635 | 0.531 | 0.123 | 0.674 | 2.803 | 5.017 | 4.869 | 4.954 |
32 | H | Me | — | Me | 5.620 | 1.548 | 0.741 | 0.484 | 0.905 | 0.784 | 2.092 | 5.605 | 5.387 | 5.640 |
33 | H | H | Cyclopentyl | H | 7.046 | 3.196 | 0.743 | 0.870 | 0.331 | 0.805 | 3.356 | 6.984 | 6.870 | 7.033 |
34 | Me | H | Cyclopentyl | H | 7.699 | 3.645 | 0.758 | 0.923 | 0.349 | 0.775 | 3.268 | 7.525 | 7.359 | 7.572 |
35 | H | Me | Cyclopentyl | H | 7.523 | 3.595 | 0.557 | 0.773 | 0.232 | 0.506 | 3.348 | 7.386 | 7.271 | 7.502 |
36 | Me | Me | Cyclopentyl | H | 7.699 | 4.044 | 0.601 | 0.696 | 0.116 | 0.509 | 2.748 | 7.433 | 7.291 | 7.580 |
α37 | H | H | Ethyl | H | 6.097 | 2.253 | 0.692 | 0.776 | 0.081 | 0.738 | 2.770 | 6.081 | 5.830 | 6.178 |
α38 | H | H | n-Propyl | H | 6.959 | 2.782 | 0.708 | 0.630 | 0.106 | 0.718 | 2.768 | 6.181 | 5.991 | 6.216 |
α39 | H | H | Cyclopropyl | H | 5.499 | 2.078 | 0.732 | 0.909 | 0.287 | 0.778 | 3.293 | 6.520 | 6.366 | 6.549 |
α40 | H | H | n-Butyl | H | 7.000 | 3.311 | 0.640 | 0.471 | 0.039 | 0.570 | 3.328 | 6.445 | 6.341 | 6.536 |
α41 | Me | H | Methyl | H | 6.398 | 2.173 | 0.794 | 0.818 | 0.711 | 0.804 | 2.689 | 6.752 | 6.658 | 6.655 |
α42 | Me | H | Methylthioethyl | H | 7.796 | 2.921 | 0.669 | 0.752 | 0.192 | 0.665 | 3.799 | 6.799 | 6.694 | 6.882 |
α43 | H | Me | Methylthioethyl | H | 7.886 | 2.871 | 0.651 | 0.651 | 0.423 | 0.514 | 3.584 | 7.242 | 7.153 | 7.070 |
α44 | H | Me | Cyclohexyl | H | 7.658 | 4.154 | 0.765 | 0.640 | 0.501 | 0.794 | 2.616 | 7.120 | 6.998 | 7.077 |
α45 | H | H | H | H | 3.967 | 0.912 | 0.570 | 0.770 | 0.016 | 0.554 | 2.962 | 5.622 | 5.328 | 5.757 |
β46 | H | H | Methyl | H | 5.824 | 1.724 | 0.416 | 0.089 | 0.134 | 0.367 | 2.729 | 4.692 | 4.942 | 4.582 |
Descriptor | r | r2 | Impact coefficient (IC) | Equation |
---|---|---|---|---|
ClogP | 0.502 | 0.252 | 0.413 | 0.413 × ClogP + 5.664 |
SALL | 0.426 | 0.182 | 2.290 | 2.290 × SALL + 5.469 |
HDALL | 0.466 | 0.217 | 1.264 | 1.264 × HDALL + 6.250 |
HAALL | 0.388 | 0.151 | 1.031 | 1.031 × HAALL + 6.631 |
RALL | 0.293 | 0.085 | 1.178 | 1.178 × RALL + 6.296 |
μ | 0.107 | 0.011 | −0.141 | −0.141 × μ + 7.515 |
Perusal of the coefficients of the descriptors suggests that ClogP exhibit the highest correlation potential while the dipole moment shows the lowest one under linearity conditions. Considering the uni-variate relationship of the descriptors with the antiviral activity of the training set of the DABO analogues, the following order of correlation (R2) is observed.
ClogP > HDALL > SALL > HAALL > RALL > μ |
However, the order of impact (IC) of the descriptor follows the following order:
SALL > HDALL > RALL > HAALL > ClogP > μ |
The result of uni-variate linear correlation analyses shows that though ClogP has the highest R2 (0.252) indicating relatively high linear relatedness yet its impact (coefficient of the descriptor = 0.413) on the biological response is not the highest. SALL showed a low potential i.e. a less linear relationship with the biological response (R2 = 0.182) yet it has the highest impact (coefficient of the descriptor = 2.290) on the antiviral activity.
Table 4 presents detailed analyses of non-linear and linear chemometric methods used in the present investigation. In Table 4, ‘k’ is the no. of descriptors, ‘r2’ is the correlation coefficient, ‘q2’ is cross validated ‘r2’ from the (LOO) and N-CV procedure, rho (ρ) is the Spearman rank correlation coefficient, MSE is the mean squared error and PRESS is the predictive sum of squares.
S. no. | Model | K | r | r2 | radj2 | Spearman (rho) | PRESS | MSE | q2 | |
---|---|---|---|---|---|---|---|---|---|---|
a MS = manual selection, RBFK = radial basis function kernel, PK = polynomial kernel, SK = sigmoid kernel, LK = linear kernel. | ||||||||||
1 | MLR | MS | 6 | 0.912 | 0.832 | 0.798 | 0.813 | — | 0.096 | 0.832 |
2 | LOO | 6 | 0.850 | 0.723 | 0.666 | 0.773 | 5.791 | 0.161 | 0.719 | |
3 | NCV (N = 10) | 6 | 0.859 | 0.739 | 0.684 | 0.789 | 5.465 | 0.152 | 0.735 | |
1 | BPNN | MS | 6 | 0.923 | 0.852 | — | 0.827 | — | 0.104 | 0.818 |
2 | LOO | 6 | 0.833 | 0.693 | — | 0.749 | 6.540 | 0.182 | 0.683 | |
3 | NCV (N = 10) | 6 | 0.799 | 0.639 | — | 0.727 | 8.263 | 0.230 | 0.600 | |
1 | SVM (epsilon-radial) 36SV:RBFK | MS | 6 | 0.939 | 0.883 | — | 0.849 | — | 0.071 | 0.876 |
2 | LOO | 6 | 0.895 | 0.802 | — | 0.789 | — | 0.117 | 0.796 | |
3 | NCV (N = 10) | 6 | 0.899 | 0.809 | — | 0.785 | — | 0.112 | 0.805 | |
1 | SVM (epsilon-polynomial) 34SV:PK | MS | 6 | 0.332 | 0.111 | — | 0.676 | — | 0.587 | −0.023 |
2 | LOO | 6 | −0.643 | 0.414 | — | −0.674 | — | 0.589 | −0.027 | |
3 | NCV (N = 10) | 6 | −0.390 | 0.157 | — | −0.340 | — | 0.666 | −0.161 | |
1 | SVM (epsilon-sigmoid) 36SV:SK | MS | 6 | 0.908 | 0.824 | — | 0.803 | — | 0.103 | 0.820 |
2 | LOO | 6 | 0.846 | 0.716 | — | 0.751 | — | 0.170 | 0.704 | |
3 | NCV (N = 10) | 6 | 0.818 | 0.669 | — | 0.731 | — | 0.199 | 0.653 | |
1 | SVM (epsilon-linear) 36SV:LK | MS | 6 | 0.906 | 0.820 | — | 0.800 | — | 0.105 | 0.817 |
2 | LOO | 6 | 0.832 | 0.692 | — | 0.739 | — | 0.184 | 0.678 | |
3 | NCV (N = 10) | 6 | 0.813 | 0.660 | — | 0.710 | — | 0.203 | 0.646 | |
1 | SVM (nu-radial) 22SV:RBFK | MS | 6 | 0.930 | 0.864 | — | 0.820 | — | 0.080 | 0.861 |
2 | LOO | 6 | 0.872 | 0.760 | — | 0.768 | — | 0.140 | 0.757 | |
3 | NCV (N = 10) | 6 | 0.878 | 0.771 | — | 0.772 | — | 0.132 | 0.770 | |
1 | SVM (nu-polynomial) 18SV:PK | MS | 6 | 0.406 | 0.165 | — | 0.689 | — | 0.583 | −0.016 |
2 | LOO | 6 | −0.804 | 0.646 | — | −0.791 | — | 0.615 | −0.072 | |
3 | NCV (N = 10) | 6 | −0.423 | 0.179 | — | −0.372 | — | 0.609 | −0.062 | |
1 | SVM (nu-sigmoid) 22SV:SK | MS | 6 | 0.910 | 0.828 | — | 0.818 | — | 0.099 | 0.828 |
2 | LOO | 6 | 0.884 | 0.782 | — | 0.793 | — | 0.125 | 0.781 | |
3 | NCV (N = 10) | 6 | 0.851 | 0.724 | — | 0.768 | — | 0.160 | 0.721 | |
1 | SVM (nu-linear) 22SV:LK | MS | 6 | 0.911 | 0.829 | — | 0.814 | — | 0.098 | 0.829 |
2 | LOO | 6 | 0.876 | 0.767 | — | 0.788 | — | 0.134 | 0.766 | |
3 | NCV (N = 10) | 6 | 0.850 | 0.723 | — | 0.761 | — | 0.161 | 0.719 |
Fig. 2 Relevance scores of the 3D descriptors of DABO derivatives as obtained from the BPNN regression analysis. |
(n = 36, r = 0.923, r2 = 0.852, Spearman (rho) = 0.827, MSE = 0.104).
Of these descriptors, RALL has the most prominent enhancing effect on the anti-viral activity, followed by the lipophilicity (ClogP). HDALL shows a better effect on antiviral activity than HAALL. Leave-one-out (LOO) and N-cross validated (N-CV) methods are used to validate the results. Dipole moment (μ), a 3D electrostatic descriptor, exhibits the lowest impact on the antiviral activity.
pEC50 = +0.519 (± 0.631)ClogP + 3.908 (± 0.728)SALL + 2.320 (± 0.854)HDALL + 0.789 (± 0.297)HAALL − 4.099 (± 1.018)RALL + 0.126 (± 0.096)μ + 3.018 | (5) |
(n = 36, r = 0.912, r2 = 0.832, radj2 = 0.798, Spearman (rho) = 0.813, MSE = 0.096).
The high positive coefficients of SALL and HDALL show that they impart an enhancing effect whereas a high negative coefficient of RALL shows that it imparts an adverse effect on the anti HIV-1 activity. The ClogP, HAALL and dipole moment (μ) have comparatively less enhancing effects.
Fig. 3 presents the graph between the experimental and calculated activity (pEC50) derived from the best of MLR, BPNN and SVM models for the training set. A good fit is observed between observed and calculated activity reconfirming the robustness of the methods used.
Fig. 3 A graph of comparative analyses of observed and calculated pEC50 (MLR, BPNN and SVM) values for the training set of DABO derivatives. |
Compound no. 46 is observed as an outlier. It has unsubstituted R, bridged R′ and methyl moiety present at X, and has a moderate value of antiviral activity.
Fig. 4 A graph of comparative analyses of observed and calculated pEC50 (MLR, BPNN and SVM) values for the test set of DABO derivatives. |
It is observed that the steric attributes are important in elucidating the interaction mechanisms in the case of DABO derivatives and the present findings establish the role of steric interactions very well. The hydrogen donor attribute behaves better than the hydrogen acceptor attribute. It is again well evident that the hydrogen donor ability of 3-NH is crucial to the antiviral activity of the DABO derivatives.102
Comp. no. | R | R′ | X | Y | R′′ | pEC50 (exp) | ClogP | SALL | HDALL | HAALL | RALL | μ | pEC50 (predicted) SVM | pEC50 (predicted) BPNN | pEC50 (predicted) MLR |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
a Compounds with activity reported in approximate terms.b Virtually designed compounds. | |||||||||||||||
47a | Me | H | 1-Naphthyl | H | — | >3.699 | 5.326 | 0.559 | 0.450 | 0.373 | 0.419 | 3.599 | 7.971 | 7.790 | 8.042 |
48a | Me | Me | H | Me | — | >4.032 | 1.997 | 0.718 | 0.383 | 0.731 | 0.658 | 2.196 | 5.783 | 6.001 | 5.906 |
49a | H | H | CN | H | — | >3.699 | 0.807 | 0.624 | 0.819 | 0.025 | 0.626 | 1.218 | 5.515 | 5.057 | 5.383 |
50a | Me | H | CN | H | — | >3.699 | 1.256 | 0.813 | 0.816 | 0.757 | 0.818 | 2.718 | 6.218 | 6.466 | 6.328 |
51a | H | H | NO2 | H | — | >3.699 | −2.228 | 0.664 | 0.777 | 0.185 | 0.638 | 3.031 | 3.942 | 4.251 | 4.174 |
52a | Me | H | NO2 | H | — | >3.699 | 1.779 | 0.837 | 0.811 | 0.745 | 0.796 | 5.221 | 6.973 | 7.194 | 7.079 |
53a | H | H | COCH2CH3 | H | — | >3.699 | 2.201 | 0.718 | 0.871 | 0.061 | 0.730 | 3.998 | 6.790 | 6.706 | 6.548 |
54b | i-Propenyl | Me | 2-Methylbutyl | H | 2,6-di-Cl | — | 6.370 | 0.543 | 0.187 | 0.010 | 0.257 | 1.528 | 8.452 | 7.628 | 8.026 |
55b | i-Propenyl | H | 2-Methylbutyl | H | 2-Br,4-F,6-Cl | — | 6.270 | 0.505 | 0.116 | 0.002 | 0.226 | 1.104 | 8.288 | 7.462 | 7.730 |
56b | i-Propenyl | i-Propyl | i-Pentyl | H | 2,6-Di-Cl | — | 6.370 | 0.473 | 0.148 | 0.005 | 0.226 | 1.612 | 8.307 | 7.460 | 7.796 |
57b | i-Propyl | i-Propyl | 2-Oxobutyl | H | 2,4,6-Tri-Cl | — | 6.300 | 0.505 | 0.251 | 0.013 | 0.245 | 1.655 | 8.508 | 7.659 | 8.058 |
58b | i-Propyl | i-Propyl | 2-Oxobutyl | H | 2,6-Di-Br | — | 5.880 | 0.553 | 0.283 | 0.013 | 0.279 | 1.912 | 8.328 | 7.620 | 7.997 |
59b | i-Propenyl | i-Propyl | 2-Oxobutyl | H | 2,6-Di-Br,4-F | — | 5.770 | 0.565 | 0.316 | 0.010 | 0.280 | 2.526 | 8.402 | 7.655 | 8.132 |
A virtual dataset of 150 compounds, with structure similarity with the DABO derivatives is created, and their anti-viral activities are predicted using the best derived SVM (ε-RBFK) model, and for comparison sake the activities predicted using BPNN and MLR, are shown. Care is taken to obtain molecules with not only better activities but also with favourable substitution(s). Finally, six compounds (comp. nos 54–59) with antiviral activity higher than 8.25 are extracted, which could be earmarked for synthesis and subsequent development as lead or probable drug derivatives. The results suggest that substitution of the di-fluoro groups at the 2 and 6 positions in ring ‘B’ by chloro and/or bromo and tri-substitution at the 2,4 and 6 position by Br/Cl/F, smaller modification in R and X, yielded better inhibitory activity (pEC50). Therefore, compounds designed in silico may reduce the time frame in the search for better drug like candidates. Table 4 records the six structures that are virtually generated using the DABO like template. The template used for deriving the virtual data set is also given in Table 5. Table 5 also records all the descriptors and predicted effective concentration (pEC50) for the virtual data set.
Fig. 5 presents the graphical comparison of predicted values of pEC50, estimated using the SVM technique for the compounds with approximate activity and that of the virtual dataset.
Fig. 5 A graph depicting the estimated pEC50 values for compounds with approximate activity and the VDS of DABO analogues. |
The structures of the virtually designed compounds, exhibiting pEC50 values > 8.25, that are identified as lead/drug molecules from the virtual data set are presented in Fig. 6.
Table 6 presents the respective distances (in Å) from the nearest amino acid residues of the binding pocket for the TNK-651, highest active ligand (26), compound no. 47 and the virtually designed compounds (54–59). From Table 5, it is clear that compound no. 26 is better bound as compared to TNK-651 in NNIBP, due to closer interaction with the surrounding amino acid residues. Compound no. 47 (with the highest predicted activity amongst compounds whose activity is reported in approximate terms) and compound no. 26 exhibit similar interactions based on their respective distance from the referred amino acids. The comparison made above suggest that all compounds of VDS show more proximity with the surrounding amino acids and thus resulting in better interactions. Thus, the order of proximity and thereby possibilities of closer interaction in NNIBP is as follows: VDS compounds > 26 ≅ 47 > TNK-651. Among all the compounds of VDS, it is observed that compound no. 57 (also whose predicted activity is the highest, 8.508) is in relatively nearer vicinity than the other VDS compounds and thus better binding can be interpreted in terms of more favourable interactions. Compared to compound no. 26, five of the VDS compounds have shown better interactions with the amino acids while compound no. 55 whose activity, though a little less, has shown nearly similar binding.
Amino Acid Residues | Compounds | ||||||||
---|---|---|---|---|---|---|---|---|---|
TNK_651 | 26 | 47 | 54 | 55 | 56 | 57 | 58 | 59 | |
a The compounds whose activity is reported in approximate terms. | |||||||||
LYS101 | 2.606 | 1.626 | 1.997 | 1.619 | 1.643 | 1.643 | 1.466 | 1.449 | 1.879 |
LYS103 | 4.310 | 3.868 | 3.768 | 4.195 | 4.096 | 3.817 | 3.578 | 4.169 | 3.558 |
LEU234 | 3.576 | 4.236 | 2.930 | 4.626 | 3.701 | 4.256 | 3.252 | 3.299 | 3.300 |
PRO236 | 3.610 | 3.590 | 4.718 | 3.538 | 3.581 | 4.094 | 3.630 | 3.596 | 3.753 |
TRP229 | 5.026 | 4.747 | 3.812 | 4.367 | 4.945 | 4.229 | 3.321 | 4.101 | 4.750 |
TYR188 | 4.156 | 3.700 | 4.054 | 2.594 | 3.327 | 2.489 | 3.568 | 3.032 | 3.298 |
TYR318 | 2.839 | 3.302 | 3.297 | 3.636 | 1.618 | 3.379 | 3.613 | 1.787 | 3.701 |
The interactions of the docked pose of compound nos 26, 47 and 57 with the 1JLA molecule are presented in Fig. 8A–I. Electrostatic contour maps (Fig. 8B, E and H) indicate the favourable polar interactions between the protein and the ligand.
Fig. 8 (A–I) Hydrogen bond, electrostatic and hydrophobic interactions of comp. no. 26 (A–C), comp. no. 47 (D–F) and comp. no. 57 (G–I) with surrounding amino acid residues. |
The following observations are that the compounds with bulkier substitution at R′′, isopropyl and isopropylene at R and R′ and oxo-butyl group as X show better activity. The blue contour enclosing most of the molecular region represent positive interactions favouring nucleophilic assortments. The hydrogen bond interaction observed for virtual compound no. 57 is similar to that of compound no. 26. Isopropyl group (R and R′), and chlorine attached on ring B are inserted into the blue pocket exhibiting favourable hydrophobic interactions. The oxo-butyl group inserted into the red pocket shows favourable hydrophilic interactions. The ring B also shows π–π stacking behaviour with TYR188 suggesting the stability of the ligand–protein complex.
The chemometric analyses for better understanding of interactive capability of DABO derivatives within the NNIBP have yielded surprising results. The biological response of the antiviral compounds is enhanced mainly by the presence of steric attributes, hydrogen donor capabilities and lipophilicity, and bulkier groups at R, R′ and X are suggested for better biological response. On the basis of results obtained from VDS, it can be concluded that compounds with 2,6,di-chloro and/or di-bromo and tri-halo-substitution on ring-B and minor modifications of R and X are better performers. The better performance of SVM and BPNN models over the MLR model is suggestive of the fact that there invariably exists some degree of non-linear relationship between the anti-viral activity and descriptors used. In addition, among the chemometric tools, SVM certainly has better applicability as well as interpretability in terms of varied kernel settings leading to a search spanning from linear to nonlinear (radial, sigmoid, polynomial) relationships. The statistical results of the training set are validated and complimented by the test set. A significant point observed while comparing all the VDS structures concludes that the chain length of X affects activity but the chain length for X should be five bonds or less to get enhanced activity and the presence of O-atom is still more beneficial. Thus, the better activity of compound no. 57 can be attributed to the presence of the oxo-butyl group and exhibit favourable polar interactions with PRO236, TYR318 and VAL106 residues in the binding pocket. The relative orientation of the aromatic moieties of the ligand with respect to the orientation of the aromatic moieties of the receptor and their involvement in stacking type interactions, guide the polar interactions. The predicted antiviral activities of synthesized DABOs (whose activities are reported in approximate terms) show a wide gap. The in silico generated six virtual compounds showed a high biological response, and thus they can be used as precursors or lead compounds and can be synthesized and tested.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c4ra15478a |
This journal is © The Royal Society of Chemistry 2015 |