DOI: 
10.1039/C6RA04104C
(Paper)
RSC Adv., 2016, 
6, 28171-28186
Understanding the structural requirements of cyclic sulfone hydroxyethylamines as hBACE1 inhibitors against Aβ plaques in Alzheimer's disease: a predictive QSAR approach†
Received 
15th February 2016
, Accepted 9th March 2016
First published on 11th March 2016
Abstract
Beta (β)-site amyloid precursor protein cleaving enzyme 1 (BACE1) is one of the most important targets in Alzheimer's disease (AD), which is responsible for production and accumulation of beta amyloid (Aβ). The Aβ plaques are one of the major hallmarks, which are believed to be the fundamental cause of AD. Thus, a rational treatment strategy is to find efficient BACE1 inhibitors against AD. In the present study, we have developed two quantitative structure–activity relationship (QSAR) models (viz. two dimensional-QSAR and group based-QSAR), employing 91 cyclic sulfone (or sulfoxide) hydroxyethylamines as human BACE1 (hBACE1) inhibitors to identify the important structural features that are responsible for their activity. To our knowledge, this is the first QSAR report for this chemical category of BACE1 inhibitors. The developed models were stringently validated utilizing various validation metrics including the recently proposed mean absolute error (MAE)-based criteria. Based on this study, we observed that MAE-based criteria are the most appropriate way to determine the true prediction quality of the models. Unlike R2-based metrics, the MAE-based criteria were found to provide an unbiased judgment irrespective of the response range and/or distribution of response values around the training/test mean. The mechanistic interpretation of each model was performed in detail, while focusing on the important substructures that were found favorable or unfavorable for hBACE1 activity.
1. Introduction
1.1. Alzheimer's disease
Alzheimer's disease (AD) is the most common cause of dementia affecting mostly older people. It is a highly complex disease, which is associated with multiple factors that gradually diminish thinking skills, memory and eventually the capability to do basic activities of daily living (ADLs).1 The root cause of AD is unclear; however it is assumed that multiple factors like genetic, environmental, and lifestyle factors are responsible for developing AD. Similarly, the treatment of AD is still uncertain, despite several existing hypotheses such as cholinergic, tau, and amyloid hypothesis, etc. Amyloid plaques and the neurofibrillary tangles are believed to be the major hallmarks of the disease, which were first identified by Dr Alois Alzheimer.1–3 There is no cure for AD, and the existing US-FDA approved drugs provide just symptomatic treatment, without any beneficial effect for preventing or stopping the progression of the disease. The approved drugs belong to two categories, viz. acetylcholine esterase (AChE) inhibitors (AChEI) like tacrine, donepezil, galantamine and rivastigmine, and an N-methyl-D-aspartate (NMDA) receptor antagonist, namely, memantine.2,4
As stated earlier, amyloid plaque (Aβ) is one of the major hallmarks of AD, which is accountable for the initiation of a neurotoxic cascade that eventually leads to neuronal death. Beta (β)-site amyloid precursor protein cleaving enzyme 1 (BACE1, also known as β-secretase1) is responsible for amyloidogenic cleavage of an integral membrane protein, viz. amyloid precursor protein (APP), which leads to the production and accumulation of Aβ. The formation of Aβ is a two-step process, first, the cleavage of APP occurs by BACE-1 to form a β-secretase-derived C-terminal fragment of APP, followed by an action of γ-secretase to generate Aβ isoforms ranging from 37 to 42 amino acid residues. Aβ40 is the most abundant isoform, whereas the Aβ, which is mainly associated with AD pathogenesis, is aggregated Aβ42. Thus, a rational way to reduce the generation and aggregation of Aβ is inhibition of the BACE1 activity, as also observed in various reported studies.5–7
1.2. QSAR in drug design
Quantitative Structure Activity Relationship (QSAR) plays a vital role in lead optimization step in any drug discovery program.8,9 It's an important part of computer aided drug design (CADD) and is significantly utilized to save time, money, and more importantly animal sacrifice. The main role of QSAR models is to develop a quantitative relationship between descriptive properties or descriptors (computed from chemical structures under study) and the dependent variable (response) of our interest. The dependent variable can be biological activity or toxicity or any other property. Once a QSAR model is built, it is employed in understanding the structural features responsible for a property. The model also provides useful information to design new molecules with improved property of our interest. Moreover, the developed model can be utilized for predicting the activity of newly designed compounds. However in 2D-QSAR, the extracted favorable or unfavorable structural information that aids in designing new molecules to improve the property of interest is in general not site-specific. On the other hand, in the group based-QSAR (G-QSAR) technique, every molecule of the data set is considered as a set of fragments, the fragmentation scheme being either template based or user defined, and various descriptors are calculated for each fragment of the molecule. Thus, G-QSAR provides site-specific information by simply correlating the descriptors computed for substructures or fragments present at defined substitution sites with the property of interest. As a result, the selected descriptors in the final G-QSAR model directly point out the substitution sites that are needed to be optimized for improving the respective property. The detailed information related to the G-QSAR technique is well explained in the literature.10
Importantly, to rely on a QSAR model with a great confidence, one should be aware of the actual quality of the developed QSAR model. The role of validation metrics in evaluating the quality of developed QSAR model is thus a vital step during QSAR model development. At present, there are several validation metrics that are employed for proper validation of QSAR models. Note that all the essential validation metrics that are usually employed in QSAR studies are mentioned in Section 2.4.
In this study, we have developed two QSAR models to understand the relationship between the different structural features of a particular class of ligands and the human BACE1 (hBACE1) activity against AD. Here, we have given special attention to the validation of the developed QSAR models, while discussing some limitations of classical validation parameters and further suggested the use of recently proposed MAE-based criteria11 for appropriate judgment of prediction quality.
2. Materials and methods
2.1. Dataset
A congeneric series of 91 cyclic sulfone (or sulfoxide) hydroxyethylamine derivatives along with their activity values against hBACE1 was collected from three source articles published by the same group, i.e., article ‘1’ (21 compounds),12 ‘2’ (53 compounds)13 and ‘3’ (17 compounds).14 The first two sets represent cyclic sulfones while the third set cyclic sulfoxides. To our knowledge, no QSAR study has been reported so far for these sets of compounds. The combined set of 74 compounds from the first two articles12,13 were later divided into a training set and a test set, and were employed in model development process, while 17 compounds from the third article14 were utilized as a true external set for validation of the developed models. All the structures are provided as sdf files, namely, “DataSet.sdf” (data set comprising 74 compounds) and “ExternalSet.sdf” (external set comprising 17 compounds) in ESI,† and the chemical structures along with their activity values against hBACE1 are shown in Table 1. The activity/response values of all the compounds expressed as IC50 values (μM) were converted to pIC50 values for performing the QSAR studies. For G-QSAR studies,10 one compound (compound 16; see Table 1) was removed from the data set as it did not share the common scaffold. For the 2D-QSAR study, the whole molecular structures were utilized to compute descriptors, while in case of G-QSAR modeling, the fragmentation of the molecules is a prerequisite step to designate substitution sites (i.e., R1 and R2 in this study). The common scaffold and the substitution sites are shown in Fig. 1. All the structures were drawn using the Marvin Sketch software.15
Table 1 Compound number, structure and their respective activity values against human BACE-1 (hBACE-1) for all the compounds present in the dataset
		
| 
 | 
| Compound number | R1 | R2 | R3 | R4 | X | hBACE-1 pIC50 | 
| 22 | H | H | H | Me | C | 6.2147 | 
| 23 | NH2 | F | H | Me | C | 6.4559 | 
| 24 | NH2 | F | Br | Me | C | 7.2596 | 
| 25 | NH2 | H | Br | Me | C | 6.8539 | 
| 26 | NH2 | F | Br | F | C | 5.8633 | 
| 27 | NH2 | H | OCF3 | Me | C | 6.8861 | 
| 28 | NH2 | F | nPr | Me | C | 6.8861 | 
| 29 | NH2 | H | H | Me | N | 5.7905 | 
|  | 
|  | Fig. 1  The common scaffold (cyclic sulfone hydroxyethylamine) and the substitution sites as employed in the G-QSAR study. |  | 
2.2. Descriptor calculation
For developing the 2D-QSAR model, 644 descriptors were calculated employing Cerius 2 version 4.10 (ref. 16), PaDEL-Descriptor version 2.11 (ref. 17), and Dragon 6 (ref. 18) software. The total pool includes different classes of descriptors such as electrotopological state keys, electronic, spatial, structural, thermodynamic, topological, constitutional, functional group counts and extended topochemical atom (ETA) indices.19 In G-QSAR modeling, a relationship between activity and descriptors calculated for each molecular fragment at assigned substitution sites (R1, and R2) is derived. Thus, 335 descriptors (physiochemical, and atom-type based) using VLifeMDS version 3 (ref. 20) and 495 descriptors using PaDEL-Descriptor version 2.11, and Dragon 6 software were calculated for both the fragments substituted at R1, and R2 positions. After calculating the descriptors for respective QSAR models, data pretreatment was done to remove constant (variance cut-off < 0.0001) and inter-correlated (correlation coefficient cut-off ≥ 0.99) descriptors using an in-house developed software tool (DataPreTreatment 1.2 (ref. 21)). After data pretreatment, around 431 and 991 descriptors remained for 2D-QSAR and G-QSAR model development, respectively. Note that all the in-house developed tools employed in this study are freely available at http://dtclab.webs.com/software-tools and http://teqip.jdvu.ac.in/QSAR_Tools/.
2.3. Dataset division
For 2D-QSAR, the 74 compounds from first 2 articles12,13 as stated before were rationally divided into a training set (52 compounds) and a test set (22 compounds) based on the Kennard–Stone algorithm using the in-house developed tool (Dataset Division 1.2),21 while 17 compounds from 3rd article14 were employed as the true external set. In the Kennard–Stone algorithm,22 at first two compounds with the highest distance (here, we have utilized Euclidean distance as a measure of distance) between them are placed in the training set, and the remaining training set compounds are then selected by finding maximum of the minimum distances between the selected training set compounds and the remaining compounds of the dataset. The algorithm continues until the user defined percentage of compounds is placed in the training set, while the remaining compounds in the dataset are employed as the test set. Thus, Kennard–Stone algorithm led to the selection of the most diverse compounds in the training set. For G-QSAR, the same algorithm was employed for division of the data set comprising 73 compounds into the training set (51 compounds) and the test set (22 compounds), while the true external set comprises the same 17 compounds as in 2D-QSAR.
2.4. Model development and validation
Here, we have selected descriptors and built models exploiting different chemometric techniques, namely, genetic algorithm (GA) (using Genetic algorithm 4.0),21 best subset selection (using MLR BestSubsetSelection 2.1),21 multiple linear regression (MLR; using MLRplusValidation 1.3)21 and partial least squares (PLS; using Minitab software23) following a scheme as shown in Fig. 2.
|  | 
|  | Fig. 2  Flowchart depicting the steps involved in the development of 2D-QSAR and G-QSAR models. |  | 
In the Genetic algorithm, the fitness of a QSAR equation is usually evaluated using a fitness score, which is computed using a fitness function. In our in-house developed Genetic algorithm 4.0 tool,21 the fitness function is based on the prediction error-based metrics, i.e., mean absolute error (MAE) and MAE + 3 × σAE that were recently reported in the literature.11 Here, ‘σAE’ stands for the standard deviation of the absolute error (AE) values. The general formula (eqn (1)) for the fitness function is as follows:
|  | |  | (1) | 
where, 
F = fitness score, 
k = number of parameters (here, 
k = 2), 
Pi = value of the respective parameter for a equation, 
![[P with combining circumflex]](https://www.rsc.org/images/entities/i_char_0050_0302.gif) i,ideal
i,ideal = ideal value of that parameter, 
![[P with combining circumflex]](https://www.rsc.org/images/entities/i_char_0050_0302.gif) i,threshold
i,threshold = threshold value of that parameter.
The parameters involved in this fitness function along with their ideal and threshold values are shown in Table 2. Here, a higher fitness score signifies a better fitness of the QSAR model in terms of actual prediction quality. The same fitness function is utilized in the best subset selection tool (MLR BestSubsetSelection 2.1)21 employed during finalizing the best model (see Fig. 2).
Table 2 The parameters along with their ideal and threshold values as employed in the fitness function of the in-house developed genetic algorithm tool
		
| Parameter | Ideal value | Threshold value | 
| TrainYObsRange = range of training set observed response values. | 
| MAE | 0.0 | 0.15 × TrainYObsRangea | 
| MAE + 3 × σAE | 0.0 | 0.25 × TrainYObsRange | 
Further, different statistical metrics were employed for validation of the developed models, encompassing training set, test set, and true external set validation tests, along with the Y-randomization test utilizing an in-house developed software tool (MLRplusValidaion 1.3).21 The statistical quality and validation metrics computed to check the quality of models were determination coefficient (R2), adjusted R2(Ra2), leave-one-out (LOO) cross-validated determination coefficient (Q2), scaled rm2 metrics24 such as  and Δrm(training)2 for internal validation; QF12, QF22,25 concordance correlation coefficient (CCC),26
 and Δrm(training)2 for internal validation; QF12, QF22,25 concordance correlation coefficient (CCC),26  and Δrm(test)2 for test set as well as true external set validation, and cRp2 based on the Y-randomization results.27 In the Y-randomization test, the activity values of the training set compounds were randomly shuffled ‘n’ number of times keeping the descriptor matrix unchanged and ‘n’ new MLR models were built based on the shuffled activity values. In this study, we have built 50 random models. For a robust model, the average R2 and Q2 of the randomized models (i.e., average Rr2, and Qr2) must be significantly lower than R2 of the non-random (original) model, and also cRp2 must be greater than 0.5. All three models were also assessed for model acceptability criteria that are suggested by Golbraikh and Tropsha,28 as mentioned below.
 and Δrm(test)2 for test set as well as true external set validation, and cRp2 based on the Y-randomization results.27 In the Y-randomization test, the activity values of the training set compounds were randomly shuffled ‘n’ number of times keeping the descriptor matrix unchanged and ‘n’ new MLR models were built based on the shuffled activity values. In this study, we have built 50 random models. For a robust model, the average R2 and Q2 of the randomized models (i.e., average Rr2, and Qr2) must be significantly lower than R2 of the non-random (original) model, and also cRp2 must be greater than 0.5. All three models were also assessed for model acceptability criteria that are suggested by Golbraikh and Tropsha,28 as mentioned below.
(i) Q2 > 0.5
(ii) r2 > 0.6
(iii) |r02 − r′02| < 0.3
(iv) (r2 − r02)/r2 < 0.1 and 0.85 ≤ k ≤ 1.15 or,
(r2 − r′02)/r2 < 0.1 and 0.85 ≤ k′ ≤ 1.15
Further, we have judged the model prediction quality (training, test and external sets) using the mean absolute error (MAE)-based criteria, recently reported in the literature11 and computed using an in-house developed tool (XternalValidationPlus 1.1).21 The criteria categorize the model prediction quality into ‘Good’, ‘Moderate’ and ‘Bad’ considering the values of MAE, and standard deviation of the absolute error (AE) values (σAE) and are defined as follows:
(i) Good predictions:
MAE ≤ 0.1 × trainingSetRange AND MAE + 3 × σAE ≤ 0.2 × trainingSetRange
(ii) Bad predictions:
MAE > 0.15 × trainingSetRange OR MAE + 3 × σAE > 0.25 × trainingSetRange
(iii) Moderate predictions: the predictions which do not fall under either of the above two conditions are considered as of ‘moderate’ quality.
As a requisite, the above criteria are applied for judging the prediction quality when the number of compounds is more than 10 and there is no systematic error in the model predictions.11,29 Note that the XternalValidationPlus 1.1 tool21 also facilitates the prediction of any systematic error.
Importantly, the MAE based criteria are determined after removing 5% compounds with high absolute error values in order to obviate the possibility of any outlier predictions. The judgment based on the 95% data is very helpful for the training/test/external sets containing a small number of badly predicted compounds where the MAE based criteria for the 100% data can misleadingly penalize the model predictivity. In the literature,11 the above criteria were suggested for test/external set validation only, but in this study we have also checked the prediction quality of the training set using this approach by employing the predicted response values obtained from the LOO analysis.
Finally, the applicability domain of the developed models was determined employing the standardization approach30 that identifies the X-outliers present in the training set in addition to those test/external set compounds which are outside the applicability domain. The calculations were carried out using an in-house developed ADUsingStandardizationApproach 1.0 tool.21
3. Results and discussion
In this section, we have shown the final QSAR models, and then discussed in detail the statistical significance along with the mechanistic interpretation revealed from the selected descriptors in the respective models. The values of the selected descriptors and activity for all the compounds present in the training, test and external sets, for both 2D-QSAR and G-QSAR models, are provided in ESI.†
3.1. 2D-QSAR (PLS) model and validation results
| Box 1: 2D-QSAR modelNumber of components: 4|  | | pIC50 = −1.4177 − 0.21573 × “F06[C–C]” + 0.53155 × “B04[O–F]” − 0.91982 × “Atype_C_27” + 0.02857 × “Vm” − 0.19661 × “F07[N–F]” | (2) | 
 Validation metrics Training set: Ntraining = 52; R2 = 0.8318, Ra2 = 0.8135, Q2 = 0.764,  = 0.7038, Δrm(training)2 = 0.123. Test set: Ntest = 22; Rtest2 = 0.8133, Rpred2 or QF12 = 0.801, QF22 = 0.8007, CCC = 0.899,  = 0.643, Δrm(test)2 = 0.065. True external set: Nexternal = 17; Rexternal2 = 0.8236, Rpred2/QF12 = 0.7311, QF22 = 0.667, CCC = 0.745,  = 0.256, Δrm(test)2 = 0.496. Y-randomization test results: average Rr2 = 0.0787, average Qr2 = −0.1895, cRp2 = 0.7962 Golbraikh and Tropsha's model acceptability criteria: all criteria passed for the test set. For the external set, one criterion failed i.e. |r02 − r′02| = 0.36876, where |r02 − r′02| should be less than 0.3. The justification is mentioned in the text. | 
The final 2D-QSAR PLS model (see eqn (2) in Box 1) comprises 5 descriptors, and the number of components is equal to 4. Ntraining, Ntest and Nexternal indicate the number of compounds in the training, test and external sets, respectively. All the validation metrics (see the respective values mentioned in Box 1) show statistically acceptable values for the training and test sets. But for the external set, we observed that one of the Golbraikh and Tropsha's criteria (i.e., |r02 − r′02| < 0.3) in addition to a few metrics like rm2 metrics, CCC show poor values, although the prediction errors for most of the compounds in the external set were found to be significantly low, except for compound 81 which was poorly predicted. The poor prediction for compound 81 by the 2D-QSAR model was expected, as its experimental (observed) activity value ‘3.824’ is clearly outside the response domain of the 2D-QSAR model, as the training set compound with lowest activity value (i.e., compound 17) is equal to ‘4.18’. Moreover, compound 81 was also found to be outside the applicability domain (mentioned later) computed based on its descriptor values. This supports the suggested approach of determining the prediction quality based on MAE-based criteria after removing 5% erroneous data, since such a small number of highly erroneous predictions (compounds 17 and 81 here) may incorrectly penalize the validation metrics. Further, after removing 5% data with high prediction errors (thus, compound 81 was removed), one should expect that the criteria (|r02 − r′02| <0.3) and the parameters, i.e., rm2 metrics, CCC will now show better results, but in this case we observed no significant improvement in the results. The reason for this unexpected result is the substantial decrease in the response range of the external set from 4.875 to 2.423 after removing compound 81, which led to the failure of the r02 metric and thus the derived metrics like rm2 metrics. The same reason also led to the poor values of CCC parameter; similar cases are also reported in the literature.11 Furthermore, the Rpred2 or QF12 parameter (=0.7311) shows an over-predicted value, which was expected since the distribution of response values of the external set are significantly away from the response mean of the training set. This can be quickly noticed from the difference in the response mean of the external set (=7.43) and the training set (=6.89) as shown in Table 3. So, in such cases, QF22 parameter (=0.667), which considers the test/external set mean rather than training set mean, provides a true reflection of the model quality.
Table 3 Error-based metrics and prediction quality for the training, test, and external sets employed in the 2D-QSAR model
		
| Set | MAE (100%) | σAE (100%) | MAE (95%) | σAE (95%) | MAE + 3 × σAE (95%) | Prediction quality | Response range | Response mean | 
| Training | 0.3797 | 0.2674 | 0.3415 | 0.2240 | 1.0135 | Moderate | 4.5185 | 6.8917 | 
| Test | 0.3356 | 0.2152 | 0.2923 | 0.1709 | 0.8050 | Good | 3.1383 | 6.9262 | 
| True external | 0.4274 | 0.4827 | 0.3197 | 0.1950 | 0.9046 | Moderate | 4.8751 | 7.4292 | 
Finally the quality of predictions was judged based on the MAE-based criteria after removing 5% data with high prediction errors and were found to be moderate, good and moderate for training, test and external sets, respectively (see Table 3). These predictions of model quality are devoid of any bias from the range and/or distribution of response values.
The Y-randomization test further confirms the robustness of the developed 2D-QSAR models with a significant difference in the average Rr2 and Qr2 values of random models compared to the R2 and Q2 values of the original model, and also the cRp2 value (=0.7962) is significantly better than the required threshold value (0.5). The applicability domain of the 2D-QSAR model was determined using the standardization approach and the compounds which are X-outliers (training set) and those compounds which are outside the applicability domain (test set and external sets) are shown in Table 4.
Table 4 The results for applicability domain determination using standardization approach
		
| Model | Training set compounds (X-outliers) | Test set compounds (outside the applicability domain) | External set compounds (outside the applicability domain) | 
| 2D-QSAR | 18, 29, 47, 48, 73 | 46, 68 | 81, 89, 91 | 
| G-QSAR | 17, 18, 68 | — | 81, 91 | 
3.2. G-QSAR (MLR) model and validation results
| Box 2: G-QSAR modelValidation metrics|  | | pIC50 = 2.1348(±0.727) + 0.19275(±0.08212)“R1_F05[C–N]” − 0.59003(±0.18578)“R1_NssssC” − 0.6341(±0.3246)“R2_nPyridines” + 0.3551(±0.07142)“R2_nC” + 0.03238(±0.0056)“R1_SHBa” | (3) | 
 Training set: Ntraining = 51; R2 = 0.8261, Radj.2 = 0.806, Q2 = 0.764,  = 0.677, Δrm(training)2 = 0.0983. Test set: Ntest = 22; Rtest2 = 0.791, Rpred2 or QF12 = 0.7474, QF22 = 0.726, CCC = 0.846,  = 0.7145, Δrm(test)2 = 0.017. True external set: Nexternal = 17; Rexternal2 = 0.8824, Rpred2/QF12 = 0.896, QF22 = 0.8665, CCC = 0.9245,  = 0.737, Δrm(test)2 = 0.126. Y-randomization test results: average Rr2 = 0.0917, average Qr2 = −0.2104, cRp2 = 0.7828. Golbraikh and Tropsha's model acceptability criteria: all criteria passed for both the test and external sets. | 
The final G-QSAR MLR model (see eqn (3) in Box 2) comprises 5 descriptors and all the validation metrics (see the respective values mentioned in Box 2) show statistically acceptable values for the training, test sets as well as the external set. It is interesting to note that the criteria and the parameters failing for the external set during 2D-QSAR model validation were found to pass during the G-QSAR model validation. These results were expected since the prediction quality of compound 81 for the external set made by the G-QSAR model was found to be significantly better compared to the prediction made by the 2D-QSAR model. However, compound 81 is predicted to be outside the applicability domain for the G-QSAR model using the standardization approach (mentioned later; see Table 4). Further, the results are actually improved after removal of 5% data with high errors, as in such case, compound 81 was not removed, while compound 80 with the highest error was removed, and thus the response range was not reduced but remained the same. Moreover, the Rpred2 or QF12 parameter (=0.8961) again showed over-prediction, but this time the magnitude of over-prediction was comparatively lower as the prediction error itself was low. The QF22 parameter (=0.8665) again provides here a true reflection of the model quality.
The quality of predictions based on the MAE-based criteria after removing 5% data with high prediction errors were found to be moderate, moderate and good for the training, test and external sets, respectively (see Table 5). The Y-randomization test further confirms the robustness of the developed G-QSAR models with a significant difference in the average Rr2 and Qr2 values of random models compared to R2 and Q2 values of the original model, and the cRp2 value (0.7828) is appreciably better than the required threshold value (0.5). The applicability domain of the G-QSAR model was determined using the standardization approach, and the compounds which are X-outliers (training set) and outside the applicability domain (test set and external set) are shown in Table 4. Notably, compound 18 as an X-outlier and compounds 81 and 91 as outside the applicability domain were identified by both the models.
Table 5 Error-based metrics and prediction quality for the training, test, and external sets employed in G-QSAR model
		
| Set | MAE (100%) | σAE (100%) | MAE (95%) | σAE (95%) | MAE + 3 × σAE (95%) | Prediction quality | Response range | Response mean | 
| Training | 0.4061 | 0.2570 | 0.3743 | 0.2289 | 1.0611 | Moderate | 4.5185 | 6.8438 | 
| Test | 0.3811 | 0.2671 | 0.3233 | 0.1998 | 0.9226 | Moderate | 3.1461 | 7.1007 | 
| True external | 0.3407 | 0.2184 | 0.3108 | 0.1861 | 0.8693 | Good | 4.8751 | 7.4292 | 
Summarizing the observations made during validation of both the models, it appears that the response range and/or distribution of response values around the training/test mean significantly affect the R2-based metrics, but the error-based metrics remains unbiased for the response range and/or distribution of response values; such observations are already studied and reported in detail in the literature.11,31
3.3. A failed QAAR model
In this study, we have also attempted to develop a quantitative activity–activity relationship (QAAR) model for studying the selectivity of compounds against hBACE1 over human cathepsin D (hCatD), due to high active site similarity between these two enzymes. Similar type of QAAR models have been previously reported in the literature for other targets.32–34 The sufficient number of compounds (i.e., 83) with the activity information for both the enzymes and a considerably wide activity range of 3.183 were considered suitable to perform the QAAR study. To study selectivity, the values of (hBACE1 pIC50–hCatD pIC50) computed for all 83 compounds were considered as the response values. The descriptors calculated for the 2D-QSAR and G-QSAR study were combined and used to build the QAAR model employing the chemometric techniques in similar fashion as shown in the flowchart (Fig. 2). However, not a single valid model was obtained after initially considering three sets, i.e., training, test and external set, as previously employed for other QSAR models. Thus, we combined the data from all the three source articles and then the entire data was rationally divided utilizing the Kennard–Stone algorithm into two sets only, i.e., a training set and a test set and, no true external set was considered. Now, interestingly the developed QAAR models were found to be valid based on all the classical and/or R2-based validation parameters as well as Golbraikh–Tropsha's criteria; but none of these models were found to be appropriate based on MAE-based criteria, which suggested that the amount or magnitude of prediction errors found in the best possible QAAR model were not satisfactory. For reader's assessment, we have summarized the values of all the validation parameter for one of the top QAAR models (see Box S1 in the ESI†) and also discussed that how MAE-based metrics provides a true picture of the prediction quality of QSAR models over classical and/or R2-based parameters (see ESI†).
3.4. Contribution of descriptors selected in 2D-QSAR and G-QSAR models
3.4.1. 2D-QSAR descriptors. In the 2D-QSAR model, five descriptors were selected (see eqn (2)), and the contributions of each descriptor for the activity against hBACE1 and the key structural features responsible for the activity are discussed below:F06[C–C]: this 2D atom pairs descriptor refers to the frequency of two carbon atoms (C–C) at a topological distance equal to 6 in a molecule. As suggested in the 2D-QSAR (PLS) equation, this descriptor contributes negatively to the hBACE1 activity. The descriptor value is well correlated with the different groups at R1 position (positions as shown in Fig. 1). According to this descriptor, a long carbon chain and/or groups with high carbon content at 3rd position of phenyl group (R1 position) are unfavorable for the activity. For instance, compounds 4, 7, 10, and 36 (as shown in Fig. 3) have high descriptor values, and accordingly, these compounds are comparatively poorly active. The encircled substructures in Fig. 3 are found responsible for the higher values of this descriptor and are thus categorized as unfavorable substructures.
|  | 
|  | Fig. 3  Compounds 4, 7, 10 and 36 with high F06[C–C] descriptor values (descriptor values are shown in brackets) along with their respective hBACE1 activity (pIC50). The encircled substructures were found responsible for higher descriptor values. |  | 
B04[O–F]: this 2D atom pairs descriptor refers to the presence (descriptor value equals to ‘1’) or absence (descriptor value equals to ‘0’) of oxygen and fluorine (O–F) at the topological distance 4. As suggested by the 2D-QSAR (PLS) equation, this descriptor contributes positively, which means that the presence of O–F at a topological distance 4 is favorable to the hBACE1 activity. This feature is clearly observed in many highly active compounds, like 64, 67, 69, 70, and 71 etc. This feature is observed at R1 position (phenyl substituted with fluorine and -oxy group at 3rd and 5th positions, respectively) of the molecules as shown (encircled) in Fig. 4.
|  | 
|  | Fig. 4  Compounds 64 and 69 comprising oxygen and fluorine (O–F) at a topological distance of 4 (respective substructure is encircled). |  | 
Atype_C_27: this descriptor belongs to the “atom type” category,35 and here Atype_C_27 designates a carbon atom as in R⋯CH⋯X, where ⋯ represents aromatic bonds as in benzene or delocalized bonds such as the N–O bond in a nitro group; R represents any group linked through carbon; X represents any heteroatom (O, N, S, P, Se, and halogens). This descriptor contributes negatively, and hence the presence of this atom type (i.e., aromatic heterocyclic rings) at both R1 (as seen in structures 29 and 30; see Fig. 5) and R2 (see Fig. 10) positions is found to be unfavorable for the activity.
|  | 
|  | Fig. 5  Compounds 29 and 30 comprising of ‘Atype_C_27’ atom type that denotes ‘aromatic heterocyclic rings’ (encircled) along with their respective hBACE1 activity values (pIC50). |  | 
Vm: this descriptor belongs to the Weighted Holistic Invariant Molecular (WHIM) descriptor category. It captures the relevant molecular 3D information regarding molecular size, shape, symmetry, and atom distribution with respect to invariant reference frames.36 Here, ‘Vm’ descriptor refers to the total size index of the molecule weighted by the mass. It contributes positively and corroborates well with the size of R2 substituents, i.e., as the size of the R2 substituents increases, the activity values against hBACE1 are found to increase as demonstrated in Fig. 6.
|  | 
|  | Fig. 6  Comparison of compounds 17 and 19 on the basis of ‘Vm’ descriptor value and thus effect of size of R2 substituent's (encircled) on their respective hBACE1 activity value. |  | 
F07[N–F]: this 2D atom pairs descriptor refers to the frequency of nitrogen and fluorine (N–F) at a topological distance of 7. It contributes negatively and correlates well with the substituents at R1 (48, 60 etc.) as well as R2 (72) positions. For instance, at R1 position the presence of substituents comprising nitrogen and fluorine atom with the topological distance equal to 7 (compound 60; see Fig. 7) led to a decrease in the activity as compared to the same two atoms at a topological distance 6 (compound 64; see Fig. 7).
|  | 
|  | Fig. 7  Comparison of Compounds 60 and 64 based on the ‘F07[N–F]’ descriptor value and its effect on the hBACE1 activity. |  | 
 
3.4.2. G-QSAR descriptors. In the G-QSAR model, five descriptors were selected (see eqn (3)). Here, each descriptor specifically points a substitution site (R1 or R2 position) and thus confirms the contribution of a structural feature at a specific site in the molecule to the activity against hBACE1.R1_F05[C–N]: this 2D atom pairs descriptor refers to the frequency of carbon and nitrogen (C–N) atoms with a topological distance of 5 at R1 position of the molecules. It shows a positive contribution, and this means that a compound with a more number of C–N at a topological distance of 5 at R1 are supposed to have a higher hBACE1 inhibitory activity. This descriptor actually suggests that more branching structures on the phenyl ring at R1 position are favorable for the activity, and such features are also seen in most of the active compounds, for example 64, 66, 67, 69, 70, 71 etc., while deficiency of this feature was observed in a number of low active compounds like 2, 3, 4, 6, 17, 18, 21 etc. This is well demonstrated in Fig. 8 and the encircled substructures (as the branching increases) were found to be favorable for the activity.
|  | 
|  | Fig. 8  Effect of increase in branching at the R1 position (as indicated by ‘R1_F05[C–N]’ descriptor value) on the hBACE1 activity as observed in compounds 56, 57 and 64. |  | 
R1_NssssC: this descriptor belongs to the class of electrotopological state (E-state) indices that encode information about both the topological environment of that atom and the electronic interactions due to all other atoms in the molecule. Here, the ‘R1_NssssC’ descriptor refers to the number of Carbon atoms of type ssssC (i.e., a carbon atom with 4 single bonds) at the R1 position. It in fact points out the feature where a carbon atom connected with 4 non-electronegative atoms via single bonds (vide infra the discussion on the R1_SHBa descriptor). This descriptor contributes negatively, and hence presence of this feature is unfavorable for the activity as seen in compounds 36, 45, 47, 48 etc. For instance, one can compare two structures 48 and 66 and their activity values in order to observe the effect of this feature as shown in Fig. 9. This descriptor corroborates well with the 2D-QSAR descriptors (i.e., F07[N–F] and F06[C–C]).
|  | 
|  | Fig. 9  Comparison of compounds 48 and 66 based on the ‘R1_NssssC’ descriptor value and its effect on hBACE1 activity. |  | 
R2_nPyridines: this descriptor belongs to the category of functional group counts. ‘R2_nPyridines’ stands for the number of pyridine rings at the R2 position. It contributes negatively, which suggests that the presence of a pyridine ring at the R2 position is unfavorable for the activity as seen in compounds 18, 68 etc. For instance, one can observe the difference in the activity value when the benzene ring (69) is replaced with a pyridine ring (68) as shown in Fig. 10. The same can be observed in the compounds (78 and 91) present in the external set.
|  | 
|  | Fig. 10  Effect on hBACE1 activity due to presence of pyridine ring at R2 position (as indicated by ‘R2_nPyridines’ descriptor) as seen in compounds 68 and 69. |  | 
R2_nC: this descriptor simply refers to the number of carbon atoms at R2 position. It positively contributes to the activity, and hence it suggests that an increase in the number of carbon atoms at the R2 position is favorable for the hBACE1 activity, which is also observed in most of the active compounds (64, 66, 67, 70, 71 etc.) in the dataset. It can be demonstrated by comparing the two structures and their activity values as shown in Fig. 6. Therefore, the information obtained from this descriptor along with the R2_nPyridines descriptor, one can suggest that at the R2 position the benzene ring substituted with branched structure especially tertiary-butyl group is favorable for the activity compared to the heterocyclic ring like pyridine.
R1_SHBa: this descriptor refers to the sum of the E-states indices computed for strong hydrogen bond acceptors present at R1 position. This descriptor contributes positively, which suggests that this feature is favorable for the hBACE1 activity. In simpler terms, the presence of electronegative atoms, especially fluorine, at the R1 position is responsible for the hBACE1 activity, as this feature is observed in almost all the highly active compounds. For instance, one can compare two compounds without (22) and with (67) this feature and its effect on the hBACE1 activity as shown in Fig. 11. So the information gained from this descriptor along with the R1_F05[C–N] and R1_NssssC descriptors, one can conclude that more branched structure with more electronegative atoms at the R1 position should improve the activity.
|  | 
|  | Fig. 11  Effect on the hBACE1 activity of the presence of strong hydrogen acceptors at the R1 position (as indicated by ‘R1_SHBa’ descriptor) as seen in compounds 22 and 67. |  | 
 
4. Conclusions
2D-QSAR and G-QSAR models employing cyclic sulfone (or sulfoxide) hydroxyethylamines as hBACE1 inhibitors were developed and validated. To our knowledge, this is the first QSAR report on cyclic sulfone hydroxyethylamines as hBACE1 inhibitors. It was observed that the R2-based parameter and the Golbraikh–Tropsha's criteria have some shortcomings in judging the quality of predictions as they show a variance with respect to the training/test set range or distribution of response values around training/test/external set mean. However, we observed that the MAE-based criteria provide results that are truly based on the actual prediction errors.
Further, both the models provided sufficient information about important structural features that are responsible for the activity against hBACE1. Summarizing the information extracted from the QSAR analysis (both 2D-QSAR and G-QSAR), the favorable and unfavorable features for the hBACE1 activity are illustrated in Fig. 12. The reported QSAR models have captured the important structure–activity relationship (SAR) information and corroborated qualitative findings with respect to important structural features for the set of compounds and the target considered as discussed in the source articles.12–14 Additionally, the present QSAR study could unveil the SAR information in quantitative terms that helps in predicting the quantitative value of the endpoint of interest for newly designed compounds which might not be possible from other (qualitative) computational approaches for SAR analysis. The developed QSAR models can be employed as an efficient query tool for search of hBACE1 inhibitors, while taking into consideration the applicability domain of the respective models.
|  | 
|  | Fig. 12  Summary of the mechanistic interpretation of the 2D-QSAR, and G-QSAR models using compound 66 (one of the most active compounds). |  | 
Acknowledgements
No funding has been received for this research work. We thank VLife Sciences for providing us with an free evaluation license.
References
- Alzheimer's Disease Education and Referral Center, About Alzheimer's Disease: Alzheimer's Basic. Available from: http://www.nia.nih.gov/alzheimers/topics/alzheimers-basics [Last accessed 10 February 2016].
- P. Ambure and K. Roy, Expert Opin. Drug Discovery, 2014, 9, 697–723 CrossRef CAS PubMed.
- K. Maurer, S. Volk and H. Gerbaldo, The Lancet, 1997, 349, 1546–1549 CrossRef CAS.
- Alzheimer's Disease Education and Referral Center, Alzheimer's Disease Medications Fact Sheet, Available from: http://www.nia.nih.gov/alzheimers/publication/alzheimers-disease-medications-fact-sheet [Last accessed 10 February 2016].
- R. Vassar, Adv. Drug Delivery Rev., 2002, 54, 1589–1602 CrossRef CAS PubMed.
- Y. Luo, B. Bolon, S. Kahn, B. D. Bennett, S. Babu-Khan, P. Denis, W. Fan, H. Kha, J. Zhang and Y. Gong, Nat. Neurosci., 2001, 4, 231–232 CrossRef CAS PubMed.
- S. Sankaranarayanan, M. A. Holahan, D. Colussi, M.-C. Crouthamel, V. Devanarayan, J. Ellis, A. Espeseth, A. T. Gates, S. L. Graham and A. R. Gregro, J. Pharmacol. Exp. Ther., 2009, 328, 131–140 CrossRef CAS PubMed.
- J. C. Dearden, International Journal of Quantitative Structure-Property Relationships (IJQSPR), 2016, 1, 1–44,  DOI:10.4018/IJQSPR.2016010101.
- K. Roy, S. Kar and R. N. Das, Understanding the basics of QSAR for applications in pharmaceutical sciences and risk assessment, Academic Press,  2015 Search PubMed.
- S. Ajmani, K. Jadhav and S. A. Kulkarni, QSAR Comb. Sci., 2009, 28, 36–51 CAS.
- K. Roy, R. N. Das, P. Ambure and R. B. Aher, Chemom. Intell. Lab. Syst., 2016, 152, 18–33 CrossRef CAS.
- H. Rueeger, J.-M. Rondeau, C. McCarthy, H. Mobitz, M. Tintelnot-Blomley, U. Neumann and S. Desrayaud, Bioorg. Med. Chem. Lett., 2011, 21, 1942–1947 CrossRef CAS PubMed.
- H. Rueeger, R. Lueoend, O. Rogel, J.-M. Rondeau, H. Mobitz, R. Machauer, L. Jacobson, M. Staufenbiel, S. Desrayaud and U. Neumann, J. Med. Chem., 2012, 55, 3364–3386 CrossRef CAS PubMed.
- H. Rueeger, R. Lueoend, R. Machauer, S. J. Veenstra, L. H. Jacobson, M. Staufenbiel, S. Desrayaud, J.-M. Rondeau, H. Mobitz and U. Neumann, Bioorg. Med. Chem. Lett., 2013, 23, 5300–5306 CrossRef CAS PubMed.
- ChemAxon Marvin Sketch 5115, Budapest, Hungary, Available at: http://www.chemaxon.com/products.html Search PubMed.
- Cerius2 Version 410, Accelrys, Inc, San Diego, CA,  2005 Search PubMed.
- C. W. Yap, J. Comput. Chem., 2011, 32, 1466–1474 CrossRef CAS PubMed.
- A. Mauri, V. Consonni, M. Pavan and R. Todeschini, MATCH, 2006, 56, 237–248 CAS.
- K. Roy and R. N. Das, Advanced Methods and Applications in Chemoinformatics: Research Progress and New Applications,  2011, pp. 380–411 Search PubMed.
- VLife MDS 4.1. 3, Molecular Design Suite, VLife Sciences Technologies Pvt Ltd, Pune, India Search PubMed.
- The simple, user-friendly and reliable online standalone tools freely available at http://teqip.jdvu.ac.in/QSAR_Tools/ and http://dtclab.webs.com/software-tools.
- R. W. Kennard and L. A. Stone, Technometrics, 1969, 11, 137–148 CrossRef.
- MINITAB statistical software, Minitab Inc, Minitab Release,  2000, p. 13 Search PubMed.
- K. Roy, P. Chakraborty, I. Mitra, P. K. Ojha, S. Kar and R. N. Das, J. Comput. Chem., 2013, 34, 1071–1082 CrossRef CAS PubMed.
- G. Schuurmann, R.-U. Ebert, J. Chen, B. Wang and R. Kuhne, J. Chem. Inf. Model., 2008, 48, 2140–2145 CrossRef PubMed.
- N. Chirico and P. Gramatica, J. Chem. Inf. Model., 2011, 51, 2320–2335 CrossRef CAS PubMed.
- I. Mitra, A. Saha and K. Roy, Mol. Simul., 2010, 36, 1067–1079 CrossRef CAS.
- A. Golbraikh and A. Tropsha, J. Mol. Graphics Modell., 2002, 20, 269–276 CrossRef CAS PubMed.
- J. C. Dearden, M. T. D. Cronin and K. L. E. Kaiser, SAR QSAR Environ. Res., 2009, 20, 241–266 CrossRef CAS PubMed.
- K. Roy, S. Kar and P. Ambure, Chemom. Intell. Lab. Syst., 2015, 145, 22–29 CrossRef CAS.
- W. H. Davis Jr and W. A. Pryor, J. Chem. Educ., 1976, 53, 285 CrossRef.
- P. Ambure and K. Roy, RSC Adv., 2014, 4, 6702–6709 RSC.
- S. Cassani, S. Kovarich, E. Papa, P. P. Roy, L. van der Wal and P. Gramatica, J. Hazard. Mater., 2013, 258, 50–60 CrossRef PubMed.
- P. K. Ojha and K. Roy, Mol. Simul., 2010, 36, 939–952 CrossRef CAS.
- A. K. Ghose, V. N. Viswanadhan and J. J. Wendoloski, J. Phys. Chem. A, 1998, 102, 3762–3772 CrossRef CAS.
- R. Todeschini, P. Gramatica, R. Provenzani and E. Marengo, Chemom. Intell. Lab. Syst., 1995, 27, 221–229 CrossRef CAS.
| Footnote | 
| † Electronic supplementary information (ESI) available. See DOI: 10.1039/c6ra04104c | 
| 
 | 
| This journal is © The Royal Society of Chemistry 2016 | 
Click here to see how this site uses Cookies. View our privacy policy here.