Pravin Ambure and
Kunal Roy‡
*
Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India. E-mail: kunalroy_in@yahoo.com; kroy@pharma.jdvu.ac.in; kunal.roy@manchester.ac.uk; Web: http://sites.google.com/site/kunalroyindia/
First published on 6th January 2014
A congeneric series of 224 cyclin-dependant kinase 5/p25 (CDK5/p25) inhibitors was exploited to understand the structural requirements for improving activity against CDK5/p25 and selectivity over CDK2. The CDK5/p25 enzyme complex plays a significant role in the formation of neurofibrillary tangles in Alzheimer's disease. In the present study, 2D-quantitative structure–activity relationship (2D-QSAR), group or fragment based QSAR (G-QSAR), and quantitative activity–activity relationship (QAAR) models were developed and validated with satisfactory performance as evidenced from statistical metrics, indicating the reliability and robustness of the models. The 2D-QSAR and G-QSAR models explore the structural requirements for improving activity, while the QAAR model facilitates the better understanding of features required for selectivity of the inhibitors. The docking study further provides information regarding the key active site residues and structural features important for proper binding in the active site of the CDK5/p25 complex.
Cyclin-dependant kinases (CDKs) are the group of enzymes that control the events of the cell cycle and also participate in apoptosis. The main biological function of CDKs is cell cycle regulation; however certain CDKs (i.e. CDK5) are involved in controlling cell differentiation in neuronal cells instead of cell cycle control. CDK5 is expressed in post-mitotic cells of the central nervous system and it plays a vital role during neuronal differentiation. It is a unique member of the family since it requires association with p35 activator for activation instead of cyclins.5,6 The CDK5 activity is restricted to neurons, since the activator p35 is neuronal-specific. The deregulation of CDK5 activity has been concerned in several neurogenerative diseases including AD. This deregulation is induced by the proteolytic cleavage of p35 by calpain, a calcium-dependent cysteine protease, forming the active fragment p25, which is observed to be accumulated in the brain of AD patients.7 It is believed, although the precise mechanism is unclear, that the catalytic activity of CDK5/p25 is significantly higher than the CDK5/p35, and this hyperactive CDK5/p25 complex causing hyperphosphorylation of tau protein is considered to be responsible in neuronal cell toxicity.8
One of the imposing challenges in designing drugs against any kinase is selectivity. In this case, achieving selectivity against CDK5 is an important issue. The selectivity issue of CDK5 inhibitors over CDK2 has been reported in the literature.9–11
The quantitative structure–activity relationship (QSAR), group-based QSAR (G-QSAR) and quantitative activity–activity relationship (QAAR) techniques are very useful for exploring variations in the properties and its relationship with activity/selectivity of the compounds. These techniques are widely used to predict activity/selectivity of unknown compounds, once the relationship is well defined in terms of mathematical equation and validated by means of various validation parameters.
In this study, 2D-QSAR, G-QSAR and QAAR techniques are performed on a series of CDK5/p25 inhibitors to explore structural requirements for CDK5/p25 inhibition considering both the issues i.e. activity against CDK5/p25 as well as selectivity over CDK2. The structural features required for binding to CDK5 specifically was studied using molecular docking approach.
and Δrm2(training) for internal validation,
and Δrm2(test) for external validation,
and Δrm2(overall) for overall validation, Rpred2 (external validation), and cRp2 (ref. 22) based on Y-randomization results. In Y-randomization test, the activity values of the training set compounds were randomly shuffled keeping the descriptor matrix unchanged and new models were built based on the shuffled activity values. For a robust model, the squared average correlation coefficient of the randomized models (average Rr2) must be significantly lower than R2 of the non-random (original) model and also cRp2 must be greater than 0.5. To verify the robustness and predictivity of the QAAR model, more stringent Leave-Many-Out (LMO) cross validation was performed on the training set. All three models were also evaluated for model acceptability criteria's suggested by Golbraikh and Tropsha.23,24 According to the acceptance criteria, a model must follow the following conditions:
(i) Q2 > 0.5
(ii) r2 > 0.6
(iii) |r02 − r′02| < 0.3
(iv) (r2 − r02)/r2 < 0.1 and 0.85 ≤ k ≤ 1.15 or (r2 − r′02)/r2 < 0.1 and 0.85 ≤ k′ ≤ 1.15
| pICCDK550 nM = 7.65 (±0.0969) + 0.504 (±0.371)S_aasN + 0.518 (±0.06)Atype_N_69 + 0.0017 (±0.00057)D/Dtr05 − 2.465 (±0.537)Atype_C_43 + 0.026 (±0.0051)S_sCl | (1) |
, Δrm2(training) = 0.207;Ntest = 64; Rpred2 = 0.675,
, Δrm2(test) = 0.168,
, Δrm2(overall) = 0.197.
Eqn (1) represents the best 2D-QSAR model obtained using GFA linear technique and the selected descriptors suggested different structural requirements for improving activity. The standard errors of regression coefficients are shown within parentheses. Ntraining and Ntest are the number of compounds in training and test sets which were used to develop the model and to validate the developed model, respectively. The predictive ability of the 2D-QSAR model was found to be significant, since Q2 = 0.700 (Leave-One-Out method), Rpred2 = 0.675,
, and Δrm2(test) = 0.168. All other validation parameters were also found to be with statistically significant values confirming the reliability of the model. The Y-randomization results (average Rr2 = 0.029, and average Qr2 = −0.046 for 50 randomly generated models) showed that the developed model is robust with cRp2 = 0.712. The model also passes the Golbraikh and Tropsha model acceptance criteria (all respective parameters are shown in Table 1).
| Parameters | 2D-QSAR model | G-QSAR model | Remarks |
|---|---|---|---|
| Q2 | 0.70 | 0.686 | Passed |
| r2 | 0.677 | 0.696 | Passed |
| |r02 − r′02| | 0.1024 | 0.099 | Passed |
| (r2 − r02)/r2 and k or (r2 − r′02)/r2 and k′ | 0.004 and 1.001 or 0.154 and 0.997 | 0.0015 and 0.99 or 0.144 and 0.99 | Passed |
| pICCDK550 nM = 6.621 (±0.1009) + 0.136 (±0.058)R3_SsClE_index − 16.22 (±0.91)R2_chi5chain + 0.469 (±0.065)R2_SsNH2count − 0.126 (±0.033)R1_k1alpha + 1.376 (±0.276)R1_chi3Cluster | (2) |
, Δrm2(training) = 0.214;Ntest = 64; Rpred2 = 0.695,
, Δrm2(test) = 0.188,
, Δrm2(overall) = 0.207.
Eqn (2) represents the best G-QSAR model achieved using the GFA linear technique and it consists of descriptors which suggested structural requirement at specific locations (R1, R2 and R3 positions shown in Fig. 1) in the molecules to improve the activity. The predictive ability of G-QSAR model was also found to be significant based on Q2 = 0.686, Rpred2 = 0.695,
, and Δrm2(test) = 0.188 values. All other validation parameters along with Y-randomization test results (average Rr2 = 0.033, average Qr2 = −0.044 for 50 randomly generated models, cRp2 = 0.70) were found to be statistically significant confirming the reliability of the model. Like 2D-QSAR model, this model also passes the Golbraikh and Tropsha model acceptance criteria (Table 1).
| pICCDK5–CDK250 nM = −7.133 (±1.24) + 3.8522 (±0.419)Jurs_FPSA_2 − 4.39 (±0.549)R2_chi3Cluster + 5.65 (±1.43)R3_ElectronegativityCount | (3) |
, Δrm2(training) = 0.102.Eqn (3) corresponds to the best QAAR model achieved using the GFA linear technique. This model helps in understanding the structural requirements to attain or improve selectivity over CDK2. The internal parameters such as Q2 = 0.790,
, Δrm2(training) = 0.102 and LMO results (as shown in Table 2) indicate the better predictive potential of the developed QAAR model. The Y randomization results (average Rr2 = 0.173, average Qr2 = −0.405 for 50 randomly generated models, cRp2 = 0.78) further confirm the robustness of the QAAR model.
The contributions of descriptors found in 2D-QSAR, G-QSAR and QAAR models (eqn (1)–(3)) are discussed below and compared with docking results for better explanation.
![]() | ||
| Fig. 2 Top docking pose of a known ligand (co-crystal ligand, PDB id: 1UNH) used for validation of the set docking protocol (with Glide gscore = −8.333). | ||
R3-SsClE-index (G-QSAR model) is an electrotopological state index, which carries information concerning the topology of an atom and the electronic interactions due to all other atoms in the molecule. This descriptor shows a positive contribution to the activity. Here, R3-SsClE-index corresponds to the number of chlorine connected with one single bond at position R3. It suggests that a chlorine atom present at R3 position improves the activity. It is in accordance with the S_sCl descriptor (2D-QSAR model), also a type of electrotopological state indices, which designates sum of electrotopological state values of chlorine atoms with single bond. This can be observed in compounds C7a and C7b, where decrease in the value of R3-SsClE-index due to removal of chlorine at R3 results in decrease in activity as well as docking score (Fig. 3).
The R2-Chi5chain (G-QSAR model) descriptor indicates a retention index for 5 member ring at position R2. This descriptor shows a negative and highest contribution to the activity. All the compounds having 5 member 1,2,4-triazole ring present at R2 position shows poor activity (pIC50 nM < 6). Hence, this descriptor effectively differentiates all the compounds with pIC50 < 6. It is in accordance with the descriptor D/Dtr05 (2D-QSAR model), which is a distance/detour ring index of order 5. The docking analysis showed that presence of five member ring affects interactions and hence docking score. This can be observed in compounds C9e and C10e as shown in Fig. 4.
![]() | ||
| Fig. 4 The best docked conformations of the compound C9e and C10e within the binding sites of the CDK5 enzyme. | ||
The R2-SsNH2count (G-QSAR model) is an electrotopological state index descriptor showing positive contribution to the activity. It defines the total number of –NH2 groups connected with one single bond at R2 position. It effectively differentiates, along with the R2-Chi5chain descriptor, compounds in active and less active (pIC50 nM < 6). Compounds (e.g., B6a) with –NH2 group, but without 5 membered 1,2,4-triazole ring at R2 position shows higher activity and those compounds having –NH2 group along with 5 membered 1,2,4-triazole ring (e.g., B9a) or without –NH2 group (e.g., B4a) show less activity. It is in accordance with the descriptor Atype_N_69 (2D-QSAR model), which designates the presence of atom type ‘N’ as seen in these fragments: Ar–NH2 and X–NH2 (‘Ar’ represents aromatic groups and ‘X’ represents any heteroatoms). From the docking studies, it is observed that –NH2 group plays role in hydrogen bonding with Asp 144, Gln 130 (active site residues). In presence of the ring structure, these interactions may get affected. For example, changes in the interactions, docking score and activity values with changes in R2 group are seen in compounds B4a, B9a and B6a (Fig. 5).
![]() | ||
| Fig. 5 The best docked conformations of the compound B4a, B9a, and B6a within the binding sites of the CDK5/p25 enzyme complex. | ||
The R1-chi3Cluster (G-QSAR model) descriptor designates the amount of branching, ring structure present and flexibility at position R1. It shows a positive contribution to the CDK5 activity suggesting that the presence of ring structure or branching at R1 contributes positively to the activity. This can be exemplified in case of compound C9i, when the phenyl group is replaced with methyl group (C9e) at R1 position, the activity value decreases from 7.494 to 6.473 (pIC50 nM). The R1-k1alpha (G-QSAR model) descriptor signifies first alpha modified shape index. It shows a negative contribution to the activity. The contribution of this descriptor to the response is marginal. Atype_C_43 (2D-QSAR study) descriptor indicates number of carbon atoms of specified type, i.e., X--CR+++X (‘X’ represents any heteroatom, R represents any group linked through carbon, ‘--' represents aromatic bonds as in benzene or delocalized bonds such as the N–O bond in a nitro group and ‘+++’ represents aromatic single bonds). It shows a negative and highest contribution to the activity according to the 2D-QSAR model. It effectively differentiates compounds into active and less active (pIC50 nM < 6) classes with values 1 and 2 respectively. S_aasN (2D-QSAR model) is a type of electrotopological state index, which shows a positive contribution to the activity. Here, S_aasN indicates nitrogen with two aromatic and a single bonds. The contribution is marginal showing an increase in activity with increase in S_aasN value upto 2. Further increase shows decrease in activity as observed in case of compounds (C10a–C10l) bearing five membered 1,2,4-triazole ring at R2 position which leads to decrease in activity (negative contribution of R2-Chi5chain descriptor).
The R2-chi3Cluster descriptor designates the amount of branching and ring structure present and flexibility at position R2. It shows a negative contribution to the selectivity. Here it can be seen that if a ring structure, or higher degree of branching is present at position R2, it increases chi3Cluster value and decreases selectivity. In the docking studies, it is seen that presence of such structures affects interaction with Asp 144, Glu 12 and Gln 130 residues. It can be concluded that an increase in branching or presence of ring structure at R2 position decreases selectivity (example shown in Fig. 6).
![]() | ||
| Fig. 6 The best docked conformations of the compound B6a and B8a within the binding sites of the CDK5/p25 enzyme complex. | ||
The R3-ElectronegativityCount descriptor is a measure of electronegativity. It shows a positive contribution to the selectivity. It is observed that even a little change in descriptor value affects selectivity. Removal of chlorine at R3 group reduces R3-ElectronegativityCount value from 0.597 to 0.55, resulting in decrease in selectivity (from 1.312 to 0.963) as well as activity (suggested by positive contribution of R3-SsClE-index descriptor selected in the G-QSAR model) and docking score (example as shown in Fig. 3).
In total, there are 224 structures and it was not easy to draw any conclusion regarding structure–activity relationships (SAR) from visual inspection of 2D structures and experimental activities. From the results of QSAR studies, the understanding of SAR of this series of compounds becomes clearer, and specific conclusions can be made. Once the SAR is interpreted from the QSAR models, it is then confirmed with docking studies. We may not focus on the specific features and their interaction patterns directly from docking studies without prior information about SAR. Also the reported QSAR models well explain the SAR and they pass all the requisite validation criteria, which prove their predictive ability and hence they are useful for predicting activity/selectivity of similar class of compounds.
![]() | ||
| Fig. 7 Summary of the mechanistic interpretation of 2D-QSAR, G-QSAR, QAAR and docking studies for CDK5/p25 inhibitors using compound 7a (one of the most active compounds). | ||
(1) Presence of branching or ring structure at R2 position affects the activity as well as selectivity, since it hinders interactions of molecules with active site residues (Glu 12, Asp 144, and Gln 130). This conclusion is based on G-QSAR, QAAR, and docking studies.
(2) Presence of –NH2 group at R2 position is important for the activity, since it plays a significant role in interaction with active site residues (Asp 144, and Gln 130); this conclusion is based on G-QSAR, 2D-QSAR and docking studies.
(3) A chlorine atom at R3 is found to be important for the activity as well as selectivity based on G-QSAR, 2D-QSAR and QAAR studies.
(4) Presence of a ring structure like 4-chloro-benzyl group at R1 position is required for the activity based on the positive contribution of the R1-chi3Cluster descriptor as obtained in the G-QSAR model. Hence a ring or branching structure at R1 position is important for the activity.
(5) Presence of a –NH– fragment is found to be essential, since it interacts with an active site residue (Ile 10) as seen in most of the compounds. The Ile 10 residue was earlier found to be an essential residue responsible for biological activity based on molecular dynamics study.29
To design new molecules with improved activity/selectivity, one has to consider the above requisite structural features, while trying with recommended different scaffolds (for e.g., different ring structures or branched structures at R1 position) at various positions on similar backbones and then can predict the activity or selectivity values using reported QSAR models. Further, the newly designed molecules can be docked to the CDK5/p25 enzyme to notice the changes in interactions with active site residues and docking scores due to newly added structural features. Further, the developed models can be used as efficient query tools for screening of potent and selective CDK5/p25 inhibitors. This study offers an understanding of the essential structural features or properties of the molecules for proper binding in the active site of the CDK5/p25 enzyme.
Footnotes |
| † Electronic supplementary information (ESI) available: Table S1 contains summary (list of descriptors and values of various statistical and validation metrics) of the reported QSAR and QAAR models. Tables S2, S3 and S4 list the results of 2D-QSAR, G-QSAR and QAAR models (compound names, descriptor values, activity values (observed, calculated and LOO predicted) and docking scores) respectively. The structures of all 224 compounds considered in the G-QSAR and 2D-QSAR models are provided in a .sdf file (224cdk5ExploitH.sdf). The structures of 18 compounds considered in the QAAR model are provided in another .sdf file (18cdk5.sdf). See DOI: 10.1039/c3ra46861e |
| ‡ Presently at Manchester Institute of Biotechnology, Manchester M1 7DN, UK. |
| This journal is © The Royal Society of Chemistry 2014 |