Huiwen Wanga,
Zeyu Guana,
Jiadi Qiua,
Ya Jiaa,
Chen Zengab and
Yunjie Zhao*a
aDepartment of Physics, Institute of Biophysics, Central China Normal University, Wuhan 430079, China. E-mail: yjzhaowh@mail.ccnu.edu.cn
bDepartment of Physics, The George Washington University, Washington DC 20052, USA
First published on 10th January 2020
Kinase proteins have been intensively investigated as drug targets for decades because of their crucial involvement in many biological pathways. Most kinase drugs target the catalytic ATP pocket, which is highly conserved across the kinome, and as such often leads to potential side effects. It is thus highly desirable to develop non-ATP-competitive drugs that inhibit kinase activity via allosteric interactions. However, to elucidate the complex allosteric mechanism, it is essential to build a novel method to characterize a comprehensive non-catalytic pocket for the structurally well-covered human kinome. In this work, we developed a hybrid approach of sequence, structure and network analysis on 168 representative kinases to identify group-specific non-catalytic pockets. The geometric analysis was performed to cluster these pockets and to identify group-specific non-catalytic pockets based on their shape and location characteristics. Subsequent sequence evolutionary analysis reveals the crucial residues of each pocket that will likely interact with inhibitors binding to the pocket. These residues thus serve as potential biomarkers of each pocket for inhibitor design. Moreover, the residue–residue interaction network analysis was performed to elucidate the complex allosteric mechanism of these non-catalytic pockets. The final list of 14 group-specific non-catalytic pockets and their characterized structural, sequence and network features can be an enabling dataset for drug design effort at the human kinome level. The developed hybrid approach is able to identify group-specific non-catalytic pockets and will benefit the research related to human kinome drug design.
Kinase proteins have been intensively investigated as drug targets for decades.6–8 Currently, 518 kinases encoded in the human genome9,10 are classified into eight groups (CK1, STE, CAMK, AGC, CMGC, TK, TKL, and RGC). Structurally, most of the kinase proteins share a similar topology that consists of N-terminal and C-terminal lobes.11 The N-terminal lobe, containing five β strands and at least one α C-helix, is highly conserved in the human kinome, while the C-terminal lobe, which is composed of α helix, activation loop, and substrate binding groove, shows more sequence variation.
At present, kinase inhibitors are broadly classified into two classes depending on whether an inhibitor is ATP-competitive or allosteric.12 The former is further refined as type I or II for targeting active or inactive ATP pocket, respectively, while the latter as type III or IV for targeting non-catalytic pockets near or far away from the ATP pocket, respectively. The vast majority of kinase drugs currently in clinical use are ATP-competitive inhibitors. While the pockets targeted by some allosteric inhibitors remain unknown, a few are known to target certain allosteric pockets near the ATP pocket.12,13 Volkamer et al.9 systematically analyzed the geometric characteristics of the ATP pocket in the human kinome. It is found that the geometric characteristics of the ATP pocket are highly conserved. Thus, the ATP-competitive drugs may cause undesirable side effects such as hand-foot skin reaction, hypertension and acute renal failure.14–16 Therefore, the highly selective allosteric inhibitors (type III and IV) with minimal side effects are widely needed. There are some case studies reported for allosteric inhibitors. For example, the inhibitor trametinib targets the non-catalytic pockets for MEK and BRAF kinase proteins.17 Cobimetinib is another MEK allosteric inhibitor currently in clinical trials as an anticancer agent.18 Chen et al.19 developed some peptide inhibitors binding to the allosteric pocket. The experiments show that these peptides can break the CDK2/Cyclin interface and decrease the kinase activity. Hu et al.20 identified some novel allosteric inhibitors that interrupted the interaction between CDK2 and Cyclin A3. Wylie et al.21 developed the allosteric inhibitor ABL001 experimentally. This inhibitor binds to the allosteric pocket of the ABL1 kinase. This binding leads to the C-terminal helix formation and decreases the kinase activity. However, both the number of known allosteric inhibitors and kinase targets were very limited.
It is easier to develop or screen for new drugs to target known pockets.22 Barnash et al.23 believed that target-oriented drug development not only complements a disease-focused approach but also reduces the risk of side effects. Several studies have attempted to identify specific non-catalytic pockets for drug design. For example, Chen et al.24 identified one non-catalytic pocket on CDK2 and developed corresponding peptides to inhibit CDK2 activity. Ma et al.4 proposed some potentially non-catalytic pockets in six human kinase proteins by correlation analysis between allosteric sites and catalytic sites. Given that the kinome is now well covered structurally, a comprehensive analysis of all potentially allosteric pockets may shed light on generic mechanisms of allosteric kinase inhibition.
In this article, we performed cluster analysis to identify the group-specific non-catalytic (GSNC) pockets by location distance and shape distance. The group-specific non-catalytic (GSNC) pockets are highly conserved only in one or several groups but share little shape similarity in other kinase groups. A total of 29 GSNC pockets were identified in seven groups. Then, we further clustered these GSNC pockets into 14 pockets at the kinome level. Some of the 14 pockets are shared by one or several groups. The inhibitors targeting these pockets will have minimal side effects for the diseases that involve in one or a few groups only. Moreover, we performed sequence conservation and network analysis to explain the allosteric mechanism of these GSNC pockets. The developed hybrid approach is able to identify group-specific non-catalytic pockets and will benefit the research related to human kinome drug design.
![]() | ||
Fig. 1 The difference between the traditional method and our method, and the distribution of 168 human kinases in struKin dataset on the kinome tree. (A) The difference between the traditional method and our method. The traditional method is to design and screen drugs directly by analyzing pockets of one or several proteins. We first obtained all pockets of a class of proteins and clustered these pockets with pocket structure features to identify specific pockets. Then the network analysis was performed to explain the allosteric mechanism of specific pockets. (B) The distribution of 168 human kinases in struKin dataset on kinome tree. The red dots represent each structure. For example, one red dot represents the CDK2 structure (PDB ID: 4ACM41). The CDK2 structure is colored in conservation scores with red (conserved residue), green (average residues), and blue (variable residues), respectively. The ATP pocket is highly conserved while T-loop, C-terminal helix, and the area below the C-helix are variable. |
Structurally, the N-terminal conformations including ATP pocket are highly similar in the entire human kinome. However, the C-terminal conformations are different. In addition, we also performed sequence evolutionary analysis to infer the crucial residues and projected the evolutionary scores of each residue onto the tertiary structures. The highly conserved ATP pocket (colored in red) indicates that the ATP pocket maintains the structure for biological function while some non-catalytic pockets are variable (Fig. 1B).
In terms of sequence features, we performed an evolutionary analysis of all ATP pockets using ConSurf.27,28 The continuous conservation scores are divided into a discrete scale of 9 grades with grade 1 indicating the most variable positions and grade 9 the most conserved positions. Table S1† shows the ATP pockets have a mean value of 7.34 (±0.16). The high conservation scores and small standard deviation values suggest that ATP pockets are highly conserved. For example, there are 42 residues in the ATP pocket of MAP2K2 kinase (PDB ID: 1S9I29). We re-numbered column positions of the 42 residues from the 168 kinase sequences alignment. Fig. S2A† shows the sequence variations for ATP binding sites by analyzing 15 available kinase/ATP complex structures in struKin dataset. We divided the binding sites into hydrogen bonds and hydrophobic interactions (see Datasets S1, S2, and S3 for details†). The hydrogen bonding residues are mainly located at four positions. Two charged residues are located at positions 11(K) and 22(D/E). Another two residues are located at positions 34 and 37 (mostly N/D) form ATP interactions via magnesium or manganese ions (Fig. S3†). The hydrophobic interaction residues are mainly located at eight positions. The residues in position 1 are mainly L or I, positions 2/4 are G, position 8 is V, position 9 is A, position 21 are M, F or T, position 23 are F, L or Y, and position 35 are L or M, respectively. These residues, which interact with ATP to form hydrogen bonds and hydrophobic interactions, are mainly uniformly distributed in the three phosphate groups and adenosine regions of ATP respectively (shown in Fig. S2B†).
The group-specific non-catalytic (GSNC) pockets were identified by pocket location and shape similarity. We first aligned kinase structures to a reference structure for each group. Then, we detected all pockets using DoGSiteScorer.25,26 The group-specific non-catalytic pockets (GSNC pockets) were clustered by the following criterion.
(a) Location distance (LD) of 8 Å between geometric centers of two pockets was used to quantify the position similarity of two pockets.
(b) Shape distance (SD) of 2.5 was used to measure the shape similarity of two pockets. The volume, depth, and surface values in SD are able to help screen the drug size, length, and interaction groups, respectively.
(c) The coverage rate of the similar pocket for each group is greater than 80%.31
Thus, a total of 29 typical GSNC pockets and 49 non-GSNC pockets (the coverage rates of the similar pockets for a given group are not greater than 80%.) were identified as listed in Table 1. And the 29 typical GSNC pockets were visualized as shown in Fig. 2.
Group | Kinase | PDB ID | GSNC pockets | Non-GSNC pockets |
---|---|---|---|---|
CMGC | CLK1 | 1Z57 (ref. 56) | p2, p3, p4, p6, p7 | p1, p5, p8 |
AGC | AKT1 | 4GV1 (ref. 59) | p1, p3, p4, p5, p8 | p2, p6, p7, p9, p10, p11, p12, p13, p14 |
TKL | PIPK2 | 5J7B60 | p1, p3, p5 | p2, p4, p6, p7, p8, p9, p10, p11 |
TK | JAK1 | 3EYG58 | p1, p2, p3, p6, p8 | p4, p5, p7, p9 |
CAMK | CaMK1α | 4FG8 (ref. 61) | p1, p2, p3 | p4, p5, p6, p7, p8, p9, p10, p11 |
STE | MST3 | 3A7I62 | p5, p9 | p1, p2, p3, p4, p6, p7, p8, p10, p11, p12, p13 |
CK1 | CK1α | 5FQD63 | p1, p2, p3, p6, p7, p12 | p4, p5, p8, p9, p10, p11 |
![]() | ||
Fig. 2 A total of 29 group-specific non-catalytic (GSNC) pockets in seven kinase groups. The kinase structures and identified GSNC pockets are shown as cartoon and surface, respectively. |
Unlike the large ATP-binding pockets with volume around 600 Å3, the identified GSNC pockets are smaller with volume around 200 Å3. Previous research reported that the small non-catalytic pockets may act as allosteric sites for kinase inhibition. For example, Ma et al.4 identified 13 non-catalytic pockets in 6 human kinases (CDK2, CK2, Chk1, MAP14, MAP8, and c-Abl). The average volume, surface area and depth values of these 13 non-catalytic pockets are 263.27 (±110.58) Å3, 414.64 (±148.25) Å2 and 10.23 (±4.75) Å (Table S2†). Another experiment performed by Comess et al. demonstrated a small non-catalytic pocket (volume = 140.67 Å3, surface area = 182.60 Å2, depth = 8.59 Å) in JNK1α1 is able to bind an inhibitor (PDB ID: 3O2M35). This pocket is located at the same position as p2 of CLK1 kinase in the CMGC group. In addition, some allosteric inhibitors (such as PDB ID: 4M12 (ref. 36)) were developed to target a non-catalytic pocket of ITK kinase. This pocket is located at the same position as p6 (volume = 158.85 Å3, surface area = 334.14 Å2, depth = 11.79 Å) of JAK1 in TK group. Taken together, these results show that the identified GSNC pockets may act as allosteric sites for inhibitor binding.
First, we compared the sequence conservation scores of identified GSNC pockets and non-GSNC pockets on the protein surface (Fig. 3D and Table S1†). The identified GSNC pockets (average conservation score = 5.79 ± 1.24) are less conserved than the ATP pockets, but more conserved than non-GSNC pockets (average conservation score = 4.95 ± 1.13). These results suggest that there may be some more conserved residues within each GSNC pocket that can serve as the biomarkers for the GSNC pockets.
Then, we identified the crucial residues for different GSNC pockets using WebLogo.38,39 For example, p7 (TL) pocket from CLK1 kinase in CMGC group is able to accommodate a new class of inhibitors distinct from the traditional ATP-competitive inhibitors. We performed sequence variation analysis of this pocket using all kinase sequences (Fig. 4A) and CMGC group sequences (Fig. 4B). Residue Tyr180 (position 4) is highly conserved in CMGC group but shows variation in the entire kinome. This result indicates that Tyr180 may be a crucial residue for p7 (TL) pocket of CLK1 kinase in CMGC group. This observation agrees with previous experiment.24 In addition, Yang et al.40 demonstrated that the Glu–Arg pair serves as a center hub of connectivity between these two structurally conserved elements in EPKs. Mutations of either residue would disrupt communication between the two segments as well as the rest of the protein, leading to altered catalytic activity and enzyme regulation. Residue Pro271 (position 12) in CDK2 (PDB ID: 4ACM41) shields the Glu–Arg ion pair from solvent, which suggests that Pro271 (position 12) may also be a crucial residue served as biomarker for the p7 (TL) pocket of CLK1 kinase in CMGC group.
![]() | ||
Fig. 4 Sequence variation analysis of p7 (TL) pocket from CLK1 kinase (PDB ID: 1Z57 (ref. 56)) in the CMGC group. The differences between (A) kinome level and (B) CMGC group level shows specificity. The network representation (C) and surface model (D) of p7 pocket indicate that Tyr180 (position 4) and Pro271 (position 12) may be the crucial residues for p7 (TL) pocket from CLK1 kinase in the CMGC group. This observation agrees with previous experiments.24,40 |
Group | Reference kinase | PDB ID | ATP pocket | GSNC pocket's ranking | Non-GSNC pocket's ranking |
---|---|---|---|---|---|
CMGC | CLK1 | 1Z57 | p0 (0.350) | p3(0.371) > p4(0.339) = p6(0.339) > p2(0.321) > p7(0.320) | p1(0.333) > p8(0.290) > p5(0.272) |
AGC | AKT1 | 4GV1 | p0(0.360) | p1(0.371) > p5(0.367) > p4(0.350) > p8(0.331) > p3(0.313) | p12(0.363) > p13(0.360) > p7(0.347) > p10(0.310) > p6(0.308) > p14(0.307) > p11(0.299) > p9(0.296) > p2(0.285) |
TKL | PIPK2 | 5J7B | p0(0.362) | p1(0.368) > p3(0.356) > p5(0.325) | p8(0.351) > p11(0.350) > p7(0.346) p4(0.343) > p10(0.335) > p9(0.325) p2(0.322) > p6(0.291) |
TK | JAK1 | 3EYG | p0(0.374) | p2(0.408) > p1(0.370) > p6(0.369) > p8(0.356) > p3(0.324) | p4(0.343) > p7(0.319) > p9(0.315) > p5(0.275) |
CAMK | CAMK1α | 4FG8 | p0(0.372) | p2(0.357) > p1(0.346) > p3(0.325) | p5(0.375) > p7(0.371) > p6(0.367) > p10(0.362) > p4(0.356) > p8(0.336) > p9(0.328) > p11(0.298) |
STE | MST3 | 3A7I | p0(0.375) | p9(0.375) > p5(0.326) | p12(0.395) > p8(0.371) > p4(0.367) > p7(0.347) > p1(0.332) > p13(0.331) > p3(0.328) > p10(0.324) > p6(0.323) > p11(0.315) > p2(0.301) |
CK1 | CK1α | 5FQD | p0(0.364) | p1(0.386) > p7(0.356) > p3(0.351) > p2(0.347) > p6(0.344) > p12(0.339) | p10(0.391) > p11(0.382) > p4(0.353) > p8(0.349) > p5(0.323) > p9(0.318) |
![]() | ||
Fig. 5 The network (closeness and shortest paths) analysis of pockets. (A) The pocket closeness is defined by the average closeness of all residues in the pocket. The results show that the average closeness values of ATP pockets, group-specific non-catalytic (GSNC) pockets, and non-GSNC pockets are 0.37 ± 0.01, 0.35 ± 0.02 and 0.33 ± 0.03 respectively. (B and C) The average shortest paths of the GSNC pockets (orange) and the non-GSNC pockets (blue) to Asp–Phe–Gly (DFG) residues. A large conformational change for DFG residues at the N terminus of the activation segment determines whether the kinase is active or inactive.45 The shortest path of one pocket to a residue is the average shortest path of all residues in the pocket to the residue. (B) For p38α kinases, the results indicate that the average shortest path of GSNC pockets (2.10 ± 0.52) is smaller than those of non-GSNC pockets (3.08 ± 0.43), and the shortest paths of 5 GSNC pockets to DFG residues are ranked as the following: p6 < p2 < p4 < p3 < p7. (C) For JAK1 kinase, the results indicate that the average shortest path of GSNC pockets to DFG residues (2.02 ± 0.83) is less than those of non-GSNC pockets (2.33 ± 0.34), and the average shortest path of 5 GSNC pockets to DFG residues are ranked as the following: p2 < p1 < p6 < p8 < p3. |
We hypothesize that the closeness analysis is able to qualitatively identify the non-catalytic pockets, and more suitable for drug design. To do this, we constructed the protein network from the kinase crystal structures, and then computed the closeness values of all surface residues (without ATP pockets) and classified the pockets into three categories: (1) most likely drug binding pockets (high closeness values), (2) likely drug binding pockets (intermediate closeness values), and (3) unlikely drug binding pockets (small closeness values). Fig. 6 shows the top 10 high closeness residues on kinase surface for most likely drug binding pockets identification with experiment validations. The residues colored in red, purple and green are located at inhibitor binding GSNC pockets with experimental validation, GSNC pockets without inhibitor, and surface residues, respectively. For example, for the p38α kinase, nine high closeness residues are located at four GSNC pockets (p2, p3, p4 and p6 from CLK1 kinase in the CMGC group) as shown in Fig. 6A. Five of the nine high closeness residues are located at p2 pocket from CLK1 kinase in CMGC group that are able to bind a molecule (molecule name: 46A; Kd = 16000 nM; PDB ID: 3O2M35). Fig. 6B shows another example in the TK group. Seven high closeness residues are located at four GSNC pockets (p1, p2, p6 and p8 from JAK1 kinase of TK group). Two out of the seven high closeness residues are located at p2 pocket from JAK1 kinase in the TK group which are able to bind molecule (molecule name: 0O7; IC50 = 4200 nM; PDB ID: 4EBV44). Two of the seven high closeness residues are able to bind a molecule (molecule name: 1YZ; Kd = 900 nM; PDB ID: 4M12 (ref. 36)) located at p6 pocket from JAK1 kinase in TK group. The results suggest that the closeness analysis is able to qualitatively identify the useful GSNC pockets for drug design. Other GSNC pockets with high closeness residues are potential binding pockets.
![]() | ||
Fig. 6 The top 10 high closeness residues on kinase surface from most likely drug binding pockets. The residues colored in red, purple and green are located at inhibitor binding GSNC pockets with experimental validation, GSNC pockets without inhibitor, and surface residues, respectively. (A) For the p38α kinase (PDB ID: 1R3C57), the nine high closeness residues are located at four GSNC pockets (p2, p3, p4 and p6 from CLK1 kinase in CMGC group). Five out of the nine high closeness residues are located at p2 pocket from CLK1 kinase in CMGC group which are able to bind molecule (molecule name: 46A; Kd = 16![]() |
The closeness calculation predicts the druggable pocket by average the closeness values of all residues to the corresponding pocket. The previous conclusion shows that the closeness of GSNC pockets (0.35 ± 0.02) is smaller than ATP pockets (0.37 ± 0.01) but larger than non-GSNC pockets (0.33 ± 0.03) (Table 2 and Fig. 5A). The closeness rankings are 4th out of 14 pockets in the AGC group, 1st and 3rd out of 9 pockets in the TK group, 5th and 6th out of 8 pockets in the CMGC group. The performance of closeness calculation is better than druggability calculation. These two druggable pocket prediction strategies will provide guidance for people working in the field.
A large conformational change for Asp–Phe–Gly (DFG) residues at the N terminus of the activation segment determines whether the kinase is active or inactive.45 Thus, we want to calculate the correlations between all non-catalytic pockets and the DFG residues. Based on the protein network, we calculated the average shortest paths between all non-catalytic pockets and DFG residues to quantify their correlation for p38α and JAK1 kinases. The shortest path of one pocket to a residue is the average shortest path of all residues in the pocket to the residue. The shorter the average path of the pocket to DFG residues, the stronger the ability of the pocket to regulate DFG residues. For p38α kinases (Fig. 5B), the average shortest path of GSNC pockets to DFG residues (2.10 ± 0.52) is less than that of non-GSNC pockets (3.08 ± 0.43). And the average shortest path of p2 pocket, to which a molecule binds (molecule name: 64A; Kd = 16000 nM; PDB ID: 3O2M35), to DFG residues is 1.75. The shortest paths of 5 GSNC pockets to DFG residues are ranked as follows p6 < p2 < p4 < p3 < p7. The results indicate that GSNC pockets are more likely to regulate DFG residues than non-GSNC pockets. For JAK1 kinase (Fig. 5C), the average shortest path of GSNC pockets to DFG residues (2.02 ± 0.83) is less than that of non-GSNC pockets (2.33 ± 0.34). And the average shortest path of p2 pocket, to which a molecule binds (molecule name: 0O7; IC50 = 4200 nM; PDB ID: 4EBV44), to DFG residues is 1.37. The average shortest paths of 5 GSNC pockets to DFG residues are ranked as follows p2 < p1 < p6 < p8 < p3. Again, the results indicate that GSNC pockets are more likely to regulate DFG residues than non-GSNC pockets.
We further clustered the 29 GSNC pockets at the kinome level using cutoffs of LD = 8 Å and SD = 2.5. Finally, a total of 14 GSNC pockets (GSNCp1 to GSNCp14 pockets) were identified in the entire human kinome as shown in Table S5† and Fig. 7. GSNCp1 pocket is shared by four groups (CMGC, CK1, TK, and TKL). The results show that drugs targeting this pocket may regulate the kinase activities of these four groups. Similarly, three pockets (GSNCp2, GSNCp3, and GSNCp4) are shared by three groups (CMGC, TKL, AGC; CMGC, CK1, AGC; TK, CK1, CAMK), six pockets (GSNCp5 to GSNCp10) are shared by two groups (CMGC, AGC; TK, STE; TK, AGC; CMGC, TK; STE, CK1; AGC, CK1), four pockets (GSNCp11 to GSNCp14) are shared by only one group (CK1; TKL; CAMK; CAMK), respectively. The similarity analysis would elucidate the drug effects and side effects for different GSNC pockets. For example, the GSNCp8 pocket is shared by CMGC and TK groups, and as such, drugs targeting GSNCp8 pocket for treating brain disease (Fig. S4C†) will induce fewer side effects in comparison with ATP-competitive drugs. Similarly, because the GSNCp12 pocket is shared within the TKL group only, drugs targeting GSNCp12 pocket for treating endometriosis (Fig. S4D†), will likely have minimal side effects.
To visualize the interaction network, we constructed a force-directed graph from crystal structure using D3.JS (A JavaScript library for producing dynamic, interactive data visualizations in web browsers).46 The users can drag the nodes to achieve a dynamical effect. For example, we analyzed the p7 pocket from CLK1 kinase in the CMGC group using the protein network. The result shows that the two residues on positions 4, 12 are located at the center of the network with high closeness and degree values (Fig. 4C). These two residues are also highly conserved in the CMGC group (Fig. 4B). The result shows that the two residues can be the critical binding residues for inhibitor design. The locations of the two residues are visualized in Fig. 4D. Similarly, we analyzed other GSNC pockets in the CMGC group. The critical residues for these GSNC pockets are listed in Table S6 and Fig. S5–S8.†
(1) Structures in the PDB database47 were extracted using the kinome UniProt ID (Dataset S4.xls).
(2) Structures of less than 250 residues were removed because 90% of kinases have more than 250 residues (Hanks and Hunter, 1995).
(3) Structures of low resolution (>4 Å)48 were removed.
(4) The structure of the highest resolution was selected if there are several experimentally determined structures for a given kinase.
(5) All structures were optimized using the template-based structure modeling tool SWISS-MODEL49 to fill in the missing atoms.
This finally resulted in 168 structures contained in the kinase structure dataset (struKin dataset, Fig. 1B).
(1) For each of the seven groups (Table 1), a typical kinase structure was randomly selected as the reference structure to which all other structures in that group were aligned using PyMOL (http://www.pymol.org).
(2) All pockets of a given kinase structure were detected and calculated using DoGSiteScorer.25,26 DoGSiteScorer is an active site identification program that identifies all pockets on the surface of a given protein structure.
(3) The reference ATP pocket was extracted from the CDK2/ATP structure (PDB ID: 1FIN50).
(4) To classify the non-catalytic pockets, we defined two similarity measures as location distance (LD) and shape distance (SD) since similar pockets should share a similar location and shape for the aligned kinases. The location distance (LD) calculates the separation of the geometric centers of two pockets with a low value indicating a similar position.
![]() | (1) |
The shape distance computes the similarity of volumes, surfaces, and depths of two pockets with a low value indicating similar tertiary shapes.
![]() | (2) |
(5) Bandyopadhyay et al.31 inferred structure-based function using protein family-specific fingerprints that were defined as those subgraphs found in at least 80% of the family. Thus, we define the group shared non-catalytic pocket (GSNC pocket for short) if more than 80% kinases of one group have a shared pocket.
![]() | (3) |
In addition, the correlations between non-catalytic pockets and the DFG (Asp–Phe–Gly) residues were calculated. For a given pocket, the correlation between the pocket and a residue is defined as the average shortest path of all residues in the pocket to the residue. Thus, the correlations between a given pocket and the DFG residues is defined as the average shortest path of all residues in the pocket to the three residues (DFG) as the following:
![]() | (4) |
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c9ra07471f |
This journal is © The Royal Society of Chemistry 2020 |