An improved large-scale prediction model of CYP1A2 inhibitors by using combined fragment descriptors

Xianchao Panab, Li Chaob, Sujun Qub, Shuheng Huangb, Li Yangab and Hu Mei*ab
aKey Laboratory of Biorheological Science and Technology, Ministry of Education, Chongqing University, Chongqing 400044, China. E-mail: meihu@cqu.edu.cn; Fax: +86-23-65112677; Tel: +86-23-65102507
bCollege of Bioengineering, Chongqing University, Chongqing 400044, China

Received 25th August 2015 , Accepted 24th September 2015

First published on 28th September 2015


Abstract

CYP1A2, an important member of the cytochromes P450 (CYPs) superfamily, is involved in the metabolism or bioactivation of many clinical drugs and precarcinogens. Thus, accurate prediction of CYP1A2 inhibitors is of great importance in early drug discovery and cancer prevention. In this study, a dataset of more than 12[thin space (1/6-em)]000 structurally diverse compounds was used to develop prediction models by a support vector machine (SVM). By combining two types of fragment descriptors, i.e. Molecular Hologram and MACCS descriptors, an improved radial basis function (RBF)-based SVM model was obtained, of which the accuracies (ACCs), sensitivities (SENs), specificities (SPEs), and Matthews correlation coefficients (MCCs) were 90.95%, 92.40%, 89.70%, 0.8191 for 6396 training samples, and 83.14%, 85.17%, 81.41%, 0.6638 for 6395 test samples, respectively. The prediction capability of the SVM model obtained was further validated by an independent dataset of 2581 samples with geometric mean (G-mean) based accuracy of 70.67%. The results indicate that the combination of the two types of fragment descriptors is an extremely efficient method for eliciting the key structural features of CYP inhibitors, and thus can be employed to large-scale virtual screening of inhibitors of CYP isoforms.


Introduction

Cytochromes P450 (CYPs) are a superfamily of heme enzymes with about 60 isoforms in humans.1 It has been proved that CYPs are not only involved in oxidative metabolism of a wide range of endogenous and xenobiotic compounds, but also in the occurrence of clinically adverse drug–drug interactions (DDIs).2 Among the CYP isoforms, CYP1A2, CYP2C9, CYP2C19, CYP2D6, CYP2E1, and CYP3A4 are of particular importance for drug metabolism, which catalyze the oxidative metabolism of approximately 90% of the currently marketed drugs.3 The broad substrate diversity makes CYPs particularly prone to be inhibited by a large number of drugs, resulting in decreased clearance and increased toxicities of co-administered drugs.4 Therefore, accurate prediction of CYP inhibitory activities and potential DDIs of drug candidates is extremely important for early drug discovery.

In human liver, CYP1A2 accounts for about 13% of total CYP content5 and metabolizes a variety of clinical drugs (e.g. clozapine, ropivacaine, olanzapine, theophylline, and terbinafine).2,6 In the past decade, in silico methods in particular quantitative structure–activity relationship (QSAR) have been increasingly attractive for prediction of potential CYP1A2 inhibitors and associated DDIs.7–14 However, the extrapolation capabilities of the available prediction models are restricted by small datasets and limited structural diversities.

In 2009, Veith et al. determined the AC50 values (half-maximal activity concentration) of 17[thin space (1/6-em)]143 compounds against 5 CYP isoforms (1A2, 2C9, 2C19, 2D6, and 3A4) by quantitative high throughput screening (qHTS) technique.15 Based on this large dataset, various prediction models of CYP inhibitors have been developed by support vector machine (SVM), decision tree (DT), k-nearest neighbor (k-NN), naïve Bayes (NB), and random forest (RF), respectively.16–19 One of the most interesting works comes from Cheng et al.,16 who established combined classifiers for predicting CYP inhibitors by using SVM, C4.5DT, NB, and k-NN algorithms. For CYP1A2 isoform, the 5-fold cross-validation (CV) accuracies of the combined classifiers are approximately 81% for 12[thin space (1/6-em)]099 training samples (5663 inhibitors/6436 noninhibitors), and the prediction accuracies for 2804 test samples (1752 inhibitors/1052 noninhibitors) range from 70% to 73%.16 This is the first time, to the best of our knowledge, that highly predictive models have been constructed based on this large dataset. However, the method of combined classifiers is somewhat complex and time-consumed for large-scale virtual screening.

Herein, a strategy of combination of Molecular Hologram and MACCS has been successfully applied to construct prediction models for CYP1A2 inhibitors. The results showed that a predictive RBF (radial basis function)-SVM model was achieved with the accuracies of 90.95% for 6396 training samples and 83.14% for 6395 test samples. The prediction capability of the RBF-SVM model was further validated by an independent dataset of 2581 samples with geometric mean based accuracy of 70.67%. Taken together, the strategy of the combined fragment descriptors provides an extremely simple, accurate, and efficient approach for predicting CYP1A2 inhibitors and can be further applied for predicting inhibitors of other CYP isoforms.

Materials and methods

Datasets

The CYP1A2 dataset for model development was derived from the PubChem BioAssay database (AID: 1851).15 The inhibitory activities (AC50) were measured by a standard protocol (http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?aid=1851). Then, the AC50 of each compound was converted to an activity score between 0 and 100. Compounds are considered as inhibitors if the activity scores larger than 40 and noninhibitors if equal to 0. Compounds with intermediate activity scores (1–39) are inconclusive and thus removed from the dataset.

According to this criterion, 13[thin space (1/6-em)]256 compounds with inhibitor/non-inhibitor labels of CYP1A2 were extracted from the database, of which 6000 compounds were determined as inhibitors and 7256 as non-inhibitors. Prior to further analysis, the molecular structures were pretreated by Verify 2D module of Sybyl 8.1.20 After removing inorganic compounds, counter ions, duplicates, salts, and mixtures, the resulting 12[thin space (1/6-em)]791 compounds (designated as Dataset I) were then randomly split into a training set containing 6396 compounds and a test set containing 6395 compounds.

An independent validation dataset containing 8465 compounds (4446 inhibitors/4019 noninhibitors) was collected from another PubChem BioAssay database (AID: 410). After filtering structures as mentioned above and removing duplicated structures to Dataset I, 2581 qualified compounds (designated as Dataset II) were obtained. According to the SLN strings of molecules, the duplicate compounds have been removed by using ligand preparation tools in Sybyl 8.1.20 The statistical descriptions of the training, test and validation sets are shown in Table 1. The PubChem ID, SMILES, and inhibitor/non-inhibitor labels of all 15[thin space (1/6-em)]372 samples are listed in Table S1.

Table 1 The statistic descriptions of 15[thin space (1/6-em)]372 unique compounds
Datasets Type No. of inhibitors No. of noninhibitors Sum Ratio (inhibitors/noninhibitors)
Dataset I (AID: 1851) Training set 2948 3448 6396 0.8550[thin space (1/6-em)]:[thin space (1/6-em)]1
Test set 2947 3448 6395 0.8547[thin space (1/6-em)]:[thin space (1/6-em)]1
Dataset II (AID: 410) Validation set 1774 807 2581 2.1983[thin space (1/6-em)]:[thin space (1/6-em)]1
Total 7669 7703 15[thin space (1/6-em)]372  


Method of structural description

In the last decades, fragment descriptors have shown prominent computational efficiency and easy interpretation in large-scale virtual screening researches.21,22 In this study, two types of fragment descriptors, i.e. Molecular Hologram and MACCS were used for developing prediction models of CYP1A2 inhibitors.

Molecular Hologram description was carried out by Hologram QSAR (HQSAR) module of Sybyl 8.1 package.20,23 A Molecular Hologram is an array containing counts of molecular fragments. As depicted in Fig. 1, molecules are first broken into pre-defined structural fragments (including branched, cyclic, and overlapping fragments). Then, each unique fragment is assigned a specific large integer by means of cyclic redundancy check (CRC) algorithm. Each integer corresponds to a bin in an integer array of fixed length L. Bin occupancies are incremented according to the fragments generated. Thus, all generated fragments are hashed into array bins in the range 1 to L. This array is so-called Molecular Hologram, and bin occupancies are the hologram descriptors, which contain topological and compositional molecular information. The generation of Molecular Hologram is mainly determined by 3 parameters: fragment size, fragment distinction, and hologram length.


image file: c5ra17196b-f1.tif
Fig. 1 Generation of Molecular Hologram.19

MACCS descriptors, also called 166-bit MDL keys, use a dictionary that consists of 166 pre-defined substructure fragments.24 For a molecule, each bit represents the presence or absence of a certain atom type, bond type, atom environment, group, or property. If a specified substructure is presented in a given molecule, the corresponding bit is set to ‘1’; conversely, it is set to ‘0’. Thus, each molecule is described as a binary string to represent structural features. The substructure dictionary of MACCS keys is freely available in OpenBabel (http://openbabel.org/).

SVM modeling

In the past two decades, as one of excellent machine learning algorithms, SVM has been successfully applied to establish the prediction models of CYP inhibitors with satisfied accuracies.16–19,25,26 In SVM classification, the original data points are firstly projected into a high-dimensional feature space by linear or non-linear kernel-functions and then classified by constructing a hyper-plane in the feature space. In this study, both linear- and RBF-kernel SVM modeling were performed by using LIBSVM v2.9 package.27 All variables were scaled linearly to the range of [0, 1] before SVM modeling. The kernel parameter γ and error penalty parameter C were fine-tuned by using grid search strategy and 10-fold cross-validations.

Assessment of model quality

The prediction performance of SVM models was assessed by sensitivity (SEN), specificity (SPE), overall accuracy (ACC), and Matthews correlation coefficient (MCC), the definitions of which are shown in eqn (1)–(4).
 
image file: c5ra17196b-t1.tif(1)
 
image file: c5ra17196b-t2.tif(2)
 
image file: c5ra17196b-t3.tif(3)
 
image file: c5ra17196b-t4.tif(4)

Here, TP (true positives) is the number of inhibitors predicted as inhibitors, TN (true negatives) the number of non-inhibitors predicted as non-inhibitors, FP (false positives) the number of noninhibitors predicted as inhibitors, and FN (false negatives) the number of inhibitors predicted as noninhibitors. The value of MCC ranges from −1 to 1. A value of 1 indicates perfect agreement between predicted and observed classes, whereas −1 indicates the worst possible prediction.

Results and discussion

Chemical space and structure diversity

The chemical space of the samples in Dataset I was explored by principal component analysis (PCA) based on 25 pharmacophore and physicochemical descriptors (Table S2), which characterize molecule volume, shape, electronic, hydrophobic, and H-bond receptor/donor properties. All descriptors were auto-scaled prior to PCA analysis. A total of 4 significant components were obtained by PCA analysis. The first two components explained 25.2% and 18% of the variances of dataset, respectively.

The distribution in the first two principal components of the samples is shown in Fig. 2. It can be seen that both the training and test samples cover the most chemical space within 95% confidence interval and that only a minority of the samples are outside the 95% confidence interval. This indicates that the training and test samples have similar chemical distributions. Besides, no significant difference is detected in the distributions between the inhibitors and the noninhibitors.


image file: c5ra17196b-f2.tif
Fig. 2 The first 2 principal component scores of the samples in Dataset I (AID: 1851). Blue cross: inhibitors in the training set; blue diamond: noninhibitors in the training set; red cross: inhibitors in the test set; red diamond: noninhibitors in the test set. The oval-shaped curve: the 95% confidence interval of Hotelling's T2.

Parameters optimization for Molecular Holograms

Before generation of Molecular Holograms, the three important parameters, namely fragment size, fragment distinction, and hologram length, are optimized by partial least squares discriminant analysis (PLS-DA) build-in HQSAR module of Sybyl 8.1 package.20 According to our experience, fragment size of 4–7 is optimal in most cases, which not only covers the most important chemical groups but also decreases the number of fragments. Then, the optimal combination of fragment distinction and hologram length was systematically examined according to the performance of PLS-DA models. In PLS-DA modeling, all variables were auto-scaled and the number of principal components was determined by 10-fold cross-validations.

The best 14 PLS-DA models established by different parameter combinations are shown in Table 2. It can be seen that no significant difference in the overall performance is observed among the 14 models. Herein, model 8 with the highest ACCs and MCCs for both the training and test sets is selected as the best PLS-DA model, of which the fragment size is 4–7, the fragment distinction is A/B/Ch/DA, and the hologram length is 401 bins. Thus, the Molecular Hologram descriptors of model 8 were used for the following SVM modeling.

Table 2 Performance of the best 14 PLS-DA models
Model Fragment distinctiona Hologram length Training set Test set I
SEN (%) SPE (%) ACC (%) MCC SEN (%) SPE (%) ACC (%) MCC
a A: atom types; B: bond types; C: connectivity; Ch: chirality; H: hydrogens; DA: H-bond donor and acceptor.
1 A/B/C 401 76.29 81.12 78.89 0.5748 73.97 78.86 76.61 0.5288
2 A/B/C/H 401 79.65 75.41 77.36 0.5488 78.52 72.45 75.25 0.5082
3 A/B/C/Ch 401 75.20 82.05 78.89 0.5745 72.38 79.32 76.12 0.5186
4 A/B/C/H/Ch 353 80.33 75.00 77.45 0.5516 79.30 70.74 74.68 0.4994
5 A/C/Ch/DA 401 74.49 82.08 78.58 0.5681 72.11 80.63 76.70 0.5300
6 A/B/C/DA 257 72.76 81.96 77.72 0.5506 69.63 79.15 74.76 0.4907
7 A/B/C/Ch/DA 401 74.86 81.24 78.30 0.5625 71.19 77.44 74.56 0.4872
8 A/B/Ch/DA 401 76.97 81.67 79.50 0.5871 75.26 78.60 77.06 0.5385
9 A/B/H/Ch 401 79.82 76.65 78.11 0.5630 78.15 73.06 75.40 0.5105
10 A/B/H/DA 401 77.78 77.87 77.83 0.5554 77.06 74.91 75.90 0.5182
11 A/B/H/Ch/DA 401 78.22 78.02 78.11 0.5612 77.33 75.09 76.12 0.5227
12 A/B/C/H/Ch/DA 401 79.27 75.93 77.47 0.5504 78.08 72.80 75.23 0.5072
13 A/C/H/Ch/DA 307 77.68 78.92 78.35 0.5651 74.55 75.90 75.28 0.5037
14 A/C/H/DA 401 77.88 78.07 77.99 0.5584 75.98 74.88 75.39 0.5072


SVM classification models established by each of the fragment descriptors

First, the 401-bin Molecular Hologram descriptors and 166-bit MACCS descriptors were used for SVM modeling separately. The performance of the optimal RBF- and linear-kernel SVM models is shown in Table 3. Overall, all SVM models show high ACCs for both the training and test sets, which range from 77% to 83%. By comparison, the MACCS models outperform the Molecular Hologram models. Meanwhile, the ACCs of RBF-SVM models are slightly higher than that of linear-SVM models for both the training and test sets. Thus, the best SVM model is the MACCS model with RBF kernel, of which the ACCs are larger than 80% for both the training and test sets. Also, it can be seen that all the 4 SVM models outperform the best PLS-DA model, which demonstrates superiority of SVM modeling method.
Table 3 Performance of SVM models established by each of the fragment descriptors
Model Description method Training set Test set
SEN (%) SPE (%) ACC (%) MCC SEN (%) SPE (%) ACC (%) MCC
a RBF kernel, C = 30.8022, γ = 1.0000.b Linear kernel, C = 39.5508.c RBF kernel, C = 16.4872, γ = 1.3591.d Linear kernel, C = 37.1545.
RBF-SVMa Molecular Holograms 85.31 79.09 81.96 0.6421 82.59 73.72 77.81 0.5620
Linear-SVMb Molecular Holograms 83.21 78.74 80.80 0.6176 80.52 74.07 77.04 0.5444
RBF-SVMc MACCS 84.70 79.90 82.11 0.6441 84.19 76.54 80.06 0.6056
Linear-SVMd MACCS 83.31 80.48 81.79 0.6361 82.80 77.18 79.77 0.5979


SVM classification models established by the combined fragment descriptors

Molecular Hologram and MACCS are two types of fragment descriptors. A Molecular Hologram is an array containing counts of molecular fragments, and it reflects a many-to-one relationship between fragments and bins. The MACCS fingerprint uses a pre-defined dictionary of structural features and denotes their presence or absence by ‘1’ or ‘0’. Therefore, MACCS keys reflect a one-to-one relationship between features and bits. In a sense, Molecular Hologram puts emphasis on fragment types, while MACCS on atom features and environments. Thus, the two types of fragment descriptors may be, to some degree, complementary to each other, and can be combined to enhance model's prediction power.

Just as expected, the overall prediction performance of SVM models significantly increases after introducing the combined descriptors (Table 4). Especially for the RBF-SVM model, the ACCs and MCCs are strikingly high for both the training (90.95%, 0.8191) and test sets (83.14%, 0.6638). The results indicate that the combination of the two fragment descriptors with different types can effectively enhance the prediction performance. In comparison with earlier researches on CYP1A2 inhibitors, our RBF-SVM model is clearly more predictive and simple (Table 4).

Table 4 Performance of the SVM models established by the combined descriptors
Modeling methods No. of training samples Training set No. of test samples Test set
SEN (%) SPE (%) ACC (%) MCC SEN (%) SPE (%) ACC (%) MCC
a RBF kernel; C = 42.1016, γ = 2.2408.b Linear kernel, C = 50.7842.c The optimal combined model designed by Cheng et al.16 were based on SVM and k-NN algorithms and the performance was evaluated by 5-fold cross-validation.d The presented performance of the SVM model18 was obtained based on the training and the test sets assembled by random sampling strategy in the study.e The accuracy for the training set was evaluated by 7-fold cross-validation, and the accuracy for the test set was measured by the area under the curves (AUC) of the receiver operating characteristic (ROC).17f ASNN: Associative neural networks.26
RBF-SVM in this studya 6396 92.40 89.70 90.95 0.8191 6395 85.17 81.41 83.14 0.6638
Linear-SVM in this studyb 6396 87.25 83.56 85.26 0.7060 6395 84.32 78.71 81.30 0.6284
Cheng et al. combined model IIc 12[thin space (1/6-em)]099 80.00 82.50 81.30 0.6260 2804 72.00
Su et al. SVMd 10[thin space (1/6-em)]238 2559 86.80 74.00 79.80
Sun et al. RBF-SVMe 7208 87.50 7128   0.93 (AUC)    
Novotarskyi et al. ASNNf 3745 3741 0.827 0.827 0.827 0.6530


Recently, Lapins et al.25 developed a unified proteochemometric (PCM) model successfully for predicting inhibitors of five major CYP isoforms, i.e. CYP1A2, CYP2C9, CYP2C19, CYP2D6 and CYP3A4. Based on the signature information of the inhibitors and amino acid property composition and transition information in the CYP primary sequences, high cross-validated accuracies (78–85%) and prediction accuracies for an external dataset (79–88%) were obtained by using SVM, k-nearest neighbor, and random forest classifiers. For CYP1A2 inhibitors, however, it can be seen from the ROC curve that high sensitivity for the external validation dataset can be achieved only at a cost of low specificity. In comparison with our models, the PCM models established on about 80[thin space (1/6-em)]000 atomic signatures and amino acid property transition information is somewhat complex and less interpretable.

In order to further validate the prediction and extrapolation capabilities of the SVM models, an independent validation set (Dataset II) containing 2581 diverse compounds (1774 inhibitors and 807 noninhibitors) was introduced. As observed in Table 5, the two SVM models with different kernel functions achieve modest ACCs (∼65%). Although the SENs of both models are somewhat limited, the SPEs and ACCs are still satisfying. The low SENs may be explained by the different experimental protocols and labeling methods applied for the two databases. For this imbalanced dataset, the geometric mean (G-mean) based ACC was also introduced to measure the predictive power. It can be seen that the G-mean of ∼70% is relatively high for this imbalanced dataset.

Table 5 Performance of the SVM models on the independent validation set (2581 samples)
Model Description method SEN (%) SPE (%) ACC (%) MCC G-meana (%)
a image file: c5ra17196b-t5.tif
RBF-SVM Combined 57.33 87.11 66.64 0.4156 70.67
MACCS 53.27 83.89 62.84 0.3494 66.85
Molecular Holograms 54.90 84.76 64.24 0.3719 68.22
Linear-SVM Combined 54.40 86.12 64.32 0.3809 68.45
MACCS 52.03 85.01 62.34 0.3498 66.51
Molecular Holograms 52.82 83.02 62.26 0.3371 66.22


Furthermore, according to the weight coefficients of variables in the RBF-SVM model, variable screening was also performed. The results showed that no significant improvement was observed with the decreased number of fragment descriptors (Fig. 3).


image file: c5ra17196b-f3.tif
Fig. 3 The performance of the combined descriptors based RBF-SVM model for the (a) training set (b) test set and (c) validation set.

Conclusions

In this study, Molecular Hologram and MACCS descriptors were combined to construct SVM classification models for CYP1A2 inhibitors on a large dataset with more than 12[thin space (1/6-em)]000 unique compounds. The results show that the prediction performance of the RBF-SVM model based on combined fragment descriptors was remarkably improved with the overall accuracies of 90.95% and 83.14% for the training and test sets, respectively. The SVM models were further validated by an independent dataset of 2581 samples with the G-mean accuracy of ∼70%. The results indicate that the Molecular Hologram and MACCS descriptors are, to some degree, complementary to each other, and can be combined to enhance predictive power effectively. In comparison with the earlier studies, the RBF-SVM model based on the combined descriptors is extremely simple, predictive, and especially suit for large scale virtual screening of CYP1A2 inhibitors. According to this research and our previous researches of CYP2C19 inhibitors, we suggest that the combination of Molecular Hologram and MACCS descriptors can be considered as one preferable method for the virtual screening of inhibitors of other CYP isoforms.

Acknowledgements

This research was supported by the National Natural Science Foundation of China (No. 61073135) and the ‘111’ project of Introducing Talents of Discipline to Universities.

References

  1. D. R. Nelson, D. C. Zeldin, S. M. Hoffman, L. J. Maltais, H. M. Wain and D. W. Nebert, Pharmacogenetics, 2004, 14, 1–18 CrossRef CAS PubMed.
  2. S. F. Zhou, J. P. Liu and B. Chowbay, Drug Metab. Rev., 2009, 41, 89–295 CrossRef CAS PubMed.
  3. D. Singh, A. Kashyap, R. V. Pandey and K. S. Saini, Drug Discovery Today, 2011, 16, 793–799 CrossRef CAS PubMed.
  4. O. Pelkonen, M. Turpeinen, J. Hakkola, P. Honkakoski, J. Hukkanen and H. Raunio, Arch. Toxicol., 2008, 82, 667–715 CrossRef CAS PubMed.
  5. T. Shimada, H. Yamazaki, M. Mimura, Y. Inui and F. P. Guengerich, J. Pharmacol. Exp. Ther., 1994, 270, 414–423 CAS.
  6. I. S. Lee and D. Kim, Arch. Pharmacal Res., 2011, 34, 1799–1816 CrossRef CAS PubMed.
  7. T. Fox and J. M. Kriegl, Curr. Top. Med. Chem., 2006, 6, 1579–1591 CrossRef CAS PubMed.
  8. J. Sridhar, J. Liu, M. Foroozesh and C. L. Stevens, Molecules, 2012, 17, 9283–9305 CrossRef CAS PubMed.
  9. K. K. Chohan, S. W. Paine, J. Mistry, P. Barton and A. M. Davis, J. Med. Chem., 2005, 48, 5154–5161 CrossRef CAS PubMed.
  10. F. Hammann, H. Gutmann, U. Baumann, C. Helma and J. Drewe, Mol. Pharm., 2009, 6, 1920–1926 CrossRef CAS PubMed.
  11. J. Burton, I. Ijjaali, O. Barberan, F. Petitet, D. P. Vercauteren and A. Michel, J. Med. Chem., 2006, 49, 6231–6240 CrossRef CAS PubMed.
  12. K. Roy and P. P. Roy, Expert Opin. Drug Metab. Toxicol., 2009, 5, 1245–1266 CrossRef CAS PubMed.
  13. H. Li, J. Sun, X. Fan, X. Sui, L. Zhang, Y. Wang and Z. He, J. Comput.-Aided Mol. Des., 2008, 22, 843–855 CrossRef CAS PubMed.
  14. M. P. Gleeson, A. M. Davis, K. K. Chohan, S. W. Paine, S. Boyer, C. L. Gavaghan, C. H. Arnby, C. Kankkonen and N. Albertson, J. Comput.-Aided Mol. Des., 2007, 21, 559–573 CrossRef CAS PubMed.
  15. H. Veith, N. Southall, R. Huang, T. James, D. Fayne, N. Artemenko, M. Shen, J. Inglese, C. P. Austin, D. G. Lloyd and D. S. Auld, Nat. Biotechnol., 2009, 27, 1050–1055 CrossRef CAS PubMed.
  16. F. Cheng, Y. Yu, J. Shen, L. Yang, W. Li, G. Liu, P. W. Lee and Y. Tang, J. Chem. Inf. Model., 2011, 51, 996–1011 CrossRef CAS PubMed.
  17. H. Sun, H. Veith, M. Xia, C. P. Austin and R. Huang, J. Chem. Inf. Model., 2011, 51, 2474–2481 CrossRef CAS PubMed.
  18. B. H. Su, Y. S. Tu, C. Lin, C. Y. Shao, O. A. Lin and Y. J. Tseng, J. Chem. Inf. Model., 2015, 55, 1426–1434 CrossRef CAS PubMed.
  19. L. Chao, H. Mei, X. C. Pan, W. Tan, T. F. Liu and L. Yang, Chemom. Intell. Lab. Syst., 2014, 130, 109–114 CrossRef CAS.
  20. Tripos Inc., St. Louis, MO, USA, 2008, available online: http://www.tripos.com.
  21. A. Varnek, Methods Mol. Biol., 2011, 672, 213–243 CrossRef CAS PubMed.
  22. K. Z. Myint and X. Q. Xie, Int. J. Mol. Sci., 2010, 11, 3846–3866 CrossRef CAS PubMed.
  23. T. Hurst and T. Heritage, Tripos Technical Notes, 1997, 1, 1–15 Search PubMed.
  24. J. L. Durant, B. A. Leland, D. R. Henry and J. G. Nourse, J. Chem. Inf. Comput. Sci., 2002, 42, 1273–1280 CrossRef CAS PubMed.
  25. M. Lapins, A. Worachartcheewan, O. Spjuth, V. Georgiev, V. Prachayasittikul, C. Nantasenamat and J. E. Wikberg, PLoS One, 2013, 8, e66566 CrossRef CAS PubMed.
  26. S. Novotarskyi, I. Sushko, R. Korner, A. K. Pandey and I. V. Tetko, J. Chem. Inf. Model., 2011, 51, 1271–1280 CrossRef CAS PubMed.
  27. C. C. Chang and C.-J. Lin, ACM Transactions on Intelligent Systems and Technology, 2010, 2, 1–27 CrossRef.

Footnote

Electronic supplementary information (ESI) available. See DOI: 10.1039/c5ra17196b

This journal is © The Royal Society of Chemistry 2015
Click here to see how this site uses Cookies. View our privacy policy here.