Investigation of dual JAK2 and HDAC6 inhibitors using machine learning methods†
Abstract
Janus kinase (JAK) inhibitors have been extensively used to treat hematologic cancers, but issues such as drug resistance and limited efficacy persist. Designing multi-target inhibitors with synergistic effects is an appropriate solution. Histone deacetylase (HDAC) inhibitors are also extensively employed as anticancer agents, so multi-target inhibitors based on JAK2 and HDAC6 may offer enhanced efficacy and safety. In this study, we established classification and regression models to facilitate the identification of dual JAK2 and HDAC6 inhibitors. Features were selected using mutual information (MI) algorithm, and 40 classification models were constructed using 5 feature methods and 8 Machine Learning (ML) algorithms to identify dual JAK2 and HDAC6 inhibitors. Among them, the KNN (K-nearest neighbors) model exhibited the best performance (ACC = 0.988, MCC = 0.970, AUC = 0.978). Additionally, four regression models were built using recursive feature elimination (RFE) to predict the inhibitory activity of JAK2 and HDAC6 inhibitors. Extreme gradient boosting (GBDT) and light gradient boosting machine (LGBM) models exhibited the best performance, with R2test-JAK2 = 0.752 and R2test-HDAC6 = 0.743, respectively. Furthermore, we utilized the SHapley Additive exPlanations (SHAP) method to elucidate the features that impact the classification and regression models. Based on this method, key fingerprint structures influencing the classification model and descriptors related to inhibitory activity were identified. Subsequently, molecular docking was employed to investigate how dual JAK2 and HDAC6 inhibitors interact with JAK2 and HDAC6 proteins. In conclusion, the classification and regression models established in this study can effectively facilitate the discovery of dual JAK2 and HDAC6 inhibitors, emphasizing the significant promise of machine learning in the discovery of dual inhibitors.