Discovery of neuroprotective compounds by machine learning approaches

Jiansong Fang; Xiaocong Pang; Rong Yan; Wenwen Lian; Chao Li; Qi Wang; Ai-Lin Liu; Guan-Hua Du

doi:10.1039/C5RA23035G

Discovery of neuroprotective compounds by machine learning approaches†

Jiansong Fang,‡^ab Xiaocong Pang,‡^a Rong Yan,^a Wenwen Lian,^a Chao Li,^a Qi Wang,^b Ai-Lin Liu*^acd and Guan-Hua Du*^acd

Author affiliations

* Corresponding authors

^a Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, 1 Xian Nong Tan Street, Beijing 100050, PR China
E-mail: liuailin@imm.ac.cn, dugh@imm.ac.cn
Fax: +86-10-83150885
Tel: +86-10-83150885

^b Institute of Clinical Pharmacology, Guangzhou University of Traditional Chinese Medicine, Guangzhou 510006, China

^c Beijing Key Laboratory of Drug Target and Screening Research, Beijing 100050, PR China

^d State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Beijing 100050, PR China

Abstract

Neuronal cell death from oxidative stress is a strong factor of many neurodegenerative diseases. To tackle these problems, phenotypic drug screening assays are a possible alternative strategy. The aim of this study is to develop the neuroprotective models against glutamate or H₂O₂-induced neurotoxicity by machine learning approaches, which helps in discovering neuroprotective compounds. Four different single classifiers (neural network, k nearest neighbors, classification tree and random forest) were constructed based on two large datasets containing 1260 and 900 known active or inactive compounds, which were integrated to develop the combined Bayesian models to obtain superior performance. Our results showed that both of the Bayesian models (combined-NB-1 and combined-NB-2) outperformed the corresponding four single classifiers. Additionally, structural fingerprint descriptors were added to improve the predictive ability of the models, resulting in the two best models NB-1-LPFP4 and NB-2-LCFP6. The best two models gave Matthews correlation coefficients of 0.972 and 0.956 for 5-fold cross validation as well as 0.953 and 0.902 for the test set, respectively. To illustrate the practical applications of the two models, NB-1-LPFP4 and NB-2-LCFP6 were used to perform virtual screening for discovering neuroprotective compounds, and 70 compounds were selected for further cell-based assay. The assay results showed that 28 compounds exhibited neuroprotective effects against glutamate-induced and H₂O₂-induced neurotoxicity simultaneously. Our results suggested the method that integrated single classifiers into combined Bayesian models could be feasible to predict neuroprotective compounds.

This article is part of the themed collection: Machine learning and artificial neural networks in chemistry

RSC Advances

Discovery of neuroprotective compounds by machine learning approaches†

Abstract

Supplementary files

Article information

Download Citation

Permissions

Discovery of neuroprotective compounds by machine learning approaches

Social activity

Search articles by author

Spotlight

Advertisements