Discovery of neuroprotective compounds by machine learning approaches
Neuronal cell death from oxidative stress is a strong factor of many neurodegenerative diseases. To tackle these problems, phenotypic drug screening assays are a possible alternative strategy. The aim of this study is to develop the neuroprotective models against glutamate or H2O2-induced neurotoxicity by machine learning approaches, which helps in discovering neuroprotective compounds. Four different single classifiers (neural network, k nearest neighbors, classification tree and random forest) were constructed based on two large datasets containing 1260 and 900 known active or inactive compounds, which were integrated to develop the combined Bayesian models to obtain superior performance. Our results showed that both of the Bayesian models (combined-NB-1 and combined-NB-2) outperformed the corresponding four single classifiers. Additionally, structural fingerprint descriptors were added to improve the predictive ability of the models, resulting in the two best models NB-1-LPFP4 and NB-2-LCFP6. The best two models gave Matthews correlation coefficients of 0.972 and 0.956 for 5-fold cross validation as well as 0.953 and 0.902 for the test set, respectively. To illustrate the practical applications of the two models, NB-1-LPFP4 and NB-2-LCFP6 were used to perform virtual screening for discovering neuroprotective compounds, and 70 compounds were selected for further cell-based assay. The assay results showed that 28 compounds exhibited neuroprotective effects against glutamate-induced and H2O2-induced neurotoxicity simultaneously. Our results suggested the method that integrated single classifiers into combined Bayesian models could be feasible to predict neuroprotective compounds.