Predicting selective liver X receptor β agonists using multiple machine learning methods†
Abstract
Liver X receptor (LXR) α and β are cholesterol sensors; they respond to excess cholesterol and stimulate reverse cholesterol transport. Activating LXRs represents a promising therapeutic option for dyslipidemia. However, activating LXRα may cause unwanted lipogenicity. A better anti-dyslipidemia strategy would be to develop selective LXRβ agonists that do not activate LXRα. In this paper, a data set of 234 selective and non-selective LXRβ agonists was collected from the literature. For the first time, we derived the classification models from the data set to predict selective LXRβ agonists using multiple machine learning methods (naïve Bayesian (NB), Recursive Partitioning (RP), Support Vector Machine (SVM), and k-Nearest Neighbors (kNN) methods) with optimized property descriptors and structural fingerprints. The models were optimized from 324 multiple machine learning models, and most of the models showed high predictive abilities (overall predictive accuracies of >80%) for both training and test sets. The top 15 models were evaluated using an external test set of 76 compounds (all containing new scaffolds), and 10 of them displayed overall predictive accuracies exceeding 90%. The top models can be used for the virtual screening of selective LXRβ agonists. The NB models can identify privileged and unprivileged fragments for selective LXRβ agonists, and the fragments can be used to guide the design of new selective LXRβ agonists.