A systematic approach to prioritize drug targets using machine learning, a molecular descriptor-based classification model, and high-throughput screening of plant derived molecules: a case study in oral cancer†
Systems-biology inspired identification of drug targets and machine learning-based screening of small molecules which modulate their activity have the potential to revolutionize modern drug discovery by complementing conventional methods. To utilize the effectiveness of such pipelines, we first analyzed the dysregulated gene pairs between control and tumor samples and then implemented an ensemble-based feature selection approach to prioritize targets in oral squamous cell carcinoma (OSCC) for therapeutic exploration. Based on the structural information of known inhibitors of CXCR4—one of the best targets identified in this study—a feature selection was implemented for the identification of optimal structural features (molecular descriptor) based on which a classification model was generated. Furthermore, the CXCR4-centered descriptor-based classification model was finally utilized to screen a repository of plant derived small-molecules to obtain potential inhibitors. The application of our methodology may assist effective selection of the best targets which may have previously been overlooked, that in turn will lead to the development of new oral cancer medications. The small molecules identified in this study can be ideal candidates for trials as potential novel anti-oral cancer agents. Importantly, distinct steps of this whole study may provide reference for the analysis of other complex human diseases.