Increasing trustworthiness of machine learning-based drug sensitivity prediction with a multivariate random forest approach
Abstract
Ensuring the trustworthiness of machine learning (ML) models in high-stake applications is crucial. One such application is predicting anti-cancer drug sensitivity, where ML models are built with the final goal of integrating them into treatment recommendation systems for personalized medicine. Here, we propose a trustworthy multivariate random forest method MORGOTH, available in our package ‘morgoth’. Besides standard regression and classification functions, MORGOTH allows for the simultaneous optimization of regression and classification tasks via a joint splitting criterion. Additionally, it provides a graph representation of the random forest to address model interpretability, and a cluster analysis of the leaves to measure the dissimilarity of new inputs from the training data to account for its reliability and robustness. In total, MORGOTH provides a comprehensive approach that unites simultaneous regression and classification, interpretability, reliability, and robustness in a single framework. While our package is broadly applicable, we demonstrate its capabilities for anti-cancer drug sensitivity prediction by a comprehensive large-scale study on the Genomics of Drug Sensitivity in Cancer (GDSC) database. We trained single-drug as well as multi-drug models. In either case, MORGOTH clearly outperforms state-of-the-art neural network approaches. Moreover, we highlight an evaluation issue for multi-drug models and demonstrate that single-drug models consistently outperform them when evaluated fairly.

Please wait while we load your content...