Increasing trustworthiness of machine learning-based drug sensitivity prediction with a multivariate random forest approach

Lisa-Marie Rolli; Lea Eckhart; Lutz Herrmann; Andrea Volkamer; Hans-Peter Lenhof; Kerstin Lenhof

doi:10.1039/D5DD00284B

Increasing trustworthiness of machine learning-based drug sensitivity prediction with a multivariate random forest approach

Lisa-Marie Rolli,

*^ab Lea Eckhart,

^acd Lutz Herrmann,

^b Andrea Volkamer,

^b Hans-Peter Lenhof

^a and Kerstin Lenhof

^acde

Author affiliations

* Corresponding authors

^a Chair for Bioinformatics, Center for Bioinformatics, Saarland University, Saarland Informatics Campus, Saarland, Germany
E-mail: lisa-marie.rolli@uni-saarland.de

^b Chair for Data Driven Drug Design, Center for Bioinformatics, Saarland University, Saarland Informatics Campus, Saarland, Germany

^c Integrative Bioinformatics Group, Department of Medical Bioinformatics, University Medical Center Göttingen, Georg-August-University Göttingen, Lower Saxony, Germany

^d CAIMed – Lower Saxony Center for Artificial Intelligence and Causal Methods in Medicine, Göttingen, Germany

^e Computational Biology Group, Department of Biosystems Science and Engineering, ETH Zürich, Klingelbergstrasse 48, Basel, Switzerland

Abstract

Ensuring the trustworthiness of machine learning (ML) models in high-stake applications is crucial. One such application is predicting anti-cancer drug sensitivity, where ML models are built with the final goal of integrating them into treatment recommendation systems for personalized medicine. Here, we propose a trustworthy multivariate random forest method MORGOTH, available in our package ‘morgoth’. Besides standard regression and classification functions, MORGOTH allows for the simultaneous optimization of regression and classification tasks via a joint splitting criterion. Additionally, it provides a graph representation of the random forest to address model interpretability, and a cluster analysis of the leaves to measure the dissimilarity of new inputs from the training data to account for its reliability and robustness. In total, MORGOTH provides a comprehensive approach that unites simultaneous regression and classification, interpretability, reliability, and robustness in a single framework. While our package is broadly applicable, we demonstrate its capabilities for anti-cancer drug sensitivity prediction by a comprehensive large-scale study on the Genomics of Drug Sensitivity in Cancer (GDSC) database. We trained single-drug as well as multi-drug models. In either case, MORGOTH clearly outperforms state-of-the-art neural network approaches. Moreover, we highlight an evaluation issue for multi-drug models and demonstrate that single-drug models consistently outperform them when evaluated fairly.

Digital Discovery

Increasing trustworthiness of machine learning-based drug sensitivity prediction with a multivariate random forest approach

Abstract

Supplementary files

Transparent peer review

Article information

Download Citation

Permissions

Increasing trustworthiness of machine learning-based drug sensitivity prediction with a multivariate random forest approach

Social activity

Search articles by author

Spotlight

Advertisements