Increasing trustworthiness of machine learning-based drug sensitivity prediction with a multivariate random forest approach

02 June 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Ensuring the trustworthiness of machine learning (ML) models in high-stake applications is crucial. One such application is predicting anti-cancer drug sensitivity, where ML models are built with the final goal of integrating them into treatment recommendation systems for personalized medicine. Here, we propose a trustworthy multivariate random forest method, called MORGOTH. Besides standard regression and classification functions, MORGOTH allows for the simultaneous optimization of regression and classification tasks via a joint splitting criterion. Additionally, it provides a graph representation of the random forest to address model interpretability, and a cluster analysis of the leaves to measure the dissimilarity of new inputs from the training data to account for its reliability. While our approach is broadly applicable, we demonstrate its capabilities for anti-cancer drug sensitivity prediction by a comprehensive large-scale study on the Genomics of Drug Sensitivity in Cancer (GDSC) database. We trained single-drug as well as multi-drug models. In either case, MORGOTH clearly outperforms state-of-the-art neural network approaches. Moreover, we highlight an evaluation issue for multi-drug models and demonstrate that single-drug models consistently outperform them when evaluated fairly.

Keywords

trustworthiness
interpretability
reliability
multivariate learning
simultaneous classification and regression
drug sensitivity prediction
cancer
drug prioritization

Supplementary materials

Title
Description
Actions
Title
Supplement
Description
Contains additional information on the used data, the feature selection, the implementation, and the results for our main manuscript "Increasing trustworthiness of machine learning-based drug sensitivity prediction with a multivariate random forest approach".
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.