Increasing trustworthiness of machine learning-based drug sensitivity prediction with a multivariate random forest approach

Lisa-Marie Rolli; Lea Eckhart; Andrea Volkamer; Hans-Peter Lenhof; Kerstin Lenhof

doi:10.26434/chemrxiv-2025-ml78s

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Increasing trustworthiness of machine learning-based drug sensitivity prediction with a multivariate random forest approach

02 June 2025, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Ensuring the trustworthiness of machine learning (ML) models in high-stake applications is crucial. One such application is predicting anti-cancer drug sensitivity, where ML models are built with the final goal of integrating them into treatment recommendation systems for personalized medicine. Here, we propose a trustworthy multivariate random forest method, called MORGOTH. Besides standard regression and classification functions, MORGOTH allows for the simultaneous optimization of regression and classification tasks via a joint splitting criterion. Additionally, it provides a graph representation of the random forest to address model interpretability, and a cluster analysis of the leaves to measure the dissimilarity of new inputs from the training data to account for its reliability. While our approach is broadly applicable, we demonstrate its capabilities for anti-cancer drug sensitivity prediction by a comprehensive large-scale study on the Genomics of Drug Sensitivity in Cancer (GDSC) database. We trained single-drug as well as multi-drug models. In either case, MORGOTH clearly outperforms state-of-the-art neural network approaches. Moreover, we highlight an evaluation issue for multi-drug models and demonstrate that single-drug models consistently outperform them when evaluated fairly.

Keywords

trustworthiness

interpretability

reliability

multivariate learning

simultaneous classification and regression

drug sensitivity prediction

cancer

drug prioritization

Supplementary materials

Title

Description

Actions

Title

Supplement

Description

Contains additional information on the used data, the feature selection, the implementation, and the results for our main manuscript "Increasing trustworthiness of machine learning-based drug sensitivity prediction with a multivariate random forest approach".

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Jun 02, 2025 Version 1

Metrics

379

Views

Downloads

Citations

License

The content is available under CC BY 4.0

DOI

10.26434/chemrxiv-2025-ml78s

Funding

European Union Horizon Program

101178148

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) declare that they have sought and gained approval from the relevant ethics committee/IRB for this research and its publication.

Increasing trustworthiness of machine learning-based drug sensitivity prediction with a multivariate random forest approach

Authors

Abstract

Keywords

Supplementary materials

Comments

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share