Quantifying the hardness of bioactivity prediction tasks for transfer learning

Hosein Fooladi; Steffen Hirte; Johannes Kirchmair

doi:10.26434/chemrxiv-2024-871mt

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Quantifying the hardness of bioactivity prediction tasks for transfer learning

29 January 2024, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Today, machine learning methods are widely employed in drug discovery. However, the chronic lack of data continues to hamper their further development, validation, and application. Several modern strategies aim to mitigate the challenges associated with data scarcity by learning from data on related tasks. These knowledge-sharing approaches encompass transfer learning, multi-task learning, and meta-learning. A key question remaining to be answered for these approaches is about the extent to which their performance can benefit from the relatedness of available source (training) tasks, in other words, how difficult (“hard”) a test task is to a model, given the available source tasks. This study introduces a new method for quantifying and predicting the hardness of a bioactivity prediction task based on its relation to the available training tasks. The approach involves the generation of protein and chemical representations and the calculation of distances between the bioactivity prediction task and the available training tasks. In the example of meta-learning, we demonstrate that the proposed task hardness metric is inversely correlated with performance. The metric will be useful in estimating the task-specific gain in performance that can be achieved through meta-learning.

Keywords

bioactivity prediction

artificial intelligence

Supplementary materials

Title

Description

Actions

Title

Supporting information

Description

Information on protein embeddings, molecule featurizers, the distance module, and the prototypical network. Seven figures and three tables with additional information and data

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Now Published

Quantifying the Hardness of Bioactivity Prediction Tasks for Transfer Learning

Hosein Fooladi, Steffen Hirte, Johannes Kirchmair journal article

Journal of Chemical Information and Modeling , Volume 64, Issue 10

Online publication date: May 13, 2024

Version History

Jan 29, 2024 Version 1

Metrics

709

312

Views

Downloads

Citations

License

The content is available under CC BY 4.0

DOI

10.26434/chemrxiv-2024-871mt

Funding

Austrian Federal Ministry of Labour and Economy

National Foundation for Research, Technology and Development

Christian Doppler Research Association

Boehringer-Ingelheim RCV GmbH & Co KG

BASF SE

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Quantifying the hardness of bioactivity prediction tasks for transfer learning

Authors

Abstract

Keywords

Supplementary materials

Comments

Now Published

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share