Multi-task Proteochemometric Modelling

16 February 2022, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Motivation: In silico prediction of protein-ligand binding is a hot topic in computational chemistry and machine learning-based drug discovery, as an accurate prediction model could reduce the time and resources required to detect and identify and prioritize potential drug candidates. Proteochemometric modelling (PCM) is a promising approach for in-silico protein-ligand binding prediction that utilises both compound and target descriptors. However, in its original form PCM model cannot separate multiple assays associated with the same target. Therefore, a practitioner applying PCM approach to modelling experimental data has either to select only one assay for each target, and thus exclude potentially significant amount of data, or pull measurements from different assays together effectively mixing possibly very different functional dependencies between (protein, ligand) pairs and experimental measurements. Results: We describe two modifications of PCM models that increase its flexibility allowing to separate multiple assays associated with the same target. Evaluated on a subset of internal Bayer dose-response data and ChEMBL, these approaches result in improved performance compared to standard PCM models. Our results demonstrate importance of disentangling multiple assays associated with the same target when using PCM methodology in pharmaceutical environment. Availability: Source code is made publicly available on GitHub for non-commercial usage after publication.

Keywords

Proteochemometric Modelling
Multi-task learning
QSAR
protein-ligand binding

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.