Guided docking as a data generation approach facilitates structure-based machine learning on kinases

Michael Backenköhler; Joschka Groß; Verena Wolf; Andrea Volkamer

doi:10.26434/chemrxiv-2023-prk53

Biological and Medicinal Chemistry

Search within Biological and Medicinal Chemistry

Guided docking as a data generation approach facilitates structure-based machine learning on kinases

22 December 2023, Version 1

This is not the most recent version. There is a

newer version

of this content available

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Drug discovery pipelines nowadays rely on machine learning models to explore and evaluate large chemical spaces. While including 3D structural information is considered beneficial, structural models are hindered by the availability of protein-ligand complex structures. Exemplified for kinase drug discovery, we address this issue by generating kinase-ligand complex data using template docking for the kinase compound subset of available ChEMBL assay data. To evaluate the benefit of the created complex data, we use it to train a structure-based E(3)-invariant graph neural network (GNN). Our evaluation shows that binding affinities can be predicted with significantly higher precision by models that take synthetic binding poses into account compared to ligand or DTI models only.

Keywords

data-driven drug discovery

structure-based machine learning

E(3)-invariant graph neural networks

template docking

kinases

Supplementary weblinks

Title

Description

Actions

Title

Raw kinodata-3D dataset

Description

A Zenodo record holding the raw kinase-ligand complex data we generated, including ligand structures, poses, KLIFS pocket structures, and CHEMBL bioactivity measurements.

Actions

View

Title

Preprocessed kinodata-3D for PyTorch Geometric & kinase affinity prediction models

Description

A Zenodo record holding dataset and model artifacts that can be used with our published code.

Actions

View

Title

Binding affinity prediction case study

Description

The code used to carry out our binding affinity prediction case study.

Actions

View

Title

Kinodata-3D data generation pipeline

Description

The code used to generate the kinodata-3D dataset.

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Apr 12, 2024 Version 2

Dec 22, 2023 Version 1

Metrics

1,449

979

Views

Downloads

Citations

License

The content is available under CC BY NC 4.0

DOI

10.26434/chemrxiv-2023-prk53

Funding

NextAID project at Saarland Univeristy

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Guided docking as a data generation approach facilitates structure-based machine learning on kinases

Authors

Abstract

Keywords

Supplementary weblinks

Comments

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share