Graph Neural Networks for Identifying Protein-Reactive Compounds

Victor Hugo Cano Gil; Christopher Rowley

doi:10.26434/chemrxiv-2023-d0dqp-v2

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Graph Neural Networks for Identifying Protein-Reactive Compounds

07 February 2024, Version 2

This is not the most recent version. There is a

newer version

of this content available

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The identification of protein-reactive electrophilic compounds is critical to the design of new covalent modifier drugs, screening for toxic compounds, and the exclusion of reactive compounds from high throughput screening. In this work, we employ traditional and graph machine learning (ML) algorithms to classify molecules being reactive towards proteins or nonreactive. For training data, we built a new dataset, ProteinReactiveDB, composed primarily of covalent and noncovalent inhibitors from the DrugBank, BindingDB, and CovalentInDB databases. To assess the transferability of the trained models, we created a custom set of covalent and noncovalent inhibitors, which was constructed from the recent literature. Baseline models were developed using Morgan fingerprints as training inputs, but they performed poorly when applied to compounds outside the training set. We then trained various Graph Neural Networks (GNNs), with the best GNN model achieving an Area Under the Receiver Operator Characteristic (AUROC) curve of 0.80, precision of 0.89, and recall of 0.72. We also explore the interpretability of these GNNs using Gradient Activation Mapping (GradCAM), which shows regions of the molecules GNNs deem most relevant when making a prediction. These maps indicated that our trained models can identify electrophilic functional groups in a molecule and classify molecules as protein-reactive based on their presence. We demonstrate the use of these models by comparing their performance against common chemical filters, identifying covalent modifiers in the ChEMBL database and generating a putative covalent inhibitor based on an established noncovalent inhibitor.

Keywords

covalent modifier

irreversible inhibitior

Supplementary materials

Title

Description

Actions

Title

Supporting information

Description

Details of the hyperparameter optimization, list of GNN atomic and bond features, list of GNN features

Actions

Supplementary weblinks

Title

Description

Actions

Title

Github Repository

Description

Github repository of input files and scripts

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Jun 07, 2024 Version 3

Feb 07, 2024 Version 2

Nov 08, 2023 Version 1

Version Notes

Revision with comparisons to Eli Libby Filters and demonstration of generative models.

Metrics

1,698

759

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2023-d0dqp-v2

Funding

Natural Sciences and Engineering Research Council of Canada

RGPIN-05795-2016

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Graph Neural Networks for Identifying Protein-Reactive Compounds

Authors

Abstract

Keywords

Supplementary materials

Supplementary weblinks

Comments

Version History

Version Notes

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share