Activity Cliff-Informed Contrastive Learning
for Molecular Property Prediction

WANXIANG SHEN; Chao Cui; Xiaorui Su; Zaixi Zhang; Alejandro Velez-Arce; Jianming Wang; Xiang Cheng Shi; Yan Bing Zhang; Jie Wu; Yu Zong Chen; Marinka Zitnik

doi:10.26434/chemrxiv-2023-5cz7s-v2

Biological and Medicinal Chemistry

Search within Biological and Medicinal Chemistry

Activity Cliff-Informed Contrastive Learning for Molecular Property Prediction

07 November 2024, Version 2

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Modeling molecular activity and quantitative structure-activity relationships of chemical compounds is critical in drug design. Graph neural networks, which utilize molecular structures as frames, have shown success in assessing the biological activity of chemical compounds, guiding the selection and optimization of candidates for further development. However, current models often overlook activity cliffs (ACs)—cases where structurally similar molecules exhibit different bioactivities—due to latent spaces primarily optimized for structural features. Here, we introduce AC-awareness (ACA), an inductive bias designed to enhance molecular representation learning for activity modeling. The ACA jointly optimizes metric learning in the latent space and task performance in the target space, making models more sensitive to ACs. We develop \name, an AC-informed contrastive learning approach that can be integrated with any graph neural network. Experiments on 39 benchmark datasets demonstrate that AC-informed representations of chemical compounds consistently outperform standard models in bioactivity prediction across both regression and classification tasks. AC-informed models show strong performance in predicting pharmacokinetic and safety-relevant molecular properties. ACA paves the way toward activity-informed molecular representations, providing a valuable tool for the early stages of lead compound identification, refinement, and virtual screening.

Keywords

Activity cliff

molecular activity prediction

Contrastive Learning

Activity-cliff-awareness

Graph neural network

Machine learning

Deep learning

Supplementary materials

Title

Description

Actions

Title

Supplementary.pdf

Description

Supplementary Tables and Figures: Supplementary Figures S1 to S8 Supplementary Tables S1 to S9

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Nov 07, 2024 Version 2

May 29, 2023 Version 1

Version Notes

The new version has been updated with a changed title. The activity-cliff awareness (ACA) loss has been further tested on four different Graph Neural Network (GNN) backbones and expanded to include the property-cliff concept. Both the qualitative and quantitative advantages of the ACA loss versus no-ACA loss have been investigated.

Metrics

2,733

1,364

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2023-5cz7s-v2

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content