A Physics-Inspired Approach to the Understanding of Molecular Representations and Models

Luke Dicks; David Graff; Kirk Jordan; Connor Coley; Edward Pyzer-Knapp

doi:10.26434/chemrxiv-2023-0zx26

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

A Physics-Inspired Approach to the Understanding of Molecular Representations and Models

11 December 2023, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The story of machine learning in general, and its application to molecular design in particular, has been a tale of evolving representations of data. Understanding the implications of the use of a particular representation -- including the existence of so-called `activity cliffs' for cheminformatics models -- is the key to their successful use for molecular discovery. In this work we present a physics-inspired methodology which exploits analogies between model response surfaces and energy landscapes to richly describe the relationship between the representation and the model. From these similarities, a metric emerges which is analogous to the commonly used frustration metric from the chemical physics community. This new property shows state-of-the-art prediction of model error, whilst belonging to a novel class of roughness measure that extends beyond the known data allowing the trivial identification of activity cliffs even in the absence of related training or evaluation data.

Keywords

Molecular representations

Energy landscapes

Structure-property relationships

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Dec 11, 2023 Version 1

Metrics

1,462

905

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2023-0zx26

Funding

Science and Technology Facilities Council

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

A Physics-Inspired Approach to the Understanding of Molecular Representations and Models

Authors

Abstract

Keywords

Comments

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share