Actively Searching: Inverse Design of Novel Molecules with Simultaneously Optimized Properties

Nicolae C. Iovanac; Robert MacKnight; Brett Savoie

doi:10.26434/chemrxiv.14643360.v1

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Actively Searching: Inverse Design of Novel Molecules with Simultaneously Optimized Properties

24 May 2021, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Combining quantum chemistry characterizations with generative machine learning models has the potential to accelerate molecular searches in chemical space. In this paradigm, quantum chemistry acts as a relatively cost-effective oracle for evaluating the properties of particular molecules while generative models provide a means of sampling chemical space based on learned structure-function relationships. For practical applications, multiple potentially orthogonal properties must be optimized in tandem during a discovery workflow. This carries additional difficulties associated with specificity of the targets and the ability for the model to reconcile all properties simultaneously. Here we demonstrate an active learning approach to improve the performance of multi-target generative chemical models. We first demonstrate the effectiveness of a set of baseline models trained on single property prediction tasks in generating novel compounds with various property targets, including both interpolative and extrapolative generation scenarios. For property ranges where accurate targeting proves difficult, the novel compounds suggested by the model are characterized using quantum chemistry to obtain the true values, and these new molecules closest to expressing the desired properties are fed back into the generative model for additional training. This gradually improves the generative models’ understanding of unknown areas of chemical space and shifts the distribution of generated compounds towards the targeted values. We then demonstrate the effectiveness of this active learning approach in generating compounds with multiple chemical constraints, including vertical ionization potential, electron affinity, and dipole moment targets, and validate the results at the B97X-D3/def2-TZVP level. This method requires no modifications to extant generative approaches, but rather utilizes their inherent generative and predictive aspects for self-refinement, and can be applied to situations where any number of properties with varying degrees of correlation must be optimized simultaneously.

Keywords

active learning strategies

deep generative models

Supplementary materials

Title

Description

Actions

Title

Description

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Now Published

Actively Searching: Inverse Design of Novel Molecules with Simultaneously Optimized Properties

Nicolae C. Iovanac, Robert MacKnight, Brett M. Savoie journal article

The Journal of Physical Chemistry A , Volume 126, Issue 2

Online publication date: Jan 05, 2022

Version History

May 24, 2021 Version 1

Version Notes

presubmission

Metrics

1,436

553

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv.14643360.v1

Author’s competing interest statement

no conflict of interest

Actively Searching: Inverse Design of Novel Molecules with Simultaneously Optimized Properties

Authors

Abstract

Keywords

Supplementary materials

Comments

Now Published

Version History

Version Notes

Metrics

License

DOI

Author’s competing interest statement

Share