Versatile Molecular Editing via Multimodal and Group-optimized Generative Learning

27 September 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


Generating molecules with specific constituents and structures that exhibit desired properties is a crucial yet challenging task in the computer-aided design of functional molecules. This challenge arises from the discrete nature of the vast design space of molecules, which is subject to additional physical constraints such as symmetries. Exploration and optimization within this constrained discrete space pose difficulties for most machine learning methods. In this paper, we introduce a multimodal representation for molecules that accounts for both their discrete atomic constituents and their continuous atomic positions in 3D Euclidean space. Based on this representation, we develop MolEdit, a molecular generation method that simultaneously solves discrete and continuous optimization problems: MolEdit learns the distribution of molecular constituents using efficient normalizing flow models and employs a group-optimized score matching algorithm to model the symmetry-preserved distribution of atomic positions. By combining these two components, MolEdit can effectively assemble any discrete molecular graph and generate corresponding molecular conformers. Furthermore, by decomposing the generation process multimodally, MolEdit can work with flexible prompts specifying conditional information about molecular constituents and substructures, leading to a general-purpose approach to versatile molecular editing.


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.