Fragment-based Sequential Translation for Molecular Optimization

Benson Chen; Xiang Fu; Regina Barzilay; Tommi Jaakkola

doi:10.26434/chemrxiv-2021-fzxmk-v2

Searching for novel molecular compounds with desired properties is an important problem in drug discovery. Many existing frameworks generate molecules one atom at a time. We instead propose a flexible editing paradigm that generates molecules using learned molecular fragments---meaningful substructures of molecules. To do so, we train a variational autoencoder (VAE) to encode molecular fragments in a coherent latent space, which we then utilize as a vocabulary for editing molecules to explore the complex chemical property space. Equipped with the learned fragment vocabulary, we propose Fragment-based Sequential Translation (FaST), which learns a reinforcement learning (RL) policy to iteratively translate model-discovered molecules into increasingly novel molecules while satisfying desired properties. Empirical evaluation shows that FaST significantly improves over state-of-the-art methods on benchmark single/multi-objective molecular optimization tasks.

Fragment-based Sequential Translation for Molecular Optimization

Abstract

Keywords

Comments

Version History

Version Notes

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share

Fragment-based Sequential Translation for Molecular Optimization

Authors

Abstract

Keywords

Comments

Version History

Version Notes

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share