A Sequence-to-Sequence Transformer Model for Disconnection Aware Retrosynthesis

08 December 2021, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Retrosynthesis is an approach commonly undertaken when considering the manufacture of novel molecules. During this process, a target molecule is broken down and analyzed by considering the bonds to be changed as well as the functional group interconversion. In modern computer-assisted synthesis planning tools, the predictions of these changes are typically carried out automatically. However there may be some benefit to the decision being guided by those executing the process: typically, chemists have a clear idea where the retrosynthetic change should happen, but not how such a transformation is to be realized. Using a data-driven model, the retrosynthesis task can be further explored by giving chemists the option to explore specific disconnections. In this work, we design an approach to provide this option by adapting a transformer-based model for single-step retrosynthesis. The model takes as input a product SMILES string, in which the atoms where the transformation should occur are tagged accordingly. This model predicts precursors corresponding to a disconnection occurring in the correct location in 88.9% of the test set reactions. The assessment with a forward prediction model shows that 76% of the predictions are chemically correct, with 14.1% perfectly matching the ground truth.

Keywords

Chemical Reactions
Machine Learning
Molecular Transformer
Deep Learning
Retrosynthesis
SMILES
Reactions
Human-in-the-loop

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.