Predicting Retrosynthetic Pathways Using a Combined Linguistic Model and Hyper-Graph Exploration Strategy

21 October 2019, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

We present an extension of our Molecular Transformer architecture combined with a hyper-graph exploration strategy for automatic retrosyn- thesis route planning without human intervention. The single-step ret- rosynthetic model sets a new state of the art for predicting reactants as well as reagents, solvents and catalysts for each retrosynthetic step. We introduce new metrics (coverage, class diversity, round-trip accuracy and Jensen-Shannon divergence) to evaluate the single-step retrosynthetic models, using the forward prediction and a reaction classification model always based on the transformer architecture. The hypergraph is con- structed on the fly, and the nodes are filtered and further expanded based on a Bayesian-like probability. We critically assessed the end-to-end framework with several retrosynthesis examples from literature and aca- demic exams. Overall, the frameworks has a very good performance with few weaknesses due to the bias induced during the training process. The use of the newly introduced metrics opens up the possibility to optimize entire retrosynthetic frameworks through focusing on the performance of the single-step model only.


Available on IBM RXN for Chemistry: https://rxn.res.ibm.com.

Keywords

Machine Learning
Deep Learning
Hypergraph
Chemical Space
Organic Synthesis
Organic Chemistry
Reaction Prediction
Retrosynthesis
SMILES-Encoded Molecular Structures
SMILES
Synthesis Route Planning
Chemical Reactions

Supplementary materials

Title
Description
Actions
Title
IBMRXN supplementary information
Description
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.