These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
article.pdf (601.98 kB)

A Transformer Model for Retrosynthesis

submitted on 30.04.2019, 10:37 and posted on 02.05.2019, 18:21 by Pavel Karpov, Guillaume Godin, Igor Tetko

We describe a Transformer model for a retrosynthetic reaction prediction task. The model is trained on 45 033 experimental reaction examples extracted from USA patents. It can successfully predict the reactants set for 42.7% of cases on the external test set. During the training procedure, we applied different learning rate schedules and snapshot learning. These techniques can prevent overfitting and thus can be a reason to get rid of internal validation dataset that is advantageous for deep models with millions of parameters. We thoroughly investigated different approaches to train Transformer models and found that snapshot learning with averaging weights on learning rates minima works best. While decoding the model output probabilities there is a strong influence of the temperature that improves at T=1.3 the accuracy of models up to 1-2%.


Email Address of Submitting Author


Institute of Structural Biology



ORCID For Submitting Author


Declaration of Conflict of Interest

No conflict of interests.