ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
article.pdf (601.98 kB)

A Transformer Model for Retrosynthesis

preprint
submitted on 30.04.2019 and posted on 02.05.2019 by Pavel Karpov, Guillaume Godin, Igor Tetko

We describe a Transformer model for a retrosynthetic reaction prediction task. The model is trained on 45 033 experimental reaction examples extracted from USA patents. It can successfully predict the reactants set for 42.7% of cases on the external test set. During the training procedure, we applied different learning rate schedules and snapshot learning. These techniques can prevent overfitting and thus can be a reason to get rid of internal validation dataset that is advantageous for deep models with millions of parameters. We thoroughly investigated different approaches to train Transformer models and found that snapshot learning with averaging weights on learning rates minima works best. While decoding the model output probabilities there is a strong influence of the temperature that improves at T=1.3 the accuracy of models up to 1-2%.

History

Email Address of Submitting Author

carpovpv@gmail.com

Institution

Institute of Structural Biology

Country

Germany

ORCID For Submitting Author

0000-0003-4786-9806

Declaration of Conflict of Interest

No conflict of interests.

Exports