Predicting and proposing the reaction mechanism, as well as speculating the reaction intermediates are great challenges among the development of modern organic chemistry. Herein, a model from Natural Language Processing (NLP) was firstly employed to learn and perform the task of intermediate prediction, which is served as a language translation task. Radical cascade cyclization is prevalently used in life science and pharmaceutical projects, while the regioselectivity of radical attack is difficult to predict. The model is trained on self-built dataset to tackle the challenge. And transfer learning was used to surmount the restriction of limited amounts of data. The NLP transformer model performs well with remarkable accuracy, providing an efficient instruction for mechanism understanding. Manual encoding of rules is not required, thus, providing a favorable tool towards solving the challenging problem of computational organic chemical mechanism inference.
New Application of Natural Language Processing（NLP）for Chemist: Predicting Intermediate and Providing an Effective Direction for Mechanism Inference