These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
Preprints are manuscripts made publicly available before they have been submitted for formal peer review and publication. They might contain new research findings or data. Preprints can be a draft or final version of an author's research but must not have been accepted for publication at the time of submission.
submitted on 27.09.2020 and posted on 30.09.2020by Timur Madzhidov, Arkadii
I. Lin, Ramil Nugmanov, Natalia Dyubankova, Timur Gimadiev, Jörg Kurt Wegner, Assima Rakhimbekova, Tagir Akhmetshin, Zarina Ibragimova, Alexandre Varnek, Rail Suleymanov, Hugo Ceulemans, Jonas Verhoeven
Here, we discuss a reaction standardization protocol followed by a comparison of popular Atom-to-atom mapping (AAM) tools (ChemAxon, Indigo, RDTool, NextMove and RXNMapper) as well as some consensus AAM strategies. For this purpose, a dataset of 1851 manually curated and mapped reactions was prepared (the Golden dataset) and used as a reference set. It has been found that RXNMapper possesses the highest accuracy, despite the fact that it has some clear disadvantages. Finally, RXNMapper was selected as the best tool, and it was applied to map the USPTO dataset. The standardization protocol used to prepare the data, as well as the data itself are available in the GitHub repository https://github.com/Laboratoire-de-Chemoinformatique.