ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
Transformer_v001.pdf (576.21 kB)

Struct2IUPAC -- Transformer-Based Artificial Neural Network for the Conversion Between Chemical Notations

preprint
submitted on 23.11.2020, 14:24 and posted on 24.11.2020, 05:12 by Lev Krasnov, Ivan Khokhlov, Maxim Fedorov, Sergey Sosnin
Providing IUPAC chemical names is necessary for chemical information exchange. We developed a Transformer-based artificial neural architecture to translate between SMILES and IUPAC chemical notations: Struct2IUPAC and IUPAC2Struct. Our models demonstrated the performance that is comparable to rule-based solutions. We proved that both accuracy, speed of computations, and the model's robustness allow us to use it in production. Our showcase demonstrates that a neural-based solution can encourage rapid development keeping the same performance. We believe that our findings will inspire other developers to reduce development costs by replacing complex rule-based solutions with neural-based ones. The demonstration of Struct2IUPAC model is available online on Syntelly platform https://app.syntelly.com/smiles2iupac

History

Email Address of Submitting Author

sergey.sosnin@skoltech.ru

Institution

Skolkovo Institute of Science and Technology

Country

Russia

ORCID For Submitting Author

0000-0002-3042-7369

Declaration of Conflict of Interest

Maxim Fedorov and Sergey Sosnin are co-founders of Syntelly LLC. Lev Krasnov and Ivan Khokhlov are employees of Syntelly LLC

Version Notes

version 0.0.1

Exports