ReactionCode: Format for Reaction Searching, Analysis, Classification, Transform, and Encoding/Decoding

2020-07-23T09:55:17Z (GMT) by Victorien Delannée Marc Nicklaus
In the past two decades a lot of different formats for molecules and reactions have been created. These formats were mostly developed for the purposes of identifiers, representation, classification, analysis and data exchange. A lot of efforts have been made on molecule formats but only few for reactions where the endeavors have been made mostly by companies leading to proprietary formats. Here, we developed a new open-source format which allows to encode and decode a reaction into multi-layers machine readable code, which aggregates reactants and products into a condensed graph of reaction (CGR). This format is flexible and can be used in a context of reaction similarity searching and classification. It is also designed for database organization, machine learning applications and as a new transform reaction language.