National Institutes of Health (NIH) Workshop on Reaction Informatics



The virtual workshop took place on May 18-20, 2021. It was a follow-up from the December 2020 NIH Workshop on Ultra Large Chemistry Databases. A major theme emerging from the December 2020 workshop was the fact that all the databases of a billion or more structures are virtual. For each virtual molecule the question then arises of whether, or how, it can be synthesized. The organizers therefore assembled speakers to give presentations about how reaction-related data are represented, captured, managed in databases, analyzed, used for drug design, applied in robotics, and exchanged locally as well as globally. This report summarizes talks from 27 practitioners in the reaction informatics field. The aim is to represent as accurately as possible the information that was delivered by the speakers; the report does not seek to be evaluative. The themes, in the order used for this report, were reaction representations, file formats, and standards; sources of reaction data; AI and machine learning applications of reaction-related data in de novo drug design, synthetic accessibility, synthesis planning, reaction prediction etc.; and automation and progression toward autonomous synthesis.