Simple User-Friendly Reaction Format

08 November 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Leveraging the increasing volume of chemical reaction data can enhance synthesis planning and improve suc- cess rates. However, machine learning applications for retrosynthesis planning and forward reaction prediction tools depend on having readily available, high-quality data in a structured format. While some public and licensed reaction databases are available, they frequently lack essential information about reaction condi- tions. To address this issue and promote the principles of findable, accessible, interoperable, and reusable (FAIR) data reporting and sharing, we introduce the Simple User-Friendly Reaction Format (SURF). SURF standardizes the documentation of reaction data through a structured tabular format, requiring only a basic understanding of spreadsheets. This format enables chemists to record the synthesis of molecules in a format that is both human- and machine-readable, making it easier to share and integrate directly into machine- learning pipelines. SURF files are designed to be interoperable, easily imported into relational databases, and convertible into other formats. This complements existing initiatives like the Open Reaction Database (ORD) and Unified Data Model (UDM). At Roche, SURF plays a crucial role in democratizing FAIR reaction data sharing and expediting the chemical synthesis process.

Keywords

FAIR data
reaction data
reaction database
reaction prediction

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.