Machine Learning to Reduce Reaction Optimization Lead Time – Proof of Concept with Suzuki, Negishi and Buchwald-Hartwig Cross-Coupling Reactions

07 July 2020, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

To date, optimizing reactions let alone predicting the outcome (yield) of known reactions requires expert knowledge and can at best be obtained by computationally complex and expensive modelling. The present investigation tests if machine learning represents a viable approach for predicting a model reaction outcome that could be put into daily production. A prerequisite was replacing advanced scripting techniques with a more approachable data science platform such as Knime®. The Palladium catalyzed Suzuki-Miyaura, Negishi and Buchwald-Hartwig reactions were selected for a classification model of high/low yielding outcome combined with a selection of reaction conditions stemming from a commercial database. Here we present preliminary results of a random forest-based classification model using readily calculated standard medicinal chemistry descriptors from substrates and products yielded high ROC AUC of up to 96%. The descriptors used in the model do not convey anything about the reactivity, only 1D- and 2D- structural information, and performed equal or better than fingerprints, both in terms of prediction and computational requirements. One of the major challenges was the quality of the data and its subsequent curation.

Keywords

machine Learning Methods Enable Predictive Modeling
cross coupling reactions
KNIME

Supplementary materials

Title
Description
Actions
Title
S1 Supplementary Information
Description
Actions
Title
S2 Knime workflows with readme
Description
Actions
Title
S3 Reaxys ReactionIDs
Description
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.