These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
4 files

Machine Learning to Reduce Reaction Optimization Lead Time – Proof of Concept with Suzuki, Negishi and Buchwald-Hartwig Cross-Coupling Reactions

submitted on 06.07.2020, 11:54 and posted on 07.07.2020, 04:47 by Fernando Huerta, Samuel Hallinder, Alexander Minidis
To date, optimizing reactions let alone predicting the outcome (yield) of known reactions requires expert knowledge and can at best be obtained by computationally complex and expensive modelling. The present investigation tests if machine learning represents a viable approach for predicting a model reaction outcome that could be put into daily production. A prerequisite was replacing advanced scripting techniques with a more approachable data science platform such as Knime®. The Palladium catalyzed Suzuki-Miyaura, Negishi and Buchwald-Hartwig reactions were selected for a classification model of high/low yielding outcome combined with a selection of reaction conditions stemming from a commercial database. Here we present preliminary results of a random forest-based classification model using readily calculated standard medicinal chemistry descriptors from substrates and products yielded high ROC AUC of up to 96%. The descriptors used in the model do not convey anything about the reactivity, only 1D- and 2D- structural information, and performed equal or better than fingerprints, both in terms of prediction and computational requirements. One of the major challenges was the quality of the data and its subsequent curation.


Email Address of Submitting Author


RISE Chemical Process and Pharmaceutical Development



ORCID For Submitting Author


Declaration of Conflict of Interest

No conflict of interest

Version Notes

Manuscript Version 1