Development and Validation of a Chemoinformatic Workflow for Predicting Reaction Yield for Pd-Catalyzed C-N Couplings with Substrate Generalizability

13 February 2023, Version 2
This content is a preprint and has not undergone peer review at the time of posting.


A machine learning-based tool that provides conditions and predicted yields for Buchwald-Hartwig couplings from a ChemDraw™ structure input is described. The tool is built on an in-house generated experimental dataset that explores a diverse network of reactant pairings. To minimize the number of experiments necessary to produce models and maximize data value, a workflow based on unsupervised machine leaning tools was created. The workflow enables the construction of models which can successfully generalize—making predictions for reactants which are not represented in the dataset.


machine learning
Buchwald-Hartwig coupling
yield prediction

Supplementary materials

Supplementary Materials
Full experimental procedures including validation runs, characterization data, experimental apparatus, qHPLC analytical methodology, and copies of 1H, 13C, 31P, and 19F spectra can be found in the Supplementary Materials as well as feature engineering, modeling details and model validation, structures of each product made in the dataset, predictions for each condition with every reactant pair in the dataset.


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.