An Unsupervised Machine Learning Workflow for Assigning and Predicting Generality in Asymmetric Catalysis


The development of chiral catalysts that can provide high enantioselectivities across a wide assortment of substrates or reaction range is a priority for many catalyst design efforts. While several approaches are available to aid in the identification of general catalyst systems there is currently no simple procedure for directly measuring how general a given catalyst could be. Herein, we present a catalyst-agnostic workflow centered on unsupervised machine learning that enables the rapid assessment and quantification of catalyst generality. The workflow uses curated literature data sets and reaction descriptors to visualize and cluster chemical space coverage. This reaction network can then be applied to derive a catalyst generality metric through designer equations and interfaced with other regression techniques for general catalyst prediction. As validating case studies, we have successfully applied this method to identi-fy-through-quantification the most general catalyst chemotype for an organocatalytic asymmetric Mannich reaction and predicted the most general chiral phosphoric acid catalyst for the addition of nucleophiles to imines. The mechanistic basis for catalyst generality can then be gleaned from the calculated values by deconstructing the contributions of chemical space and enantiomeric excess to the overall result. We conclude that broadly applicable catalysts may be more adaptative to changes in reactant structure because enantioinduction does not rely on a single set of noncovalent interactions. In contrast, some systems work by engaging in robust noncovalent contacts that do not change significantly in nature when the structure of the reaction component is altered. Finally, our generality techniques permit-ted the development of mechanistically informative catalyst screening sets that allow experimentalists to rationally select catalysts that have the highest probability of achieving a good result in the first round of reaction development. Overall, our findings represent a framework for interrogating and predicting catalyst generality, and this strategy should be relevant to other catalytic systems widely applied in asymmetric synthesis.

Version notes

Added experimental evaluation of general catalyst rankings using literature data and new experimental results. Additional clarification of more complex points.