Abstract
The use of data science tools to provide the emergence of nontrivial chemical features for catalyst design
is an important goal in catalysis science. Additionally, there is currently no general strategy for
computational homogeneous, molecular catalyst design. Here we report the unique combination of an
experimentally verified DFT-transition-state model with a random forest machine learning model in a
campaign to design new molecular Cr phosphine imine (Cr(P,N)) catalysts for selective ethylene
oligomerization, specifically to increase 1-octene selectivity. This involved the calculation of 1-hexene:1-
octene transition-state selectivity for 105 (P,N) ligands and the harvesting of 14 descriptors, which were
then used to build a random forest regression model. This model showed the emergence of several key
design features, such as Cr–N distance, Cr–α distance, and Cr distance out of pocket, which were then used
to rapidly design a new generation of Cr(P,N) catalyst ligands that are predicted to give >95% selectivity
for 1-octene