Abstract
Improving our fundamental understanding of complex heterocatalytic processes increasingly relies on electronic structure simulations and microkinetic models based on calculated energy differences. In particular, calculation of activation barriers, usually achieved through compute-intensive saddle point search routines, remains a serious bottleneck in understanding trends in catalytic activity for highly branched reaction networks. Although the well-known Brønsted-Evans-Polyani (BEP) scaling – a one-dimensional linear regression model – has been widely applied in such microkinetic models, they still rely on calculated reaction energies and may not generalize beyond a single facet on a single class of materials, e.g., a terrace sites on transition metals. For highly branched and energetically shallow reaction networks, such as electrochemical CO2 reduction or waste remediation, calculating even reaction energies on many surfaces can become computationally intractable due to the combinatorial explosion of states that must be considered. Here, we investigate the feasibility of activation barrier prediction without knowledge of the reaction energy using linear and nonlinear machine learning (ML) models trained on a new database of over 500 dehydrogenation activation barriers. We and find that inclusion of the reaction energy significantly improves both classes of ML models, but complex nonlinear models can achieve performance similar to the simplest BEP scaling when predicting activation barriers on new systems. Additionally, inclusion of the reaction energy significantly improves generalizability to new systems beyond the training set. Our results suggest that the reaction energy is a critical feature to consider when building models to predict activation barriers, indicating that efforts to reliably predict reaction energies reliably through, e.g., the Open Catalyst Project and others, will be an important route to effective model development for more complex systems.
Supplementary materials
Title
Dataset and DFT trajectories used in the writing of this manuscript.
Description
Dataset and DFT trajectories used in the writing of this manuscript.
Actions
Title
Written SI
Description
Contains details related to hyperparamter selection and dataset generation.
Actions