Machine Learning Models Correct Systematic Errors in Alchemical Perturbation Density Functional Theory Applications to Catalysis


Alchemical perturbation density functional theory (APDFT) has great promise for enabling rapid and accurate computational screening of hypothetical catalyst sites, but first order approximations are unsatisfactorily inaccurate when alchemical derivatives are large. In this work, we analyze errors in first order APDFT calculation schemes for binding energies of CHx, NHx, OHx, and OOH adsorbates over a range of different coverages on hypothetical alloys based on a Pt(111) reference system. We then construct feature vectors by fingerprinting the dopant locations in the alloy and then use a data set of about 11,100 data points to train three different support vector regression machine learning models that correct systematic APDFT prediction errors for each of the three classes of carbon, nitrogen, and oxygen based adsorbates. While uncorrected first order APDFT alone can approximate reasonably accurate adsorbate binding energies on up to 36 hypothetical alloys based on a single Kohn-Sham DFT calculation on a 3 × 3 unit cell for Pt(111), the machine learning-corrected APDFT in principle extends this number to more than 20,000 and provides a recipe for developing other machine learning models to aid future high throughput screening studies.


Supplementary material