Theoretical and Computational Chemistry

Stacked Ensemble Machine Learning for Range-Separation Parameters

Authors

  • Cheng-Wei Ju University of Massachusetts Amherst & Chicago ,
  • Ethan French University of Massachusetts Amherst & University of Chicago ,
  • Nadav Geva Advanced Micro Devices (United States) ,
  • Alexander Kohn Activision Blizzard (United States) ,
  • Zhou Lin University of Massachusetts Amherst

Abstract

High-throughput virtual materials and drug discovery based on density functional theory has achieved tremendous success in recent decades, but its power on organic semiconducting molecules suffered catastrophically from the self-interaction error until the optimally tuned range-separated hybrid (OT-RSH) exchange-correlation functionals were developed. The accurate but expensive first-principles OT-RSH transitions from a short-range (semi-)local functional to a long-range Hartree-Fock exchange at a distance characterized by the inverse of a molecule-specific, non-empirically-determined range-separation parameter (ω). In the present study, we proposed a promising stacked ensemble machine learning (SEML) model that provides an accelerated alternative of OT-RSH based on system-dependent structural and electronic configurations. We trained ML-ωPBE, the first functional in our series, using a database of 1,970 organic semiconducting molecules with sufficient structural diversity, and assessed its accuracy and efficiency using another 1,956 molecules. Compared with the first-principles OT-ωPBE, our ML-ωPBE reached a mean absolute error of 0:00504a_0^{-1} for the optimal value of ω, reduced the computational cost for the test set by 2.66 orders of magnitude, and achieved comparable predictive powers in various optical properties.

Version notes

Correct scientific and typographic mistakes.

Content

Thumbnail image of main_text_submission.pdf

Supplementary material

Thumbnail image of si_submission.pdf
Supporting Information
Brief proof of Koopmans' theorem and asymptotic decay of electronic density, descriptions of details for general OT-ωPBE and ML-ωPBE functionals, composite molecular descriptors, the SEML model, quantum chemical calculations, and summaries of statistics of errors of ML-ωPBE and other XC functionals in optical properties.
Thumbnail image of dataset.xlsx
Dataset
SMILES strings and ω values for all molecules in the training and test sets.

Supplementary weblinks

ML-wPBE: Source Code and Database
Link to the source code and database on the GitHub repository of the Lin Group