Stacked Ensemble Machine Learning for Range-Separation Parameters

13 September 2021, Version 2
This content is a preprint and has not undergone peer review at the time of posting.


High-throughput virtual materials and drug discovery based on density functional theory has achieved tremendous success in recent decades, but its power on organic semiconducting molecules suffered catastrophically from the self-interaction error until the optimally tuned range-separated hybrid (OT-RSH) exchange-correlation functionals were developed. The accurate but expensive first-principles OT-RSH transitions from a short-range (semi-)local functional to a long-range Hartree-Fock exchange at a distance characterized by the inverse of a molecule-specific, non-empirically-determined range-separation parameter (ω). In the present study, we proposed a promising stacked ensemble machine learning (SEML) model that provides an accelerated alternative of OT-RSH based on system-dependent structural and electronic configurations. We trained ML-ωPBE, the first functional in our series, using a database of 1,970 organic semiconducting molecules with sufficient structural diversity, and assessed its accuracy and efficiency using another 1,956 molecules. Compared with the first-principles OT-ωPBE, our ML-ωPBE reached a mean absolute error of 0:00504a_0^{-1} for the optimal value of ω, reduced the computational cost for the test set by 2.66 orders of magnitude, and achieved comparable predictive powers in various optical properties.


Stacked Ensemble Machine Learning
Range-Separation Parameter
Composite Molecular Descriptors
Base and Meta Learners
Koopmans' Theorem
Optical Properties

Supplementary materials

Supporting Information
Brief proof of Koopmans' theorem and asymptotic decay of electronic density, descriptions of details for general OT-ωPBE and ML-ωPBE functionals, composite molecular descriptors, the SEML model, quantum chemical calculations, and summaries of statistics of errors of ML-ωPBE and other XC functionals in optical properties.
SMILES strings and ω values for all molecules in the training and test sets.

Supplementary weblinks


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.