Stacked Ensemble Machine Learning for Range-Separation Parameters

13 September 2021, Version 2
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

High-throughput virtual materials and drug discovery based on density functional theory has achieved tremendous success in recent decades, but its power on organic semiconducting molecules suffered catastrophically from the self-interaction error until the optimally tuned range-separated hybrid (OT-RSH) exchange-correlation functionals were developed. The accurate but expensive first-principles OT-RSH transitions from a short-range (semi-)local functional to a long-range Hartree-Fock exchange at a distance characterized by the inverse of a molecule-specific, non-empirically-determined range-separation parameter (ω). In the present study, we proposed a promising stacked ensemble machine learning (SEML) model that provides an accelerated alternative of OT-RSH based on system-dependent structural and electronic configurations. We trained ML-ωPBE, the first functional in our series, using a database of 1,970 organic semiconducting molecules with sufficient structural diversity, and assessed its accuracy and efficiency using another 1,956 molecules. Compared with the first-principles OT-ωPBE, our ML-ωPBE reached a mean absolute error of 0:00504a_0^{-1} for the optimal value of ω, reduced the computational cost for the test set by 2.66 orders of magnitude, and achieved comparable predictive powers in various optical properties.

Keywords

Stacked Ensemble Machine Learning
Range-Separation Parameter
Composite Molecular Descriptors
Base and Meta Learners
Koopmans' Theorem
Optical Properties

Supplementary materials

Title
Description
Actions
Title
Supporting Information
Description
Brief proof of Koopmans' theorem and asymptotic decay of electronic density, descriptions of details for general OT-ωPBE and ML-ωPBE functionals, composite molecular descriptors, the SEML model, quantum chemical calculations, and summaries of statistics of errors of ML-ωPBE and other XC functionals in optical properties.
Actions
Title
Dataset
Description
SMILES strings and ω values for all molecules in the training and test sets.
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.