Abstract
The knowledge of frontier orbital, highest occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital (LUMO), energies is vital for studying chemical and electrochemical stability of compounds, their corrosion inhibition potential, reactivity, etc. Density functional theory (DFT) calculations provide a direct route to estimate these energies, either in the gas-phase or condensed phase. However, the application of DFT methods becomes computationally intensive when hundreds of thousands of compounds are to be screened. Such is the case when all the isomers for the 1-alkyl-3-alkyllimidazolium cation [C$_n$C$_m$im]$^+$ (n = 1-10, m=1-10) are considered. Enumerating the isomer space of [C$_n$C$_m$im]$^+$ yields close to 316,000 cation structures. Calculating frontier orbital energies for each would be computationally very expensive and time-consuming using DFT. In this article, we develop a machine learning model based on extreme gradient boosting (XGBoost) method using the a small subset of the isomer space and predict the HOMO and LUMO. Using the model, the HOMO energies are predicted with a mean-absolute error (MAE) of 0.4 eV and the LUMO energies with MAE of 0.2 eV. Inferences are also drawn on type of the descriptors deemed important for the HOMO and LUMO energy estimates. Application of the machine learning model results in a drastic reduction in computational time required for such calculations.
Supplementary materials
Title
Supplementary Information
Description
Hyper parameter selection
Actions