Photoemission Spectroscopy of Organic Molecules Using Plane-Wave/Pseudopotential Density Functional Theory and Machine Learning: A Comprehensive and Predictive Computational Protocol for Isolated Molecules, Molecular Aggregates and Organic Thin Films

13 March 2025, Version 3
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Photoemission measurements performed in gas phase at low pressure opened the exploration at the scale of single molecules of the complex relationship between electronic and structural properties of the matter. Experimental results collected on molecules isolated from interaction with other species provided, in turn, an ideal breeding ground for developing ab initio simulations capable of interpreting and predicting photoemission spectra. Regarding the atom- and site-specific core ionization binding energies (BEs), accurate methods facilitate the interpretation of experimental data, also helping to assign the contributions of all non-equivalent atoms of the same species even in unresolved features arising from a molecular structure. In this context, we have developed, extensively tested and made available to a broad readership a computational protocol rooted on plane-wave/pseudopotential density functional theory (hereafter referred to as PW-DFT), based on a ∆SCF approach, to predict X-ray photoemission spectra (XPS) in molecules, molecular aggregates and molecular thin films deposited on inorganic substrates. Calculations have been performed using a representative set of semilocal and global/range-separated hybrid density functionals, containing increasing fractions of Hartree-Fock exact exchange (EXX). Specifically, PBE, B3LYP (20 % EXX), HSE (range separated with 25 % EXX at short range) and BH&HLYP (50 % EXX) have been used for the assessment of the computational protocol. Equation-of-motion coupled-cluster with single and double excitations (EOM-CCSD) has been employed as reference theoretical method for comparison. The computational protocol has been tested against a wide set of molecular classes encompassing aromatic, heteroaromatic and aliphatic compounds as well as drugs and biomolecules, demonstrating to be generally accurate and robust even using semilocal DFT. Moreover, valence photoemission measurements represent a complementary tool to core photoemission, particularly useful to investigate the properties of delocalized and π- conjugated molecular orbitals, and sensitive to chemical modifications which involves large molecules through non covalent interactions. We have used the same set of density functionals to assess their capability to predict valence-shell ionization spectra for different molecular classes using Kohn-Sham eigenvalues as estimators. Finally, our PW-DFT data-set of C1s, N1s and O1s BEs have been used to train machine learning (ML) models finalized to the prediction of XPS spectra in isolated organic molecules from their structure. To ensure the reproducibility of our results and foster the use of our protocol, we have made available through a public repository a library of pseudopotentials and input files for ab initio calculations, together with the data sets employed to train the ML models.

Keywords

Photoelectron spectroscopy
Density functional theory
Organic molecules
Core ionization
Predictive modeling

Supplementary materials

Title
Description
Actions
Title
Supporting Information
Description
Detailed structures and extended data tables related to all the systems investigated in this study. Application of maching-learning models to O1s and N1s core ionization.
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.