Theoretical and Computational Chemistry

OS100: A Benchmark Set of 100 Digitized UV-Visible Spectra and Derived Experimental Oscillator Strengths



The scientific method involves validating computational theories and methods against experimental results. However, the comparison between theory and experiments is not always straightforward; in UV-visible spectroscopy, experiments provide a plot of wavelength-dependent molar extinction/attenuation coefficients (ε) while computations typically provide single-valued excitation energies and oscillator strengths (ƒ) for each band. ε and ƒ are related, but this relation is complicated by various broadening and solvation effects. We describe a protocol to fit and integrate experimental UV-visible spectra to obtain ƒexp values for absorption bands and to estimate the uncertainty in the fitting. We apply this protocol to derive 164 ƒexp values from 100 organic molecules ranging in size from 6-34 atoms. The corresponding computed oscillator strengths (ƒcomp) are obtained with time-dependent density functional theory and a polarizable continuum solvent model. By expressing experimental and computed absorption strengths using a common quantity, we directly compare ƒcomp and ƒexp. While ƒcomp and ƒexp are well correlated (linear regression R2=0. 914), ƒcomp in most cases significantly overestimates ƒexp (regression slope=1.31). The agreement between absolute ƒcomp and ƒexp values is substantially improved by accounting for a solvent refractive index factor, as suggested in some derivations in the literature. The 100 digitized UV-visible spectra are included as plain text files in the supporting information to aid in benchmarking computational or machine-learning approaches that aim to simulate realistic UV-visible absorption spectra.


Thumbnail image of manuscript_OS.pdf

Supplementary material

Thumbnail image of Supporting_OS.pdf
Main supporting document
Includes the following information: Tables S1: Information about the 100 molecules in the OS100 set and their solvent information. Tables S2: fcomp, fexp,g1, fexp,g2, fexp,g, fexp,n1, fexp,n2, fexp,n, and total score for each of the 164 transitions. Tables S3: vmin, vmax, εmax, and v at εmax for each of the 164 transitions. Fig. S1: Plot and linear regression for fcomp vs. fexp,g after excluding 37 “very low confidence” points. Fig. S2: Plot and linear regression for fcomp vs. εmax. Fig. S3: Plots and linear regression for fcomp vs. fexp,g / n. Fig. S4: Plots and linear regression for fcomp vs. fexp,g × n.
Thumbnail image of Supporting_plots.pdf
Digitized and integrate UV-visible spectra
Plots of all digitized spectra for the fexp,g1 data, including Gaussian deconvolution, chemical structures, and TD-B3LYP/6-31+G* stick spectra