Abstract
Despite its widespread use in chemical discovery, approximate density functional theory (DFT) is poorly suited to many materials targets, such as those containing open-shell, 3d transition metals that can be expected to have strong multireference (MR) character. For DFT workflows to be predictive, we must incorporate automated, low cost methods that can distinguish the regions of chemical space where DFT should be applied and where it should not. We curate over 4,800 open shell transition metal complexes up to hundreds of atoms in size from prior high-throughput DFT studies and evaluate affordable, finite-temperature DFT evaluation of fractional occupation number (FON)-based MR diagnostics. We show that intuitive measures of strong correlation (i.e., the HOMO-LUMO gap) are not predictive of MR character as judged by FON-based diagnostics. We train independent machine learning (ML) models to predict HOMO-LUMO gaps and FON-based diagnostics. ML model analysis reveals differences in metal- and ligand-sensitivity of the two quantities, suggesting opportunities to minimize MR character while tailoring the gap. We use our trained ML models to rapidly evaluate MR character over a space of ca. 187,000 theoretical complexes, identifying large-scale trends in spin-state-dependent MR character and discovering small HOMO-LUMO gap complexes with low MR character.