These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.

Semi-Supervised Machine Learning Enables the Robust Detection of Multireference Character at Low Cost

submitted on 01.07.2020 and posted on 02.07.2020 by Chenru Duan, Fang Liu, Aditya Nandy, Heather Kulik
Multireference (MR) diagnostics are common tools for identifying strongly correlated electronic structure that makes single reference (SR) methods (e.g., density functional theory or DFT) insufficient for accurate property prediction. However, MR diagnostics typically require computationally demanding correlated wavefunction theory (WFT) calculations, and diagnostics often disagree or fail to predict MR effects on properties. To overcome these challenges, we introduce a semi-supervised machine learning (ML) approach with virtual adversarial training (VAT) of an MR classifier using 15 WFT and DFT MR diagnostics as inputs. In semi-supervised learning, only the most extreme SR or MR points are labeled, and the remaining point labels are learned. The resulting VAT model outperforms the alternatives, as quantified by the distinct property distributions of SR- and MR-classified molecules. To reduce the cost of generating inputs to the VAT model, we leverage the VAT model’s robustness to noisy inputs by replacing WFT MR diagnostics with regression predictions in a MR decision engine workflow that preserves excellent performance. We demonstrate the transferability of our approach to larger molecules and those with distinct chemical composition from the training set. This MR decision engine demonstrates promise as a low-cost, high-accuracy approach to the automatic detection of strong correlation for predictive high-throughput screening.





Simultaneous mitigation of density and energy errors in approximate DFT for transition metal chemistry

Basic Energy Sciences

Find out more...

AAAS Marion Milligan Mason Award

MolSSI fellowship (MolSSI NSF grant number ACI-1547580)

Burroughs Wellcome Fund Career Award at the Scientific Interface


Email Address of Submitting Author


Massachusetts Institute of Technology


United States

ORCID For Submitting Author


Declaration of Conflict of Interest

The authors declare no conflict of interest.