Abstract
We report a comprehensive computational study of unsupervised machine learning for extraction of chemically relevant information in X-ray absorption near edge structure (XANES) and in valence-to-core X-ray emission spectra (VtC-XES) for classification of a broad ensemble of sulforganic molecules. By progressively decreasing the constraining assumptions of the unsupervised machine learning algorithm, moving from principal component analysis to a variational autoencoder to t-distributed stochastic neighbor embedding (t-SNE), we find improved sensitivity to steadily more refined chemical information. Surprisingly, even in merely two dimensions, t-SNE distinguishes not just oxidation state and general sulfur bonding environment but the aromaticity of the bonding radical group with 87% accuracy as well as identifying even finer details in electronic structure within aromatic or aliphatic sub-classes. We find that the chemical information in XANES and VtC-XES is very similar, although they exhibit an unexpected tendency to have different sensitivity within a given molecular class.
Supplementary materials
Title
Supplemental information: Unsupervised Machine Learning for Unbiased Chemical Classification in X-ray Absorption Spectroscopy and X-ray Emission Spectroscopy
Description
Supplemental information for the main manuscript.
Actions