Abstract
The development of new nanomaterials for energy technologies is
dependent on understanding the intricate relation between material
properties and atomic structure. It is, therefore, crucial to be able to
routinely characterise the atomic structure in nanomaterials, and a
promising method for this task is Pair Distribution Function (PDF)
analysis. The PDF can be obtained through Fourier transformation of x-ray total scattering data, and represents a histogram of
all interatomic distances in the sample. Going from the distance
information in the PDF to a chemical structure is an unassigned
distance geometry problem (uDGP), and solving this is often the bottleneck in nanostructure analysis. In this work, we propose to
use a Conditional Variational Autoencoder (CVAE) to automatically
solve the uDGP to obtain valid chemical structures from PDFs. We
use a simple model system of hypothetical mono-metallic nanoparticles containing up to 100 atoms in the face centered cubic (FCC)
structure as a proof of concept. The model is trained to predict the
assigned distance matrix (aDM) from a simulated PDF of the structure as the conditional input. We introduce a novel representation
of structures by projecting them inside a unit sphere and adding
additional anchor points or satellites to help in the reconstruction
of the chemical structure. The performance of the CVAE model is
compared to a Deterministic Autoencoder (DAE) showing that both
models are able to solve the uDGP reasonably well. We further show
that the CVAE learns a structured and meaningful latent embedding
space which can be used to predict new chemical structures.