These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
2 files

Predicting Single-Substance Phase Diagrams: A Kernel Approach on Graph Representations of Molecules

submitted on 02.03.2021, 03:13 and posted on 03.03.2021, 05:31 by Yan Xiang, Yu-Hang Tang, Hongyi Liu, Guang Lin, Huai Sun

This work presents a Gaussian process regression (GPR) model on top of a novel graph representation of chemical molecules that predicts thermodynamic properties of pure substances in single, double, and triple phases. A transferable molecular graph representation is proposed as the input for a marginalized graph kernel, which is the major component of the covariance function in our GPR models. Radial basis function kernels of temperature and pressure are also incorporated into the covariance function when necessary. We predicted three types of representative properties of pure substances in single, double, and triple phases, i.e., critical temperature, vapor-liquid equilibrium (VLE) density, and pressure-temperature density. The data is collected from Knovel Data Analysis Beta: NIST ThermoDynamics Pure Compounds. The accuracy of the models is nearly identical to the precision of the experimental measurements. Moreover, the reliability of our predictions can be quantified on a per-sample basis using the posterior uncertainty of the GPR model. We compare our model against Morgan fingerprints and a graph neural network to further demonstrate the advantage of the proposed method. The marginalized graph kernel is computed using GraphDot package at All codes used in this work can be found at


Email Address of Submitting Author


Shanghai Jiao Tong University



ORCID For Submitting Author


Declaration of Conflict of Interest

There is no conflict of interest.