Exploring graph traversal algorithms in graph-based molecular generation

Authors

Abstract

Here we explore the impact of different graph traversal algorithms on molecular graph generation. We do this by training a graph-based deep molecular generative model to build structures using a node order determined via either a breadth- or depth-first search algorithm. What we observe is that using a breadth-first traversal leads to better coverage of training data features compared to a depth-first traversal. We have quantified these differences using a variety of metrics on a dataset of natural products. These metrics include: percent validity, molecular coverage, and molecular shape. We also observe that using either a breadth- or depth-first traversal it is possible to over-train the generative models, at which point the results with the graph traversal algorithm are identical

Version notes

List of changes: (1) Updated author contact information, (2) updated manuscript for clarity, including additional results on molecular complexity and examples of training/sampled molecules, and (3) fixed the Zenodo link.

Content

Supplementary weblinks

Exploring graph traversal algorithms in graph-based molecular generation
Here we have uploaded tarballs which contain all the scripts, data, and (modified) GraphINVENT code needed to reproduce the results in 'Exploring graph traversal algorithms in graph-based molecular generation' by Rocío Mercado, Esben J. Bjerrum, and Ola Engkvist.