ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
rediscovery_v5.pdf (7.15 MB)

Beyond Generative Models: Superfast Traversal, Optimization, Novelty, Exploration and Discovery (STONED) Algorithm for Molecules using SELFIES

preprint
submitted on 16.12.2020, 02:28 and posted on 16.12.2020, 02:33 by AkshatKumar Nigam, Robert Pollice, Mario Krenn, Gabriel dos Passos Gomes, Alan Aspuru-Guzik
Inverse design allows the design of molecules with desirable properties using property optimization. Deep generative models have recently been applied to tackle inverse design, as they possess the ability to optimize molecular properties directly through structure modification using gradients. While the ability to carry out direct property optimizations is promising, the use of generative deep learning models to solve practical problems requires large amounts of data and is very time-consuming. In this work, we propose STONED – a simple and efficient algorithm to perform interpolation and exploration in the chemical space, comparable to deep generative models. STONED bypasses the need for large amounts of data and training times by using string modifications in the SELFIES molecular representation. We achieve comparable performance on typical benchmarks without any training. We demonstrate applications in high-throughput virtual screening for the design of drugs, photovoltaics, and the construction of chemical paths, allowing for both property and structure-based interpolation in the chemical space. We anticipate our results to be a stepping stone for developing more sophisticated inverse design models and benchmarking tools, ultimately helping generative models achieve wide adoption.

Funding

Postdoc.Mobility fellowship by the Swiss National Science Foundation (SNSF, Project No. 191127)

Erwin Schrödinger fellowship No. J4309 from the Austrian Science Fund (FWF)

Banting Postdoctoral Fellowship from the Natural Sciences and Engineering Research Council of Canada (NSERC)

Anders G. Frøseth

Natural Resources Canada and the Canada 150 Research Chairs program

Compute Canada

History

Email Address of Submitting Author

akshat.nigam@mail.utoronto.ca

Institution

University of Toronto

Country

Canada

ORCID For Submitting Author

0000-0002-5152-2082

Declaration of Conflict of Interest

No conflict of interest

Exports