These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
3 files

Molecular Generation Targeting Desired Electronic Properties via Deep Generative Models

submitted on 27.09.2019, 13:54 and posted on 30.09.2019, 18:32 by Qi Yuan, Alejandro Santana-Bonilla, Martijn Zwijnenburg, Kim Jelfs

The chemical space for novel electronic donor-acceptor oligomers with targeted properties was explored using deep generative models and transfer learning. A General Recurrent Neural Network model was trained from the ChEMBL database to generate chemically valid SMILES strings. The parameters of the General Recurrent Neural Network were fine-tuned via transfer learning using the electronic donor-acceptor database from the Computational Material Repository to generate novel donor-acceptor oligomers. Six different transfer learning models were developed with different subsets of the donor-acceptor database as training sets. We concluded that electronic properties such as HOMO-LUMO gaps and dipole moments of the training sets can be learned using the SMILES representation with deep generative models, and that the chemical space of the training sets can be efficiently explored. This approach identified approximately 1700 new molecules that have promising electronic properties (HOMO-LUMO gap <2 eV and dipole moment <2 Debye), 6-times more than in the original database. Amongst the molecular transformations, the deep generative model has learned how to produce novel molecules by trading off between selected atomic substitutions (such as halogenation or methylation) and molecular features such as the spatial extension of the oligomer. The method can be extended as a plausible source of new chemical combinations to effectively explore the chemical space for targeted properties.


European Research Council, European Research Council under FP7 (CoMMaD, ERC Grant No. 758370)

EPSRC; EP/M017257/1, EP/P005543/1 and EP/L000202/1

Royal Society, University Research Fellowship


Email Address of Submitting Author


Imperial College London


United Kingdom

ORCID For Submitting Author


Declaration of Conflict of Interest

No conflict of interest