Theoretical and Computational Chemistry

L-MolGAN: An improved implicit generative model for large molecular graphs

Satoru Hiwa Doshisha University


Deep generative models are used to generate arbitrary molecular structures with the desired chemical properties. MolGAN is a renowned molecular generation models that uses generative adversarial networks (GANs) and reinforcement learning to generate molecular graphs in one shot. MolGAN can effectively generate a small molecular graph with nine or fewer heavy atoms. However, the graphs tend to become disconnected as the molecular size increase. This poses a challenge to drug discovery and material design, where large molecules are potentially inclusive. This study develops an improved MolGAN for large molecule generation (L-MolGAN). In this model, the connectivity of molecular graphs is evaluated by a depth-first search during the model training process. When a disconnected molecular graph is generated, L-MolGAN rewards the graph a zero score. This procedure decreases the number of disconnected graphs, and consequently increases the number of connected molecular graphs. The effectiveness of L-MolGAN is experimentally evaluated. The size and connectivity of the molecular graphs generated with data from the ZINC-250k molecular dataset are confirmed using MolGAN as the baseline model. The model is then optimized for a quantitative estimate of drug-likeness (QED) to generate drug-like molecules. The experimental results indicate that the connectivity measure of generated molecular graphs improved by 1.96 compared with the baseline model at a larger maximum molecular size of 20 atoms. The molecules generated by L-MolGAN are evaluated in terms of multiple chemical properties, QED, synthetic accessibility, and log octanol–water partition coefficient, which are important in drug design. This result confirms that L-MolGAN can generate various drug-like molecules despite being optimized for a single property, i.e., QED. This method will contribute to the efficient discovery of new molecules of larger sizes than those being generated with the existing method.


Thumbnail image of ytsujimoto_lmolgan_ChemRxiv.pdf
download asset ytsujimoto_lmolgan_ChemRxiv.pdf 1 MB [opens in a new tab]

Supplementary material

Thumbnail image of Figure1.pdf
download asset Figure1.pdf 0.15 MB [opens in a new tab]
Thumbnail image of Figure2.pdf
download asset Figure2.pdf 0.08 MB [opens in a new tab]
Thumbnail image of Figure3.pdf
download asset Figure3.pdf 0.04 MB [opens in a new tab]
Thumbnail image of Figure4.pdf
download asset Figure4.pdf 0.04 MB [opens in a new tab]
Thumbnail image of Figure5.pdf
download asset Figure5.pdf 0.60 MB [opens in a new tab]
Thumbnail image of Figure6.pdf
download asset Figure6.pdf 0.05 MB [opens in a new tab]
Thumbnail image of Figure7.pdf
download asset Figure7.pdf 0.04 MB [opens in a new tab]
Thumbnail image of Figure8.pdf
download asset Figure8.pdf 0.05 MB [opens in a new tab]
Thumbnail image of Figure9.pdf
download asset Figure9.pdf 0.04 MB [opens in a new tab]
Thumbnail image of Figure10.pdf
download asset Figure10.pdf 0.04 MB [opens in a new tab]
Thumbnail image of ytsujimoto_lmolgan_ChemRxiv.docx
download asset ytsujimoto_lmolgan_ChemRxiv.docx 0.08 MB [opens in a new tab]
ytsujimoto lmolgan ChemRxiv