ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
DruganalogsLSTM-REV.pdf (1.56 MB)
0/0

Drug Analogs from Fragment Based Long Short-Term Memory Generative Neural Networks

preprint
revised on 22.02.2019 and posted on 22.02.2019 by Mahendra Awale, Finton Sirockin, Nikolaus Stiefl, Jean-Louis Reymond

Several recent reports have shown that long short-term memory generative neural networks (LSTM) of the type used for grammar learning efficiently learn to write SMILES of drug-like compounds when trained with SMILES from a database of bioactive compounds such as ChEMBL and can later produce focused sets upon transfer learning with compounds of specific bioactivity profiles. Here we trained an LSTM using molecules taken either from ChEMBL, DrugBank, commercially available fragments, or from FDB-17 (a database of fragments up to 17 atoms) and performed transfer learning to a single known drug to obtain new analogs of this drug. We found that this approach readily generates hundreds of relevant and diverse new drug analogs and works best with training sets of around 40,000 compounds as simple as commercial fragments. These data suggest that fragment-based LSTM offer a promising method for new molecule generation.

History

Email Address of Submitting Author

awale@dcb.unibe.ch

Institution

University of Bern, Novartis Institutes for Biomedical Research

Country

Switzerland

ORCID For Submitting Author

0000-0002-0611-6552

Declaration of Conflict of Interest

None

Exports