These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
Deep reinforcement learning for multiparameter optimization in de novo drug design.pdf (759.84 kB)

Deep Reinforcement Learning for Multiparameter Optimization in de novo Drug Design

revised on 29.04.2019, 18:44 and posted on 30.04.2019, 15:07 by Niclas Ståhl, Göran Falkman, Alexander Karlsson, Gunnar Mathiason, Jonas Boström

In medicinal chemistry programs it is key to design and make compounds that are efficacious and safe. This is a long, complex and difficult multi-parameter optimization process, often including several properties with orthogonal trends. New methods for the automated design of compounds against profiles of multiple properties are thus of great value. Here we present a fragment-based reinforcement learning approach based on an actor-critic model, for the generation of novel molecules with optimal properties. The actor and the critic are both modelled with bidirectional long short-term memory (LSTM) networks. The AI method learns how to generate new compounds with desired properties by starting from an initial set of lead molecules and then improve these by replacing some of their fragments. A balanced binary tree based on the similarity of fragments is used in the generative process to bias the output towards structurally similar molecules. The method is demonstrated by a case study showing that 93% of the generated molecules are chemically valid, and a third satisfy the targeted objectives, while there were none in the initial set.


Email Address of Submitting Author


University of Skövde



ORCID For Submitting Author


Declaration of Conflict of Interest

No conflict of interest