DrugEx v2: De Novo Design of Drug Molecule by Pareto-based Multi-Objective Reinforcement Learning in Polypharmacology

Xuhan Liu; Kai Ye; Herman Van Vlijmen; Michael T. M. Emmerich; Adriaan P. IJzerman; Gerard van Westen

doi:10.26434/chemrxiv.14474127.v1

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

DrugEx v2: De Novo Design of Drug Molecule by Pareto-based Multi-Objective Reinforcement Learning in Polypharmacology

26 April 2021, Version 1

This is not the most recent version. There is a

newer version

of this content available

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

In polypharmacology, ideal drugs are required to bind to multiple specific targets to enhance efficacy or to reduce resistance formation. Although deep learning has achieved breakthrough in drug discovery, most of its applications only focus on a single drug target to generate drug-like active molecules in spite of the reality that drug molecules often interact with more than one target which can have desired (polypharmacology) or undesired (toxicity) effects. In a previous study we proposed a new method named DrugEx that integrates an exploration strategy into RNN-based reinforcement learning to improve the diversity of the generated molecules. Here, we extended our DrugEx algorithm with multi-objective optimization to generate drug molecules towards more than one specific target (two adenosine receptors, A₁AR and A_2AAR, and the potassium ion channel hERG in this study). In our model, we applied an RNN as the agent and machine learning predictors as the environment, both of which were pre-trained in advance and then interplayed under the reinforcement learning framework. The concept of evolutionary algorithms was merged into our method such that crossover and mutation operations were implemented by the same deep learning model as the agent. During the training loop, the agent generates a batch of SMILES-based molecules. Subsequently scores for all objectives provided by the environment are used for constructing Pareto ranks of the generated molecules with non-dominated sorting and Tanimoto-based crowding distance algorithms. Here, we adopted GPU acceleration to speed up the process of Pareto optimization. The final reward of each molecule is calculated based on the Pareto ranking with the ranking selection algorithm. The agent is trained under the guidance of the reward to make sure it can generate more desired molecules after convergence of the training process. All in all we demonstrate generation of compounds with a diverse predicted selectivity profile toward multiple targets, offering the potential of high efficacy and lower toxicity.

Keywords

deep learning

Adenosine Receptors Novel

cheminformatics

reinforcement learning

multi-objective optimization

Exploration strategy

Supplementary materials

Title

Description

Actions

Title

fig 1

Description

Actions

Title

fig 2

Description

Actions

Title

fig 3

Description

Actions

Title

fig 4AB

Description

Actions

Title

fig 4CE

Description

Actions

Title

fig 5

Description

Actions

Title

fig 6

Description

Actions

Title

fig 7A

Description

Actions

Title

fig 7B

Description

Actions

Title

fig 1

Description

Actions

Title

fig 1

Description

Actions

Title

fig 2

Description

Actions

Title

fig 3

Description

Actions

Title

fig 4AB

Description

Actions

Title

fig 4CE

Description

Actions

Title

fig 5

Description

Actions

Title

fig 6

Description

Actions

Title

fig 7A

Description

Actions

Title

fig 7B

Description

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Apr 27, 2021 Version 2

Apr 26, 2021 Version 1

Version Notes

pre-print version to Journal of Cheminformatics v1.0

Metrics

2,534

1,021

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv.14474127.v1

Funding

Dutch Scientific Council (NWO) Applied and engineering Sciences (AES) VENI # 14410

CSC scholarship

Author’s competing interest statement

no competing interests

DrugEx v2: De Novo Design of Drug Molecule by Pareto-based Multi-Objective Reinforcement Learning in Polypharmacology

Authors

Abstract

Keywords

Supplementary materials

Comments

Version History

Version Notes

Metrics

License

DOI

Funding

Author’s competing interest statement

Share