Evaluation of Reinforcement Learning in
Transformer-based Molecular Design

Jiazhen He; Alessandro Tibo; Jon Paul Janet; Eva Nittinger; Christian Tyrchan; Werngard Czechtizky; Ola Engkvist

doi:10.26434/chemrxiv-2024-r9ljm-v2

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Evaluation of Reinforcement Learning in Transformer-based Molecular Design

10 July 2024, Version 2

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Designing compounds with a range of desirable properties is a fundamental challenge in drug discovery. In pre-clinical early drug discovery, novel compounds are often designed based on an already existing promising starting compound through structural modifications for further property optimization. Recently, transformer-based deep learning models have been explored for the task of molecular optimization by training on pairs of similar molecules. This provides a starting point for generating similar molecules to a given input molecule, but has limited flexibility regarding user-defined property profiles. Here, we evaluate the effect of reinforcement learning on transformer-based molecular generative models. The generative model can be considered as a pre-trained model with knowledge of the chemical space close to an input compound, while reinforcement learning can be viewed as a tuning phase, steering the model towards chemical space with user-specific desirable properties. The evaluation of two distinct tasks - molecular optimization and scaffold discovery - suggest that reinforcement learning could guide the transformer-based generative model towards the generation of more compounds of interest. Additionally, the impact of pre-trained models, learning steps and learning rates are investigated. Scientific Contribution: Our study investigates the effect of reinforcement learning on a transformer-based generative model initially trained for generating molecules similar to starting molecules. The reinforcement learning framework is applied to facilitate multiparameter optimisation of starting molecules. This approach allows for more flexibility for optimizing user-specific property profiles and helps finding more ideas of interest.

Keywords

transformer

reinforcement learning

generative model

molecular optimization

Supplementary materials

Title

Description

Actions

Title

Supplementary figures

Description

This file contains supplementary figures to the main manuscript.

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Jul 10, 2024 Version 2

Mar 12, 2024 Version 1

Version Notes

Added more experiments

Metrics

1,273

732

Views

Downloads

Citations

License

The content is available under CC BY 4.0

DOI

10.26434/chemrxiv-2024-r9ljm-v2

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) declare that they have sought and gained approval from the relevant ethics committee/IRB for this research and its publication.