QSARtuna: an automated QSAR modelling platform for molecular property prediction in drug design

Lewis Mervin; Alexey Voronov; Mikhail Kabeshov; Ola Engkvist

doi:10.26434/chemrxiv-2024-2rlk7-v2

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

QSARtuna: an automated QSAR modelling platform for molecular property prediction in drug design

27 March 2024, Version 2

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Machine-learning (ML) and Deep-Learning (DL) approaches to predict the molecular properties of small molecules are increasingly deployed within the design-make-test-analyse (DMTA) drug design cycle to predict molecular properties of interest. Despite this uptake, there are only a few automated packages to aid their development and deployment that also support uncertainty estimation, model explainability and other key aspects of model usage. This represents a key unmet need within the field and the large number of molecular representations and algorithms (and associated parameters) means it is non-trivial to robustly optimise, evaluate, reproduce, and deploy models. Here we present QSARtuna, a molecule property prediction modelling pipeline, written in Python and utilising the Optuna, Scikit-learn, RDKit and ChemProp packages, which enables the efficient and automated comparison between molecular representations and machine learning models. The platform was developed considering the increasingly important aspect of model uncertainty quantification and explainability by design. We provide details for our framework and provide illustrative examples to demonstrate the capability of the software when applied to simple molecular property, reaction/reactivity prediction and DNA encoded library enrichment analyses. We hope that the release of QSARtuna will further spur innovation in automatic ML modelling and provide a platform for education of best practises in molecular property modelling. The code to the Qptuna framework is made freely available via GitHub.

Keywords

Machine Learning

Artificial Intelligence

Hyperparameter Optimisation

Supplementary weblinks

Title

Description

Actions

Title

QSARtuna: QSAR using Optimization for Hyper-parameter Tuning

Description

Build predictive models for CompChem with hyper-parameters optimized by Optuna.

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Now Published

QSARtuna: An Automated QSAR Modeling Platform for Molecular Property Prediction in Drug Design

Lewis Mervin, Alexey Voronov, Mikhail Kabeshov, Ola Engkvist journal article

Journal of Chemical Information and Modeling

Online publication date: Jul 01, 2024

Version History

Mar 27, 2024 Version 2

Feb 16, 2024 Version 1

Version Notes

Qptuna has been renamed to QSARtuna. The manuscript has been updated with this change with improvements to brevity and reference to data availability

Metrics

2,701

1,824

Views

Downloads

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2024-2rlk7-v2

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

QSARtuna: an automated QSAR modelling platform for molecular property prediction in drug design

Authors

Abstract

Keywords

Supplementary weblinks

Comments

Now Published

Version History

Version Notes

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share