FAME.AL: Site-of-metabolism prediction with active learning

Ya Chen; Thomas Seidel; Roxane Axel Jacob; Steffen Hirte; Angelica Mazzolari; Alessandro Pedretti; Giulio Vistoli; Thierry Langer; Filip Miljković; Johannes Kirchmair

doi:10.26434/chemrxiv-2023-4dnf1

Biological and Medicinal Chemistry

Search within Biological and Medicinal Chemistry

FAME.AL: Site-of-metabolism prediction with active learning

03 October 2023, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The ability to determine and predict metabolically labile atom positions in a molecule (also called “sites of metabolism” or “SoMs”) is of high interest to the design and optimization of bioactive compounds such as drugs, agrochemicals, and cosmetics. In recent years, several in silico models for SoM prediction have become available, many of which include a machine-learning component. The bottleneck in the further development of these approaches is the coverage of distinct atom environments and rare and complex biotransformation events with high-quality experimental data. In this context, active learning strategies could yield higher data efficiency and, in addition, provide guidance to experimentalists on which atom environments to investigate next for maximum information gain. Here we report on the development and validation of FAME.AL, an active learning approach for site-of-metabolism prediction that builds on the previously published FAst MEtabolizer (FAME 3). The active learning approach yielded competitive performance for phase 1 and phase 2 metabolism (Matthews correlation coefficients of approximately 0.50 on holdout data) while using only 20% of the training data used by classical modeling setups. Besides high performance and high data efficiency, the active learning approach is also characterized by high robustness and speed. The approach is largely invariant to starting conditions and parameters, and substantial speed-ups can be yielded by using small atom batches rather than individual atoms during the iterative model-building process. The source code of FAME.AL is publicly available.

Keywords

site-of-metabolism prediction

xenobiotic metabolism

drug metabolism

machine learning

active learning

Supplementary materials

Title

Description

Actions

Title

Supporting information for FAME.AL: Site-of-metabolism prediction with active learning

Description

Includes tables on (i) the 24 Sybyl atom types used by the CDPKit FAME descriptors and (ii) the CDPKit 2D descriptors and their CDK counterparts

Actions

Supplementary weblinks

Title

Description

Actions

Title

FAME.AL source code

Description

FAME.AL source code

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Now Published

Active Learning Approach for Guiding Site-of-Metabolism Measurement and Annotation

Ya Chen, Thomas Seidel, Roxane Axel Jacob, Steffen Hirte, Angelica Mazzolari, Alessandro Pedretti, Giulio Vistoli, Thierry Langer, Filip Miljković, Johannes Kirchmair journal article

Journal of Chemical Information and Modeling , Volume 64, Issue 2

Online publication date: Jan 03, 2024

Version History

Oct 03, 2023 Version 1

Metrics

858

426

Views

Downloads

Citations

License

The content is available under CC BY 4.0

DOI

10.26434/chemrxiv-2023-4dnf1

Funding

Austrian Federal Ministry of Labour and Economy for Digital and Economic Affairs

National Foundation for Research, Technology and Development

Christian Doppler Research Association

Boehringer-Ingelheim RCV GmbH & Co KG

BASF SE

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

FAME.AL: Site-of-metabolism prediction with active learning

Authors

Abstract

Keywords

Supplementary materials

Supplementary weblinks

Comments

Now Published

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share