ConfRank+: Extending Conformer Ranking to Charged Molecules

Rick Oerder; Christian Hölzer; Jan  Hamaekers

doi:10.26434/chemrxiv-2025-xkwk6

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

ConfRank+: Extending Conformer Ranking to Charged Molecules

05 June 2025, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

We present a machine learning model for high-throughput energetic ranking of charged molecular conformers. Based on the ConfRank approach, the model is trained in a pairwise fashion to predict energy differences for pairs of conformers. By conditioning the model on dataset embedding vectors, we are able to train our model on two different reference levels simultaneously, allowing for a larger training dataset and to emulate multiple reference methods. In particular, we train our model on a large subset of the SPICE 2.0.1 dataset with references on the ωB97M-D3(BJ)/def2-TZVPPD range-separated hybrid meta-GGA DFT-level and a self-developed conformer dataset based on the GEOM dataset including r²SCAN-3c references. The result is a single multi-fidelity model that can reproduce both reference levels up to ML-typical model errors for small- and medium-sized molecules including the following elements: H, Li, B, C, N, O, F, Na, Mg, Si, P, S, Cl, K, Ca, Br, I. By including partial atomic charges obtained from the electronegativity equilibration charge model, our model incorporates information about the charge distribution in a molecule, allowing the treatment of charged closed-shell species and explicit treatment of electrostatic interactions. We test the ranking capability of the model on various datasets, paying special attention to molecular charges of -1, 0, 1. Throughout all tests, we find our model to be as accurate as current AIMNet2 and MACE-OFF23(L) models, while requiring an order of magnitude fewer parameters and matching the robustness of the state-of-the-art semi-empirical quantum method GFN2-xTB.

Keywords

Supplementary materials

Title

Description

Actions

Title

Supporting Information

Description

The Supporting Information contains additional material such as explanations on the loss function, statistical metrics or tabular overviews for test datasets.

Actions

Supplementary weblinks

Title

Description

Actions

Title

ConfRank+ Github

Description

The github repository includes code for using the ConfRank+ model and loading the datasets provided on Zenodo.

Actions

View

Title

Datasets

Description

The data used for training and testing the ConfRank+ model can be found on Zenodo.

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Jun 05, 2025 Version 1

Metrics

285

Views

Downloads

Citations

License

The content is available under CC BY NC 4.0

DOI

10.26434/chemrxiv-2025-xkwk6

Funding

Deutsche Forschungsgemeinschaft

SPP 2363 on “Utilization and Development of Machine Learning for Molecular Applications – Molecular Machine Learning”

Deutsche Forschungsgemeinschaft

CRC 1639 “NuMeriQS” – project no. 511713970

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

ConfRank+: Extending Conformer Ranking to Charged Molecules

Authors

Abstract

Keywords

Supplementary materials

Supplementary weblinks

Comments

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share