An open-source framework for fast-yet-accurate calculation of quantum mechanical features

Eike Caldeweyher; Christoph Bauer; Ali Soltani  Tehrani

doi:10.26434/chemrxiv-2021-8gthw

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

An open-source framework for fast-yet-accurate calculation of quantum mechanical features

06 December 2021, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

We present the open-source framework kallisto that enables the efficient and robust calculation of quantum mechanical features for atoms and molecules. For a benchmark set of 49 experimental molecular polarizabilities, the predictive power of the presented method competes against second-order perturbation theory in a converged atomic-orbital basis set at a fraction of its computational costs. Robustness tests within a diverse validation set of more than 80,000 molecules show that the calculation of isotropic molecular polarizabilities has a low failure-rate of only 0.3 %. We present furthermore a generally applicable van der Waals radius model that is rooted on atomic static polarizabilites. Efficiency tests show that such radii can even be calculated for small- to medium-size proteins where the largest system (SARS-CoV-2 spike protein) has 42,539 atoms. Following the work of Domingo-Alemenara et al. [Domingo-Alemenara et al., Nat. Comm., 2019, 10, 5811], we present computational predictions for retention times for different chromatographic methods and describe how physicochemical features improve the predictive power of machine-learning models that otherwise only rely on two-dimensional features like molecular fingerprints. Additionally, we developed an internal benchmark set of experimental super-critical fluid chromatography retention times. For those methods, improvements of up to 17 % are obtained when combining molecular fingerprints with physicochemical descriptors. Shapley additive explanation values show furthermore that the physical nature of the applied features can be retained within the final machine-learning models. We generally recommend the kallisto framework as a robust, low-cost, and physically motivated featurizer for upcoming state-of-the-art machine-learning studies.

Keywords

machine learning

quantum chemistry

pharmaceutical industry

artificial intelligence

Supplementary weblinks

Title

Description

Actions

Title

kallisto: A command-line interface to simplify computational modelling and the generation of atomic features

Description

Efficiently calculate 3D-atomic/molecular features for quantitative structure-activity relationship approaches.

Actions

View

Title

Benchmark set for static molecular polarizabilities

Description

The data of this repository has been extracted from the supporting information of Ref. (Thakkar, 2015). All structures have been optimized using density functional theory at the CAM-B3LYP-D3(B)/def2-TZVP level of theory.

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Now Published

An open-source framework for fast-yet-accurate calculation of quantum mechanical features

Eike Caldeweyher, Christoph Bauer, Ali Soltani Tehrani journal article

Physical Chemistry Chemical Physics , Volume 24, Issue 17

Online publication date: 2022

Version History

Dec 06, 2021 Version 1

Metrics

1,317

563

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2021-8gthw

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) declare that they have sought and gained approval from the relevant ethics committee/IRB for this research and its publication.

An open-source framework for fast-yet-accurate calculation of quantum mechanical features

Authors

Abstract

Keywords

Supplementary weblinks

Comments

Now Published

Version History

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share