A Fragment Based Approach Towards Curating, Comparing and Developing Machine Learning Models Applied in Photo-chemistry

30 April 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The development of Graph Neural Networks for the task of predicting molecular properties has gained a great deal of attention as it typically allows the correlation of quick to compute atomic and bond descriptors with overall molecular properties. With the raising interest in photochemistry and photocatalysis as sustainable alternatives to thermal reactions, curation of virtual databases of computed photophysical properties for training of machine learning models has become popular. Unfortunately, current efforts fail to consider the exciton localization onto different chromophores of the same molecule, leading to potentially large prediction errors. Here we describe a molecular fragmentation strategy that can be used to overcome this limitation, while also providing a way to compare structural diversity between different libraries. Using a newly generated a database of 46,432 adiabatic S0-T1 energy gaps (ALFAST-DB), we compare its diversity against two datasets from the literature and demonstrate that a fragment-based delta learning approach improves model generalizability while achieving accuracies matching those of traditional message passing graph neural network architectures (MPGNN)

Keywords

Singlet-Triplet Gap
Neural Network
Message Passing
Exciton Localization
Database

Supplementary materials

Title
Description
Actions
Title
Supplemental Information
Description
Supplemental data and figures
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.