Leveraging Multitask Learning to Improve the Transferability of Machine Learned Force Fields

27 September 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Transferable neural network potentials have shown great promise as an avenue to increase the accuracy and applicability of existing atomistic force fields for organic molecules and inorganic materials. Training sets used to develop transferable potentials are very large, typically millions of examples, and as such, are restricted to relatively inexpensive levels of ab initio theory, such as density functional theory in a double- or triple-zeta quality basis set, which are subject to significant errors. It has been previously demonstrated using transfer learning that a model trained on a large dataset of such inexpensive calculations can be re-trained to reproduce energies of a higher level of theory using a much smaller dataset. Here, we show that more generally, one can use hard parameter sharing to successfully train to multiple levels of theory simultaneously. We demonstrate that simultaneously training to two levels of theory is an alternative to freezing layers in a neural network and re-training. Further, we show that training multiple levels of theory can improve the overall performance of all predictions and that one can transfer knowledge about a chemical domain present in only one of the datasets to all predicted levels of theory. This methodology is one way in which multiple, incompatible datasets can be combined to train a transferable model, increasing the accuracy and domain of applicability of machine learning force fields.

Keywords

machine learning
force fields
transfer learning
multitask learning
interatomic potentials
HDNNP

Supplementary materials

Title
Description
Actions
Title
supplementary data
Description
All geometries used to analyze errors statistics in the main text, with energy labels for the trained models and reference energies, where appropriate, are given in supplemen- tary data.tar.gz. Each test set is given as a separate json file and a text file titled README.txt describes the contents. Original references for each test set are given.
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.