Assessment of Fine-Tuned Large Language Models for Real-World Chemistry and Material Science Applications

31 July 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The current generation of large language models (LLMs), like ChatGPT, have limited chemical knowledge. Recently, it has been shown that these LLMs can learn and predict chemical properties through fine-tuning. In this work, we explore the potential and limitations of this approach. We studied the performance of fine-tuning GPT-J-6B, a public-domain version of the GPT family, for a range of different chemical questions. We find that in most, if not all, cases, this approach outperforms the benchmark (random guessing) for a simple classification problem. Depending on the size of the dataset and the type of questions, we can also address more sophisticated problems. The most important conclusions of this work are that, for all datasets considered, their conversion into an LLM fine-tuning training set is straightforward and that fine-tuning with even relatively small datasets leads to predictive models. These results suggest that the systematic use of LLMs to guide experiments and simulations will be a powerful technique in any research study, significantly reducing unnecessary experiments or computations.

Keywords

large-language model

Supplementary materials

Title
Description
Actions
Title
Supporting Information
Description
Detailed report of all case studies reported in this work.
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.