Can reasoning power significantly improve the knowledge of large language models for chemistry? --Based on conversations with Deepseek and ChatGPT

Dong-xu Cui; Shi-yu Long; Qiao Li

doi:10.26434/chemrxiv-2025-85wpr-v3

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Can reasoning power significantly improve the knowledge of large language models for chemistry? --Based on conversations with Deepseek and ChatGPT

02 June 2025, Version 3

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

This study presents a comprehensive evaluation of reasoning-enhanced large language models (LLMs) — DeepSeek R1 and OpenAI o4-mini — across six critical chemical tasks. By benchmarking these models against conventional LLMs and established computa-tional tools, we systematically investigate the impact of reasoning capabilities on chemical cognition. Experimental results demon-strate that reasoning-enabled LLMs achieve significant performance improvements in foundational tasks, with DeepSeek R1 attain-ing 88.88% accuracy in SMILES-to-name conversions and 58% accuracy in point group identification, outperforming both OpenAI o4-mini (81.48% and 26%, respectively) and legacy models. However, domain-specific limitations persist: both models exhibit structural inaccuracies in CIF file generation (e.g., erroneous atomic connectivity) and struggle with ordered pattern synthesis in SEM simulations. Notably, while reasoning frameworks enhance logical coherence, they do not inherently resolve challenges in stereochemical assignments or rare symmetry group recognition. These findings underscore the necessity for domain-optimized training paradigms to bridge the gap between generic reasoning capabilities and specialized chemical applications.

Keywords

Supplementary materials

Title

Description

Actions

Title

Supplementary materials

Description

This document provides a directory of additional materials, as well as the prompts used in the process of generating SEM and the obtained original images.

Actions

Title

The crystal structure file generated by the Large language model

Description

The crystal structure file generated by the Large language model <IMPORTANT DISCLAIMER>: All Crystallographic Information Files (.cif) within this directory are algorithmically generated by Large Language Models (LLMs) and do not represent experimental data. Complete details are provided in the WARNING.txt file within the compressed archive.

Actions

Title

Additional data(S1-S4)

Description

This document provides detailed data not mentioned in the main text for each evaluation task, corresponding to S1-S4 in the additional materials.

Actions

Supplementary weblinks

Title

Description

Actions

Title

The repository of the DeepSeek-R1 model on GitHub.

Description

You can obtain the source code of the DeepSeek-R1 model through this link.

Actions

View

Title

Introducing OpenAI o3 and o4-mini

Description

The article by OpenAI introducing the OpenAI o3 and o4-mini models.

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Jun 02, 2025 Version 3

May 30, 2025 Version 2

May 22, 2025 Version 1

Version Notes

In this update, we have revised some of the main text and supplementary materials.

Metrics

834

187

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2025-85wpr-v3

Funding

Doctor Special Program Fund of Lanzhou University of Arts and Science

2020BSZX06

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Can reasoning power significantly improve the knowledge of large language models for chemistry? --Based on conversations with Deepseek and ChatGPT

Authors

Abstract

Keywords

Supplementary materials

Supplementary weblinks

Comments

Version History

Version Notes

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share