Can reasoning power significantly improve the knowledge of large language models for chemistry? --Based on conversations with Deepseek and ChatGPT

Dong-xu Cui; Shi-yu Long

doi:10.26434/chemrxiv-2025-85wpr

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Can reasoning power significantly improve the knowledge of large language models for chemistry? --Based on conversations with Deepseek and ChatGPT

22 May 2025, Version 1

This is not the most recent version. There is a

newer version

of this content available

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

This study presents a comprehensive evaluation of reasoning-enhanced large language models (LLMs) — DeepSeek R1 and OpenAI o4-mini — across six critical chemical tasks. By benchmarking these models against conventional LLMs and established computa-tional tools, we systematically investigate the impact of reasoning capabilities on chemical cognition. Experimental results demon-strate that reasoning-enabled LLMs achieve significant performance improvements in foundational tasks, with DeepSeek R1 attain-ing 88.88% accuracy in SMILES-to-name conversions and 58% accuracy in point group identification, outperforming both OpenAI o4-mini (81.48% and 26%, respectively) and legacy models. However, domain-specific limitations persist: both models exhibit structural inaccuracies in CIF file generation (e.g., erroneous atomic connectivity) and struggle with ordered pattern synthesis in SEM simulations. Notably, while reasoning frameworks enhance logical coherence, they do not inherently resolve challenges in stereochemical assignments or rare symmetry group recognition. These findings underscore the necessity for domain-optimized training paradigms to bridge the gap between generic reasoning capabilities and specialized chemical applications.

Keywords

Supplementary materials

Title

Description

Actions

Title

Supplementary materials

Description

S1. Detailed output results of SMILES code and chemical name conversion process S2. Detailed output results of logP S3. Prompts and corresponding raw images utilized during the image generation procedure

Actions

Title

The crystal structure file generated by the Large language model

Description

The crystal structure file generated by the Large language model ******IMPORTANT DISCLAIMER****** All Crystallographic Information Files (.cif) within this directory are algorithmically generated by Large Language Models (LLMs) and do not represent experimental data. Complete details are provided in the WARNING.txt file within the compressed archive.

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.