Abstract
Scientific modeling often requires navigating a trade-off between physical interpretability and empirical accuracy—a task that can take weeks of iteration, especially in systems with partial observability, structural complexity, and experimental errors. Here, we show how a state-of-the-art agentic reasoning-and-coding large language model, OpenAI o3, autonomously solved a modeling challenge in surface chemistry that puzzled us for months: quantifying the competitive adsorption of carboxylic acids on metal–organic layers. With experimental data and a concise problem formulation, o3 rapidly formulated a physically grounded adsorption model, derived the mathematical equations, implemented the corresponding codes to fit the experimental data, revised its assumptions, and ultimately derived a competitive adsorption model with three parameters that matched experimental data across more than a dozen of tested molecules. The resulting model—simple, mechanistically transparent, and quantitatively robust—incorporates both classical Langmuir competition and structural constraints such as site accessibility. Beyond addressing this particular challenge, our findings highlight a transformative shift in scientific methodology: from manual trial-and-error approaches to AI-driven hypothesis generation and model refinement. This represents a new paradigm in research, wherein language models surpass the traditional roles of machine learning in data analysis and computational support, actively participating in scientific reasoning and hypothesis development.
Supplementary materials
Title
supporting information for "Data to Physics in Minutes: An Agentic Large Language Model Solves a Competitive Adsorption Puzzle“
Description
Experimental Details and Failed Fitting Trials
Actions