A Data-Science Approach to Experimental Catalyst Discovery: Integrating Exploration, Exploitation, and Serendipity

Sunao Nakanowatari; Keisuke Takahashi; Dam Hieu-Chi; Toshiaki Taniike

doi:10.26434/chemrxiv-2024-5hzqf

Catalysis

Search within Catalysis

A Data-Science Approach to Experimental Catalyst Discovery: Integrating Exploration, Exploitation, and Serendipity

16 December 2024, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Predicting the performance of heterogeneous catalysts is difficult because it involves complex interactions and unknown elementary reactions; hence, traditional catalyst development relies on trial and error. Machine learning offers a structured approach to address these issues. However, this approach is limited by challenges such as descriptor design, sparse data, and context-dependent interactions. In this study, two machine learning systems were developed to address these challenges in catalyst discovery: a recommender system that balances exploration and exploitation, and a "serendipiter" that detects unexpected discoveries. These systems were tested on the oxidative coupling of methane, and the results demonstrated a promising improvement in the efficiency of catalyst discovery. The recommender, based on evidence theory, uses binary combinations of catalyst components as descriptors to predict performance. It handles incomplete data by quantifying contradictions and uncertainty, facilitating a balance between exploration (testing unevidenced catalysts) and exploitation (refining known high-performing ones). The recommender efficiently identified a diverse range of high-performing catalysts through adaptive sampling with 160 catalysts. The serendipiter, a meta-learner, identifies unexpected high-performing catalysts by leveraging different machine learning models. It increased the occurrence of serendipitous discoveries to 50%, compared to 3% with the recommender alone. In summary, these systems improve the efficiency and reproducibility of catalyst discovery by balancing exploitation, exploration, and serendipity.

Keywords

Adaptive sampling

Serendipity

Evidence theory

High-throughput experimentation

Oxidative coupling of methane

Supplementary materials

Title

Description

Actions

Title

Supporting information

Description

This file is including supporting informations for helping to understand details of manuscript.

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Dec 16, 2024 Version 1

Metrics

745

381

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2024-5hzqf

Funding

JST-Mirai Program

JPMJMI22G4

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

A Data-Science Approach to Experimental Catalyst Discovery: Integrating Exploration, Exploitation, and Serendipity

Authors

Abstract

Keywords

Supplementary materials

Comments

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share