Data-driven Discovery of Polar Organic Cocrystals: Integration of Machine Learning and Automated Screening

20 February 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Polar organic cocrystals hold great promise for various advanced technological applications. However, their relatively low prevalence highlights the challenges of achieving the desired polar packing arrangements, making their discovery complex and demanding. Here, we present a data-driven approach that integrates machine learning (ML) with high-throughput (HT) automation to accelerate the discovery of polar organic cocrystals. Using ML methodologies, we identified key parameters governing polar cocrystal formation, enabling the targeted selection of molecular candidates. A total of 20 cocrystal combinations with chloranilic acid (CA) were explored, with 20 solvent systems screened for each combination, enabling a highly efficient selection process across a vast chemical space. HT automation further streamlined the synthesis and characterization processes through rapid screening and accurate structural validation, while comprehensively exploring the chemical landscape. Experimental validation yielded 16 new hydrogen-bonded cocrystals, 8 in polar space groups, achieving a polar cocrystal discovery rate (50%) over three times higher than the Cambridge Structural Database (CSD) average (~14%). his integrated approach represents a new approach in polar organic cocrystal research. The findings highlight the potential of this approach for advancing functional molecular materials, paving the way for next-generation applications using polar organic cocrystals.

Keywords

Data-driven Approach
Crystal Engineering
Cocrystallization
Machine Learning Guided Cocrystal Prediction
Lab Automation

Supplementary materials

Title
Description
Actions
Title
supplumentary information
Description
The supplementary information provided detailed information on datasets, codes, tables, general methods, crystallographic data, and any other relevant information in addition to the main content.
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.