Abstract
Understanding the crystallization of molecules, and specifically the appearance of polymorphs, is a great challenge to modern chemistry, with both fundamental and practical aspects. Here, motivated by the proven ability of Machine-Learning (ML) algorithms to perform classification tasks, we harness ML-based tools and existing chemical datasets to ask the following question: can the existence of polymorphs of a molecular crystal be predicted based solely on properties of the single molecule. We find that our algorithm can predict the existence of polymorphism with an average accuracy of 65% at best, not enough to generate a reliable “polymorph predicting” engine. Moreover, our results imply that the number of polymorphs is much larger than that reported in the literature. We suggest two possible reasons for the poor performance, both providing important insights. The first is that the data is inherently biased towards mono-morphs, as many polymorphs are reported as mono-morphs, not because they indeed have only one stable crystal structure, but rather because only on crystal form has been observed in experiments. The second reason is even more profound, suggesting that the many-body nature of crystallization limits the possibility of predicting crystal properties based solely on single-molecule characteristics.
Supplementary materials
Title
Supporting information
Description
Methods and supplementary plots
Actions