Abstract
Metal-organic frameworks (MOFs) are porous materials with applications in gas separations and catalysis, but a lack of water stability often limits their practical use given the ubiquity of water in air and the environment. Consequently, it is useful to predict whether a MOF is water-stable before investing time and resources into synthesis. Existing heuristics for designing water-stable MOFs lack generality and artificially limit the diversity of explored chemistry due to narrowly defined criteria. Machine learning (ML) models offer the promise to improve generality of predictions but require diverse experimental MOF stability data to be trained. In an improvement on previous efforts, we enlarge the available training data for MOF water stability prediction by over 400%, adding 911 MOFs with water stability labels assigned through semi-automated manuscript analysis to curate the new data set WS24. The additional data is shown to improve ML model performance (test ROC-AUC > 0.8) over diverse chemistry for the prediction of both water stability and stability in harsher acidic conditions. We illustrate how the expanded data set and models can be used with previously developed activation stability models to carry out genetic algorithms to quickly screen ~10,000 MOFs from a space of hundreds of thousands for candidates with multivariate stability (i.e., for activation, in water, and in acid). Model analysis and genetic algorithm results uncover metal- and geometry-specific design rules for robust MOFs. The data set and ML models developed in this work, which we disseminate through an easy-to-use web interface, are expected to contribute toward the accelerated discovery of novel, water-stable MOFs for applications such as direct air gas capture and water treatment.
Supplementary materials
Title
Supplementary document
Description
Supplementary figures and tables
Actions