Theoretical and Computational Chemistry

Diversifying databases of metal organic frameworks for high-throughput computational screening



By combining metal nodes and organic linkers, an infinite number of metal organic frameworks (MOFs) can be designed in silico. When making new databases of such hypothetical MOFs, we need to assure that they not only contribute towards the growth of the count of structures but also add different chemistry to existing databases. In this study, we designed a database of ~20,000 hypothetical MOFs which are diverse in terms of their chemical design space—metal nodes, organic linkers, functional groups and pore geometries. Using Machine Learning techniques, we visualized and quantified the diversity of these structures. We find that on adding the structures of our database, the overall diversity metrics of hypothetical databases improve, especially in terms of the chemistry of metal nodes. We then assessed the usefulness of diverse structures by evaluating their performance, using grand-canonical Monte Carlo simulations, in two important environmental applications—post combustion carbon capture and hydrogen storage. We find that many of these structures perform better than widely used benchmark materials such as Zeolite-13X (for post combustion carbon capture) and MOF-5 (for hydrogen storage).


Thumbnail image of main.pdf

Supplementary material

Thumbnail image of SI.pdf
Supporting Information
Additional details for structure generation—list of metal nodes, organic linkers, functional groups and topologies used to generate the hypothetical MOFs in this study; additional details for structure optimization and charge generation calculations; force field parameters used for grand-canonical Monte Carlo simulations; additional figures for diversity analysis; additional figures for hydrogen storage.