Materials Science

DiSCoVeR: a Materials Discovery Screening Tool for High Performance, Unique Chemical Compositions



We present Descending from Stochastic Clustering Variance Regression (DiSCoVeR), a Python tool for identifying high-performing, chemically unique compositions relative to existing compounds using a combination of a chemical distance metric, density-aware dimensionality reduction, and clustering. We introduce several new metrics for materials discovery and validate DiSCoVeR on Materials Project bulk moduli using compound-wise and cluster-wise validation methods. We visualize these via multiobjective Pareto front plots and assign a weighted score to each composition where this score encompasses the trade-off between performance and density-based chemical uniqueness. We explore an additional uniqueness proxy related to property gradients in chemical space. We demonstrate that DiSCoVeR can successfully screen materials for both performance and uniqueness in order to extrapolate to new chemical spaces.

Version notes

The results have changed slightly and align with the new dataset version on figshare. RobustScaler used instead of MinMaxScaler to reduce effect of outliers. A few minor formatting tweaks (e.g. acknowledgement section).


Thumbnail image of discover_paper.pdf

Supplementary weblinks

DiSCoVeR Codebase
A materials discovery algorithm geared towards exploring high performance candidates in new chemical spaces.
Trained Materials Discovery Python Class
Trained materials discovery Python Discover() class for Materials Project elasticity data. For documentation, see the linked GitHub repository.
Interactive DiSCoVeR Pareto Front Figures
Various figures, both interactive and non-interactive, related to the DiSCoVeR algorithm as applied to compounds and clusters. For more details, see the paper.