Finding the Next Superhard Material through Ensemble Learning

We report an ensemble machine-learning method capable of finding new superhard materials by directly predicting the load-dependent Vickers hardness based only on the chemical composition. A total of 1062 experimentally measured load-dependent Vickers hardness data were extracted from the literature and used to train a supervised machine-learning algorithm utilizing boosting, achieving excellent accuracy (R2 = 0.97). This new model was then tested by synthesizing and measuring the load-dependent hardness of several unreported disilicides as well as analyzing the predicted hardness of several classic superhard materials. The trained ensemble method was then employed to screen for superhard materials by examining more than 66,000 compounds in crystal structure databases, which showed that only 68 known materials surpass the superhard threshold. The hardness model was then combined with our data-driven phase diagram generation tool to expand the limited num1 ber of reported compounds. Eleven ternary borocarbide phase spaces were studied, and more than ten thermodynamically favorable compositions with superhard potential were identified, proving this ensemble model’s ability to find previously unknown superhard materials