Abstract
Recent decades have shown arising growth-on-demand of integrating the machine learning into all areas of chemistry and materials science. In this study, we consider one of the aspects of applying these technologies to gain advantage in the search for new knowledge extracted from experimental data obtained in ever-growing number of studies. The novelty detection approaches are aimed to identify the artefacts in these data that may be of importance in many direc- tions. The analysis of "outliers" in details of the synthesis in the research studies of garnet-structured solid electrolytes was chosen as the object of demonstration of one of the practical applications of this methodology. Particular attention was paid to the choice of precursors. The thermodynamic data such as the heat of formation from the pure oxides as well as the results of drop solution calorimetry for simple oxides were involved as the descriptors of the studied systems. The overall performance of novelty/outlier detection of all types of outliers was characterized for the data described varying the complexity of description using ROC-AUC statistics and was assessed to be 0.71 – 0.72 using the Area-Under-Curve statistics. It was found that all “outlier” compounds related to those as the result of using the rare precursors in synthesis were successfully identified. The complementary regression analysis was performed to elucidate the relationship between the data diversity and the complexity of data description.