Abstract
Crystal structure classification of binary intermetallic structures with 1:3 stoichiometry was done with simple machine learning algorithms. The successful crystal structure segregation is attributed to the novel set of descriptors comprising both compositional and structural features. The dataset includes 97 features, and a total of 2366 reported compounds adopting six different structure types. The unsupervised learning method based on principal component analysis (PCA) followed by clustering using the K-means method was applied to cluster compounds belonging to different structure types. Using the recommendation engine, we predicted the expansion of the clusters and then identified cluster/structure-type overlap. PuNi3-type was among the clearly segregated structure types according to the unsupervised model, and a novel representative, TbIr3, was selected for experimental validation, adopting this structure. The final supervised machine learning predictions were done with PLS-DA, SVM, and XGBoost confidently predicting the novel TbIr3 to belong to the PuNi3-type with an accuracy of 96.6%, 99.8%, and 99.9% respectively. Analysis of the features reveals that the main contributing features to the AB3 crystal structure segregation are the average shortest distance count of A element, the total number of sites of B atom, and the total number of second shortest distance count of B atom. Given that the predicted PuNi3-type of the TbIr3 phase could be controversial due to the extensive study of the Tb–Ir phase diagram and the reports of the TbIr3 in two different structure types, we conducted two independent experimental structural validations to confirm the existence of the TbIr3 in PuNi3-type structure. Subsequent theoretical validation explains that Ir-Ir contacts are the primary stability factor of TbIr3 in PuNi3-type structure over other structure types.
Supplementary materials
Title
Supporting Information
Description
The supplementary material includes crystallographic and atomic parameters from Rietveld refinement of PXRD data for various annealed samples, phase compositions, and impurity analyses (Tables S1–S9). Tables S3 and S10 present predicted probabilities for test samples, while Tables S4 and S5 list top candidates for specific clusters containing Ir, Rh, Os, and Ru. Figures S1–S3 depict periodic tables highlighting Cu-type structures. Figures S4–S12 provide Rietveld refinement plots and SEM/EDS analysis of selected samples, revealing phase compositions. Figures S6–S10 visualize cluster compositions on traditional and PCA periodic tables. Finally, Figures S13–S16 present density of states (DOS) and crystal orbital Hamilton population (COHP) analyses for various model structures.
Actions
Title
TbIr3 Features
Description
Features, structures dataset.
Actions
Supplementary weblinks
Title
Automated Machine Learning Workflow for Excel Data
Description
This repository offers a comprehensive solution for performing data analysis using machine learning techniques on data stored in Excel files. The goal is to provide users with an easy-to-use command-line interface (CLI) to preprocess, analyze, and visualize data using a variety of machine learning models, including both unsupervised (PCA, clustering) and supervised (classification) learning methods.
Actions
View Title
Supervised ML Tool
Description
This project utilizes multiple machine learning models for classification tasks related to material structures. The models include PLS-DA (Partial Least Squares Discriminant Analysis), SVM (Support Vector Machine), and XGBoost. Each model is trained on labeled datasets and validated using a separate validation database.
Actions
View Title
Structure-type Explorer
Description
STEx is a powerful tool for visualizing compounds on periodic tables and recommending elements for novel compound discovery.
Actions
View Title
Composition Analyzer/Featurizer (CAF)
Description
An interactive Python script that generates chemical compositional features and provides tools for filtering, sorting, and merging data.
Actions
View Title
Structure Analysis/Featurizer (SAF)
Description
A Python script designed to process CIF files and extract geometric features. These features include interatomic distances, information on atomic environments, and coordination numbers.
Actions
View