Abstract
Transition metal phosphates (TMPs) are extensively explored for electrochemical and catalytical applications due to their structural versatility and chemical stability. Within this material class, novel high-entropy metal phosphates (HEMPs)—containing multiple transition metals combined into a single-phase structure—are particularly promising, as their compositional complexity can significantly enhance functional properties. However, the discovery of suitable HEMP compositions is hindered by the vast compositional design space and complex or very specific synthesis conditions. Here, we present a data-driven strategy combining automated wet-chemical synthesis with a Sequential Learning App for Materials Discovery (SLAMD) framework (Random Forest regression model) to efficiently explore and optimize HEMP compositions. Using a limited set of initial experiments, we identified near-equimolar multi-metal compositions in a single-phase crystalline solid. The model successfully predicted a novel Co0.3Ni0.3Fe0.2Cd0.1Mn0.1 phosphate octahydrate phase, validated experimentally, demonstrating the effectiveness of the machine learning approach. This work highlights the potential of integrating automated synthesis platforms with data-driven algorithms to accelerate the discovery of high-entropy materials, offering an efficient designing pathway to advanced functional materials.
Supplementary materials
Title
Crystal structures
Description
Collection of crystal structure files used for this work according to Table 1,
Actions
Title
Inputs and Outputs for sequential learning
Description
The input_output.xlsx spreadsheet contains all input data for sequential learning, including 38 evaluated data sets (learning data), and outputs from SLAMD. The spreadsheet file comprises 5 tabs:
Input Data. Defines the experimental space with, 1360 possible samples, out of which 38 samples were synthesised and analysed. Such list of ws directly analysed with SLAMD.
Predictions_1 and Predictions_2: Output data from SLAMD, in which all the samples i.e. those experimentally tested as well as the ones “predicted” with RF regression are ranked. We include two different sets of predictions to highlight, how the results change depending on the applied algorithm settings and target properties (e.g. the number of phases).
Metadata_1 and Metadata_2: records of relevant algorithm settings used to generate prediction spreadsheets in SLAMD.
Actions
Title
XRD data
Description
Our experimental data set includes 38 diffractograms as raw data files in three formats: Bruker RAW, XRDML, and XY ASCII (xrd_data_multiformat.zip). File names included sample numbers from the input_output.xlsx spreadsheet in (4) and additionaly encode information about compositions and relevant reaction conditions. For instance:
Sample48_Co0.6_Ni0.1_Fe0.1_Cd0.1_Mn0.1_MP1.0_cPO40.05_Temp90C_Time40min.raw
Actions
Title
Automated synthesis metadata
Description
The automated synthesis metadata file, synthesis_GraphML.zip. This GraphML file contains all the necessary metadata to reproduce the synthesis, The GraphML schematics represent the topological information about the experimental setup with the connections of pumps, valves, waste containers, reservoirs of the used chemicals, reaction vessels, and heating plates.
Actions