Abstract
High entropy alloys (HEAs), with their unique distribution of active sites, are of considerable attraction as a promising class of catalysts for hydrogen electro-conversion, but suffer with the vast number of element combinations and the explosive growth of composition space, which hinder the rational design of catalysts and large-scale industrialization. This paper describes a procedural research workflow targeted at accelerating the discovery of efficient HEAs electrocatalyst, combining first-principles database construction utilizing large language model (LLM), fine-tuning the pre-trained model to generate machine learning (ML) potentials, and high-throughput screening. We also introduce a two-dimensional kernel density analysis to develop a novel approach for evaluating the HEAs catalytic activity and obtain data-driven formulas to predict the max density center (the adsorption energies of *H/*OH) from atomic physical intrinsic quantity via the symbolic regression method. Based on the identified formula, approximately 16000 possible HEAs compositions were efficiently screened without time-consuming traditional theoretical calculations, as proved by numerous reported studies, while additional unreported promising candidates await experimental validation. Our work opens the avenue for intelligent catalyst design in high-dimensional multi-element systems.