Open Access Article
Jinxin Sunab,
Xiaokang Xuab,
Yuqing Maob,
Anjie Chenb,
Shu Wangb,
Li Shi
*c,
Chongyi Ling
*a,
Jinlan Wang
*a and
Xiuyun Zhang
*b
aKey Laboratory of Quantum Materials and Devices of Ministry of Education, School of Physics, Southeast University, Nanjing 21189, China. E-mail: lingchy@seu.edu.cn; jlwang@seu.edu.cn
bCollege of Physics Science and Technology, Yangzhou University, Yangzhou 225002, China. E-mail: xyzhang@yzu.edu.cn
cState Key Laboratory of Organic Electronics and Information Displays, Institute of Advanced Materials (IAM), Nanjing University of Posts and Telecommunications, Nanjing 210023, China. E-mail: iamlshi@njupt.edu.cn
First published on 29th September 2025
High-entropy alloys (HEAs) have emerged as a promising class of multisite catalysts that exhibit high levels of performance due to a diversity of active sites; however, establishing their structure–performance relationships remains a grand challenge. Herein, we systematically explored the structure–activity–selectivity relationship of HEAs for the CO2 reduction reaction (CO2RR) with the assistance of a machine learning framework and density functional theory computations. Statistical analysis of hundreds of thousands of binding energies of *CO, *CHO, and *H on (FeCoNiCuMo)55 clusters revealed that HEAs can break the well-established scaling relationship of pure metal catalysts, but they also face an activity–selectivity tradeoff. This originates from the positive role of the unpaired d electron number in enhancing the binding strength of *CHO and *H and limits the overall performance. Moreover, an activity and a selectivity descriptor were constructed, giving accurate predictions for the performance variations of the reported experiments. On this basis, rapid screening among 26
334 types of HEAs was performed, and 10 promising candidates that balanced activity and selectivity were selected. Our workflow not only provides quantitative criteria to accelerate the rational design of HEA catalysts for the CO2RR, but it also offers a systematic approach to unraveling the intricate structure–performance relationship in complex systems.
The structure–performance relationship unveils how material structure influences its properties and ultimately performance.21–24 and has been widely adopted to guide catalyst design and optimization. However, the diverse active sites of HEAs also give rise to an extremely complicated structure–performance relationship,25,26 in addition to the great promise for realizing unattainable performance. Because of the large number of possible active sites in HEAs, it is computationally costly to comprehensively assess the binding strength of reaction intermediates.27,28 However, diversity in structure undoubtedly leads to the wide difference in binding strength across different active sites,29,30 and therefore, it is also very challenging to determine how to computationally evaluate the overall performance. As a result, the structure–performance of HEAs remains underexplored, although there have been great efforts in the study of HEA catalysts.
Aiming at this challenge, a machine learning (ML) workflow, coupled with density functional theory (DFT) calculations and statistical approaches, was adopted to explore the structure–activity–selectivity relationship of HEAs for the CO2RR. Using (FeCoNiCuMo)55 HEA clusters as prototypes, the binding energies of three key intermediates, *CO, *CHO, and *H (ΔE*CO, ΔE*CHO and ΔE*H), on an appropriate portion of possible sites were calculated using DFT computations. On this basis, three ML regression models were trained, realizing the rapid and accurate prediction of ΔE*CO, ΔE*CHO, and ΔE*H on all the possible sites of (FeCoNiCuMo)55 HEA clusters (81
900, 488
250, and 488
250 for *CO, *CHO, and *H, respectively).
Statistical analysis of the obtained hundreds of thousands of binding energies revealed an activity–selectivity tradeoff that leads to a low overall performance of HEAs for the CO2RR. This rationalizes the aforementioned phenomenon that experimental reports on HEAs for the CO2RR are very rare, despite the abundance of research in electrocatalysis. Additionally, classification models for activity and selectivity evaluations were established to determine the key factors regarding catalytic performance, and the structure–activity–selectivity relations were explored based on the SHapley Additive exPlanations (SHAP) analyses of these features. Furthermore, we constructed activity and selectivity descriptors using SHAP values, and related experimental reports were referenced to validate the reliability. Finally, these descriptors were employed for the rapid screening of 26
334 types of unknown HEAs, where 10 promising candidates with significantly enhanced performance as compared to reported HEAs were determined.
In the calculation of binding energies, to compare the scaling relations of HEA clusters with those of transition metals, we used the same calculation equations reported by Nørskov et al.37,38 The binding energies of *CO, *CHO, and *H are calculated through:
| ΔE*CO = E*CO − E* − ECO |
562
000). Considering the fact that the coordination environment of the active site usually dominates its catalytic performance,27,29,30 further evaluations were carried out based on the consideration of the local microstructures (MS-HEAs) of the active site, where two distinct local microstructures (penta-coordinate sites (MSpenta-HEAs) and hexa-coordinate sites (MShex-HEAs), as shown in Fig. 1a) were involved.
For each structure, the active center (region 1, R1), along with its nearest neighboring coordination atoms on the surface (region 2, R2) and the subsurface (region 3, R3), were considered. As a result, over 80
000 (3150 for MSpenta-HEAs and 78
750 for MShex-HEAs, Fig. S2) distinct active centers in M55 HEA clusters were identified, which significantly decreased as compared to the total number of the clusters, but is a sufficiently large number for examination of the activity–performance relationship for the CO2RR. Therefore, the diverse local structure of active sites in M55 clusters enables comprehensive exploration of the performance trend, although it is relatively small.
The binding energies of three key intermediates, i.e., *CO, *CHO, and *H (ΔE*CO, ΔE*CHO, and ΔE*H), were adopted for the assessment of electrocatalytic performance in the CO2RR, according to numerous experimental and computational studies.30,45–47 These intermediates (*CO, *CHO, and *H) prefer to adsorb on the top, bridge, and hollow site of the clusters (Fig. S3), respectively, leading to 81
900 (3150 for MSpenta-HEAs and 78
750 for MShex-HEAs), 488
250 (3150 × 5 + 78
750 × 6), and 488
250 (3150 × 5 + 78
750 × 6) distinct adsorption sites (Fig. S4) for *CO, *CHO, and *H, respectively. The difference in the adsorption configuration was characterized using the coordination features X_1, X_2, and X_3, which denote the surface coordination number of the atoms at the adsorption sites bonded to the intermediates (Fig. 1b). Furthermore, apart from the R1, R2, and R3 regions, the *O-connected atoms of bridge sites (bridge region 1, B1), the nearest neighboring coordination atoms of bridge sites on the surface (bridge region 2, B2), and the *H-connected atoms of hollow sites (hollow region 1, H1) were extracted as active regions (Fig. 1c).
On this basis, composition features and atomic property features were introduced to describe the variation of the active sites. Specifically, element vectors corresponding to element names in different regions (R1, R2, R3, B1, B2, and H1) of MSpenta-HEAs and MShex-HEAs were extracted and converted into composition features through counting the number of distinct elements within different regions (Fig. 1d). In this way, 15, 25, and 20 features for top, bridge, and hollow sites, respectively, are generated. In addition, the number of unpaired d electrons (Nd-up), the first ionization energy (I1), and electronegativity (χ) were used to describe the physical and chemical properties of different metal atoms (Fig. 1e). By incorporating the composition feature and atomic property feature into the following function based on the grouping average, 9, 15, and 12 features for top, bridge, and hollow sites, respectively, were generated:
represents the mean feature value of i-th elemental properties in the k-th region. After feature engineering, the total number of features utilized for predicting ΔE*CO, ΔE*CHO, and ΔE*H is 25, 42, and 35, respectively.
Based on the designed features and the moderate-scale DFT calculations (Fig. S5–S7), three datasets for ΔE*CO, ΔE*CHO, and ΔE*H were compiled, and three adsorption prediction models designated as XGBR*CO, XGBR*CHO, and XGBR*H were trained by employing the Extreme Gradient Boost Regression (XGBR) algorithm.39,40 The training and testing datasets were randomly selected and divided into an 80% and 20% ratio. To evaluate the predictive accuracy of these models, the mean square errors (MSE) between DFT-calculated and ML-predicted ΔE*CO, ΔE*CHO, ΔE*H were calculated for training and testing datasets.
Our results showed that the XGBR*CO, XGBR*CHO, and XGBR*H models exhibited excellent prediction accuracy when the data scale reached 322, 368, and 831, respectively, where the corresponding MSE values are 0.009/0.011/0.005 eV and 0.032/0.057/0.028 eV for the training and testing sets (Fig. 1f–h). Using the trained models, the ΔE*CO, ΔE*CHO, and ΔE*H on all the possible sites of the (FeCoNiCuMo)55 HEA cluster (81
900, 488
250, and 488
250 for *CO, *CHO, and *H, respectively) were predicted (Fig. S8–S10). Some general tendencies can be obtained from predicted energies: (i) the binding strengths of *CO on the active sites with Cu atoms in the R1 region are relatively weak (Fig. S8); (ii) the binding strengths of *CHO on the active sites with Mo atoms in the B1 region are usually strong (Fig. S9); (iii) additional Cu atoms in the H1 region usually lead to weaker adsorption of *H (Fig. S10). The adsorption energies on most sites deviate from the well-established scaling relations between ΔE*CO and ΔE*CHO on the (211) and (111) surfaces of mono-metals (Fig. 2a).37,38,48 This deviation leads to a low reaction free energy for *CO reduction into *CHO, which is usually the potential-determining step for the CO2RR into deep-reduced products. Consequently, HEAs demonstrate promising catalytic activity for the CO2RR by facilitating this crucial reaction pathway.
To uncover the driving force behind this phenomenon, an activity classification model (XGBCactivity) based on the XGBoost algorithm was developed, where a dataset combining ΔE*CO and ΔE*CHO over 488
250 sites with activity labels (where the sites breaking the scaling relationship were recognized to be highly active in the CO2RR and were labeled as 1; otherwise were labeled as 0) and 7 key structural features were used. Herein, the strategy for selecting 7 key structural features considered the correlation with intermediate adsorption and the transferability of the model. Specifically, the top 10 features from XGBR*CO and XGBR*CHO were selected by taking their union to ensure a high correlation with *CO and *CHO adsorption (Fig. S11). Then, features with atomic properties (such as Nd-up-B1, Nd-up-R2, and I1-R1) were retained, while features with specific elemental composition information (such as Cu-R1, Co-R1, and Mo-B1) were removed.
The accuracy of the trained classification model was very high, where the area under the curve (AUC) value of the operating characteristic (ROC) curve was 0.987, and the main diagonal values of the normalized confusion matrix reached 0.95 and 0.98 (Fig. 2b). On this basis, SHAP analysis was further carried out,41,42 and it showed that the highest importance to the activity was attributed to the number of unpaired d electrons at the B1 region (Nd-up-B1) (Fig. 2c). This feature also dominated the binding strength of *CHO (Fig. S12). Therefore, the relationship between the feature values of Nd-up-B1, SHAP values of Nd-up-B1 in XGBR*CHO, and SHAP values of Nd-up-B1 in XGBCactivity were plotted to understand the structure–property–activity relationship from Nd-up-B1.
As shown in Fig. 2d, highly active sites (the points in the blue dashed box with positive SHAP values of XGBCactivity) generally possess positive SHAP values in XGBR*CHO (corresponding to the enhancement of *CHO binding strength) and relatively large values (3–5) of Nd-up-B1. This can be ascribed to the adsorption configuration of *CHO. Specifically, a higher Nd-up of B1 atoms usually corresponds to a higher oxygen affinity of the metal atom, leading to enhancement of the binding strength of the O–M bond (Fig. 2e, f and S13). Interestingly, a large Nd-up of the coordination atoms also resulted in the relatively weak binding strength of *CO (negative SHAP values of Nd-up-R2 in XGBR*CO, Fig. 2g), due to the enhanced bonding strength between the active center and coordination atoms (Fig. 2h, i and S15). Specifically, the mean ICOHP values between the active center and coordination atoms significantly decrease when Nd-up-R2 values are 3–5, while they remain almost unchanged when Nd-up-R2 is in the range of 0–3. Therefore, the large number of unpaired d electrons of coordination atoms stabilizes the *CHO and simultaneously weakens the binding strength of *CO, resulting in the breaking of the scaling relationship between ΔE*CO and ΔE*CHO on pure metal surfaces and potentially high activity of MS-HEAs for the CO2RR.
Nevertheless, these active sites exhibit low selectivity towards the CO2RR, due to the preference of *H adsorption. Our results show that most of the studied active sites possess a more negative *H adsorption energy than that on a pure Cu55 cluster (Fig. 3a). The stronger *H binding strength indicates that these sites will be more easily covered by *H intermediates as compared to Cu.27,30,49 This facilitates the competing HER or blocks the related active sites, both of which lowers the reaction rate of the CO2RR. Using a dataset of over 488
250 sites with selectivity labels (where the sites with *H binding strength weaker than Cu were labeled as 1, and otherwise were labeled as 0) and 5 key structural features, the trained classification model exhibited high accuracy for AUC values reaching 0.988 (Fig. 3b). Similar to the binding strength of *CHO, the mean Nd-up of the hollow site (Nd-up-H1) is also the determining factor for the ΔE*H (Fig. S16), as well as the selectivity (Fig. 3c), where a larger value of Nd-up-H1 usually leads to stronger *H-binding strength and lower CO2RR selectivity (Fig. 3d). Therefore, a high number of unpaired d electrons facilitates the activity but lowers the selectivity of the CO2RR, suggesting an activity–selectivity tradeoff in the HEAs that results in the low overall performance for the CO2RR.
To provide a more intuitive picture of the above phenomenon, we calculated the ratios of active sites with high-activity (Pactivity), high-selectivity (Pselectivity), and high-performance (with high-activity and high-selectivity, Pperformance) with the variation of the key features (Nd-up-B1 and Nd-up-H1 for the activity and selectivity, respectively). The ratio is defined as
where Nhigh denotes the number of active sites with high activity, selectivity, or performance and Ntotal denotes the total number of active sites under specific situations. A general tendency can be observed from Fig. 3e and f, where Pactivity increases, while Pselectivity decreases with the increase in Nd-up-B1 and Nd-up-H1. As a result, Pperformance constantly remains at a low level. These findings highlight the activity–selectivity tradeoff in HEAs, which likely explains the current lack of experimental reports on HEAs for the CO2RR, despite extensive research on their potential for electrocatalysis.
The structure–activity–selectivity relationship was further explored, to determine which sites of (FeCoNiCuMo)55 HEAs can achieve high activity and selectivity for the CO2RR. To this end, the sum of the SHAP values of the key features was adopted as the descriptor. Accordingly, two descriptors, labeled as SHAPactivity and SHAPselectivity, were constructed for the activity and selectivity evaluation, respectively, where the positive value corresponds to high activity or selectivity. Our results show that these two descriptors can provide accurate classification of the activity and selectivity of the active sites, where the values on the main diagonal of the confusion matrix are as high as 0.93/0.97 and 0.99/0.94 for the activity and selectivity, respectively (Fig. 4a and b). By combining SHAPselectivity and SHAPactivity, the probability density distribution of different types of sites is obtained, where the sites with high-activity and low-selectivity were found to be dominant (Fig. 4c).
On the contrary, active sites with high activity and selectivity are very rare, supporting the activity–selectivity tradeoff and poor performance for the CO2RR of (FeCoNiCuMo)55 HEAs once again. Furthermore, the local environments with excellent performance were extracted, and were defined as (M1M2M3)penta or (M1M2M3)hex. Specifically, M1, M2, and M3 are the three metal atoms of a hollow site in MS-HEA, while other atoms are random (Fig. 4d). The 8 local environments with the highest Pperformance were selected due to their ability to maintain high-performance while the surrounding atoms change. Our results show that (CuMoCu)penta, (MoMoNi)penta, (CuMoCu)hex, (MoMoNi)hex, (NiMoMo)hex, (CuFeCu)hex, (CuFeCu)penta, and (MoMoCu)penta are promising local structures for the CO2RR (Fig. 4e). In particular, 75% of the active sites that contain a (CuMoCu)penta center are expected to exhibit excellent performance. This can be well understood from the aforementioned feature analysis. Specifically, the large Nd-up of Mo enhances the *CHO binding strength, and the small Nd-up of Cu atoms leads to the low mean Nd-up of the active center to weaken the *H binding strength, leading to the excellent performance.
The feasibility of the constructed structure–activity–selectivity relationship in other HEAs was further evaluated. Specifically, the performance of two representative HEAs, i.e., AgCuAuPdPt18 and PtMoPdRhNi10 that have been experimentally proven to be promising catalysts for the CO2RR and HER, respectively, were systematically explored. Using (AgCuAuPdPt)55 and (PtMoPdRhNi)55 clusters as models, and SHAPactivity and SHAPselectivity as the descriptors, the probability density distribution of activity and selectivity was predicted (Fig. 5a and b). The aforementioned activity–selectivity tradeoff still exists in these two systems, where a large portion of active sites exhibits high activity but low selectivity for the CO2RR. Nevertheless, (AgCuAuPdPt)55 possesses a considerable number of sites with high activity and selectivity (Fig. 5a), in agreement with the experimentally reported satisfactory performance in the CO2RR.18 On the contrary, almost no active site lies in the region that meets the requirement of high catalytic performance for (PtMoPdRhNi)55 (Fig. 5b), in accordance with its high HER activity.10
The structure–activity–selectivity relationship was further adopted to other reported HEAs, including FeCoNiCuAl,19 CoMnNiCuZn,20 CoCuNiZnSn,27 CoCuGaNiZn,30 FeCoNiRuMn,12 RuRhPdPtIr,50 NiFeCoMoW,51 PtCoNiMoRh,52 PtRuFeCoNi,53 PtPdRuMoNi,54 IrRuRhMoW,14 PtCoNiRuIr,55 NiFeCrVTi,56 and NiCoFePtRh.11 Surprisingly, our model provides a satisfactory description of the performance variation of these systems, supporting the reliability of the proposed models (Fig. S17 and S18). Specifically, for the reported HEAs with a high CO2RR performance, including FeCoNiCuAl,19 CoMnNiCuZn,20 CoCuNiZnSn,27 and CoCuGaNiZn,30 there is a relatively high ratio of active sites with high-performance (Pperformance = 0.12–0.18) as compared to others with a high HER activity (Pperformance = 0–0.05) (Fig. 5c). Therefore, the constructed structure–activity–selectivity relationship deciphers the universal rule of the CO2RR over HEAs.
With the assistance of the constructed structure–activity–selectivity relation, a high-throughput screening of promising HEAs for the CO2RR was performed. The candidates were 26
334 HEAs via the possible combination of 22 elements (Fig. 5d). Although the combinations are complex, the descriptors are transferable because the features are based on fundamental atom properties rather than specific elemental composition information. With the assistance of the SHAPactivity and SHAPselectivity, all the high-performance sites of the 26
334 HEAs were efficiently evaluated, and the top 10 HEAs with the highest Pperformance were selected as excellent catalysts for the CO2RR (Fig. 5e), including AgCuZnCdTa, AgCuZnMnCd, AgCuZnCrCd, AgCuZnVCd, AgCuZnNbCd, AgCuZnCdW, AgCuNiZnCd, CuPdZnCdTa, AgPdZnCdTa, and AgCuPdZnCd (Fig. S19). Clearly, the high-activity ratios (Pactivity = 0.72–0.96) and the high-selectivity ratios (Pselectivity = 0.52–0.77) of these promising HEAs reached a balance, leading to a significantly higher Pperformance (0.51–0.72) than the reported HEAs (0.12–0.18) (Fig. 5f). Therefore, these HEAs are expected to exhibit excellent overall performance for the CO2RR.
250 different sites of (FeCoNiCuMo)55 HEA clusters uncovered the activity–selectivity tradeoff of HEAs for the CO2RR. This originates from the positive effect of unpaired d electrons of coordination atoms on enhancing the binding strength of *CHO and *H intermediates. Therefore, most sites are expected to possess high activity by breaking the scaling relation, but low selectivity exists due to the preference for *H adsorption. This degrades the overall performance of HEAs for the CO2RR and rationalizes the current situation in this field, i.e., there are rare experimental reports on HEAs for the CO2RR despite the abundance of research into electrocatalysis.
Using the sum of the SHAP values of key features as the descriptors for activity and selectivity, the activity–selectivity tradeoff in (FeCoNiCuMo)55 HEAs is intuitively presented. Moreover, the diversity in the structure and components of the active sites enables the proposed descriptors to be generally applicable to other HEAs, where the reliability is supported by the accurate description of the performance variation of reported HEAs. On this basis, rapid screening among 26
334 types of HEAs was performed, where AgCuZnCdTa, AgCuZnMnCd, AgCuZnCrCd, AgCuZnVCd, AgCuZnNbCd, AgCuZnCdW, AgCuNiZnCd, CuPdZnCdTa, AgPdZnCdTa, and AgCuPdZnCd were selected as the top 10 promising catalysts to achieve a balance between activity and selectivity for the CO2RR. These predictions are expected to guide experimental research to realize superior performance of HEAs in the CO2RR. Overall, our research, based on quantitative descriptors, deciphers the structure–activity–selectivity of HEAs for the CO2RR, provides insights into related phenomenon and guidelines for the design of promising HEAs, and also provides a useful method for the exploration of the structure–performance relationship of complicated systems.
Supplementary information: details of the ML workflow to investigate the structure–performance relation; the number of MS-HEAs and all adsorption sites; ML-predicted ΔE*CO/ΔE*CHO/ΔE*H on all adsorption sites; COHPs and ICOHPs of O–M/M1–M2−n bonds; and the density distribution of SHAPactivity and SHAPselectivity for reported/predicted HEAs. See DOI: https://doi.org/10.1039/d5sc05762k.
| This journal is © The Royal Society of Chemistry 2025 |