Open Access Article
This Open Access Article is licensed under a Creative Commons Attribution-Non Commercial 3.0 Unported Licence

Predicting the HER activity of SACs on MXenes with simple features and interpretable machine learning models

Chandra Chowdhury a, Matteo Lovato ab, Giovanni Di Liberto a, Francesc Viñes b, Francesc Illas b, Gianfranco Pacchioni a and Livia Giordano *a
aDepartment of Materials Science, University of Milano-Bicocca, Via Cozzi 55, 20125 Milano, Italy. E-mail: livia.giordano@unimib.it
bDepartament de Ciència de Materials i Química Física & Institut de Química Teòrica i Computacional (IQTCUB), Universitat de Barcelona, c/ Martí i Franquès 1-11, 08028 Barcelona, Spain

Received 2nd September 2025 , Accepted 18th December 2025

First published on 24th December 2025


Abstract

The efficient design of single-atom catalysts (SACs) for hydrogen evolution reaction (HER) requires rapid and accurate screening methods capable of handling a large number of candidate materials. Traditional density functional theory (DFT) calculations, while reliable, are computationally expensive, necessitating alternative approaches for high-throughput screening. In this work, we create our own database based on DFT calculations using MXene-based SACs for HER and develop an interpretable machine learning (ML) model that predicts hydrogen adsorption free energy (ΔGH) with an accuracy of 0.17 eV, relying solely on simple, non-DFT-calculated features. Our approach systematically transitions from a black-box model to a grey-box model, incorporating feature importance analysis to identify key descriptors influencing ΔGH. This ultimately leads to the development of a glass-box model, where we derive a transparent analytical equation that allows for direct prediction of ΔGH using raw feature values. Importantly, the final model does not require data standardization or complex computational simulations, making it highly accessible for experimental researchers and enabling rapid, first-order screening of MXene-based SACs for HER.


Introduction

Electrochemical water (H2O) splitting is a promising method of leveraging intermittent renewable energy to produce fuels through the formation of hydrogen (H2) and oxygen (O2) from water. The hydrogen evolution reaction (HER), 2H+ + 2e → H2, represents a cathodic semi-reaction of the process, which is catalyzed by the expensive Pt in commercial electrolyzers,1,2 motivating the search for earth-abundant and cost-effective alternatives with equal (or better) electrocatalytic activities.3–7 The equilibrium potential of this reaction is formally null; however, in practice, an overpotential is observed, primarily dependent on the ability of the electrode material to bind with the reaction intermediate, which is typically an adsorbed hydrogen atom (denoted as H*).8,9 From a computational perspective, the seminal works of Nørskov and co-workers10,11 provide an extremely successful and handy proxy for predicting the catalytic activity for HER based on a descriptor, namely the Gibbs free energy of the reaction intermediate H* (ΔGH). The closer the ΔGH is to zero, the lower the predicted overpotential. For example, Pt(111) and other Pt-group metals show ΔGH close to zero when dispersion corrections are included, and they reveal optimal performance for HER.12 The success of this simple framework has stimulated a surge of theoretical studies based on the screening of several catalysts toward achieving zero overpotential.13 This is particularly the case in the field of single-atom catalysts (SACs).14–16

SACs have received great attention in recent years, due to their potential to minimize the critical raw metal loading and improve the catalytic activity and selectivity.17,18 SACs are made by transition metal atoms atomically dispersed in a solid matrix, maximizing the utilization of the active phase.19 Additionally, the catalytic activity of SACs can be tailored and eventually engineered by modifying the nature of the support and the local coordination of the metal atom.20 The typical supporting matrices are oxides and 2D materials, most often involving carbon-based supports, such as graphene, N-doped graphene, and carbon nitride.21 In recent years, MXenes, a new family of 2D materials,22 have attracted attention as supporting templates of SACs to catalyze a variety of reactions, including the HER.23–28 MXenes are two-dimensional (2D) transition-metal carbides, nitrides, and carbonitrides, and they are promising in electrochemistry owing to their electronic conductivity and ability to anchor metal atoms and eventually to host alkaline metals in accumulators.22,29,30

MXenes are synthesized from MAX phases, a material family with the general formula of Mn+1AXn (n = 1–3), where M, A, and X correspond to early d-transition metals, main group p-elements, and C and/or N, respectively. Starting from these materials, MXenes can be obtained by selective chemical etching of the A element.29,31 Using selective etchants, it is possible to break the M–A bonds that are weaker than the M−X bonds, resulting in highly active MXene sheets. Depending on the etchants and the subsequent chemical treatment, the thus obtained MXene sheets might be terminated with surface groups, such as fluorine (–F), hydroxyl (–OH), hydrogen (–H), or oxygen (–O).29,32 By selecting the M and X elements, the number of atomic layers in the sheet, and the type of surface terminations, a huge number of possible MXene-based materials can be synthetized, affording the possibility to customize their properties.31,33 Additionally, recent work has shown that the surface termination can be removed or substituted in a rational, predetermined way.34–36 Recently, Oschinski et al.37 used density functional theory (DFT) to study the interaction of first-row transition metals with M2C (M = Ti, Zr, Hf, V, Nb, Ta, Cr, Mo, and W), aiming to provide a rational guide to the design of new SAC catalysts based on MXenes and facilitate the experimental work to synthesize stable systems.

The improvement of computational power is promoting computational (electro)catalysis studies towards an even more ambitious goal, the development of descriptors and data-driven statistical models for the prediction of catalytic activity.38,39 In the case of the HER, this translates to seeking models capable of predicting ΔGH based on a set of defined physical properties (features) of the catalyst. Nevertheless, there are multiple ways to pursue this goal, and one can adopt different data-driven approaches to find relevant correlations between the target variable (ΔGH) and a number of features of varied complexity levels. The drawback of this approach is that it typically results in a complex mathematical expression for ΔGH as a function of the input variables, where the functional form is difficult to interpret in physical or chemical terms. The input parameters often require the full DFT calculations of adsorbed H to be determined, raising some questions on the real predictive power of the schemes and their usefulness. There are several recent studies showing the usefulness of machine learning models in predicting performances across several electrocatalytic applications based on MXene systems.40,41 For instance, Zheng et al.40 utilized a high-throughput screening approach combined with a machine learning (ML) model to predict the hydrogen binding energy and then the HER activity of a total of 299 MXene materials. Their methodology incorporates features such as the d-band centre and Bader charges, which necessitate DFT calculations of the H adsorption on the active site. They found that the random forest algorithm predicts hydrogen binding energy with a root mean square error of 0.27 eV. Likewise, Umer et al. incorporated a comprehensive set of 29 features in their study of rationally designed 364 catalysts by embedding 3d/4d/5d TM single atoms on diverse substrates, like graphitic carbon nitride (g-C3N4), π-conjugated polymer, pyridinic graphene and hexagonal boron nitride, including complicated descriptors such as the Coulomb matrix. Additionally, they integrated DFT-derived features to further refine their predictive model.40 At this stage, we recognize a gap in the availability of a straightforward and interpretable model that relies solely on simple features. Such a model should provide a clear, user-friendly equation for predicting the target using raw feature values, without the need for data standardization. This would enable broader accessibility, allowing anyone to utilize the derived equation for initial screening with ease.

In this work, we performed a computational screening of MXene-based SACs for HER with the aim of providing a straightforward and interpretable model for the prediction of ΔGH from simple variables that do not require any DFT calculations of the SAC or hydrogen adsorption on the SAC. We expanded the dataset of transition metals reported in ref. 37 to the second and third rows, thereby constructing a comprehensive dataset covering the complete transition-metal series. We then considered hydrogen adsorption on the resulting SACs, obtaining a dataset of 208 ΔGH values. We trained statistical models with different numbers of features and levels of complexity to predict ΔGH, by paying close attention to the simplicity of the descriptors and the interpretability of the model in physically meaningful terms. We began with a black-box model and gradually transitioned to a grey-box approach41 by conducting feature importance analysis across models ranging from linear to complex, which essentially provided us with the reason behind the noted differences in model performance. Ultimately, we developed a glass-box model41 by formulating a simple equation for the target using the interpretable and meaningful attributes, relying solely on non-DFT calculated features. Our results show that it is possible to predict a target variable with a linear multivariate model based on tabulated or easy-to-get features with a reasonable accuracy (about 0.17 eV), without the need to invoke complex models or not easily interpretable features. Please note that we do not claim this as the first or only interpretable model; instead, we emphasize its significance as a physically grounded, computationally efficient surrogate that connects chemical intuition with data-driven learning. The provision of a comprehensive analytical expression (eqn (5)), necessitating no preprocessing or scaling, aims to improve accessibility for researchers, especially in experimental contexts where implementing complete ML pipelines may be impractical.

The results of this study could provide a first-order screening of MXene-based SACs for the HER, aimed at identifying promising candidates for more elaborate calculations, including electronic structure and geometrical relaxation analysis, as well as the consideration of other effects, such as multiple hydrogen loading, solvation, pH and applied voltage, to ultimately guide the experimental synthesis and characterization. We remark that the present calculations were conducted using bare M2C (0001) MXene surfaces, which serve as an idealized structural model. Under the experimental conditions, MXenes are generally terminated with surface anions, such as –O, –OH, or –F, or a mixture of them,42 depending upon the synthesis and etching methods employed. These terminations can partially passivate the surface, hence reducing the metal binding strength compared to the uncoated surface. We chose to work with bare surfaces due to the need to identify inherent trends in metal–support interactions and to maintain uniformity across various MXene substrates. While absolute binding energies may be reduced, the comparative trends and feature–property correlations established herein are anticipated to remain valid. Future research will incorporate terminated MXenes to specifically assess the impact of surface functionalization on single-atom stability and hydrogen adsorption energetics.

Computational details

We performed DFT calculations as implemented in the Vienna ab initio simulation package (VASP).43–45 We employed the Perdew–Burke–Ernzerhof (PBE) exchange–correlation functional within the generalized gradient approximation (GGA),46 known for its capability to duly describe metallic systems,47,48 including most pristine MXenes.49 Dispersion contributions were introduced using the D3 Grimme's parametrization.50 The Kohn–Sham equations were solved using an expansion of the valence electron density in a plane-wave basis set with a kinetic energy cutoff of 400 eV, whereas the effect of the core electrons on the valence density was accounted for through the projector augmented wave approach (PAW).51,52 The energy threshold for electronic self-consistency was set to 10−6 eV, while the geometry optimizations were considered converged when the forces acting on the nuclei were all below 0.01 eV Å−1. Structure optimizations were performed on a p(3 × 3) slab supercell, which guarantees a separation of ∼10 Å between the absorbed adatoms, sufficient to avoid interactions between the adatoms on periodically repeating adjacent cells. The Monkhorst–Pack k-point sampling grid was set to 5 × 5 × 1 to perform the necessary numerical integrations in the reciprocal space.31 A vacuum region of 16 Å was included along the c direction in order to avoid interactions between repeated MXenes in the direction perpendicular to the basal plane. We modelled the magnetic ordering based on the most stable forms reported in previous works.53,54 In most of the cases, the most stable form is the ferromagnetic one, except for Ti2C, Hf2C, and Zr2C, which are characterized by antiferromagnetic ordering. The net charges on the atoms were calculated by means of the quantum theory of atoms in molecules (QTAIM) developed by Bader.55–58

MXenes with the M2C stoichiometry, comprising three atomic layers and exhibiting the (0001) surface, were modelled by fully optimizing the atomic structure of the supercell representing the nanosheets using previously optimized lattice parameters.37 Thus, the atomic coordinates of the nanosheet atoms and the adsorbed TM atom were all fully relaxed. For the bare MXene, the atomic structure of the supercell is the same as that of the primitive unit cell. We calculated the binding energy of the anchored transition metal (TM) adatoms, as follows:

 
Eb = ESACEsETM(1)
where ESAC is the total energy of the MXene sheet with the TM adatom attached, Es is the total energy of the bare MXene support, and ETM is the total energy of an isolated TM atom, all considered in the gas phase calculated from the DFT ground-state energies. The binding energy of a metal atom to the support is a relevant quantity as it can be used as a proxy for the stability of the SAC.59 We calculated the hydrogen adsorption energy, as follows:
 
image file: d5ta07143g-t1.tif(2)
where EH* is the total energy of the system with the absorbed hydrogen atom, and EH2 is the energy of an isolated H2 molecule. ΔGH was determined from ΔEH by adding the zero-point energy contribution and entropy terms. The entropy term involves the vibrational contributions of the adsorbed hydrogen atom, and the gas phase hydrogen molecule, with the latter taken from thermodynamic tables.10 Note that this is an approximate method to estimate the hydrogen adsorption free energy since, for instance, solvent effects are neglected. Nevertheless, the aim of this study is to precisely predict this approximate value, affording a reasonably accurate estimate of ΔGH without requiring DFT calculations.

We employed permutation feature-importance analysis for feature selection and conducted a Pearson correlation analysis for the considered features. We also employed various supervised machine learning techniques to predict the features, including traditional Multiple Linear Regression (MLR), Ridge Regression (RD), LASSO, Random Forest models (RF), Extra Trees (ET), Decision Trees (DT), Gradient Boosting (GB), Extra Gradient Boosting (XGB) and AdaBoost (ADB). To evaluate the generalization capability of our ML models, we implemented a careful data splitting approach. Initially, we randomly partitioned 15% of the full dataset as a strict external test set, entirely removed from any aspect of model training, scaling, or feature analysis. The external test set (15% of the full dataset) was extracted once at the beginning of the workflow using a fixed random seed for reproducibility. This held-out subset was kept constant and used consistently across all machine learning models. This ensured that all performance comparisons were made on the same test data, providing a fair and controlled evaluation of the generalization performance. The remaining 85% of the data was subsequently used for training and validation. All model development, excluding the leave-one-out procedure, was performed exclusively on the 85% dataset using 10-fold cross-validation. All data preprocessing, including feature scaling using StandardScaler within Pipeline, and model fitting were conducted solely on the training set within the cross-validation loop. Identical scaling values were utilized for the validation fold and external test sets to prevent leaking. Furthermore, for the tree-based models (e.g., RF, GB), we avoided any early stopping or information leakage from the test set by depending exclusively on internal splits or designated validation subsets. For models that facilitate internal validation (e.g., GB), the validation_fraction was confined to the training set exclusively. In the leave-one-out procedures, the training set for each excluded support or metal was fixed and included all remaining data. Although the held-out subset remained fixed in each case, the models were retrained with internal randomization (e.g., bootstrapping, feature selection), leading to slightly different predictions across runs. The reported error values reflect the average across these 1000 randomized training iterations. We used a ten-fold cross-validation method with a fixed random seed (seed = 42) to provide a fair and robust evaluation while preventing the overfitting of our model. Mean absolute error (MAE) was used as a metric for the ML models. MAE defines the average magnitude of errors between the values predicted by our ML model and those obtained from DFT calculations. For all the ML model training and testing calculations, we used the Scikit-learn library.60 The importance analysis was based directly on the intrinsic parameters of each model type, rather than on permutation scores. For linear models (LR, RD, LASSO), we calculated importance using the absolute standardized regression coefficients, which measure each descriptor's contribution to the target following feature scaling. For tree-based and ensemble models (Decision Tree, Random Forest, Extra Trees, Gradient Boosting, AdaBoost, and XGBoost), we utilized the inherent mean decrease in impurity (MDI) values, which indicate the average variance reduction related to each feature across all decision nodes. To guarantee robustness, all importances were averaged across 100 randomized 80/20 train-test splits and normalized for each model prior to aggregation.

Results and discussion

Dataset of the MXene-based SAC and their computed HER activities

We created a database of 208 SACs made by 26 TMs anchored to eight bare M2C (0001) MXene surfaces. Fig. 1 shows the binding energies and Bader charges of the TM atoms and supports considered in this study. We selected about 200 systems, representing an acceptable size of the database to provide statistically meaningful results. Starting from the data in ref. 37, which focused on the interaction between first-row transition metals and MXene substrates, we expanded the calculations to the TMs of the second and third rows. In this study, the adsorption stability of transition-metal atoms on MXene substrates was evaluated based on the calculated binding energy, Eb, where negative values indicate exothermic adsorption and, thus, stable anchoring of the metal atom. Accordingly, only systems exhibiting negative binding energies (Eb < 0) were considered thermodynamically stable and retained in the dataset. In contrast, combinations yielding positive or near-zero binding energies (Eb ≥ 0) were classified as unstable, as they represent endothermic adsorption processes and weak metal–surface interactions. For Zn, Cd, and Hg, several MXene combinations produced positive or marginal binding energies, confirming negligible binding. These systems were therefore excluded from further analysis to ensure that the database and subsequent machine learning models are constructed exclusively from energetically and structurally stable SAC configurations.
image file: d5ta07143g-f1.tif
Fig. 1 Properties of MXene-based SACs. Binding energies Eb (a) and Bader charges ΔQ (b) of transition metal atoms (TM = Sc, Ti, V, Cr, Mn, Fe, Co, Ni, Cu, Y, Zr, Nb, Tc, Mo, Ru, Rh, Pd, Ag, Hf, Ta, W, Re, Os, Ir, Pt and Au) on M2C (0001) MXenes (M = Ti, V, Zr, Nb, Mo, Hf, Ta, and W). Vertical lines separate the three transition metal series.

The interaction of the different first-row adatoms on each of the MXene surfaces considered has been investigated by considering the preferred adsorption sites found by Oschinski et al.37 Regarding the second and third TM row atoms, we investigated four different high-symmetry sites: the atop site (T) where the adatom is placed directly above an M surface atom, the bridge site (B) corresponding to the midpoint of the M–M bond, and two hollow sites that correspond to carbon (hcp) and empty (fcc) sites (see Fig. S1 of SI). After geometry optimization, we found that the hcp and fcc sites are preferred in most cases, regardless of the adatom or substrate. We found that metal adsorption is favourable for all systems considered, with binding energies ranging from −2.6 to −8.2 eV, as shown in Fig. 1a. As already discussed in ref. 37, the energetic trends depend not only on the adatom but also on the MXene substrate, Fig. 1a. Three similar patterns are observed for the three TM rows (3d, 4d, 5d), with a W-shaped behaviour consisting of two local minima and three local maxima for the binding energy, known to be related to the special stability of d5 and d10 transition metals in vacuum.61,62 Moving along each of the series (Sc–Cu, Y–Ag, Hf–Au), the value of Eb decreases until the first minimum (Ti, Zr/Nb, Ta), after which it starts to increase, reaching the first maximum (Cr/Mn, Tc, and W/Re, which have a half-occupied nd shell). Eb then decreases, reaching the second local minimum (Ni, Ru, Os/Ir), and finally increases to reach the second maximum (Cu, Ag, Au, which have a fully occupied nd shell). In other words, elements with nearly half-filled or filled nd shells are characterized by a weak interaction with the surface. The effect of the substrate can be observed as a change in the values of Eb, which follow the same pattern along the series. In every series, this effect is more marked in the early elements of the row, where the Eb differences between the same TM in different substrates are more pronounced and can reach several eV, while for the late elements of the row, the effect of the substrate is less noticeable, typically within 1 eV. Interestingly, the order of stability for the adsorption of the same TM atom on different substrates is maintained for the most part, with a few exceptions. In general, Mo2C and W2C bind the adatoms with the most negative energy values, while Hf2C and Zr2C form the weakest bonds. The absolute maximum values, corresponding to weak bindings, for the different TMs are achieved by the following systems: Cr@Ta2C, Mn@Ti2C, Ag@Hf2C and Ag@Ta2C, while the absolute minimum values, corresponding to strong bindings, are achieved by Ru@V2C, Os@V2C and Ir@Ti2C. A similar trend (especially along the 3d series) is observed for the average distance between the TM and the three nearest neighbours (cf. Fig. S2a), indicating that very negative binding energies correspond to short distances.

The degree of charge transfer between the transition metal and the substrate has been analysed to gain more insight into the interactions between the TM and the eight substrates. Fig. 1b reports the excess TM charge (ΔQ) calculated for the supported TM atom and obtained from the analysis of the electron density through the QTAIM developed by Bader.55–58 Interestingly, the same pattern is reproduced on every substrate, and a similar parabolic trend repeats for the three TM series (see Fig. 1b). Another clear tendency is observed depending on the type of metal constituting the substrate. In fact, it can be observed that the charges on the adatoms adsorbed on d2 M2C (Ti2C, Zr2C, Hf2C), d3 M2C (V2C, Nb2C, Ta2C), and d4 M2C (Mo2C, W2C) follow similar well-separated trends along the d series. Furthermore, the sign of the excess charge ΔQ depends on the type of adatom–substrate combination. For example, the values of ΔQ for the d2 M2C substrates are negative in most cases, except for Sc and Y, Fig. 1b. For d4 M2C, the values of ΔQ are mainly positive with some exceptions, while for d3 M2C, the ΔQ values are more TM-dependent. Clearly, the oxidizing character of the d4 M2C MXenes is larger than that of their d3 M2C and d2 M2C counterparts. As expected from the electronegativity trend, the early TMs in each d series tend to donate charge to the substrate or remain neutral species, while electrons are transferred from the substrate to the late TMs, which become partially reduced.

The magnetization of the TMs anchored on MXene substrates also follows a periodic trend, as shown in Fig. S3 of SI. Early and late TMs in the d-series expose a small number of unpaired electrons, while the atoms in the middle of the series exhibit large spin values, regardless of the MXene substrate.

Once the trends in the properties of the TM@MXene systems have been obtained, the adsorption free energy of H on the SAC, considered as H*, becomes a key intermediate in the HER. For each TM@MXene, a hydrogen atom was adsorbed on top of the TM. Interestingly, the optimized distances between the hydrogen adatom and the TM of the SAC follow a periodic trend (Fig. 2a), with a parabolic shape similar to the one obtained for the TM charge reported in Fig. 1b. For each d-series, the TM–H distance is the largest for the first atom in the series and then decreases, reaching a minimum for Group X metals (Ni, Pd and Pt), except for Ti2C, Zr2C and Hf2C substrates, where the minimum occurs earlier in the series. We also noticed a strong correlation between the TM–substrate distance before and after hydrogen adsorption, indicating that the adsorption of H affects the interaction of TM with the substrate similarly for all considered systems (see Fig. S2 of SI).

The obtained values for the H adsorption free energy are reported as a function of the TM (Fig. 2b). Unlike the case for the other quantities discussed above, here, it is not possible to extrapolate a specific trend or relationship with the substrates or the d-series. The energy results are distributed in a range of values between −0.3 and 1.2 eV, and the adsorption Gibbs free energy is negative only for a few systems. The lack of trend for ΔGH with the TM or the substrate raises the question of whether any physical property of the SAC can be used as a descriptor for this quantity. To address this, we are trying to design an interpretable machine learning model that utilizes simple, easily accessible features, aiming to provide a transparent and intuitive predictive framework.


image file: d5ta07143g-f2.tif
Fig. 2 Hydrogen adsorption on TM@MXene SACs. Hydrogen–metal distances (a) and adsorption Gibbs free energies ΔGH (b) of TM@MXene SACs (TM = Sc, Ti, V, Cr, Mn, Fe, Co, Ni, Cu, Y, Zr, Nb, Tc, Mo, Ru, Rh, Pd, Ag, Hf, Ta, W, Re, Os, Ir, Pt, and Au; M = Ti, V, Zr, Nb, Mo, Hf, Ta, and W). Vertical lines separate the three transition metal series.

Dataset distribution

An essential initial stage in constructing a reliable machine learning model for predicting ΔGH, and hence, the catalyst performance in the hydrogen evolution reaction involves verifying that the dataset consists of a broad spectrum of adsorption energy values. The dataset distribution of the ΔGH values is depicted in Fig. 3a, which exhibits a wide enough distribution of values, with the bulk of data points centred within the range of 0.4 to 0.7 eV. The general distribution of the dataset closely approximates a normal distribution. A normal distribution in machine learning model prediction is important because it ensures the training data is balanced and representative of a wide range of outcomes. This helps minimize bias and prevents the model from overfitting to specific patterns in the data, promoting more generalizable learning.
image file: d5ta07143g-f3.tif
Fig. 3 (a) Data distribution plot of the target property, ΔGH. (b) Features of the SAC systems used to build the predictive model for ΔGH on MXene-based SACs: electronegativity of the TM (χTM), number of TM valence electrons (NVAL), TM covalent radius (RTM), TM first ionization energy (IE), binding energy of the TM to the support (Eb), mean distance of the TM to the support (d), metal charge (ΔQ), support electronegativity (χs), binding energy of the isolated H–TM complex (ΔEisoH), and bond length of the isolated H–TM complex (disoH).

Feature selection

We have collected a number of physical properties as possible descriptors (features) of ΔGH on MXene-based SACs. The features have been chosen with the following criteria: (i) the descriptor should have a simple expression and be easily obtained, ideally tabulated; (ii) the descriptor does not involve the DFT calculation of the properties of H* on the SAC, which would question its usefulness; and (iii) the property of interest should depend on a limited number of features to avoid overfitting and to enable physical interpretation of the predictive model. Specifically, three different groups of variables have been considered in this study, as summarized in Fig. 3b.

The first group contains information describing the properties of the TM constituting the SAC. The second group includes information about the supports and their interaction with the TM. The last group accounts for the interaction of SAC with hydrogen. The features of the first group are the electronegativity of the TM (χTM), the number of TM valence electrons (NVAL), the TM covalent radius (RCOV), the first ionization energy of the TM (IE), and the electronegativity of M in the MXene substrate (χS). The data of this group of features are tabulated63 and do not need any simulation. The second group contains the following features: binding energy of TM to the support (Eb); mean distance between the adsorbed TM and the nearest neighbouring atoms of the substrate (d); and charge accumulation or depletion of the metal from the substrate (ΔQ), calculated by taking the difference between the number of valence electrons in the isolated TM atom and the number of electrons in the anchored state, as provided by the Bader method. This quantity represents the charge acquired by the TM upon adsorption. The data of these features require the DFT geometry optimization of the bare MXene support and of the TM adsorbed on the support. The third group of features includes the binding energy of isolated diatomic H–TM molecules (ΔEisoH) and the corresponding bond length (disoH). These two variables involve calculations of isolated TM atoms and their interaction with a hydrogen atom, and therefore, they do not involve the simulation of specific supported SACs. Considering all three groups, we have a total of 10 features for designing the model. An increase in the number of features occasionally improves the accuracy of machine learning models by supplying additional information for predictions, but at the same time, it adds complexity to the model and possibly generates overfitting issues.

To check the reliability of our predictive model, we incorporated supplementary properties that, although not directly associated with hydrogen evolution activity, may nonetheless provide significant insights. These properties encompass electron affinity, atomic mass, melting point, boiling point, the principal quantum number of the valence shell, and TM density, as it is known that in SACs, the regulation of atomic density can affect the catalytic performances.64 The resulting expanded dataset currently has a total of 16 features. We aim to compare the performance of our model utilizing both the original collection of 10 features and the combined set of 16 characteristics, enabling us to assess the impact of these new properties on the model's accuracy and predictive efficacy.

Correlation among features

We first verified the correlations among the selected features. A correlation heatmap is an effective visualization technique that illustrates the linear correlations among several variables, with correlation coefficients spanning from −1 (strong negative correlation) to +1 (high positive correlation). The heatmap offers a clear and rapid visualization of feature interactions by color-coding values, usually employing red for positive and blue for negative correlations, see Fig. 4a. This facilitates the identification of robust correlations, assists in feature selection for predictive models, reveals potential redundancies, and provides insight into the fundamental properties influencing system behaviours.
image file: d5ta07143g-f4.tif
Fig. 4 (a) Correlation matrix for the features considered in this work, defined in Fig. 3. (b) Performance of individual features for predicting the target property, namely ΔGH by a linear regression model, which is quantified as MAE. Features are ranked according to their MAE values, with lower values signifying better predictive performance.

We notice a relatively strong relation between the TM properties in the upper-left corner (from χTM to IE), indicating that they are not completely independent. It can also be seen that ΔQ has a significant correlation with the TM properties. Furthermore, χS has no significant correlation with any other variable. Similarly, the properties of SAC@MXene, i.e., Eb and d, do not correlate strongly with each other or with the substrate electronegativity. Finally, the properties of the TM–H diatomic molecule, ΔEisoH and disoH, do not show any strong correlation with other properties, other than the weak correlation of ΔEisoH with Eb. Metrics such as ΔQ, IE, and distance-related measures exhibit various degrees of correlation, illustrating the complex interplay of structural, electronic, and magnetic features that affect the hydrogen adsorption energetics. The association among these properties offers insight into their combined effect on hydrogen evolution, potentially informing the selection of catalysts for enhanced performance.

Linear regression with a single feature

To create a straightforward, efficient model with a limited number of features capable of reliably predicting the target, we initially assessed the predictive power of each feature. Before designing a linear regression model, we analysed the correlation between each feature and the target variable, ΔGH, as illustrated in the scatter plot (see Fig. S4). Each column represents a distinct feature plotted against the target, utilizing raw feature values and DFT-calculated free energies to preserve analytical simplicity. The results demonstrate that individual features do not have sufficient predictive power to precisely predict the target variable. Consistent with the analysis from Fig. S4, Fig. 4b shows that almost all features display higher MAE values (>0.180 eV), implying limited predictive accuracy when utilized individually. To further investigate this issue, we analysed parity plots (Fig. S5 of the SI) for each feature, which demonstrated nearly linear trends for almost all of the features. This linearity signifies an absence of correlation between individual features and the target, hence confirming the observed high MAE values. Interestingly, a few features, namely IE, d, and NVAL, have a lower MAE than all other features, and they also exhibits slightly better alignment (see Fig. S5) with the target. Due to the lack of strong correlations between individual features and the target, as confirmed by Fig. 4b, S4 and S5 of SI, a multivariate regression model may be more effective in predicting ΔGH, as it considers the combined effect of several descriptors.

Multivariate linear regression

The lack of correlation of ΔGH with the individual physical properties suggests that one descriptor is not enough for the prediction of this quantity. Consequently, a multivariate linear regression was performed, using the ten variables described above to predict the hydrogen adsorption free energy. The data were standardized to take into account the different scales of the features using the mean and standard deviation, and a set of 11 coefficients was optimized (one for every variable and one for the constant) by minimizing the distance between a hyperplane in the variable space and our target, the H adsorption Gibbs free energy ΔGH.

The predicted values with the multivariate linear method against the original DFT data are reported in Fig. 5a. Overall, the model provides a MAE of 0.159 eV, which is comparable to the intrinsic error of the DFT methodology. This result is quite remarkable since it indicates that calculating the adsorption free energy from the multivariate linear model would not induce a loss of accuracy. Fig. 5a demonstrates the superior prediction power of the multivariate linear regression model relative to the single-feature-based linear regression model. Although the individual features had a nearly linear, low-correlation relationship with the target variable (cf. Fig. S5 of the SI), the integration of several features in a linear regression framework shows a robust correlation with the target, as indicated by the close clustering of points around the parity line (see Fig. 5a). The enhanced correlation indicates that the multivariable model effectively captures complex interactions between features and the target variable, which individual features could not accomplish independently.


image file: d5ta07143g-f5.tif
Fig. 5 Multivariate linear regression. (a) Parity plot of the multivariate linear regression for ΔGH as a function of the features considered in this work, defined in Fig. 3. The shaded grey area corresponds to ± 0.2 eV, which corresponds to the typical errors of DFT calculations. (b) Analysis of MAE for several machine learning models, namely, Linear Regression (LR), Ridge (RD), LASSO, Extra Trees (ET), Random Forest (RF), Decision Tree (DT), Gradient Boosting (GB), AdaBoost (ADB), and XGBoost (XGB).

Comparison with complex ML models

To improve predictability, we gradually increased model complexity by exploring new methods. We assessed regularized linear models, including Ridge (RD) and LASSO, subsequently analysing tree-based models, such as Random Forest (RF), Decision Tree (DT), and Extra Trees (ET). Furthermore, we evaluated boosting models, comprising Gradient Boosting (GB), AdaBoost (ADB), and XGBoost (XGB). This thorough assessment enabled us to compare the performances of these advanced models with that of our simple multivariate linear regression model, offering insights into the balance between model complexity and prediction accuracy. To compare the predictive power of all models, we calculated the corresponding MAE values, and the result is shown in Fig. 5b. The comparison of the MAE values among various models illustrates the impact of model complexity on prediction accuracy. Simpler linear models, such as LR, RD, and LASSO, produce comparable MAE values ranging from 0.157 to 0.187 eV, thereby defining a baseline performance standard. The introduction of more sophisticated tree-based models yields an enhancement, especially with the ET model, which attains the lowest MAE of 0.135 eV. The RF model outperforms the linear models with a MAE of 0.143 eV, while the DT underperforms (0.193 eV) compared to simpler methods. Boosting techniques, such as GB, ADB, and XGB, improve accuracy, with GB attaining a notable MAE of 0.133 eV. The results indicate that models such as ET and GB offer relatively higher accuracy compared to more simplistic linear and tree-based approaches. However, it is also clear that increasing the model complexity does not improve the accuracy of multivariate linear models significantly, which indicates that our multivariate linear model provides an accuracy similar to those of the complex models.

As shown in Fig. 4a, some of the tabulated transition-metal properties, particularly electronegativity (χTM), ionization energy (IE), valence electron count (NVAL), and covalent radius (RTM), exhibit moderate to strong pairwise correlations (∼0.8), reflecting their shared periodic origin. However, the single-feature regression results presented in Fig. 4b show that none of these descriptors alone can accurately predict ΔGH (all single-feature MAEs > 0.18 eV), indicating that, despite their interdependence, they capture complementary physicochemical information rather than redundant relationships. To ensure that these correlations do not compromise the model's robustness, we applied regularization techniques (Ridge and LASSO regression), which constrain large coefficient magnitudes and stabilize correlated inputs. The regularized models produced nearly identical MAE values (0.16–0.18 eV) compared to the unregularized linear regression, confirming that multicollinearity does not significantly affect predictive reliability. Furthermore, to verify this conclusion, we performed an additional analysis by removing the most strongly correlated features (IE, χTM, NVAL, and RTM) and re-evaluated the linear regression model. The resulting MAE increased slightly from 0.159 eV (all features) to 0.179 eV (after feature removal), demonstrating that the correlated descriptors contribute synergistically to the model's predictive accuracy without introducing instability. These results confirm that the inclusion of these physically meaningful descriptors enhances the model's performance while maintaining interpretability.

Diving towards the grey-box model: feature-importance analysis

Thus far, our methodology has predominantly depended on a black-box model, yielding predictions without an adequate understanding of the underlying factors influencing the outcomes. In order to overcome this constraint, we aim to clarify the underlying causes for the results reported across various machine learning models, from multivariate linear regression to more complex models. We gradually transitioned from a black-box approach to a grey-box and, finally, to the simpler, interpretable glass-box model, as shown in the schematic diagram in Fig. 6. For the grey-box approach, we performed a feature-importance analysis for each model to ascertain which features most significantly influence the predictions. This improves interpretability, which will essentially bridge the gap between the model's prediction performance and physical understanding.
image file: d5ta07143g-f6.tif
Fig. 6 Schematic for shifting from a black-box to a glass-box model for predicting ΔGH. The black-box model produces predictions lacking interpretability, whereas the grey-box model integrates feature-importance analysis to identify key contributing factors. The glass-box model simplifies the prediction process by establishing a clear equation derived from a limited number of interpretable features, hence improving transparency and usability. The downward arrow signifies the increasing interpretability of the models.

First, we performed a feature-importance analysis to further investigate the findings of the multivariate linear regression model. This is essential for identifying the features that most significantly affect the model predictions, elucidating the key factors that influence the target variable. Understanding feature importance allows us to explain the model with greater clarity and transparency. The feature importance has been evaluated by the absolute values of the coefficients, as reported in Fig. 7a. Interestingly, from the feature-importance analysis, we found a comprehensive overview that connects the multivariate linear regression outcomes with the insights gained from individual feature performance. In the single-feature analysis, NVAL and IE were identified as the features exhibiting relatively low MAE values, indicating that they possessed the most robust individual correlation with the target variable. The significance of NVAL and IE is further validated in the multivariate model, where they constitute approximately 24% and 18% of the model's predictive capacity, respectively, highlighting their essential functions in both univariate and multivariate cases. Likewise, ΔEisoH and χTM, which had moderate correlations in the single-feature analysis, became significant in the multivariate model. This indicates that although these features may individually possess inadequate predictive power, their synergistic effect with other factors improves the model's accuracy. The minimal contributions of the other features in both univariate and multivariate models indicate that they offer additional information but are not robust independent variables. This comprehensive analysis highlights the need to integrate essential features, such as IE, NVAL, and χTM, to enhance the predictive model, utilizing their distinct and complementary insights into the target variable.


image file: d5ta07143g-f7.tif
Fig. 7 (a) Feature-importance analysis for multivariate linear regression (b) and gradient boost models.

Similarly, to closely examine the predictive accuracy of complex models, we have performed feature-importance analysis for each model and found an interesting correlation between feature distribution and model performance. For this analysis, we considered feature-importance analysis for models like GB and compared the results with the MLR results (Fig. 7b). For all other models, the analysis is shown in Fig. S6 of SI. The feature-importance distributions for the MLR (cf.Fig. 7a) and GB (cf.Fig. 7b) models illustrate the effect of increased model complexity on feature utilization and prediction accuracy. In the simplified MLR model, importance is predominantly focused on a limited number of essential features, which are mainly from Group 1, with IE, NVAL, and χTM collectively prevailing, signifying a dependence on these primary descriptors. In the more complex GB model, feature importance is more evenly distributed across a broad array of all feature groups, indicating that the GB model identifies further interactions among descriptors and utilizes a wider spectrum of information. This distributional shift corresponds to the rather superior accuracy of GB relative to that of MLR (MAE values: 0.133 vs. 0.159 eV), demonstrating that complex models can be tailored to more effectively capture nuanced patterns in the data by assigning appropriate weights to a greater number of relevant features, hence improving predictive performance.

To gain more insight from the feature-importance analysis and to delve more into the grey-box approach, we created a heatmap (see Fig. 8a), which provides insight into the feature importance of each model. Linear models tend to prioritize a few easily accessible features (mainly from Group 2 & Group 3), whereas complex models distribute importance across a wider range of features. To better understand how each model assigns importance to feature groups, we analysed the performance of each model by evaluating each group individually, as shown in Fig. 8b. Fig. 8b presents a comparison analysis of the feature-group importance across various models, demonstrating the distribution of importance among Group 1, Group 2, and Group 3 features for each model. Simple models, such as LR, RD, and LASSO, mostly focus on Group 1 and Group 2 features, indicating their dependence on basic properties for predictive power. With the enhancement of model complexity, tree-based models (e.g., RF, ET, DT) and boosting techniques (e.g., XGB, GB, ADB) tend to allocate importance more uniformly among all groups, with substantial emphasis placed on features from Group 2 and Group 3.


image file: d5ta07143g-f8.tif
Fig. 8 (a) Heatmap illustrating feature importance for diverse models, depicting the percentage contribution of each feature to the predictions of the models. The rows denote various models, while the columns specify particular features. Darker colors indicate more feature importance. (b) Feature-importance distribution across different machine learning models, categorized by feature groups (Group 1, Group 2, and Group 3). The chart illustrates how simpler models, such as Linear Regression (LR), Ridge (RD), and LASSO, heavily rely on Group 1 and Group 2 features, while more complex models, including Random Forest (RF), Extra Trees (ET), and boosting methods (XGBoost, Gradient Boosting (GB), and AdaBoost (ADB)), allocate importance more evenly across all feature groups.

This transition signifies that complex models gain from a wider range of feature interactions, compiling further information beyond the basic features in Group 1. This distributional pattern reinforces the notion that simpler models emphasize essential aspects, but more advanced models utilize a broader array of properties across groups, hence improving their capacity to identify complex interactions within the data. For all complex models, we computed the correlations between each group of features and the target (see Fig. S6–S13 of SI). In complex models, such as tree-based and boosting models, features from Group 2 and Group 3 demonstrate non-linear correlations with the target variable. In these models, data points are concentrated around the parity line, signifying robust prediction capability. Conversely, in linear models, the features of Group 2 and Group 3 exhibit almost linear plots (see Fig. S14 of SI), with negligible connection to the target.

Model predictive power

Since the previous sections indicate that the simpler MLR model performs quite well compared to more complex models, we focus on MLR alone to assess its prediction accuracy. Once the accuracy of the model is established, one needs to verify its stability and predictive power. In order to assess the predictive power of our model, we divided the dataset into training and testing subsets. Three different approaches were carried out. In the first, we generated the training set by randomly extracting 75% of the original points. The coefficients of the linear model were then determined, and the model was used to predict the H adsorption energy of the remaining 25% of the dataset (testing set). That procedure was iteratively repeated 103 times, extracting, in a random way, the training and testing sets. For each case, the MAE was computed and then averaged for each iteration. Remarkably, we did not observe a sizable loss in accuracy compared with the case where the entire dataset was included in the model. We found an averaged MAE of 0.16 eV, nearly unchanged from that of the original fitting, indicating the good predictive power of the model. In addition, during each iteration, the regression coefficients were stored, and the mean was computed. The mean absolute deviation of coefficients was performed, and relative errors were considered to quantify the variations of the models trained with different datasets. Once again, negligible changes occurred.

Next, we trained the model on a subset of the dataset, which included all data except those associated with a single support. The testing set was created using SAC supported by the newly excluded support, employing the leave-one-out method. The schematic diagram for this leave-one-out method is shown in Fig. 9a for clarity. This method enables the assessment of the model's predicted performance for each substrate without prior exposure to that particular substrate, thus evaluating the model's predictive capacity across various substrates. We iterated the method for all eight substrates 1000 times, obtaining an averaged MAE of 0.176 eV and a maximum MAE of 0.250 eV (W2C).


image file: d5ta07143g-f9.tif
Fig. 9 (a) Scheme for the leave-one-out method employed to evaluate the predictive power of our model, wherein the data of each substrate is systematically excluded as an unseen test set, while the model is trained on the data from the other seven substrates. (b) The distribution of the Mean Absolute Error (MAE) for each substrate in the leave-one-out substrate experiment. The violin plots illustrate the spread and central tendency of MAE values throughout the 1000 iterations, with narrower distributions signifying more consistent predictions.

The violin plot (Fig. 9b) illustrates the distribution of MAE values for each substrate across 1000 experiments, offering insight into the variability and consistency of the model's predictions. Substrates such as Nb2C and V2C have low and closely grouped MAE distributions, with mean values of about 0.141 and 0.144 eV, respectively. Thus, the model predicts these substrates with precision and consistency over iterations. Conversely, W2C and Mo2C have relatively elevated average MAE values (0.250 and 0.223 eV, respectively) and wider dispersion, indicating reduced predictive reliability. The shape and spread of each violin plot show the model's stability across many substrates, with narrower distributions and lower MAEs indicating more accurate performance. Upon analyzing the reason behind the different prediction accuracies for different substrates, we found that substrates like Nb2C and V2C have a smaller distribution of target variables (approximately 0.13–1.07 eV), whereas for substrates like W2C, Mo2C, the distributions of target variables are higher (approximately −0.26 to 1.17 eV). Finally, the same leave-one-out procedure was carried out by including all substrates and iteratively excluding one adsorbed metal at a time, so that the model is tested on a single metal atom never seen before. The averaged MAE is 0.232 eV, slightly higher than the previous values. The corresponding violin-plot for the leave-one-out experiment for metals is shown in Fig. S15 of SI. We identified Ru and Rh as being critical, as excluding them from the average calculation decreased the MAE to 0.174 eV. It is important to note that in this last assessment of the predictive power of the model, the testing dataset is quite small (only eight data points), which could affect the accuracy. These assessments show the robustness and portability of the resulting model. Overall, the model can predict the hydrogen adsorption energy of unknown systems, and the method could be a starting point to screen new catalysts based on similar substrates.

It is to be noted that to evaluate the generalization and robustness of our model, we conducted a series of validation procedures, including ten-fold cross-validation and leave-one-out tests. In the tests, each substrate and metal was systematically excluded from the training set and subsequently predicted as novel systems (see Fig. 9 and S15, SI). These supplementary methods guarantee that the model's predictive power is not limited to a particular subset of the dataset but encompasses various substrates and transition metals throughout the examined domain. While validation against an independent external dataset would yield the strictest assessment of transferability, such data are presently inaccessible for SAC@MXene systems with equivalent DFT accuracy and descriptor definitions. Nevertheless, the uniform results achieved in all internal evaluations, along with the model's physically interpretable parameters, indicate its reliable applicability for screening novel and compositionally analogous systems.

Further simplifying the linear model: glass-box model

We attempted to achieve our final goal of minimizing the computational cost and maximizing interpretability by proposing a model containing only features that require neither the DFT simulation of H* on the SACs, as in the model discussed above, nor the SACs themselves. For this reason, we performed a similar analysis by considering only Group 1 features first, which essentially do not require any DFT computations.

Further, we performed analysis with Group 2, Group 3 and simultaneously Group 2 & 3 features, separately (see Fig. S16 of SI). Interestingly, we found that only Group 1 features correlate well with the prediction of the target, whereas other features exhibit poor correlation. This finding corresponds well with the outcomes of the feature-importance analysis (cf.Fig. 6b), which showed the considerable significance of features like IE and NVAL, all of which are classified as Group 1 features.

To construct a glass-box model, we first formulated a reduced equation by employing feature-importance analysis only for Group 1 features. This method enabled us to establish a clear, interpretable connection between the chosen features and the ΔGH target variable. The coefficients in the equation were established according to their impact on the model's predictive efficacy, ensuring that each term accurately represents the influence of its corresponding feature on ΔGH. At first, we formulated a clear equation using all the features used in the study, i.e. using all 10 features, where the MAE is 0.16 eV, and the corresponding equation is as follows:

 
ΔGH = 1.39 + 0.78·RTM − 0.74·disoH + 0.35·χTM − 0.29·χs + 0.24·ΔQ − 0.23·IE + 0.16·ΔEisoH + 0.08·NVAL + 0.06·d − 0.04·Eb(3)

After that, we employed standardized features (see Fig. S17a of SI for feature-importance analysis), whereby each feature was normalized to possess a mean of zero and a standard deviation of one. The simple equation is given as follows:

 
ΔGH = 0.91 + 0.41·RTM − 0.20·IE + 0.10·χTM − 0.09·χS + 0.07·NVAL(4)

This equation offers a straightforward, interpretable, and computationally efficient method to estimate ΔGH on SACs at the considered MXenes, utilizing a small set of readily available features with a MAE of 0.17 eV (see Fig. S17b of SI for parity plot). To further simplify our model, we explored the possibility of using non-standardized features to assess the predictive efficiency while ensuring ease of application. By utilizing raw feature values, we aimed to create a model that eliminates the need for data preprocessing, implying that no prior standardization is required. This is important because no standardization coefficients are required, allowing anyone to perform an initial screening without requiring specialized programming or statistical transformations. This approach makes the model even more accessible, particularly for experimental researchers and practitioners looking for a quick and intuitive way to estimate ΔGH. Interestingly, when we switched to non-standardized features, we observed a slight shift in the MAE values (0.18 eV, see Fig. S18 of SI for parity plot) as in the standardized case, and the corresponding simplified equation is given as follows:

 
ΔGH = 0.34 + 0.76·RTM − 0.19·IE − 0.15·χS + 0.11·NVAL + 0.01·χTM(5)

This finding is particularly valuable, as it confirms that the simplified, non-standardized model can be used directly with raw data, making it a practical tool for initial screening and decision-making in material selection. It is essential to understand that, although the glass-box model (eqn (4) and (5)) offers a highly interpretable analytical connection grounded purely upon tabulated atomic properties, this simplification inherently imposes constraints on the prediction accuracy. The parity plots (Fig. S17b and S18 of SI) indicate that specific systems, especially those at the extremities of the ΔGH range, display significant dispersion in relation to the parity line. This dispersion may result from localized geometric and electronic properties that are not explicitly captured by the basic atomic descriptors employed in this glass-box model. Nevertheless, the glass-box equation effectively captures the trends within the dataset, attaining a MAE of roughly 0.17 eV, which is analogous to the intrinsic error of DFT-calculated adsorption free energies. This accuracy is sufficient for fast, preliminary evaluation and for differentiating between distinctly promising and less-effective catalyst candidates. In contrast to the single-feature models (e.g., solely ionization energy or valence count) and models utilizing Group 2 or Group 3 features, which demonstrate minimal predictive capability and produce parity plots nearly parallel to the x-axis (Fig. S5, S16 of SI), the Group 1-based model displays a distinct correlation with the DFT reference values (Fig. S18). This indicates that, despite its simplicity, the model effectively captures significant multivariate structure–property correlations. Consequently, we emphasize that the glass-box model should be considered a physically significant, predictive framework that prioritizes transparency and accessibility over high-precision quantitative estimations. This result validates the efficiency and simplicity of our model, demonstrating that precise predictions may be achieved without supplementary DFT calculations, thereby reinforcing our aim of developing a computationally efficient and interpretable model. Note also that to assess if interpretability undermines prediction performance, we compared our linear models with various sophisticated, high-capacity algorithms, including RF, ET, GB, ADB, and XGB (refer to Fig. 5b). The top-performing black-box models, including GB and ET, attained MAE values of approximately 0.133–0.135 eV, somewhat lower than the 0.159 eV produced with the multivariate linear regression model. The minor discrepancy of 0.02–0.03 eV falls within the intrinsic error of the DFT-calculated adsorption free energies (about 0.10–0.20 eV), suggesting that our interpretable methodology maintains almost the same predictive capability. Thus, the linear model presents an advantageous compromise—attaining DFT-level precision while establishing transparent, physically significant correlations between descriptors and ΔGH. This illustrates that interpretability may be achieved without much compromise in prediction performance, rendering the suggested model both scientifically accessible and computationally efficient for expedited catalyst screening.

Discussion and conclusions

Our thorough investigation has uncovered substantial insights regarding predictive performance and feature importance across various modelling methodologies. The greater accuracy of complicated models, including tree-based methods (e.g., RF and ET) and boosting models (e.g., GB and XGB), is due to their superior ability to utilize features from all groups of variables (Group 1, Group 2, and Group 3). These models allocate feature importance more uniformly, enabling them to identify non-linear correlations and complex interactions among many types of features. Conversely, LR focuses predominantly on Group 1 and Group 2 features, attaining good prediction accuracy through a limited selection of features. The emphasis on simple features (mainly Group 1 features) in simpler models, especially LR, is significant as these properties are easily obtainable from conventional chemistry handbooks, incurring no further computational costs. To ascertain if enhanced model accuracy could be attained through increased complexity, we combined the feature set by incorporating six additional periodic-table-derived descriptors, resulting in a total of 16 features. These were selected to preserve simplicity and universal applicability while somewhat enhancing the descriptor diversity. Nevertheless, as illustrated in Fig. S19, models trained on this combined feature set did not demonstrate any significant improvement in performance. The MAEs for all regressors were essentially constant, with variations of less than 0.01–0.02 eV overall. This indicates that the primary 10 features encompass the most chemically important information for predicting ΔGH in this dataset. It further shows the reliability of our model, as well as the features.

Conversely, the properties of Group 2 and Group 3 necessitate additional DFT calculations, which may be computationally expensive, especially for large datasets. Thus, although complex models provide enhanced accuracy, they require greater processing resources owing to their dependence on features from Group 2 and Group 3. The LR model attains satisfactory accuracy while reducing computing expenses, offering an attractive trade-off between accuracy and efficiency. This trade-off indicates that, based on the application requirements, the LR model may be the ideal selection for scenarios when computational efficiency is paramount.

We finally underline that the final model predicts the hydrogen binding energy with an accuracy of 0.170 eV using only quantities that do not require any atomistic simulation of the SACs. In this respect, the model could be of help to provide a first-order approximation for the screening of SACs for the HER based on M2C materials like MXenes, followed by more accurate calculations on promising candidates. The MAE values for the best-performing models (0.13–0.18 eV) are comparable to the intrinsic accuracy limits of the DFT calculations, often ranging from 0.10 to 0.20 eV for the hydrogen adsorption free energies, depending upon the functional and dispersion corrections employed. From a practical viewpoint, this level of accuracy suffices for the initial screening of the MXene-based SACs, facilitating the dependable identification of interesting candidates with near-optimal ΔGH values (ΔGH < 0.2 eV). The model is designed not to substitute comprehensive DFT computations but to assist in their prioritization, delivering DFT-level performance at a significantly reduced computational expense. Note also that even if the conclusions are extracted from the SACs involving M2C materials like MXenes, it is likely that the presented model works reasonably well for M2C3 and M3C4 MXenes, as the number of atomic layers has a small influence on the adsorption properties of molecules such as CO2.65 Further work will be dedicated to the extension of the approach to other relevant supports.

Author contributions

The manuscript has been written with the contributions from all the authors.

Conflicts of interest

There are no conflicts to declare.

Data availability

Relevant data are available from the corresponding author upon reasonable request.

The data supporting this article have been included as part of supplementary information (SI). Supplementary information: adatom adsorption sites, TM–substrate distance before and after H-binding, magnetization value of the TM adsorbed on the substrate, correlation with DFT-calculated features, parity plot for single-variable linear regression models, feature-importance analysis for different models, parity plots for different models, leave-one-out experiment with individual metals, and comparison of the MAE values among several models utilizing 10 and 16 features. See DOI: https://doi.org/10.1039/d5ta07143g.

Acknowledgements

Access to the CINECA supercomputing resources was granted via ISCRAB. M. L. was supported in part by an Erasmus + traineeship grant. F. V. and F. I. thank the Spanish Ministerio de Ciencia e Innovación (MCIN) and the Agencia Estatal de Investigación (AEI) MCIN/AEI/10.13039/501100011033 and, as appropriate, from the European Union Next Generation EU/PRTR through grants PID2021-126076NB-I00 and PID2024-159906NB-I00, partially funded by FEDER Una manera de hacer Europa, as well as from the Unidad de Excelencia María de Maeztu CEX2021-001202-M granted to the IQTCUB. The Generalitat de Catalunya funding through 2021SGR00079 grant is also acknowledged. F. V. acknowledges support from the ICREA Academia Award 2023 Ref. Ac2216561.

References

  1. J. Bockris, I. Ammar and A. Huq, The Mechanism of the Hydrogen Evolution Reaction on Platinum, Silver and Tungsten Surfaces in Acid Solutions, J. Phys. Chem., 1957, 61(7), 879–886 CrossRef CAS.
  2. Z. Y. Yu, Y. Duan, X. Y. Feng, X. Yu, M. R. Gao and S. H. Yu, Clean and Affordable Hydrogen Fuel from Alkaline Water Splitting: Past, Recent Progress, and Future Prospects, Adv. Mater., 2021, 33(31), 2007100 CrossRef CAS PubMed.
  3. T. F. Jaramillo, K. P. Jørgensen, J. Bonde, J. H. Nielsen, S. Horch and I. Chorkendorff, Identification of Active Edge Sites for Electrochemical H2 Evolution from MoS2 Nanocatalysts, Science, 2007, 317(5834), 100–102 CrossRef CAS.
  4. D. Voiry, M. Salehi, R. Silva, T. Fujita, M. Chen, T. Asefa, V. B. Shenoy, G. Eda and M. Chhowalla, Conducting MoS2 Nanosheets as Catalysts for Hydrogen Evolution Reaction, Nano Lett., 2013, 13(12), 6222–6227 CrossRef CAS.
  5. S. Yao, X. Zhang, A. Chen, Z. Zhang, M. Jiao and Z. Zhou, Algorithm Screening to Accelerate Discovery of 2D Metal-Free Electrocatalysts for Hydrogen Evolution Reaction, J. Mater. Chem. A, 2019, 7(33), 19290–19296 RSC.
  6. Y. Jiao, Y. Zheng, M. Jaroniec and S. Z. Qiao, Design of Electrocatalysts for Oxygen-and Hydrogen-Involving Energy Conversion Reactions, Chem. Soc. Rev., 2015, 44(8), 2060–2086 RSC.
  7. J. Mahmood, F. Li, S.-M. Jung, M. S. Okyay, I. Ahmad, S.-J. Kim, N. Park, H. Y. Jeong and J.-B. Baek, An Efficient and pH-Universal Ruthenium-Based Catalyst for the Hydrogen Evolution Reaction, Nat. Nanotechnol., 2017, 12(5), 441–446 Search PubMed.
  8. P. Rüetschi and P. Delahay, Hydrogen Overvoltage and Electrode Material. A Theoretical Analysis, J. Chem. Phys., 1955, 23(1), 195–199 CrossRef.
  9. P. C. Vesborg, B. Seger and I. Chorkendorff, Recent Development in Hydrogen Evolution Reaction Catalysts and Their Practical Implementation, J. Phys. Chem. Lett., 2015, 6(6), 951–957 Search PubMed.
  10. J. K. Nørskov, T. Bligaard, A. Logadottir, J. Kitchin, J. G. Chen, S. Pandelov and U. Stimming, Trends in the Exchange Current for Hydrogen Evolution, J. Electrochem. Soc., 2005, 152(3), J23 Search PubMed.
  11. E. Skúlason, V. Tripkovic, M. E. Björketun, S. Gudmundsdóttir, G. Karlberg, J. Rossmeisl, T. Bligaard, H. Jónsson and J. K. Nørskov, Modeling the Electrochemical Hydrogen Oxidation and Evolution Reactions on the Basis of Density Functional Theory Calculations, J. Phys. Chem. C, 2010, 114(42), 18182–18197 CrossRef.
  12. M. Allés, L. Meng, I. Beltrán, F. Fernández and F. Viñes, Atomic Hydrogen Interaction with Transition Metal Surfaces: A High-Throughput Computational Study, J. Phys. Chem. C, 2024, 128(47), 20129–20139 CrossRef.
  13. J. Greeley and J. K. Nørskov, Large-Scale, Density Functional Theory-Based Screening of Alloys for Hydrogen Evolution, Surf. Sci., 2007, 601(6), 1590–1598 CrossRef CAS.
  14. I. Barlocco, L. A. Cipriano, G. Di Liberto and G. Pacchioni, Modeling Hydrogen and Oxygen Evolution Reactions on Single Atom Catalysts with Density Functional Theory: Role of the Functional, Adv. Theory Simul., 2023, 6(10), 2200513 CrossRef CAS.
  15. G. Gao, S. Bottle and A. Du, Understanding the Activity and Selectivity of Single Atom Catalysts for Hydrogen and Oxygen Evolution via Ab Initial Study, Catal. Sci. Technol., 2018, 8(4), 996–1001 RSC.
  16. G. Di Liberto, L. A. Cipriano and G. Pacchioni, Role of Dihydride and Dihydrogen Complexes in Hydrogen Evolution Reaction on Single-Atom Catalysts, J. Am. Chem. Soc., 2021, 143(48), 20431–20441 CrossRef CAS.
  17. A. Wang, J. Li and T. Zhang, Heterogeneous Single-Atom Catalysis, Nat. Rev. Chem., 2018, 2(6), 65–81 CrossRef CAS.
  18. X.-F. Yang, A. Wang, B. Qiao, J. Li, J. Liu and T. Zhang, Single-Atom Catalysts: A New Frontier in Heterogeneous Catalysis, Acc. Chem. Res., 2013, 46(8), 1740–1748 CrossRef CAS.
  19. Y. Chen, S. Ji, C. Chen, Q. Peng, D. Wang and Y. Li, Single-Atom Catalysts: Synthetic Strategies and Electrochemical Applications, Joule, 2018, 2(7), 1242–1264 Search PubMed.
  20. Y. Yang, J. Liu, F. Liu, Z. Wang and D. Wu, FeS2-Anchored Transition Metal Single Atoms for Highly Efficient Overall Water Splitting: A DFT Computational Screening Study, J. Mater. Chem. A, 2021, 9(4), 2438–2447 RSC.
  21. M. D. Hossain, Z. Liu, M. Zhuang, X. Yan, G. L. Xu, C. A. Gadre, A. Tyagi, I. H. Abidi, C. J. Sun and H. Wong, Rational Design of Graphene-Supported Single Atom Catalysts for Hydrogen Evolution Reaction, Adv. Energy Mater., 2019, 9(10), 1803689 CrossRef.
  22. B. Anasori and Y. Gogotsi, The Rise of MXenes, ACS Nano, 2019, 13, 8491–8494 Search PubMed.
  23. J. Zhang, Y. Zhao, X. Guo, C. Chen, C.-L. Dong, R.-S. Liu, C.-P. Han, Y. Li, Y. Gogotsi and G. Wang, Single Platinum Atoms Immobilized On An MXene As An Efficient Catalyst For The Hydrogen Evolution Reaction, Nat. Catal., 2018, 1(12), 985–992 Search PubMed.
  24. Y. Cheng, J. Dai, Y. Song and Y. Zhang, Single Molybdenum Atom Anchored on 2D Ti2NO2 MXene as a Promising Electrocatalyst for N2 Fixation, Nanoscale, 2019, 11(39), 18132–18141 Search PubMed.
  25. L. Li, X. Wang, H. Guo, G. Yao, H. Yu, Z. Tian, B. Li and L. Chen, Theoretical Screening of Single Transition Metal Atoms Embedded in MXene Defects As Superior Electrocatalyst of Nitrogen Reduction Reaction, Small Methods, 2019, 3(11), 1900337 CrossRef CAS.
  26. X. Yang, Z. Lu, C. Cheng, Y. Wang, X. Zhang, Z. Yang and W. Lu, Identification of Efficient Single-Atom Catalysts Based on V2CO2 MXene by Ab Initio Simulations, J. Phys. Chem. C, 2020, 124(7), 4090–4100 CrossRef CAS.
  27. D. Kan, D. Wang, X. Zhang, R. Lian, J. Xu, G. Chen and Y. Wei, Rational Design of Bifunctional ORR/OER Catalysts Based on Pt/Pd-Doped Nb2CT2 MXene by First-Principles Calculations, J. Mater. Chem. A, 2020, 8(6), 3097–3108 Search PubMed.
  28. W. Fang, C. Du, M. Kuang, M. Chen, W. Huang, H. Ren, J. Xu, A. Feldhoff and Q. Yan, Boosting Efficient Ambient Nitrogen Oxidation By A Well-Dispersed Pd on MXene Electrocatalyst, Chem. Commun., 2020, 56(43), 5779–5782 Search PubMed.
  29. M. Naguib; M. Kurtoglu; V. Presser; J. Lu; J. Niu; M. Heon; L. Hultman; Y. Gogotsi and M. W. Barsoum, Two-Dimensional Nanocrystals Produced by Exfoliation of Ti3AlC2, in MXenes, Jenny Stanford Publishing, 2023, pp. 15–29 Search PubMed.
  30. M. Naguib, O. Mashtalir, J. Carle, V. Presser, J. Lu, L. Hultman, Y. Gogotsi and M. W. Barsoum, Two-Dimensional Transition Metal Carbides, ACS Nano, 2012, 6(2), 1322–1331 CrossRef CAS PubMed.
  31. M. Alhabeb, K. Maleski, B. Anasori, P. Lelyukh, L. Clark, S. Sin and Y. Gogotsi, Guidelines for Synthesis and Processing of Two-Dimensional Titanium Carbide (Ti3C2Tx-MXene), Chem. Mater., 2017, 29(18), 7633–7644 CrossRef CAS.
  32. M. A. Hope, A. C. Forse, K. J. Griffith, M. R. Lukatskaya, M. Ghidiu, Y. Gogotsi and C. P. Grey, NMR Reveals the Surface Functionalisation of Ti3C2 MXene, Phys. Chem. Chem. Phys., 2016, 18(7), 5099–5102 RSC.
  33. P. O. Persson and J. Rosen, Current State of the Art on Tailoring the MXene Composition, Structure, and Surface Chemistry, Curr. Opin. Solid State Mater. Sci., 2019, 23(6), 100774 CrossRef CAS.
  34. V. Natu and M. W. Barsoum, MXene Surface Terminations: A Perspective, J. Phys. Chem. C, 2023, 127(41), 20197–20206 CrossRef CAS.
  35. I. Persson, J. Halim, H. Lind, T. W. Hansen, J. B. Wagner, L. Å. Näslund, V. Darakchieva, J. Palisaitis, J. Rosen and P. O. Persson, 2D Transition Metal Carbides (MXenes) For Carbon Capture, Adv. Mater., 2019, 31(2), 1805472 CrossRef.
  36. V. Kamysbayev, A. S. Filatov, H. Hu, X. Rui, F. Lagunas, D. Wang, R. F. Klie and D. V. Talapin, Covalent Surface Modifications and Superconductivity of Two-Dimensional Metal Carbide MXenes, Science, 2020, 369(6506), 979–983 CrossRef CAS.
  37. H. Oschinski, A. Morales-Garcia and F. Illas, Interaction of First Row Transition Metals with M2C (M= Ti, Zr, Hf, V, Nb, Ta, Cr, Mo, and W) MXenes: A Quest for Single-Atom Catalysts, J. Phys. Chem. C, 2021, 125(4), 2477–2484 CrossRef CAS.
  38. S. Zhu, K. Jiang, B. Chen and S. Zheng, Data-Driven Design of Electrocatalysts: Principle, Progress, and Perspective, J. Mater. Chem. A, 2023, 11(8), 3849–3870 Search PubMed.
  39. K. Ding, T. Yang, M. T. Leung, K. Yang, H. Cheng, M. Zeng, B. Li and M. Yang, Recent Advances in the Data-Driven Development of Emerging Electrocatalysts, Curr. Opin. Electrochem., 2023, 101404 CrossRef CAS.
  40. J. Zheng, X. Sun, C. Qiu, Y. Yan, Z. Yao, S. Deng, X. Zhong, G. Zhuang, Z. Wei and J. Wang, High-Throughput Screening of Hydrogen Evolution Reaction Catalysts in MXene Materials, J. Phys. Chem. C., 2020, 124(25), 13695–13705 CrossRef CAS.
  41. J. A. Esterhuizen, B. R. Goldsmith and S. Linic, Interpretable Machine Learning for Knowledge Generation in Heterogeneous Catalysis, Nat. Catal., 2022, 5(3), 175–184 Search PubMed.
  42. L. Meng, L.-K. Yan, F. Viñes and F. Illas, Effect of Terminations on the Hydrogen Evolution Reaction Mechanism on Ti3C2 MXene, J. Mater. Chem. A, 2023, 11(13), 6886–6900 RSC.
  43. G. Kresse and J. Furthmüller, Efficient Iterative Schemes for Ab Initio Total-Energy Calculations Using a Plane-Wave Basis Set, Phys. Rev. B:Condens. Matter Mater. Phys., 1996, 54(16), 11169 Search PubMed.
  44. G. Kresse and J. Furthmüller, Efficiency of Ab-Initio Total Energy Calculations for Metals and Semiconductors Using a Plane-Wave Basis Set, Comput. Mater. Sci., 1996, 6(1), 15–50 CrossRef CAS.
  45. G. Kresse and J. Hafner, Ab Initio Molecular Dynamics for Liquid Metals, Phys. Rev. B:Condens. Matter Mater. Phys., 1993, 47(1), 558 Search PubMed.
  46. J. P. Perdew, K. Burke and M. Ernzerhof, Generalized Gradient Approximation Made Simple, Phys. Rev. Lett., 1996, 77(18), 3865 Search PubMed.
  47. L. Vega and F. Vines, Generalized Gradient Approximation Adjusted to Transition Metals Properties: Key Roles of Exchange and Local Spin Density, J. Comput. Chem., 2020, 41(30), 2598–2603 CrossRef CAS.
  48. D. Vázquez-Parga, A. Fernández-Martínez and F. Viñes, Quantifying the Accuracy of Density Functionals on Transition Metal Bulk and Surface Properties, J. Chem. Theory Comput., 2023, 19(22), 8285–8292 CrossRef.
  49. D. Ontiveros, S. Vela, F. Viñes and C. Sousa, Tuning MXenes Towards Their Use in Photocatalytic Water Splitting, Energy Environ. Mater., 2024, 7(6), e12774 CrossRef CAS.
  50. S. Grimme, J. Antony, S. Ehrlich and H. Krieg, A Consistent and Accurate Ab Initio Parametrization of Density Functional Dispersion Correction (DFT-D) for the 94 Elements H-Pu, J. Chem. Phys., 2010, 132(15), 154104 CrossRef PubMed.
  51. P. E. Blöchl, Projector Augmented-Wave Method, Phys. Rev. B:Condens. Matter Mater. Phys., 1994, 50(24), 17953 Search PubMed.
  52. G. Kresse and D. Joubert, From Ultrasoft Pseudopotentials to the Projector Augmented-Wave Method, Phys. Rev. B:Condens. Matter Mater. Phys., 1999, 59(3), 1758 CrossRef CAS.
  53. N. García-Romeral, Á. Morales-García, F. Vines, I. d. P. Moreira and F. Illas, Theoretical analysis of magnetic coupling in the Ti2C bare MXene, J. Phys. Chem. C, 2023, 127(7), 3706–3714 CrossRef.
  54. N. García-Romeral, Á. Morales-García, F. Viñes, I. de PR Moreira and F. Illas, How Does Thickness Affect Magnetic Coupling in Ti-Based MXenes, Phys. Chem. Chem. Phys., 2023, 25(26), 17116–17127 RSC.
  55. R. F. Bader, Atoms in Molecules, Acc. Chem. Res., 1985, 18(1), 9–15 CrossRef CAS.
  56. G. Henkelman, A. Arnaldsson and H. Jónsson, A Fast and Robust Algorithm for Bader Decomposition of Charge Density, Comput. Mater. Sci., 2006, 36(3), 354–360 CrossRef.
  57. E. Sanville, S. D. Kenny, R. Smith and G. Henkelman, Improved Grid-Based Algorithm for Bader Charge Allocation, J. Comput. Chem., 2007, 28(5), 899–908 CrossRef CAS.
  58. W. Tang, E. Sanville and G. Henkelman, A Grid-Based Bader Analysis Algorithm without Lattice Bias, J. Phys.:Condens. Matter, 2009, 21(8), 084204 CrossRef CAS PubMed.
  59. G. Di Liberto, L. Giordano and G. Pacchioni, Predicting the Stability of Single-Atom Catalysts in Electrochemical Reactions, ACS Catal., 2023, 14(1), 45–55 Search PubMed.
  60. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss and V. Dubourg, Scikit-learn: Machine Learning in Python, J. Mach. Learn. Res., 2011, 12, 2825–2830 Search PubMed.
  61. M. Manade, F. Vines and F. Illas, Transition Metal Adatoms on Graphene: A Systematic Density Functional Study, Carbon, 2015, 95, 525–534 Search PubMed.
  62. S. Kim, P. Gamallo, F. Vines and J. Y. Lee, The Nano Gold Rush: Graphynes as Atomic Sieves for Coinage and Pt-Group Transition Metals, Appl. Surf. Sci., 2020, 499, 143927 CrossRef CAS.
  63. W. M. Haynes, CRC Handbook of Chemistry and Physics, CRC Press, 2016 Search PubMed.
  64. H. Jin, W. Song and C. Cao, An Overview of Metal Density Effects in Single-Atom Catalysts for Thermal Catalysis, ACS Catal., 2023, 13(22), 15126–15142 CrossRef CAS.
  65. Á. Morales-García, M. Mayans-Llorach, F. Viñes and F. Illas, Thickness Biased Capture of CO2 on Carbide MXenes, Phys. Chem. Chem. Phys., 2019, 21(41), 23136–23142 RSC.

Footnote

These authors contributed equally.

This journal is © The Royal Society of Chemistry 2026
Click here to see how this site uses Cookies. View our privacy policy here.