Toward a new definition of surface energy for late transition metals †

Surface energy is a top-importance stability descriptor of transition metal-based catalysts. Here, we combined density functional theory (DFT) calculations and a tiling scheme measuring surface areas of metal structures to develop a simple computational model predicting the average surface energy of metal structures independently of their shape. The metals considered are W, Ru, Co, Ir, Ni, Pd, Pt, Cu, Ag and Au. Lorentzian trends derived from the DFT data proved eﬀective at predicting the surface energy of metallic surfaces but not for metal clusters. We used machine-learning protocols to build an algorithm that improves the Lorentzian trend’s accuracy and is able to predict the surface energies of metal surfaces of any crystal structure, i


Introduction
Catalysis is of crucial importance to the economy and for a sustainable future.Since its discovery in the 1800s, 1 heterogeneous catalysis has played a significant role in the chemical industry and, more recently, in applications such as fuel cells [2][3][4] and environmental control. 53][14] Campbell's equation was successfully applied to predict the growth of multiple supported transition metals such as Ni, 15,16 Ag 17 and Cu at different temperatures. 18][21][22] The surface energy measures the energy difference per unit area between a given surface and its bulk material. 23Experimental measurements of this property remain challenging and have mostly been limited to extrapolations from the liquid-state surface tensions, which provide only the average energy over the metal's surfaces. 24,25Nowadays, computational methods such as the density functional theory (DFT) predict the energy excess associated with specific surfaces and symmetric structures. 23,26However, metalbased NPs, especially sub-nanometre clusters, are irregular and present a collection of low and high-index facets, edges and corners, all of them having different properties. 27,28Therefore, predicting the local surface energy at the atomic scale would promote an intelligent design of nanocatalysts with high robustness. 29,30n the present work, we introduce an innovative model predicting accurate surface energies and surface areas using electron-independent descriptors such as the coordination number (CN).The CN is an accessible parameter easy to calculate by finding the number of closest neighbours around a given surface atom.We used machine learning neural network techniques to build a model able to predict accurate surface energies of any metallic structure independently of their size and shape.[33][34]

Methodology
This journal is © the Owner Societies 2023 The revised PBE functional from Perdew, Burke and Ernzerhof 38,39 was used to calculate the exchange-correlation energy and the projector-augmented wave (PAW) pseudo-potentials to describe the core electrons. 37,40The effect of considering all-electron potentials in these calculations has been observed as negligible and certainly does not affect the derived trends. 41,42Dispersion corrections were accounted for through Grimme's empirical dispersion correction, DFT-D3. 43The plane-wave kinetic cut-off and the electronic convergence threshold were set to 600 and 10 À5 eV, respectively, thus ensuring accurate and consistent results.The ionic relaxations had a convergence criterion of 0.01 eV Å À1 .All the bulk and slabs were calculated with a k-grid of 0.2 Å À1 .
Metallic structures of W, Ru, Co, Ir, Ni, Pd, Pt, Cu, Ag and Au were built from the fully optimised bulk.Transition metal slabs were simulated by a 5 Â 5 Â 5 supercell, with 10 Å of vacuum along the z-axis, guaranteeing no electronic interactions between periodic images.All slabs were formed with 5 atomic layers in thickness, being the uppermost 2 layers completely unconstrained during the relaxation process.Dipole corrections were applied perpendicular to the slabs.
Transition metal nanoparticles and clusters were modelled at the centre of a simulation cell with at least 10 Å of vacuum between periodic images and using a single k-point, the G-point.All atoms were relaxed upon geometry optimisation with the same criteria as the surfaces, i.e. a convergence threshold of 0.01 eV Å À1 .

Surface energy prediction
The first step toward a model predicting the surface energy of a given surface was to design a set of slabs exposing atoms with different environments, including extended and punctual defects such as steps, vacancies, and adatoms.The surface energy definition is expressed in eqn (1): where g bot and g top are the surface energy of the bottom and top sides of the slab and A bot and A top are the area associated with the bottom and top sides of the slab.E slab is the energy associated with the slab, n is the size of the slab compared to the bulk, and E bulk is the bulk energy.In order to compute g bot , a calculation on a symmetric slab, i.e., A top = A bot , was performed, where the slab was maintained fully frozen at the bulk lattice (E Frozen slab ).Then, the bottom surface energy was computed using eqn (2): where 2ÁA is the sum of A top and A bot .Then, the g top of a relaxed slab can be computed from eqn (1) and (2) as shown in eqn (3): The area of a slab is commonly derived from the cell axis, i.e.A = aÁb, as described in Fig. 1.However, this approach is only valid for low-index surfaces such as (111) and (001) of fcc metals, which are completely flat.In order to accurately compute the area of flat, stepped, and defective surfaces, and by extension, of NPs of any morphology, we developed a tiling scheme in which each atom is linked to its two closest neighbours also in contact between them, forming a triangular tile.This tiling scheme allows us to sample a surface more efficiently than using the aÁb approximation.In practice, the smaller the tiles, the more accurate the approximation.Tiles enable one to describe corrugated and defective surfaces (in addition to flat surfaces) and associate local contributions of atomically resolved geometries with surface energy changes.The tiling scheme, thus, leads to a more flexible definition of the surface energy than the standard approach of projecting the excess energy into a 2D plane.We identified each triangular tile corner with the atomic position, as illustrated in Fig. 2.
From a set of slabs exposing atoms with different coordination numbers (CN), the total top area, A top is the sum of the areas of each atom, A CN i , with coordination i in the range [3, 11].CNs 1 and 2 are not accessible without compromising the physical meaning of the slab model, as no stable surface presents these CNs.Note that we considered atomic coordination as the number of first neighbours at close distance, although, strictly, coordination implies the interaction coordinating atoms' electronic structure.We adopted this approach based on the metallic behaviour of the large systems and the electron-rich orbitals of the selected transition metals.Therefore, for fcc and hcp crystal structures, the bulk presents a coordination of 12, whereas, for the bcc metals, the bulk presents a maximum CN of 8, and realistic slabs allow defining from CN = 3 to CN = 7.A CN i can be calculated by solving a system of equations in the form of eqn (4): where n CN k is the number of atoms in a given surface with CN = k being k a i, A CN k is their area contribution, and n CN i is the number of atoms with CN = i.This result can be inserted in a second similar set of equations in the form given by eqn ( 5) to determine the contribution of each surface atom to the total top surface energy, g top : Similar to slabs, the surface energies of metal clusters were computed using the following eqn (6): where E cluster is the metal cluster's DFT energy, n is the number of metal atoms in the cluster with respect to the number of atoms in the bulk simulation cell, E bulk is the atomic bulk energy of the considered metal, A pred is the area of the nanoparticle as predicted by the tiling scheme.The magic clusters considered were formed with an atomicity range of N = 6 up to N = 225 with the morphologies illustrated in Fig. 3.The multi-layer perceptron (MLP) developed during this work was built using Python's SciKit-Learn and Keras libraries. 44,45The dataset employed to train the MLP contains targets (surface energies calculated from eqn ( 6)) and features in the form of a vector with multiple geometrical descriptors.Several descriptors were tested, including the individual ratio of atoms from CN = 2 to CN = 11, but also the groups CN(2, 3), CN(4, 5), CN(6, 7), CN (8, 9), CN (10, 11), the kinetic radius, the surface atomic density, the average CN on the surface, and the average Generalised Coordination Number (GCN).We calculated the different CN ratios from eqn (7): where the top term is the total number of atoms with CN = i plus CN = j and the bottom term, n tot , is the total number of surface atoms.The average GCN, a parameter taking into account not only the number of the closest neighbour of a surface atom but also the number of second closest neighbours, was calculated using eqn (8): 46,47 GCN ¼ 1 n where the first sum runs over all i surface atoms and the inner sum runs over the j closest neighbours of atom i with CN = CN j , being j a i, the CN max is the maximum CN of the metal bulk, i.e., 12 for fcc and hcp, 8 for bcc, and the n is the number of surface atoms.
The atomic density was computed as the ratio between the number of surface atoms, n, independently of their CN, and the total surface area, A pred , predicted with the tiling scheme and described in eqn (9): The kinetic radius was taken as the furthest distance between two atoms in a cluster and arbitrarily set to 0.000 Å  for metal slabs.Other class variables included in the features vector were the element and the nature of the system: Slab, particle or sub-nanometer cluster.These class variables were turned into numerical values using a one-hot-encoder preprocessor.
Numerical descriptors were selected depending on their variance and correlation to the target, i.e., the surface energy.The optimum set of parameters forming an input vector for the MLP was the following: Atom type, nature of the system, kinetic radius, average CN, average GCN, surface atomic density, and the ratio of atoms with CN(4, 5), CN(6, 7), CN (8, 9) and CN (10, 11).The dataset was split in an 80 : 20 ratio between the training and test sets.It means that the MLP was trained with 80% of the data, and 20% was kept aside to evaluate its performance.During the training phase, cross-validation was employed using a shuffle split scheme, splitting the training set into 5 distinct subsets (Table 1).

Tiling scheme validation
Predicted areas of fcc metals with low Miller indexes, i.e., (100), (111) and (110), were used to validate the described tiling scheme and compare the results with areas calculated as a flat parallelogram, A flat , formed by the unit cell vectors a and b in eqn (10): The two methods perfectly match the (111) and (100) as these surfaces are flat.However, the (110) areas measured from the tiling scheme are between 15 and 17% higher than the ones calculated with the flat approximation.Note that the difference between metals is due to the surface relaxation; all (110) surface areas are slightly less than the 18.35% predicted by an idealised bulk terminated surface.Nevertheless, The extent of the (110) surface area measured with the tiling scheme is consistent with experimental scanning-tunnelling-microscopy (STM) and atomic-force-microscopy (AFM) observations, [48][49][50] and with computational methods reporting the presence of steps on this particular surface of fcc metals. 51,52The agreement between atomic-resolution experimental techniques observations and the computational tiling scheme highlights the relevance of the method.Fig. 4 illustrates multiple Pd surfaces, including a step, an adatom, and a vacancy, and the areas calculated using the tiling scheme and the flat area approximation.

Prediction of the surface energy
Solving the system of equations with the form given in eqn ( 4) and ( 5) for the different atomic coordination allows one to build mathematical functions describing the atomic areas and the surface energies.For all transition metals tested, these functions took the form of a Lorentzian trend described by eqn (11): The parameters a, b, c and d were determined using function-fitting tools available in the SciPy library 53 and are given in Table S1 of the ESI.† The trends for atomic areas, A, and their surface energies, g, are presented in Fig. 5.The trends in the area prediction are very close to being linear for Ag, Pd and Au, while for Ru, Pt, Ir and Cu, these are curved along the CN axis.The calculated Pearson coefficient (R 2 ) describing the correlation between the DFT-derived and computed areas is in the range [0.82, 0.98], corresponding to fcc-Ru and Pd, respectively, with most of them being well above 0.90, i.e., strongly correlated.The values for bulk coordination were not considered for the area prediction.The trends on the g-prediction as a function of CN are inversely proportional to the areas, showing that the orbitals in undercoordinated atoms tend to rearrange and contract to stabilise the surface, as observed on the g-plateau at low atomic coordination, especially for hcp-Ru, Au, W and Pd.
A validation set of multiple slabs (including low-indexes, high-indexes, and defective surfaces) was built for each metal studied.For Pt, Ir, Au, Ag and Cu, a standard validation set containing 7 slabs was designed.Pd was selected for an extended set with 14 slabs (more details in Table S2 in the ESI †).For Ru, whose typical crystal structure is hcp, we built 6 surfaces with the hcp character and 6 with the fcc character.Then, we investigated two approaches to validate the predicting model: The discrete method and the average method.The first one expresses the area, A d , as well as the surface energy, g d , as a sum of all atomic contributions as described in eqn (12): Where n tot is the total number of surface atoms, n i is the number of atoms with CN = i, and the sets of parameters {a A , b A , c A , d A } and {a g , b g , c g , d g } are the Lorentzian parameters obtained for each trend and metal.The second approach, the average method, uses the average surface coordination number, CN, rather than a discrete sum to predict the surface area, A a , and the surface energy, g a , as described in eqn (13): Comparing the mean absolute error (MAE) of the average and the discrete methods against the calculated tile scheme and DFT surface energies reveals that the average method is slightly more accurate than the discrete one, as shown in Fig. 6.The largest error (0.300 J m À2 ) with the average method is obtained for the unique bcc metal included in this work, W, whereas for fcc metals, g a shows excellent results (MAE around 0.100 J m À2 or lower, except for Ir).The discrete method outperforms the average one only for Ru-hcp, which discrete MAE is 0.090 J m À2 lower than the average method.S3 in the ESI †).It shows that for both DFT-g and g a , the fcc (111) and (110) surfaces are very similar between them.Indeed, The fcc (110) surface atoms have an average CN = 9 distributed over a sizeable non-flat area.

Multi-layer perceptron (MLP)
Although the Lorentzian trends accurately predict the surface energy of metal slabs, they are inconsistent with the surface energy of metal clusters.Fig. 7 shows the MAEs obtained for magic clusters, which are in the range of 0.2-0.7 J m À2 , higher than the MAEs obtained for metal slabs.This result can be explained by the fact that NPs, contrarily to slabs, are isolated,  non-periodic objects with a high ratio between surface and bulk atoms.
The initial dataset containing 245 metal slabs and magic clusters (Pd, Pt, Ir, Cu, Ag, Au, Ru(hcp) and W(bcc)) was extended with 14 Co(fcc), 16 Co(hcp), 14 Ru(fcc) slabs, 10 Ru closed-shell NPs, 12 Ir, 34 and 26 Pd sub-nanometer clusters, 33 leading to a dataset of 337 structures.On a training set containing 68 features randomly selected from the dataset, the MLP scored R 2 = 0.97 with a mean absolute error (MAE) of 0.091 J m À2 , which is better than the Lorentzian equations   developed for slabs.Also, the MLP scored a very low median absolute error of 0.052 J m À2 , meaning that half of the test set population's MAE is below this value.These results are summarised in Fig. 8.The learning curve shows that the score on the test set from 180 structures in the training set remains consistent.Namely, adding more data to the dataset is unlikely to improve the MLP performance.

Surface energies and Wulff construction
We predicted the morphology of large gas-phase NPs from the surface energies obtained from Vitos et al., 26 DFT and MLP derived using the Wulffpack Python package. 54There are substantial experimental evidence that fcc-NPs tend to exhibit (111) and (100) facets rather than (110), 55,56 as shown by the Wulff representation from Vitos' data illustrated in Table 2.The morphologies derived from DFT-g and MLP-g, however, present large deviations.The (110) surface energy is very close to (111) due to the extra area described with the tiles scheme.
The deviation from the DFT-g and MLP-g models arises because the (110) facet is not considered a flat surface and presents two very distinctive atomic coordination, CN = 7 and CN = 11.However, atoms with CN = 11 are not easily accessible, and the most external atoms with CN = 7 determine the surface growth. 57,58Furthermore, Wulff's theory is also limited to predictions on large NPs that exhibit defined facets, [59][60][61] primarily low-index due to their superior stability, for which one can assume that these facets are 2-dimensional. 62Namely, in order to use the energies predicted by the tiling scheme, they would also need to be projected into the 2D plane. 63We computed the Pd(110) surface energy on a flat surface, DFT-flat-g = 2.192 J m À2 , resulting in a much larger value than the DFT-g (1.863 J m À2 ), and used it to construct the Wulff morphology of a 5000 atom gas-phase NP, Table 2.The ratio between the different facets is much closer to those predicted using Vitos et al. surface energies.
Small particles, e.g., sub-nanometre clusters less than 40 atoms in size, do not exhibit any particular facets, [64][65][66] and therefore, the Wulff construction fails to predict their shape.Herein, we have presented a way to compute the surface energy of such particles by computing their excess energy from DFT combined with the tiling scheme's prediction to calculate their surface energies and train a MLP model.Predicting the surface energy of sub-nanometer clusters is a step forward toward better predicting their stability through a more accurate forecast of the chemical potential. 20

Conclusion
In this work, we successfully developed a tiling scheme to accurately measure the surface area of slabs and clusters  independently of their size and shape, providing a robust method to evaluate surface energies.We used the areas and the DFT calculated surface energies to build systems of equations resulting in a series of Lorentzian functions predicting the surface energy per metal atom as a function of its CN.We showed that for fcc metals, the low index surface energy increases as follows, (111) o (110) o (100), in contradiction with some results in the bibliography.We utilise the Lorentzian trends to predict surface energies following two protocols, the average and the discrete methods.From the results obtained on the slabs validation set, we figured out that better predictions are obtained from the average method, although none of them extrapolates successfully to small NPs.We The model presented here to predict surface energies fails to generate correct Wulff construction because it does not consider low-index surfaces as flat but corrugated surfaces.Projecting surface energies into 2D planes remains accurate when considering large nanoparticles.The approach developed in this work applies to sub-nanometer particles and allows the prediction their surface energies.The robust model presented aims at improving the understanding of metal atoms' site stability, as the surface energy has been identified as a key descriptor of the catalytic reactivity and, therefore, for the design of durable catalysts.

Fig. 2
Fig. 2 Illustration of the tiling scheme.On the left is a side view of an fcc surface containing one vacancy, and on the right is its top view.The coloured lines (and tiles) mark the different average z-positions of each tile.

Fig. 3
Fig. 3 Illustration of fcc-magic size clusters containing from 6 up to 225 atoms.

Fig. 6
Fig. 6 also compares the predicted and DFT-derived surface energies of low-index fcc surfaces against the values provided by L. Vitos et al., 26 (details given in TableS3in the ESI †).It shows that for both DFT-g and g a , the fcc (111) and (110) surfaces are very similar between them.Indeed, The fcc (110) surface atoms have an average CN = 9 distributed over a sizeable non-flat area.

Fig. 4
Fig. 4 Top and side view of 3 Pd-surface representations with a vacancy (left), an adatom (middle), and a step (right).

Fig. 5
Fig. 5 (a) Lorentzian trends expressing the areas, A i , as a function of CN for various metals.(b) Lorentzian trends expressing the surface energy, g i , as a function of CN.

Fig. 6
Fig.6(a) MAE obtained from the average, g a , and the discrete, g d , approach for each metal.(b) DFT-g against g-predicted (g a and g d in the legend) for each metal.(c) Comparison between the DFT-g calculated in this work using the tiling scheme (Orange), with g a (Pink), and values calculated by L. Vitos et al. (Green).26

26
Fig.6(a) MAE obtained from the average, g a , and the discrete, g d , approach for each metal.(b) DFT-g against g-predicted (g a and g d in the legend) for each metal.(c) Comparison between the DFT-g calculated in this work using the tiling scheme (Orange), with g a (Pink), and values calculated by L. Vitos et al. (Green).26

Fig. 7
Fig. 7 (a) Surface energy comparison between the average method, g a , and g-DFT for magic clusters containing from 6 to 225 atoms.(b) Illustration of the accuracy reached using the MLP on slabs, magic clusters, and other metal clusters.

Fig. 8
Fig. 8 (a) Histogram of the mean absolute error (MAE) as a function of the population in the test set.(b) Learning curve as a function of increasing training set size.
successfully overcame the extra complexity by incorporating metal clusters into the dataset and training a MLP model capable of predicting the surface energy of closed-shell NPs, and small gas-phase clusters.The MLP only requires accessible geometric features of the structure, such as average CN, average GCN, kinetic radius, surface atomic density and ratio of atoms with CN = 4-5, 6-7, 8-9 and 10-11.The trained MLP performs exceptionally well on slabs and metal clusters, independently of their size and shape, reaching the score of R 2 = 0.97 with a mean absolute error of 0.091 J m À2 .

Table 1
Comparison of the area predicted with the tiling scheme and the flat approximation for Pd, Pt and Ir low Miller-index surfaces This journal is © the Societies 2023 Phys.Chem.Chem.Phys., 2023,25, 1977-1986 | 1981

Table 2
Illustration of the Wulff morphologies predicted for Pd 5000 gas-phase clusters using different surface energies.The percentages of each surface area are also provided for comparison