Xiangcheng
Shi
abdf,
Dongfang
Cheng
abd,
Ran
Zhao
abd,
Gong
Zhang
abd,
Shican
Wu
abd,
Shiyu
Zhen
abd,
Zhi-Jian
Zhao
*abcd and
Jinlong
Gong
*abcde
aSchool of Chemical Engineering and Technology, Key Laboratory for Green Chemical Technology of Ministry of Education, Tianjin University, Tianjin 300072, China. E-mail: zjzhao@tju.edu.cn; jlgong@tju.edu.cn
bCollaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin 300072, China
cHaihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
dNational Industry-Education Platform of Energy Storage, Tianjin University, 135 Yaguan Road, Tianjin 300350, China
eJoint School of National University of Singapore and Tianjin University, International Campus of Tianjin University, Binhai New City, Fuzhou 350207, Fujian, China
fDepartment of Chemistry, National University of Singapore, 3 Science Drive 3, Singapore, 117543, Republic of Singapore
First published on 20th July 2023
The complex reconstructed structure of materials can be revealed by global optimization. This paper describes a hybrid evolutionary algorithm (HEA) that combines differential evolution and genetic algorithms with a multi-tribe framework. An on-the-fly machine learning calculator is adopted to expedite the identification of low-lying structures. With a superior performance to other well-established methods, we further demonstrate its efficacy by optimizing the complex oxidized surface of Pt/Pd/Cu with different facets under (4 × 4) periodicity. The obtained structures are consistent with experimental results and are energetically lower than the previously presented model.
Reconstructed surfaces tend to adopt the most thermodynamically stable structures,22 which is the lowest point on the potential energy surface (PES), the so-called global minimum (GM).18 Finding the GM of a working material constitutes a global optimization (GO) problem. There has been significant progress in developing GO algorithms for chemical structure optimization. However, most of them are mainly developed for isolated particles like crystals, clusters, or supported clusters, with simplified models deemed to be sufficient for most investigations.22–24 The optimization of the surface system is more geometrically restricted than that of isolated particles, owing to the periodicity and the presence of strong covalent bonds between surface atoms and the underlying support,25 and generally more atoms are required for reliable modeling.
Two major difficulties should be considered for globally optimizing surface structures: how to efficiently explore the highly complex PES, and how to reduce high computational costs caused by massive local relaxations used to describe the PES.26 Indeed, some GO algorithms that were originally developed for isolated particles have been applied to surface structures.27,28 However, their efficiency has not yet been examined systematically. Previous reports have expressed concern about the efficiency of the genetic algorithm (GA) that two good parent structures may produce poor candidates with high energy.29,30 The insufficient efficiency also limits the model size for optimizing a surface system, as most studies are conducted generally with no more than (2 × 2) periodicity.31–37 It is even unreliable since no known criterion guarantees that the “best structure” encountered by a GO algorithm is the GM that reflects the reconstructed structures.38
The absence of a criterion is primarily attributed to the inherent limitations in the spatiotemporal resolution of characterization techniques, resulting in a dearth of prior knowledge regarding the precise atomic-level structure of a reconstructed surface.39 This situation poses a significant challenge to the integration of machine learning (ML) aimed at mitigating the computational burden associated with GO. It is dangerous to rely solely on a pre-trained ML calculator that suffers extrapolation problems, as the GM can correspond to a very narrow basin of the PES and can hardly be involved in the training set.40,41 Previous research on ML-involved structural searches, using techniques without extrapolation design, such as neural networks, has generally been randomly generated or an already built database,42 or adjusting the atomic position is very constrained during the search.43–45 Even for on-the-fly ML frameworks with sampling design such as Bayesian optimization,46 if they are applied solely without incorporating near-GM structural features into the training set, their effectiveness can be compromised.47
This paper describes a new strategy for the global optimization of complex catalytic surfaces using a hybrid evolutionary algorithm (HEA) that combines differential evolution (DE) and genetic algorithms with a co-evolution framework. An on-the-fly machine learning calculator based on Gaussian processes is adopted to complement local evaluations and expedite the identification of low-lying structures. We demonstrate the HEA method in obtaining the complex surface oxide structure of different facets of transition metals like Pt, Pd, and Cu using a (4 × 4) supercell. The globally optimized structures are lower than previously reported theoretical modeling and are consistent with experimental observation, providing important clues for the rational design of catalysts.
For the offspring population in each tribe, excessive offspring candidates will be generated by a hybrid evolutionary operator: one is to produce offspring candidates from two-parent structures by the GA operator introduced by Deaven and Ho.29 Note that mutations (permutation, rattle, and mirror) are applied with pre-set probability for a newly generated structure.51 Another is to perform the DE operator, and three strategies are introduced:52
Xde = Xr3 + F × (Xr1 − Xr2) | (1) |
Xde = Xbest + F × (Xr1 − Xr2) | (2) |
Xde = Xr3 + F × (Xbest − Xr3) + F × (Xr1 − Xr2) | (3) |
The generated candidates will first be evaluated by the GP regressor. In practice, the GP regressor needs to deal with an unrelaxed structure. Considering time consumption and memory problems for force prediction, the GP regressor is directly trained and performed using unrelaxed structures with their relaxed energy.47,53,54
The GP regressor was implemented using the GPyTorch Python library.55 A Gaussian process uses Bayesian inference and assumes that the prior distribution for the data can be given by a multivariate normal distribution, while its task is to infer the Gaussian posterior distribution p(E*|X,y,X*) for the unseen datapoint X* (waiting for exploration) based on the observed training set (X, Eθ). X is taken to be the feature vector rather than the Cartesian coordinates, where the many-body tensor representation (MBTR) descriptor is adopted by the DSCRIBE software package.50,56 MBTR descriptors provide whole-system representations of periodic systems with the locality of chemical interactions being exploited. To reduce complexity for representing large surface systems, only the top three layers (that are more prominent to reconstruct) are selected for training, while the bottom layer (representing the bulk structure) is excluded. For more realistic modeling, the target value y differs from Eθ by adding an independent identically distributed Gaussian noise, . The key predictive equations for the GP regressor are57–59
(4) |
(5) |
cov(E*) = K(X*,X) − K(X*,X)[K(X,X) + σn2I]−1K(X*,X) | (6) |
The predictive mean of the distribution is the estimation of the potential energy, while the variance cov(E*) can be an uncertainty quantification. K(·,·) denotes the covariance function (or kernel) matrix that is used to characterize the similarity between samples, which is the very heart of the GP regressor. In our study, the covariance function was chosen to be a sum of a Matern kernel and a linear kernel as
(7) |
The optimal hyperparameters Θ* are determined by the log marginal likelihood as
Θ* = arg maxlogp(y|X,Θ) | (8) |
The Thompson sampling (TS) method is used as the acquisition function that leverages the uncertainty in the posterior to guide exploration, a randomized strategy that samples a reward function (that is relative to potential energy) from the posterior and queries the structure xn+1 with the highest reward.62,63 Fig. S1† shows an illustrative example of a TS-guided on-the-fly structure search in a non-convex search space.
xn+1 = arg maxfw(a) where w ∼ p(y|X,Θ) | (9) |
Equipped with this trained GP regressor, we choose a batch of multiple candidates, namely the batch Thompson sampling (B-TS) method. Different from the lower confidence bound (UCB) function used in similar studies,47,59,64 the B-TS method can naturally trade-off between the exploration and exploitation of the PES with no free parameters, thus avoiding the damage of efficiency caused by an inappropriate parameters setting of the UCB function,41 the effectiveness of which in searching chemical space has been demonstrated and reported before.63
Only these “most promising” structures are evaluated at the DFT level. The population is then updated under the ‘survival of the fittest’: a certain number of the most stable structures from the current (parent + offspring) population are kept, while others are eliminated.22 Nonetheless, all DFT-evaluated structures are added to the training dataset, and the GP regressor is re-trained on-the-fly.
The high efficiency of the HEA program stems from three features: firstly, in Fig. 3a and b, the newly introduced DE operators are much more effective and stable at generating lower energy configurations compared with the GA operator. Mirror and permutation mutation, which is widely used in the optimization of isolated particles, performs poorly when dealing with the surface system. As a result, in Fig. 2(a), GA only cannot efficiently optimize the (4 × 4) surface, which is exceeded by the HEA program combining both GA and DE operators. Secondly, the (on-the-fly) accuracy of the GP regressor produces reliable prediction of (unrelaxed) candidates. In Fig. 3(b), despite the accuracy loss, the MAE for both the models trained using relaxed and unrelaxed structures is below 10 meV per atom after around 10 generations (where the number of data points for the training set is 750).
Fig. 3 (a) Boxplots of energy changes compared to the average energy of the parent structures for different offspring operators in the HEA program. Details of the equation are provided in the ESI.† The lines inside the boxplots show the median energy change, the boxplots extend from the lower to the upper quartile of the data, whiskers extend from the 5th to the 95th percentile of the data, and black dots show data points outside of the whisker range. (b) The on-the-fly predict accuracy in optimizing O–Pt(111). “Tribe-unrelaxed”: train the GP regressor using unrelaxed structures with their relaxed energy separately inside each tribe, “-relaxed”: using relaxed structures, “w/o-tribe”: train only one GP regressor outside using structures from all tribes. The number of data points for the training set for 0, 20, 40, 60, and 80 generations is 360, 1260, 2160, 3060, and 3960, respectively, of which one-third are for “tribe” as three tribes are adopted. (c) Time spent for training the GP regressor inside or outside the tribe. (d) Kendall's τ coefficient of the ranking of all predicted energy and its relaxed energy collected from a HEA search. Details of Kendall's τ coefficient are provided in the ESI.† |
A relatively correct ranking of the offspring candidates is achieved, as shown in Fig. 3(d) through Kendall's τ coefficient, demonstrating a reliable sampling during the search. Fig. 2(b) also reflects that the MBTR descriptor contains the relevant structural information for approximating the energy, which is the prerequisite for an accurate GP regressor. Finally, introducing the “tribe” framework not only helps maintain a high structural diversity for each tribe,48 but also enhances the efficiency of the GP regressor. Fig. 3(b and d) show that the accuracy is improved, and the sampling is more targeted. While the sparse GP regressor does not provide enough accuracy (Fig. S5†), here a GP regressor with a full covariance matrix is used that requires o(N3) computational time and o(N2) memory space for Cholesky decomposition. Thus, it becomes expensive and its numerical stability is degraded for a large dataset.66,67 Dividing the training dataset using the “tribe” framework naturally avoids these problems as shown in Fig. 3(c). All the above features contribute to the enhanced structural searchability of the HEA program.
Pt can undergo irreversible restructuring under reaction conditions, due to the surface oxidation and subsequent Pt dissolution, which is thought to decrease catalytic efficiency and durability.19,69–72 However, the atomic-level modeling of the Pt oxidation remains uncertain.73
The optimized structures of 0.75 ML O–Pt(111), which are thought to exist at around 1.0–1.2 VRHE during electro-oxidation,73,76 are shown in Fig. 4(d and e). The structures consist of two interconnected, protruding square planar PtO4 units that are 1.7 Å in height. It is consistent with the scanning tunneling microscopy (STM) in Fig. 4(b and c) that the oxidized Pt(111) surface consists of a network of mono-atom-high (1.7 Å), worm-shaped islands.75,77 The surface oxidation state of PtO4 units is between that of PtO (Pt2+) and Pt3O4 (Pt2.7+), which exactly fits the in situ XANES showing that the oxidized Pt surface formed at >1.0 V presents square-planar PtO4 units with Pt in a slightly higher oxidation state than in PtO.78 With a similar oxidation state, the PtO4 units also resemble the PtO and Pt3O4 bulk oxide shown in Fig. S6.†
Fig. 4 (a) The scatterplot of the energy of the structure search during a HEA search and the average coordination number of surface Pt atoms. Red cross: the globally optimized structure. Purple dot: the structure proposed by Hawkins et al.74 Red dot: the structure searched by LASP. Green dot: the (2 × 2) optimized structure (after supercell). (b and c) STM images of oxidized Pt(111) (reproduced with permission from ref. 75, Copyright 2008 Elsevier). (d and e) The globally optimized structures of (4 × 4) 0.75 ML O–Pt(111) through the HEA program. |
In Fig. 4(a), compared with the previously proposed O–Pt(111) model by Hawkins et al.,74 our optimized structure is 0.14 eV per O-site lower. Although Hawkins et al.'s model (Fig. S3†) also contained PtO4 units, the chain structures that are linked with each PtO4 unit are not as stable as the separated PtO4–PtO4 structures we obtained. Note there is no known criterion guaranteeing that the “best structure” encountered by a GO algorithm is the “true” global optimum.38 Nevertheless, a lower energy allows our obtained structure to have a greater possibility to be the most abundant and representative phase under reaction conditions.22
Pd and Cu are widely used catalysts but both suffer from severe reconstruction under the reaction conditions.83–85 A typical characteristic for oxidized Pd(111) is the “Persian-carpet” pattern observed in STM, with which the simulated STM image based on the optimized structure is well consistent, as shown in Fig. 5b. Fig. 5c–f show that the optimized structure consists of several parallel chains with different features from the PdO2 to the PdO4 unit. Such multiple co-existing oxygen species have been observed by in situ XPS studies.86 Our optimized structure is 0.35 eV per O-site lower than the structure searched by USPEX as shown in Fig. S4,† and 0.61 eV per O-site lower than CALYPSO as reported previously.87 Compared with the CALYPSO model, we both presented the existence of subsurface O atoms that have been experimentally reported,88 while the CALYPSO model failed to further contain parallel chains that form a “Persian-carpet” pattern.
Fig. 5 (a and b) STM image of oxidized Pd(111). (Reproduced with permission from ref. 79, Copyright 2000 IOP Publishing.) Inset: simulated STM images based on the optimized structure. (c–f) Optimized structures of 1 ML O–Pd(111). (g) STM images of O–Cu(100). (Reproduced with permission from ref. 80, Copyright 2013 American Chemical Society.) Inset: simulated STM images based on the optimized structure; (h and i) optimized structures of 1 ML O–Cu(100); (j) experimental STM images of Cu(110). (Reproduced with permission from ref. 81, Copyright 2014 Elsevier.); (k) the simulated STM images based on the globally optimized 1 ML O–Cu(110). (i–m) The obtained globally optimized surface structures of 1 ML O–Cu(110) through the HEA approach. (n) HRTEM image of reconstructed Cu(110) layers under O2 pressure (reproduced with permission from ref. 82, Copyright 2022 American Chemical Society). |
Fig. 5(h and i) show that the oxidized Cu(100) forms via the creation of the Cu–O chain that resembles the bulk Cu2O.89 Experimentally, a Cu2O signal has been observed during the initial oxide growth of Cu(100)90 along with the missing-row reconstruction (MRR) in Fig. 5(g),82,91,92 which is consistent with the simulated STM image based on our optimized structure. Such MRR is reported widely for Cu oxidation, which is caused by the increasing surface stress and has been previously proven to be energetically favorable.82,91 Subsurface oxygen is also contained in our structure, linking with the Cu–O chain through O–Cu–O units, which is believed to form above 0.5 ML O coverage experimentally,92,93 and to contribute to the increased stability of the MRR structure.94 Our optimized structure is 0.31 eV per atom more stable than the well-known (2√2 × √2)R45° reconstructed Cu(100) model80,95 that also presents an MRR structure but fails to reflect the experimentally observed formation of Cu2O.96,97 Similarly, the oxidized Cu(110) also consists of the parallel, added-row Cu–O chains, whose simulated images are consistent with both experimental STM and TEM observation as shown in Fig. 5(j–n).81,82,98
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3sc02974c |
This journal is © The Royal Society of Chemistry 2023 |