Si-Da
Huang
,
Cheng
Shang
,
Xiao-Jie
Zhang
and
Zhi-Pan
Liu
*
Collaborative Innovation Center of Chemistry for Energy Material, Key Laboratory of Computational Physical Science (Ministry of Education), Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Department of Chemistry, Fudan University, Shanghai 200433, China. E-mail: zpliu@fudan.edu.cn
First published on 30th June 2017
While the underlying potential energy surface (PES) determines the structure and other properties of a material, it has been frustrating to predict new materials from theory even with the advent of supercomputing facilities. The accuracy of the PES and the efficiency of PES sampling are two major bottlenecks, not least because of the great complexity of the material PES. This work introduces a “Global-to-Global” approach for material discovery by combining for the first time a global optimization method with neural network (NN) techniques. The novel global optimization method, named the stochastic surface walking (SSW) method, is carried out massively in parallel for generating a global training data set, the fitting of which by the atom-centered NN produces a multi-dimensional global PES; the subsequent SSW exploration of large systems with the analytical NN PES can provide key information on the thermodynamics and kinetics stability of unknown phases identified from global PESs. We describe in detail the current implementation of the SSW-NN method with particular focuses on the size of the global data set and the simultaneous energy/force/stress NN training procedure. An important functional material, TiO_{2}, is utilized as an example to demonstrate the automated global data set generation, the improved NN training procedure and the application in material discovery. Two new TiO_{2} porous crystal structures are identified, which have similar thermodynamics stability to the common TiO_{2} rutile phase and the kinetics stability for one of them is further proved from SSW pathway sampling. As a general tool for material simulation, the SSW-NN method provides an efficient and predictive platform for large-scale computational material screening.
In the past 20 years, neural network (NN) function fitting techniques have demonstrated great potential for constructing high-dimensional PESs with a high accuracy.^{13–16} For example, these techniques have been successfully utilized for predicting the reaction dynamics of small molecules^{17} (e.g. less than 15 degrees of freedom) by fitting the PES data from high level quantum mechanics calculations. Recent advances in NN PES construction may be divided into two major streams: developing better NN architecture and improving NN training efficiency.^{13,14,18–23} Belonging to the former are, for example, the atom-centered high dimensional NN scheme^{13,24} introduced by Behler and Parinello to decompose total energy into individual atomic energies, and more recently the development of deep NNs^{22} by increasing the number of nonlinear layers to add more flexibility to the network. The state-of-the-art training method now simultaneously fits energy and forces,^{21} and utilizes quasi-Newton optimization methods (e.g. global extended Kalman filter^{18,19} and the Levenberg–Marquardt method^{20}) to learn network parameters. While the NN-based PES is analytic and its accuracy is in principle comparable to quantum mechanics calculations (by which the training data set is obtained), the application of the method for material discovery has been rare due to some inherent drawbacks of NNs. In particular, the poor predictive ability is noticed for structures that are vastly different from the training data set.
The predictive power of NN PESs can be systematically improved by expanding the training set to incorporate as many structures as possible with different chemical environments, e.g. solids, surfaces and clusters. In practice, these structures may be randomly generated or obtained from PES sampling techniques, such as molecular dynamics, hybrid Monte Carlo,^{25,26} and metadynamics.^{5,6} Even so, a self-consistent iterative procedure by repeating data-expanding and NN-retraining is inevitable to ensure the robustness for the results from NN. In practical applications, this tedious and time-consuming procedure is a main obstacle in constructing global NN PESs and thus largely prevents wide applications in material discovery. Therefore, an automated and simple solution to generate a big training set representative for global PESs becomes an increasingly important challenge in the field.
In this work, we proposed a “Global-to-Global” approach for constructing the general-purpose NN PES for material discovery. The first “Global” means that we rely solely on our recently-developed stochastic surface walking (SSW) global optimization method^{11,12} to generate the training set based on quantum mechanics calculations, which explores exhaustively the PESs of relatively small systems (with periodic boundary conditions); the second “Global” indicates that the obtained NN PES is consistent with the quantum mechanics PES in a wide energy window, and thus functions properly in practice for the global optimizations of large systems, where the total energy per formula unit (f.u.) often varies by more than 10 eV. We will show that the utilization of the SSW global optimization training set largely eliminates the lengthy iterative NN improving procedure, and more importantly, renders the high transferability of the NN PES that is ideal for material discovery.
This article is organized as follows. In Section 2, we present the SSW-NN method to achieve the “Global-to-Global” goal, including the NN construction, the training method and the automated global training set generation based on the SSW method. In Section 3, we apply the SSW-NN method to an important material, namely TiO_{2}, and benchmark the overall performance of the NN PES against quantum mechanics calculations. In Section 4, we make the first attempt to use the NN PES in the global optimization of large TiO_{2} systems. We successfully identify two unprecedented hollow-cage TiO_{2} crystal structures that are as stable as common TiO_{2} rutile phase. The implications of this work for material discovery are then discussed.
In the HDNN scheme, the total energy E^{tot} of a structure can be decomposed and written as a linear combination of atomic energy E_{i} (eqn (1)), an assumption widely used in empirical force field calculations.
(1) |
(2) |
(3) |
(4) |
(5) |
(6) |
(7) |
The atomic force can be analytically derived according to eqn (8), where the force component F_{k,α}, α = x, y or z, acting on the atom k is the derivative of the total energy with respect to its coordinate R_{k,α}. By combining with eqn (1), the force component can be further related to the derivatives of the atomic energy with respect to j^{th} symmetry functions of atom i, G_{j,i}:
(8) |
For solid systems, an accurate static stress tensor is critical in structure optimization and thus for converging total energy. Similarly, the static stress tensor matrix element σ_{αβ} can be analytically derived as:
(9) |
(10) |
One major feature in our NN training is the incorporation of the stress tensor into the performance function (eqn (10)), which allows us to fit all three properties simultaneously. Pukrittayakamee et al.^{21} first introduced the mixed scheme to simultaneously fit energy and force, and found that the mixed scheme can avoid overfitting, produce more accurate forces and thus improve the predictive ability. In our practice, we found that for the global optimization of solids, both forces and stresses need to be accurate and the most convenient way is to allow the NN training to fit all three terms, either simultaneously or independently (depending on the adjustable parameters, ρ and τ, in eqn (10)).
To train the NN, the derivatives of each term in eqn (10) with respect to the weights and biases need to be analytically derived. Among them, the gradient of J_{E} can be computed by the standard backpropagation algorithm,^{27} and the gradient for J_{F} has been reported in ref. 21. We have derived equations to compute the gradient of J_{σ}, which are detailed in the ESI.†
In this work, we utilize the limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS)^{28,29} method to minimize J_{tot}, which is convenient for massive parallel programming. The performance of the L-BFGS method has been demonstrated in fitting complex NNs.^{30} For the training set of ∼50000 structures from our SSW global optimization, we found that L-BFGS takes typically 20000 training epochs to generate a qualified PES with the root mean square errors (RMSEs) for energy, force, and stress being around 10 meV per atom, 150 meV Å^{−1}, and 1.5 GPa, respectively.
We note that many other gradient-based optimization algorithms have been used to optimize the network weights and biases, such as stochastic gradient descent (SGD),^{31} conjugate gradient (CG),^{32,33} Levenberg–Marquardt (LM)^{20}etc. It is in general accepted that quasi-Newton second-order methods, such as L-BFGS and LM, converge more rapidly to the true minimum for large data sets. It should be mentioned that compared to the dominant computational effort to generate the data set using quantum mechanics calculations, the training of the NN is in fact not the rate determining step in the whole NN PES construction procedure.
To date, there are a variety of global optimization methods for structure prediction, including, for example, simulated annealing,^{35} basin-hopping,^{2–4} minimum-hopping,^{36} genetic algorithm^{7–10} and the SSW method,^{11,12} which should be able to search a wide area on the PES, and identify unbiasedly the global minimum (GM) even starting from random structures. Among them, basin-hopping and genetic algorithms transform the PES by overlooking the transition region between minima and thus they are well likely to miss important structural patterns at the transition region. On the other hand, simulated annealing and minimum-hopping are based on extensive MD calculations at elevated temperatures, and both methods are rarely utilized for global search in combination with quantum mechanics (even DFT) calculations. Furthermore, the structure patterns from the MD trajectories being closely related could be overwhelmingly redundant for NN training and an iterative structure selection scheme may be required to produce a compact data set.^{13,17}
In this work we decided to utilize the SSW method developed in the group to generate the global PES structure data set. The SSW method in combination with DFT calculations has been utilized in a number of complex systems, ranging from clusters^{37–39} to surfaces^{40} and to solids.^{41} This allows us to sample different PESs within the same theoretical framework. The methodology of SSW has been detailed in our previous work,^{11,12,37} and here we provide a brief summary for the working mechanism of SSW (the algorithm of SSW is also described in the ESI†).
The SSW method targets for a rapid PES exploration via smooth structure perturbation. The idea of the SSW method comes from the bias-potential driven dynamics^{42} and Metropolis Monte Carlo sampling. Specifically, SSW smoothly manipulates the structural configuration on the PES from one minimum to another by adding bias potentials along a softened random direction and relies on the Metropolis Monte Carlo to accept or refuse each move. Thus, the SSW trajectories (see ESI Fig. S1†) include a variety of structural configurations on the PES ranging from minimum to saddle points and even to fragmented/open structures with high energy. Due to the continually-added bias potentials, even high barriers separating the minima on the PES can be surmounted efficiently. This exploration scheme is fully automated and does not need a priori knowledge of the system, such as the structure motif (e.g., bonding patterns, symmetry) of materials.
The SSW method for PES exploration is massively parallel in nature: in practice we performed a series of SSW simulations starting from different initial structures using DFT calculations. To generate a representative training set for a global PES, it is ideal to initialize SSW simulations from different phases, including the solid and layer phases and clusters. Nevertheless, this is not obliged, especially for unknown systems, since SSW being a global optimization method can visit stochastically a broad area on the PES as the SSW steps increase. For the purpose of material discovery concerned here, we will focus on solid and layer types of structures, which are treated in the same framework of periodic plane-wave DFT. In Scheme 1, we summarize the procedure for generating the data set using SSW and it is explained as follows.
(i) Set the initial structure for SSW simulations; a group of initial structures is randomly selected to represent solids, layers and clusters.
(ii) Sample the global PES using SSW in parallel; typically, DFT calculations with low accuracy setups are utilized to speed up the PES sampling. All structures along the SSW trajectories are recorded, which yields the raw data set.
(iii) Filter and generate the reference data set; the raw data set needs to be screened according to energy and geometry to remove redundant and too-high energy structures. For the large number of structures in the raw data set (typically more than 200000 structures), a smaller data set is randomly selected from the filtered raw data set, which yields the reference data set.
(iv) Refine energy, forces and stresses for the reference data set. The structures in the reference data set will be recalculated (single point energy) using DFT (or high level quantum mechanics) calculations with high accuracy setups.
In order to generate the DFT global PES, we run ten SSW simulations in total, 8 of them starting from bulk solids and 2 of them starting from layer structures where one dimension of the lattice is fixed to maintain a large vacuum (>10 Å) between adjacent unit cells. The system size is selected as 12 atoms (Ti_{4}O_{8}) and the total SSW steps for each run are limited to within 100 steps, which is suitable for DFT calculations and at the same time for enabling the SSW sampling to produce enough different structural patterns (including amorphous structural patterns). The starting TiO_{2} structures of SSW are either known crystal structures (e.g. rutile), or randomly configured. All DFT calculations performed in this work are based on the periodic plane-wave framework, as implemented in VASP,^{50,51} and the DFT setups are detailed in the ESI.†
Fig. 2a shows the global PES constructed using the 50000 structures, which are filtered and randomly selected from a 130000-structure raw data set. To project the global PES into two dimensions for visualization, i.e. energy versus geometry, here we distinguish the geometry of structure using the distance-weighted Steinhardt-type order parameter (OP),^{52} presented on the x-axis of Fig. 2. The y-axis is the energy per formula unit (f.u.), where the energy zero is set as that of GM (TiO_{2}-B phase) hereafter. The OP is written as:
(11) |
Fig. 2 Illustration of global DFT PES for TiO_{2} solids using the OP_{2} ∼ E contour plot. OP_{2}: the structural order parameter l = 2 (eqn (11)). The energy of the GM (TiO_{2}-B phase) is set as energy zero. (a) Global data set constructed according to the procedure described in Scheme 1. (b) and (c) are the data sets taken from SSW simulations starting from bulk initial structures and layer initial structures, respectively. Both are part of the global data set in (a). The black dots label the minima listed in Table 1 that are identified from the SSW global structural search using global NN PES and reoptimized using DFT. Their coordinates (E, OP_{2}) in the figures are as follow: TiO_{2}-B: (0, 0.52); anatase: (0.02, 0.37); TiO_{2}-II: (0.09, 0.23); rutile: (0.11, 0.28); Str-1: (0.12, 1.16); Str-2: (0.14, 0.87); TiO_{2}-R: (0.18, 0.32); TiO_{2}-H: (0.18, 0.64); lepidocrocite: (0.18, 0.57); baddeleyite: (0.20, 0.18); Str-3: (0.23, 1.07); Str-4: (0.35, 0.47); Str-5: (0.35, 1.03); Str-6: (0.43, 0.38); Str-7: (0.47, 0.49); Str-8: (0.51, 0.52); TiO_{2}-OII: (0.87, 0.02). |
As seen from Fig. 2a, the stable crystal phases are located at the bottom of the global PES. The OP_{2} values for the four common crystal phases, TiO_{2}-B, anatase, rutile, and TiO_{2}-II as highlighted by the black dots, are 0.52, 0.37, 0.28, and 0.23, respectively. It is obvious that most of the structures from the SSW trajectories are located nearby the anatase and rutile phases with the OP_{2} value being 0.2–0.4. This may not be surprising since these two phases contain only TiO_{6} octahedra that are the most common structural patterns in TiO_{2} crystals.
Since this DFT global PES is glued together from all SSW trajectories, it is of interest to examine the contributions from different initial structure sources. Fig. 2b and c show the bulk-starting PES and layer-starting PES, respectively. Interestingly, the former covers most parts of the PES and is quite similar to the global PES, while the latter concentrates mainly at the middle region of the OP_{2}-energy plot. It should be noted that due to the limited SSW sampling steps (100 SSW steps), the bulk-starting SSW trajectories have a low probability of visiting some of the layer structures, as evident by the lack of population in the region of large OP_{2} (>0.9). On the other hand, the layer-starting SSW simulation is constrained to keep the vacuum in the Z-direction and thus the common bulk crystal phases are not found in the trajectories (e.g. see Fig. 2c for the black dots).
It should be mentioned that ACSFs depict only the short/medium-range atomic environment (defined by the cutoff radius) and thus the long-range electrostatic interaction, if present in bulk materials, e.g. ionic solids, may dominate the error of the NN. Considering the strong ionic nature of TiO_{2}, we have decided to utilize both the fixed charge model by setting the O and Ti atom as −0.5|e| and 1.0|e|, respectively, and the floating charge model as introduced by Behler.^{14} These atomic charges will be utilized for solving the Ewald electrostatic energy^{53} of a crystal to account for the long-range electrostatic interactions and the total energy for NN training (eqn (1)) is modified as
E^{tot} = E^{DFT} − E^{Ewald} | (12) |
Fig. 3a shows the energy-resolved NN performance, including the RMSEs of energy, force, and stress for the test set, where the RMSEs of the structures in the same energy interval, E to E + dE (0.20 eV f.u.^{−1}), are calculated. It is clear that the RMSEs increase as the energy increases, which suggests simply that the low energy structures can be predicted much more accurately compared to the high energy structures. This might be understood as follows. First, the predictive ability of the NN is much dependent on the density of the data set: the low energy structures are more frequently sampled due to the Metropolis Monte Carlo algorithm. Second, compared to the low energy structures, the high energy structures, often related to defected crystals, saddle states and liquid-like structures, have a rich structural variety and a high density of states. The sampling of these high energy structures is technically difficult and insufficient, even with the SSW global optimization method.
Nevertheless, for the most important energy region of materials, ∼0.6 eV f.u.^{−1} (e.g. the rutile-anatase solid phase transition barrier of TiO_{2} is calculated to be ∼0.3 eV f.u.^{−1} that occurs above 500 K in experiment^{40,54}), the NN performance is rather accurate with low deviations, which are ∼12 meV f.u.^{−1} (4 meV per atom), 60 meV Å^{−1} and 0.4 GPa for the RMSE of energy, force and stress, respectively. This provides us with great confidence for both new structure discovery and reaction pathway sampling (shown in the next section).
To illustrate the fast convergence for the L-BFGS simultaneous fitting of energy, force and stress (EFS training), we have compared the convergence curve with that from fitting the energy only (E training) as is done traditionally. This is shown in Fig. 3b, where the blue and red lines represent the EFS and E training, respectively (solid and dotted lines for the minimization of energy and force, respectively, as measured by the RMSE values). While both training sets show similar convergence for the energy RMSE, the performance for the force and stress (shown in the inset) is obviously better in the EFS training. Especially, the force RMSE in the EFS training after 2000 epochs is 173 meV Å^{−1}, which is about half of that in the E training (334 meV Å^{−1}). We emphasize that the forces and stresses are critical for global optimization: (i) they are required in the structural optimization to reach to the true local minima and also the saddle points; (ii) they determine the structural perturbation direction in the SSW global optimization and thus control the efficiency for PES exploration. The SSW method utilizes a numerical method, the constrained Broyden dimer method,^{55} to soften the randomly generated mode, along which the structure configuration is pushed out of the local minimum.
Finally, it might be mentioned that the obtained NN PES calculations are computationally much less demanding than DFT calculations. The NN PES calculations scale linearly with the increase of the system size due to the parallel nature of the HDNN scheme (partitioning the total energy into the contribution from individual atoms). We have compared the performance of the NN PES calculations for different sized TiO_{2} systems with that using DFT calculations. For TiO_{2} solids of medium size (48 atoms), the NN PES calculation is ∼20000 times faster than the DFT calculations (ESI Fig. S2 and Table S7†). With the high efficiency of the NN PES, it is now feasible to explore the global PES of large systems using the SSW method, which is practically forbidden in the framework of quantum mechanics calculations (see the example below, and the comparison of the computational cost is detailed in ESI†).
First, based on the global set we construct another two smaller data sets, as shown in Fig. 4: (a) a partial set, which contains only 4000 structures that are randomly selected from the global set; (b) a segmented set, which also contains only 4000 structures taken randomly from selected OP_{2}-energy regions (see Fig. 4b). The segmented set is thus highly discontinuous. We then follow the same NN training procedure to train these two data sets and obtain the corresponding NN PES, denoted as the partial NN PES and segmented NN PES.
Fig. 4 PES contour plots for two small DFT data sets truncated from the global data set in Fig. 2a. (a) Partial data set: 4000 structures randomly selected from the global set. (b) Segmented data set: 4000 structures taken randomly from selected OP_{2}-energy regions in the global set. Also see the Fig. 2 caption for the meaning of the black dots. |
Second, we run the SSW global optimization with a 12-atom unit cell using the global NN PES. All experimentally known structures (9 structures, their structural parameters are listed in the ESI†) in the 12-atom unit cell have been found from SSW. In Table 1, we compare the performance of the global, partial and segmented NN PES for these known structures and other 8 randomly selected low energy structures within the 0–0.6 eV f.u.^{−1} window (Str-1 to Str-8), which are benchmarked against the DFT calculations results. All of these structures have been locally optimized using different PESs until the maximal force component is below 0.01 eV Å^{−1} and the stress is below 0.01 GPa (the same criteria for minimum convergence are utilized throughout the work). It is noted that Str-1, 2, 3, and 5 are newly found structures from the 12-atom SSW-NN that are not present in the global DFT PES in Fig. 2. This is obviously due to the fact that the SSW sampling in DFT only runs limited steps.
No | Species | SG^{a} | OP_{2}^{a} | E _{DFT}/eV f.u.^{−1} | dE/(eV f.u.^{−1})^{b} | V _{DFT}/Å^{3} f.u.^{−1} | dV^{b}/% | ||||
---|---|---|---|---|---|---|---|---|---|---|---|
Global^{d} | Part. | Seg. | Global | Part. | Seg. | ||||||
a SG: symmetry group; OP_{2}: Steinhardt-type order parameter (see eqn (11)). b dE and dV: deviation of energy and volume in percentage from the NN PES with respect to those (E_{DFT} and V_{DFT}) from the DFT PES, respectively. c The data in brackets are the RMSE excluding the data of the TiO_{2}-OII phase. d Global, part. and seg. refer to the NN PES trained from the global DFT PES (Fig. 2a), the partial DFT PES (Fig. 4a) and the segmented DFT PES (Fig. 4b), respectively. | |||||||||||
1 | TiO_{2}-B | 12 | 0.52 | 0.00 | −0.01 | 0.01 | −0.01 | 36.66 | 0.43 | 0.38 | 0.34 |
2 | Anatase | 141 | 0.37 | 0.02 | −0.01 | 0.00 | 0.03 | 35.19 | −0.03 | 0.57 | 0.93 |
3 | TiO_{2}-II | 60 | 0.23 | 0.09 | 0.01 | 0.02 | 0.04 | 31.54 | 0.59 | 0.01 | 0.21 |
4 | Rutile | 136 | 0.28 | 0.11 | 0.00 | −0.00 | 0.03 | 32.15 | 0.14 | −0.32 | −0.26 |
5 | Str-1 | 74 | 1.16 | 0.12 | 0.01 | 0.08 | 0.17 | 83.16 | −0.39 | −0.82 | 1.25 |
6 | Str-2 | 2 | 0.87 | 0.14 | 0.01 | 0.03 | −0.06 | 52.88 | 2.26 | 1.75 | −8.13 |
7 | TiO_{2}-R | 62 | 0.32 | 0.18 | 0.01 | 0.02 | 0.00 | 35.31 | 0.94 | −0.32 | 3.05 |
8 | TiO_{2}-H | 87 | 0.64 | 0.18 | 0.04 | 0.02 | −0.02 | 39.56 | 0.80 | 0.10 | 0.29 |
9 | Lepidocrocite^{57} | 59 | 0.57 | 0.18 | −0.00 | 0.01 | 0.01 | 44.75 | 0.12 | −0.68 | −0.12 |
10 | Baddeleyite | 14 | 0.18 | 0.20 | 0.02 | 0.03 | 0.02 | 29.89 | 0.03 | −0.01 | 0.07 |
11 | Str-3 | 5 | 1.07 | 0.23 | −0.02 | 0.09 | 0.12 | 76.19 | 0.49 | 4.22 | 3.43 |
12 | Str-4 | 12 | 0.47 | 0.35 | 0.05 | −0.05 | −0.09 | 37.62 | 0.99 | 17.11 | 16.93 |
13 | Str-5 | 5 | 1.03 | 0.35 | 0.00 | 0.11 | 0.17 | 78.33 | 1.07 | 2.25 | 2.81 |
14 | Str-6 | 12 | 0.38 | 0.43 | 0.03 | −0.13 | −0.14 | 33.35 | 0.82 | 18.37 | 15.67 |
15 | Str-7 | 6 | 0.49 | 0.47 | 0.01 | −0.09 | −0.13 | 39.91 | −1.11 | 6.97 | 6.21 |
16 | Str-8 | 1 | 0.52 | 0.51 | −0.00 | −0.02 | −0.05 | 41.04 | 3.18 | 4.07 | 8.74 |
17 | TiO_{2}-OII | 62 | 0.02 | 0.87 | −0.03 | −0.80 | −0.82 | 26.25 | 0.49 | 96.88 | 35.36 |
RMSE | 0.02 | 0.20 | 0.21 | 0.94 | 22.72 | 9.55 | |||||
(0.02)^{c} | (0.06)^{c} | (0.09)^{c} | (0.97)^{c} | (5.81)^{c} | (6.02)^{c} |
In Table 1, we show the energy and volume for these energy minima computed from different PESs. As shown, the overall performance of the global NN PES is the best with an energy RMSE of 0.02 eV f.u.^{−1} and a volume RMSE of 0.94%, while the segmented NN PES provides the worst results with the energy and volume RMSE being 0.21 eV f.u.^{−1} and 9.55% (including the TiO_{2}-OII phase). The energy sequence for the lowest six minima from DFT is TiO_{2}-B < anatase < TiO_{2}-II < rutile < Str-1 < Str-2, which is correctly predicted only by the global NN PES. The segmented NN PES, for example, wrongly predicts TiO_{2}-B < anatase < Str-2 < TiO_{2}-II < rutile < Str-1.
It is noted that three structures, Str-1, 3, and 5, are poorly predicted by partial and segmented NN PESs. In particular, the high pressure phase TiO_{2}-OII is not stable on these two PESs. All Ti atoms in Str-1, 3, and 5 are in a 4-coordinated polyhedron [TiO_{4}], while that in TiO_{2}-OII is [TiO_{8}], which is a rare coordination environment in TiO_{2} solids (the percentage of structures containing 4/8-coordinated Ti in all data sets is about 2% and 5%, as compared to 45% for the common 6-coordinated Ti). Because the size of the global set is approximately 10 times larger than that of the other two data sets, only the global set contains a sufficiently large number of structures containing [TiO_{4}] and [TiO_{8}]. We expect that this helps to greatly improve the predictive ability in the global NN PES. From the comparison in Table 1, we conclude that both the size and the structural varieties of the DFT data set will affect the predictive ability of the NN PES. This reaffirms the necessity of applying SSW global optimization for global data set generation.
Fig. 5 (a) PES contour plot for TiO_{2} distinct minima sampled from the SSW-NN global search. (b) Density of states (DOS) plot for TiO_{2} phases in (a), showing that amorphous structures appear in the 0.2–0.7 eV f.u.^{−1} window. (c) The percentage of differently coordinated Ti for TiO_{2} structures. [TiO_{5}] and [TiO_{6}] are the main structural features for amorphous structures. Also see the Fig. 2 caption for the meaning of the black dots. |
The PES in Fig. 5a shows that the crystal minima are rather scattered below ∼0.2 eV f.u.^{−1}, which in total contains 224 minima in the 48-atom system (among these minima, 180 minima are newly found from the SSW-NN simulation). These belong generally to single- and mixed-phase crystalline structures. We note that the number of crystalline minima is substantially large considering that there are only ∼10 TiO_{2} phases reported from experiment.^{49} From 0.2 to 0.7 eV f.u.^{−1}, one can see a salient deep-blue zone due to the energy degeneracy of the structures, featuring a high, continuous and broad peak in the DOS plot (Fig. 5b). This zone is dominated by amorphous structures.
It is also of interest to analyze the geometrical structure for all of these TiO_{2} minima, which could help to clarify the difference between amorphous and crystalline structures. For this purpose, we have computed the Ti coordination number x ([TiO_{x}]) for all structures. In Fig. 5c, we plot the evolution of the Ti coordination environment with the increase of energy. This is done by counting and averaging the Ti coordination for structures in the same energy interval, E to E + dE (5 meV f.u.^{−1}). As shown, [TiO_{4}], [TiO_{5}], [TiO_{6}] and [TiO_{7}] are the major coordination environments for Ti. For the low energy crystal phases (e.g. rutile, anatase, and TiO_{2}-B), they generally have only TiO_{6} octahedra (100% [TiO_{6}], energy lower than 25 meV f.u.^{−1}). The [TiO_{5}] starts to appear above 30 meV f.u.^{−1} and it together with [TiO_{6}] is the dominant coordination for amorphous structures. By contrast, [TiO_{4}] and [TiO_{7}] are rare coordinations in TiO_{2} solids: they appear at high energies and are found either in high pressure phases ([TiO_{7}]) or in open (e.g. cage, 2-D layer) structures ([TiO_{4}]) along with low coordination O atoms.
We then focus on the low energy crystalline structures to search for stable porous structures, which could be potentially useful as molecular sieves and catalysts. Indeed, we find two such phases, labeled as phase-87 (I4/m, #87) and phase-139 (I4/mmm, #139), which rank 10 and 23 in the minimum spectrum, respectively, as shown in Fig. 6. These two phases are alike and have open cages (their structures are detailed in the ESI†), are not known previously and are also not identified in Fig. 2 (12-atom PES using SSW-DFT). As shown in Table 2, both phases are unexpectedly stable with the calculated total energy comparable to the common TiO_{2} rutile phase (0.11 eV f.u.^{−1}), which ranks 30 in the minimum spectrum. It might be mentioned that the NN PES has correctly predicted the energy and geometry of these phases: the energy, pore size (labeled in Fig. 6) and volume deviations are smaller than 0.02 eV f.u.^{−1}, 0.08 Å and 0.7 Å^{3} f.u.^{−1}, respectively, compared to the DFT results, as detailed in Table 2. Because of their high thermodynamic (as reflected from the energetics) and kinetics stability (as seen from the pathway in Fig. 7), further studies are carried out in the group to explore the synthetic possibility of these porous phases.
Phase | Z | Energy | Pore size | Volume | |||
---|---|---|---|---|---|---|---|
DFT | NN | DFT | NN | DFT | NN | ||
87 | 8 | 0.05 | 0.07 | 6.32 | 6.40 | 53.86 | 54.21 |
139 | 16 | 0.10 | 0.10 | 7.31 | 7.35 | 57.68 | 57.90 |
Fig. 7 Minimum energy reaction pathway for the solid-to-solid phase transition from phase-87 to anatase using DFT and NN PES. The reaction coordinate is the relative distance in the generalized coordinate containing both the lattice and atom (see ref. 56 for detail) with respect to the initial state and the final state. |
Taking the newly found porous phase, phase-87 (Fig. 6a), as the example, we have examined its kinetics stability by searching for the lowest energy pathways linking it with other phases. SSW pathway sampling was performed similarly as described in the previous work.^{52} In short, the SSW starts from phase-87 and explores the likely product phases nearby. All of the product phases that are different from phase-87 will be recorded, which yields a database of reactant/product pairs. In total, 170 reactant/product pairs were obtained after SSW visits to 1000 minima. By using the variable-cell double-ended surface walking (VC-DESW) method,^{56} we then searched the TS of all pathways and found that the lowest energy pathway escaping from phase-87 leads to the anatase phase.
Fig. 7 shows the reaction profiles for the lowest energy pathway from phase-87 to anatase using both NN PES (blue line) and DFT (yellow). As shown, the reaction profile from NN PES agrees generally well with that from DFT, where the maximum deviation in energy is less than 34 meV f.u.^{−1} and the energies of the TSs in the two profiles are nearly the same (∼8 meV f.u.^{−1}). The accuracy at the phase-87 side (30 meV f.u.^{−1}) is poorer than that at the anatase side (12 meV f.u.^{−1}), apparently because phase-87 as a new phase is not present in the DFT training set.
Importantly, the calculated barrier for the phase-87 to anatase transition is 0.22 eV f.u.^{−1}, which is quite high considering that the barrier from TiO_{2}-B to anatase is only 0.12 eV f.u.^{−1} and the experimental temperature for TiO_{2}-B to anatase conversion is ∼673 K.^{54} Our results indicate that the porous phase-87 is kinetically stable. It is well likely to stabilize the new phase at ambient conditions once it can be synthesized.
Finally, it might be mentioned that the NN PES can be improved once new structure data with quantum mechanics calculations are added into the training set. In this work, after SSW pathway sampling on phase-87, we add 2000 new structures from the sampling trajectories (calculated using DFT with high accuracy setups) into the global data set, and follow the same training procedure to train the NN PES. Using the re-trained NN PES, we recalculate the reaction pathway, as shown by the red line in Fig. 7, and find that the pathway using the expanded NN PES matches with the DFT pathway nicely, with the maximal energy deviation smaller than 12 meV f.u.^{−1} These results indicate that NN PES generated from DFT global PES, although it has a good predictive ability for both structure search and pathway sampling, can be further improved once new structures or pathways are identified. For the purpose of material discovery, such an improvement, however, might not be essential since the quantum mechanics calculations can always be performed afterwards for validation.
(i) The data set is generated using a SSW global structural search that runs in parallel from varied initial structures. These SSW simulations are carried out typically in relatively small systems under a periodic DFT framework.
(ii) The NN training of the data set utilizes a performance function constituting energy, forces and stresses, which is minimized using the L-BFGS method to achieve a fast convergence.
(iii) The global NN PES matches the DFT PES in a wide energy window (∼7 eV f.u.^{−1}) with regard to total energy, forces and stresses. The accuracy of the NN PES drops slowly with the increase of energy. For a medium size system (e.g. 48 atoms), the global NN PES is ∼four orders of magnitude faster than DFT.
Using the SSW-NN method, the global NN PES for TiO_{2} solids is constructed. In the most concerned energy region (e.g. ∼0.6 eV f.u.^{−1} above GM), the typical accuracy in RMSE is within 12 meV f.u.^{−1} (4 meV per atom) for energy, 60 meV Å^{−1} for force and 0.4 GPa for stress. We demonstrate the predictive ability of the current SSW-NN method by exploring the global PES of a 48-atom TiO_{2} system with variable cell. This leads to the finding of two new porous TiO_{2} crystal phases, whose stabilities are comparable with that of the common TiO_{2} rutile phase. The solid-to-solid transition pathway sampling based on the global NN PES of TiO_{2} proves the kinetics stability of the porous structure.
For a single material, the computational cost for generating the global NN PES is up to a few hundreds of CPU cores for one week to compute the raw and the fine data set using quantum mechanics (DFT) calculations. In this work, in the TiO_{2} case, the actual computation time of DFT calculations for raw and fine data set generation is ∼0.94 × 10^{6} and ∼4.80 × 10^{6} seconds per CPU (24 cores), respectively (or ∼6.7 days for 10 CPUs in total). In applications of material discovery, the NN PES helps to greatly expedite the SSW global PES search. We estimate that the overall SSW-NN computational cost (NN PES construction plus SSW search = 1.3 × 10^{7} seconds per CPU) is roughly 10^{3} to 10^{4} times lower than that using SSW-DFT directly (10^{10} to 10^{11} seconds per CPU, see ESI Table S8†). Considering the automated process, perfect parallelization and generality of the SSW-NN method, we expect that it is time to build a comprehensive database of global NN PESs for different materials. With the collective efforts from the community, the global NN PESs can be further improved/tailored for specific applications that require a high accuracy in particularly local regions of PES. The SSW-NN method thus provides a promising solution to bridge the time-scale in material simulation and may open a new era for material discovery from theory.
Footnote |
† Electronic supplementary information (ESI) available: Derivation for the gradient of J_{σ} with respect to NN parameters. DFT calculation setups. Parameters of atom-centered symmetry functions for generating TiO_{2} NN potential. Comparison of the performance of symmetry functions with different numbers of cutoff radii. Comparison of the computational cost of NN and DFT calculations. Comparison of structural properties of experimentally known TiO_{2} phase using NN and DFT PES. XYZ coordination of structures. See DOI: 10.1039/c7sc01459g |
This journal is © The Royal Society of Chemistry 2017 |