Metashooting: A Novel Tool for Free Energy Reconstruction from Polymorphic Phase Transition Mechanisms

We introduce a novel scheme for the mechanistic investigation of solid-solid phase transitions, which we dub \textit{metashooting}. Combining transition path sampling molecular dynamics and metadynamics, this scheme allows for both a complete mechanistic analysis and a detailed mapping of the free energy surface. This is illustrated by performing \textit{metashooting} calculations on the pressure-induced B4/B3 $\rightarrow$ B1 phase transition in ZnO. The resulting free energy map helps to clarify the role of intermediate configurations along this activated process and the competition between different mechanistic regimes with superior accuracy. We argue that \textit{metashooting} can be efficiently applied to a broader class of activated processes


INTRODUCTION
Molecular dynamics (MD) has become the method of choice to investigate complex chemical phenomena at the atomistic level. Thanks to both efficiently written algorithms and highperformance computational infrastructures, the scope of numerical simulations has markedly widened. Nonetheless, the presence of inherent kinetic bottle-necks means that entire regions of configuration space may remain unexplored. If such states are separated by large energy barriers, even the most efficient MD will end up spending considerable time within long-lived, metastable states. Furthermore, depending on the height of those barriers, the event of interest may fully elude observation over the time scale of the simulation [1][2][3].
Protein folding, nucleation, crystallisation, phase transitions and chemical reactions are all important phenomena which are characterised by activated steps. Their detailed mechanistic investigation would benefit considerably from a precise mapping of the underlying free energy landscape.
However, despite improving numerical algorithms and increasing computational power, their systematic investigation remains extremely demanding and often completely impractical. Methods to explore multi-minima energy landscapes exist [4,5] and recently there has been a pronounced effort to develop novel approaches to overcome this time-scale problem [3,6]. Historically, the use of an external bias aimed at enhancing the frequency of rare events in MD simulations is realised in methods such as Umbrella Sampling (US) [7]. This is achieved by systematically adding a bias along well-chosen coordinates, in order to sample only small but overlapping regions from which a free energy can be reconstructed. Another bias-oriented method, metadynamics (metaD) [3] exploits the concept of Collective Variables (CV), which span a lower dimensional coordinate space.
Within the space of these CVs, a history-dependent bias is deposited, which rapidly drives the system away from its initial, probable state. The definition of a good set of CVs is therefore key to the success of the method [8]. Within a CV space, standard MD would linger around a minimum close to the initial state corresponding to some value of the CV. Overcoming free-energy barriers by metadynamics inevitably leads to broader exploration of CV values by crossing over to different states within a finite computational time.
Metadynamics takes place within the space of one or a set of CVs, which have to be known at the beginning of the calculation. A valid set of collective variables optimally covers the entire space of conformations of interest. This clearly requires that CVs be engineered with the global free energy map in mind. However, this is evidently problematic, as often the underlying free energy landscape is the desired final result of the simulation and can often not be deduced by intuition. Additionally, the requirement for such a 'global' CV directly clashes with the effort to keep its dimensionality as low as possible.
Recently, a novel approach has been introduced, which allows for a "bespoke bias" based on a variational approach [9,10]. Within this approach, the CV problem is partially overcome by reinterpreting its meaning as a local descriptor of metastable states. Enhancing static fluctuations will amplify the probability of a crossover into another metastable state. Importantly, the bias can be deposited in such a way that partial filling can be achieved in all basins of interest. While the success of the method still depends on an appropriate choice of CVs, this represents a novel general scheme for enhanced-sampling MD simulations.
To tackle the intrinsic disparity between the time scales of activated processes and molecular dynamics simulations, the Transition Path Sampling method (TPS) [6,11] was introduced. TPS is based upon the idea that, in a complex system, crossing a barrier between (meta-)stable states may happen in a multitude of different ways. In such a scenario, the potential energy surface of a system is unlikely to be represented by saddle points alone. Instead, numerous points on the potential energy surface become relevant, some of which may still be stationary points, but many others not. The implementation of the method consists of collecting such an ensemble of reactive trajectories, without previous knowledge of the details of the process (that is, no a priori information or guess about the reaction coordinates is required -such information should naturally result from the simulation approach). Within a transition ensemble, the relevance of a path is weighted by its probability. TPS collects true dynamical trajectories and can be plugged on top of different molecular dynamics schemes, within different simulation ensembles [2]. TPS belongs to the class of pathway-orientated methods and allows for an exquisite level of detail in mechanistic investigations [12][13][14][15][16][17][18]. Different from metadynamics, TPS does not deposit any bias potential; similarly to metadynamics, it deploys a low-dimensional order parameter, which distinguishes between the basins of interest. TPS is designed to enhance statistics in intermediate regions between long-lived basins, however it is generally not a method of choice for free energy reconstruction as the main statistical weights are still contributed by (meta-)stable basins.
Here, we present a novel combination of TPS and metadynamics techniques, which we dub metashooting. This novel approach allows for reconstruction of the underlying free energy of a process from TPS trajectories. This combination is based on the propagation of CVs from TPS into metaD. While TPS controls the rapidity of CV variation (inherent to true dynamical trajectories), metadynamics deposits a bias around regions of the CV that vary comparatively slowly. In doing so, TPS keeps the path crossing probability ratios (A → B to B → A) unchanged, thus preventing unilateral over-biasing. By rapidly varying the controlling CVs in TPS, partial basin filling can be achieved, while maintaining the intermediate regions bias-free for most of the calculations. Only in its final phases do the details of the intermediate regions appear.
In the following we illustrate the details of our approach by performing a complete metashooting analysis of the solid-solid reconstructive phase transition between the B3 zincblende phase and the B1 high-pressure rocksalt phase of zinc oxide (ZnO). ZnO has proven to be extremely challenging over the years [19][20][21][22][23], with several attempts to conclusively shed light onto competitive mechanisms and intermediates. Saitta and Decremps first discussed the competitive nature of the 'tetragonal' and 'hexagonal' routes in the B4-B1 phase transition, and their work concluded that semiconductors involving d-electrons, such as ZnO, preferred the 'tetragonal' path [21]. However, other work seemed to contradict this -notably the experimental work of Liu et al., which utilised high resolution angular dispersive X-ray diffraction, which seemed to indicate that the 'hexagonal' pathway was the more favoured for ZnO at lower pressures [24]. Second harmonic generation experiments pinpointed the role of the pressure medium (hydrostatic vs. non-hydrostatic) to promote the formation of centrosymmetric iH or acentric tetragonal iT , respectively [22]. In a previous work [15], TPS was used to produce a detailed mechanistic pathway for the B4-B1 transition, and it was shown that the mechanism proceeds via a rich series of transition states and intermediates [12] Additionally, the two 5-coordinate intermediate motifs were visited and were shown to be competing with one another in the mechanism. The analysis, based on dynamical trajectories, concluded that a hexagonal intermediate could indeed be visited, however it was not essential to the transformation. In fact, the reconstruction was found to propagate over interfaces of local tetragonal motifs (Fig.1).
It is therefore apparent that the phase transitions between different polymorphs of ZnO are very non-trivial. Indeed, much debate still rages over their precise nature and work is still being published to attempt to unravel the minutia of these elusive mechanisms, with a particular emphasis on the B4-B1 transformation. In the following, after a summary of implementation details, the results of metashooting on a challenging solid-solid phase transition in this material between the B3 and B1 phases are discussed. were shown to compete against one another in the B4-B1 transformation. Whilst the iT was only found at the interface between B4 and B1, the iH structure sometimes dominated the entire system under study.
Despite this, iH phase was shown not to be crucial to the transformation [12].

METHODS
Our complete analysis of the B3-B1 phase transition in ZnO proceeds over 2 steps: i) TPS calculations, including the definition of a suitable order parameter and ii) free-energy mapping with metashooting, where TPS and metaD are combined.

Transition Path Sampling
The aim of TPS is to obtain the transition path ensemble [6,11] and an efficient approach to this task involves the use of Monte Carlo techniques. A random walk is carried out in the space of the trajectories, whereby a new path x (n) (τ) is generated from an old one, x (o) (τ). The implementation of the random walk moves are based on the shooting algorithm [2], whose general premise is to generate pathways by "shooting off" a new pathway from a randomly selected time slice x  Using the 1 st coordination shell only means that there is no discrimination between the B3 and B4 configurations. As the system is starting from a metastable B3 structure (see below), changes in the final regime can be expected. Furthermore, using solely the 1 st coordination number as the order parameter exerts absolutely no bias on the calculation. The variation in the order parameter arises simply as a result of the application of the transition pressure to the system. Hence, monitoring the 1 st coordination sphere does not force the system to evolve in any particular way over the course of the path sampling iterations. In TPS, both the overall mechanism and the trajectory endpoints are rectified [26]. Therefore, the order parameter is chosen with minimal constraints in mind, in order to allow the process to express the most probable outcome. A higher-order CS would drive the system into a specific transformation regime within a biased TPS approach, in order to study specific, less probable transformation pathways.

Metashooting
Metashooting combines TPS and metadynamics to efficiently reconstruct the underlying free energy landscape from TPS trajectories. This way, computing time will not be expended by filling other regions of configuration space that had no relevance to the transition.
The general procedure of the metashooting approach for a transition A → B is as follows: 1. TPS step 1: Starting from a randomly chosen snapshot along the TPS trajectories, shoot off a new trajectory and time propagate.
2. MetaD step 1: Once basin A is reached, deposit a bias using a metadynamics scheme (in this case, using the well-tempered approach), in order to locally partially fill the as-yet uncharacterised energy well. At this stage, a small number of metasteps are chosen, which allows the CV to vary about the characteristic value of the basin.
3. TPS step 2: Generate another trajectory (as in step 1) and propagate into the now biased basin A, to generate a true MD velocity distribution on the biased potential.

TPS step 3:
Similarly to TPS step 1, shoot off a new trajectory from the same snapshot as above and propagate, only this time backwards in time, to basin B.

MetaD step 2:
In basin B, apply the same metadynamics scheme as above for a number of steps, in order to partially fill this second basin of attraction.
6. TPS step 4: Re-shoot a novel trajectory from a random snapshot of the previous trajectory.
Propagate into the now biased basin B.
Repeat the above iterations of metadynamics and TPS moves, until the underlying free energy profile is fully converged.
The number of metadynamics steps for each metashooting move can be chosen such that only moderate variations around the current CV value are allowed, as a ∆CV Max . Alternatively, as a better means to control bias deposition without imbalance, TPS path acceptance probability in TPS can be chosen to stay the same, i.e. at no point can crossing over the activation energy barrier become easier for either direction A → B or B → A. This is realised within an adaptive scheme of rescaling momenta modifications (see below).

Molecular Dynamics
Born-Oppenheimer molecular dynamics simulations were performed using the cp2k package within the N pT ensemble at T = 300 K and p= 9.8 GPa. At this pressure, the enthalpy of B1 and B3 are equal. Constant pressure and temperature were ensured by the Martyna-Tobias-Klein [27] algorithm, allowing for anisotropic shape changes of the simulation box [28]. Inter-atomic forces were calculated from a Buckingham pair potential [29], which has been previously validated [15].
Long range electrostatic effects were accounted for using an Ewald summation. Time propagation was performed with a time-reversible predictor-corrector algorithm. An integration timestep of 0.2 fs was used. The simulation box contained 1200 Zn-O atom pairs.

Transition Path Sampling
The sampling starts from an initial trajectory connecting the limiting phases. A new trajectory is generated by selecting a configuration from the existing one and slightly modifying the atomic momenta. The modifications δ p are applied to randomly chosen pairs of atoms (i, j) according to: p new = p old + δ p · ( r j − r i )/| r j − r i | and p new = p old − δ p · ( r j − r i )/| r j − r i |, keeping both momentum and angular momentum conserved. The resulting atomic momenta are rescaled by a factor of E old kin /(∑ i=1 | p i | 2 /2m i ), in order to keep the total kinetic energy conserved. The propagation of the modified configuration in both directions of time (−t, +t) generates a new trajectory. Repeating this step provides a set of trajectories, with successful examples representing the transition ensemble. TPS requires an initial guess trajectory, which we obtained from a topological/geometric approach of interpolating between the two limiting crystal structure (B1 and B3), described elsewhere [13,14,26]. Momenta modifications were introduced using an adaptive scheme, which entailed rescaling δ p to larger or smaller values, based on whether this δ p led to a successful or failed trajectory, respectively. The acceptance ratio for the production runs was generally between 40-60%.

Metadynamics
Metadynamics calculations were carried out using the plumed plug-in [30], in which a historydependent Gaussian bias is added to the underlying potential. The well-tempered [31]

Mechanistic analysis
TPS was initiated from a B3-B1 trajectory regime, obtained from the modelling approach described and used elsewhere [13,32]. One of the advantages of initiating the process from a plausibly designed trajectory is that it takes fewer TPS iterations for the system to move away from the initial, unfavourable regime to a more probable one. The process of moving towards a probable regime is known as trajectory decorrelation. In fewer than 50 TPS iterations, the procedure steered the trajectory away from the collective motion encoded by the geometric model towards a regime characterised by nucleation events. The trajectories continued to evolve until no further major mechanistic changes were observed, which occurred after several hundred iterations The complexity of the system leads to several types of trajectories, with some degrees of (dis)similarity in the details of the reconstruction.
A representative B1 → B4/B3 trajectory is shown in Fig. 2. A nucleus of 4-connected ZnO (Fig. 2, 0.6 ps) forms within B1, followed by the formation of regions of 5-fold coordination at the interface with B1 (Fig. 2, 0.7 ps), which are subsequently outgrown by 4-fold coordinated regions (Fig. 2, 0.9-4.7 ps). Regions of 5-fold coordination (mostly iH) are found between the expanding 4-connected region and B1. The transformation into a mixed B4/B3 entails complete resorption of areas of 5-fold coordination, Fig. 2 The B4/B3 → B1 mechanism slightly deviates from the reverse process (Fig. 3). In this trajectory type, the starting 4-coordinate material is first collectively distorted (Fig. 3, 0.0-3.2 ps, purple atoms). From 5-coordinated areas of iH motifs (Fig 3, 3.9-5.2 ps), the B1 structure develops (Fig.3, 5.2-6.4 ps). Thus, over half of the transformation time is spent slowly squeezing the system until it can transform into 5-coordinated iH regions, indicating the presence of a deep but fairly gently-sloping energy barrier separating the B4/B3 basin from the intermediate config- urations. In all configurations, iH motifs are relatively long-lived. This is not the same as the reverse B1 → B4/B3 transition, in which iH regions were less extensive and promptly eliminated at interfaces (Fig. 2), with the majority of the simulation time being dominated by coexisting 4and 6-coordinated motifs.
Consistent with previous calculations [12], different intermediates are associated with different types of trajectories. The presence of B3 adds further structures to an already complex scenario.
In previous work, low transition pressure (less than 10 GPa) were noticed to favour the appearance of iH, which benefits from a less severe [0001] B4 compression. The results presented here appear to corroborate that view.

Metashooting
In order to better understand the competitive nature of different mechanisms, the free energy (FE) was evaluated by coarse-graining the potential energy surface onto a suitable set of collective variables. Free energy surfaces have found extensive use for the description of biological systems (in particular peptide calculations [33]), but are less frequently used in the solid state.
In metashooting, mechanistic details acquired from the transition path sampling procedure re- Moreover, intermediates and transition states that are vitally important to the transformation will appear obvious along the reaction scheme. Such insight into trajectories using path-based analyses of transformations only is not trivial.
The free energy profile was constructed from converged TPS trajectories. After 250 iterations of the metashooting procedure, the free energy had fully converged into a detailed landscape.

Progression of the metashooting
The evolution of the free energy landscape according to the metashooting algorithm is first discussed, followed by an analysis of the final profile generated by the procedure.
The first few iterations of the procedure (Fig. 4)  B3, B4 and other forms.
After 150 iterations, that had resolved into a bona fide minimum i-1. By iteration 200, a second minimum centred half-way between the two limiting basins started to appear (Fig. 4c), which had fully resolved by iteration 225 (Fig. 4d), denoted i-2. By 250 iterations (Fig. 5)  In i-1 a portion of the pristine B1 fragments contains 4-and 5-coordinated motifs. With respect to the way the square pattern is opened, the locally 5-coordinated motifs correspond to iH.
This small region is extremely short lived but present in every successful trajectory sampled, thus explaining its place on the free energy profile. Interestingly, this mechanistic pattern can not propagate as it is. The next free-energy barrier TS-B is overcome by including an additional (111) B1 shearing mechanism, as seen in both the forward and reverse TPS trajectories. This configuration, which includes an enlarged seed region and the onset of the shearing, corresponds to the free-energy maximum TS-B. The configuration TS-B could easily be seen in the TPS trajectories, however the fact that this part of the transformation corresponds to two distinct events would have been completely missed without the metashooting analysis.   Fig. 3, 3.9-5.6 ps).  (order parameter) change, and corresponds to mixed motif configurations as at 4 ps in Fig. 2 for the B1 → B4/B3 direction, or at 4.7 ps for the B4/B3 → B1 (Fig. 3).

Successful
The appearance of mixed B1-iH agrees with previous studies on ionic compounds undergoing the same B4/B3→B1 transition [34][35][36], which may indicate transferability of this mechanistic analysis.

Energy barriers
From the converged free energy surface, it is apparent that there are six distinguishable energy barriers associated with the pathway -three for the B4/B3 → B1 (Fig. 8, E 1−3 ), and three for the reverse process (Fig. 8, E 5−7 ).
Trajectory variety has shown that the system may proceed along different pathways, therefore crossing different features of the free energy plot. This will lead to different values for the total amount of energy overcome by individual trajectories. The following quantitative analysis of the energy differences assumes measurement to and from the centres of each of the turning points, as described in Table II, thereby indicating the energy associated with the minimum energy transition pathway that visits all maxima and minima. Every trajectory must surmount an energy barrier at least equivalent to the activation energies presented, as successful trajectories must overcome the highest energy barrier to cross to the other side, irrespective of the path taken.
It is also apparent from inspection that the energy barriers for each stage of the transformation are very different for the forward and reverse trajectories. This may go some way to explaining why there are different pathways associated with the forward and reverse trajectories, and perhaps why there are such notable hysteresis present in such transformations.
The change in free energy is calculated as the difference between the energy values of B1 and B4/B3 basin centres. The intial pressure was chosen based on zero enthalpy difference between B3 and B1, which was dictated by the model used to commence the TPS cycles (see above).
The difference in free energy between the forward (B4/B3 → B1) and backward transformation is not zero at the chosen simulation pressure (p = 9.8 GPa). The inclusion of lattice dynamics [37] FIG. 8. Summary of energy barriers associated with the free energy profile. Activation energies are realised as the largest barrier from the starting basin. Note that these barriers are evaluated as differences from the centres of each of the turning points.
into fist principles calculations was shown to shift the (∆H = 0) "equilibrium" pressure to a lower value. Consistently with those findings, Fig. 8 pinpoints the role of entropy, which is positive going from B4/B3 to B1. The disordered nature of the B4/B3 reinforces this effect.

DISCUSSION AND CONCLUSIONS
Using transition path sampling methods, the phase transition characterised by a sequence of nucleation and growth events between a mixed wurtzite-zincblende (B4/B3) and rocksalt (B1) has been described. The forward and reverse trajectories were found to be different, each with different competing intermediates and preference for one mechanism over the other for both transformations.
A typical B1 → B4/B3 trajectory was characterised by the formation of an initial seed, followed by gradual transformation to a mixed 4-coordinate product via coexisting motifs. Trajectories could also end up trapped in a long-lived intermediate 5-coordinate basin, corresponding to a mixed rocksalt-5 coordinate hexagonal structure denoted B1-iH. The B4/B3 → B1 transition entailed a more stepwise process, which involved transition to a 4-coordinate mixture via local motifs of hexagonal iH before transforming to the final B1 product.
A novel method denoted metashooting was developed as a combination of TPS and welltempered metadynamics, using the 1 st and 3 rd average coordination spheres as the collective variables. TPS ensured rapid variations of the CVs, and a balanced filling of the basins of interest.
The metashooting procedure is able to decipher the underlying free energy landscape of the transformation, and structural information about the nature of appropriate maxima and minima can be extracted from the resultant plot. In addition, energy barriers and changes in free energy can be ascertained and successful and failed trajectories mapped onto the final free energy plot, in order to gain a true understanding of the energetics and structural changes inherent within each individual trajectory.
Metashooting can accommodate additional collective variables, for example higher coordination sequences, in order to further distinguish between the phases. Using the 1 st and 3 rd coordination spheres was convenient in this case as it was an identical measurement to the order parameter used in the path sampling calculations. In general, the CV set used on the metashooting layer can be more detailed then the order parameter utilised in the TPS steps.
It is believed that the metashooting procedure, which combines transition path sampling and metadynamics, could set a novel paradigm for future investigations into condensed matter phase transitions. Such a scheme could easily be transferred to another system, under any level of theory, to ascertain the underlying thermodynamics and kinetics of a temperature or pressure-induced phase transformation.