Structure Prediction of Nanoclusters; a Direct or a Pre-screened Search on the Dft Energy Landscape?

The atomic structure of inorganic nanoclusters obtained via a search for low lying minima on energy landscapes, or hypersurfaces, is reported for inorganic binary compounds: zinc oxide (ZnO) n , magnesium oxide (MgO) n , cadmium selenide (CdSe) n , and potassium fluoride (KF) n , where n = 1–12 formula units. The computational cost of each search is dominated by the effort to evaluate each sample point on the energy landscape and the number of required sample points. The effect of changing the balance between these two factors on the success of the search is investigated. The choice of sample points will also affect the number of required data points and therefore the efficiency of the search. Monte Carlo based global optimisation routines (evolutionary and stochastic quenching algorithms) within a new software package, viz. Knowledge Led Master Code (KLMC), are employed to search both directly and after pre-screening on the DFT energy landscape. Pre-screening includes structural relaxation to minimise a cheaper energy function – based on interatomic potentials – and is found to improve significantly the search efficiency, and typically reduces the number of DFT calculations required to locate the local minima by more than an order of magnitude. Although the choice of functional form is important, the approach is robust to small changes to the interatomic potential parameters. The computational cost of initial DFT calculations of each structure is reduced by employing Gaussian smearing to the electronic energy levels. Larger (KF) n nanoclusters are predicted to form cuboid cuts from the rock-salt phase, but also share many structural motifs with (MgO) n for smaller clusters. The transition from 2D rings to 3D (bubble, or fullerene-like) structures occur at a larger cluster size for (ZnO) n and (CdSe) n. Differences between the HOMO and LUMO energies, for all the compounds apart from KF, are in the visible region of the optical spectrum (2–3 eV); KF lies deep in the UV region at 5 eV and shows little variation. Extrapolating the electron affinities found for the clusters with respect to size results in the qualitatively correct work functions for the respective bulk materials.


Introduction
The prediction of the atomic structure of materials is of fundamental importance.This is particularly true for nanoclusters as they can exhibit substantially different chemical and physical properties compared to bulk phases.An understanding of the possible structural changes that arise as a function of the nanocluster size can lead to the discovery of new materials with desirable properties. 1 For example, it is known that the optical properties of nanoclusters vary with size, providing an alternative ''tuning'' parameter over conventional methods such as using dopants in the bulk structures. 2 Obtaining the structural information for small nanosized particles from experiment is, however, extremely difficult.Therefore, computer simulations are often used as a complementary predictive tool or an aid in the analysis of experimental observations.The goals of this study are to (1) optimise the use of computationally demanding ab initio calculations that are employed to assess the quality of trial structures during the search for atomic structures of nanoclusters, and (2) report and compare the structure and properties of these nanoclusters.
Global optimisation techniques have been adapted to explore energy landscapes and tuned to locate efficiently the lowest energy structures.][18] Evolutionary algorithms (EA), which are employed in the work reported here, have been used for structure prediction with much success using both interatomic potentials [19][20][21][22] and ab initio 13,[23][24][25][26][27][28] levels of theory to describe the energy landscape.Informative reviews of structure prediction techniques have been reported by Woodley, 1 Johnston 29,30 and more recently by Catlow et al. 31 Previous studies have focussed on the method of exploring a fixed energy landscape, i.e. one or more methods applied to a particular system, whereas few studies have focused on the effect of changing the quality of the energy function.Mainstream global optimisation in the field of structure prediction is typically applied to searching the energy landscapes defined by interatomic potentials, and subsequently the better structures are refined at a higher level of theory, such as DFT. 32With greater availability of structural data for bulk rather than nanoclusters, interatomic potential parameters are generally fitted to reproduce bulk properties.][35] In this paper, switching the definition of the cost function (e.g.energy of formation) used to assess the quality of a particular atomic configuration during the global optimisation stage is investigated.Two different approaches to searching energy landscapes are compared for a range of binary compounds.In each approach both stochastic quenching and a Lamarckian evolutionary algorithm, as implemented within the Knowledge Led Master Code (KLMC), 36 are tested.Both approaches sample many candidate structures and both utilise standard local optimisation routines to relax the atomic configuration of each sample point to an energy minimum.
In the first approach, an initial structural relaxation is performed at an interatomic potential level, and if the resulting energy is below a certain threshold then that candidate structure is further refined at the DFT level, i.e. the sample points are pre-screened.This approach of searching a hybrid landscape is applied here to configurations of ZnO and MgO.A similar approach is employed by others (see e.g.ref. 37).As these two compounds have been studied extensively, they provide a suitable set of test structures for the structure prediction module of KLMC.This type of pre-screening is one of a broad family of techniques to assess the quality of candidate structures, which include geometrical, simple-potential, tight-binding, and semi-empirical methods.
The second approach is to perform a direct search on the DFT energy landscape, although simple geometrical constraintsatoms must be at least a typical bond distance apart -are applied.This approach is employed for four binary compounds: ZnO, MgO, KF, and CdSe.These compounds are used in a wide range of industrial processes: zinc and magnesium oxides are basic components in catalysts, 38,39 potassium fluoride is used in the manufacturing industry for brazing aluminium, 40 and cadmium selenide is used as a principle component in solar cells. 41These materials differ in their polarisability -from strongly polarisable CdSe to weakly polarisable KF -and thus offer contrasting energy landscapes, and, at the nanoscale, may have a broad range of structural motifs.For the computational cost of assessing each sample point on a DFT landscape many more can be assessed on the semi-empirical landscape.In contrast to CdSe, suitable empirical potentials are already available for MgO and ZnO and their landscapes have already been explored. 20,30hus, MgO and ZnO were chosen as our initial test systems.

KLMC program -combining global and local optimisations
All of the nanoclusters presented in this work were obtained using the Knowledge Led Master Code (KLMC). 36KLMC has been written in FORTRAN90 as one program, which uses MPI and SYSTEM calls.In common with other similar software for global optimisation, 21 in order to (a) compute energies and (b) compute and minimise (using standard local optimisation routines) forces on the atoms, KLMC can call a range of stateof-the-art third party software (TPS) packages.
As well as containing routines for global optimisation, KLMC, in the first instance, is a computational tool for the automation of many tasks that traditionally have had to be done by hand.Basic examples include: creation of input files for TPS; submission of calculations using TPS on either local or external computer platforms; monitoring progress of calculations performed by TPS; extraction of data from TPS output files for use in other KLMC routines or TPS; and, if required, the resubmission of uncompleted jobs (in an appropriate way so as to balance the workload over all available nodes).The evaluation of a set of structures stored as xyz file(s) -creation of input files, feeding through the chosen TPS, and collection of energies -is, perhaps, the simplest application of KLMC.
Routines within KLMC can also be employed to automate multistage optimisations (as used in this study), wherein the energy and atomic forces are computed at different levels of theory during the relaxation of each candidate structure, i.e. after a set number of line minimisations (iterations) or once a particular tolerance is achieved, the level, and therefore the computational load, is increased, and the relaxation of the structure is continued at this higher level.KLMC can therefore be seen as a tool to chain optimisation iterations together at differing levels of theory.Moreover, with the range of available global optimisation routines, KLMC has been developed to enable the flexibility of searching either on the interatomic potential energy landscape, on DFT energy landscape, or on DFT energy landscape after initial refinement at the interatomic potential energy level, i.e. pre-screening candidate structures prior to refining at the DFT level.The first application, of chaining two energy functions within the Monte Carlo basin hopping routines of KLMC, enabled the successful prediction of low energy atomic structures of LiF 3 nanoclusters. 36uring the evaluation of each candidate structure, standard local optimisation techniques were used to minimise the energy as defined by a rigid ion model in the initial iterations and a shell model in the final iterations.
In this work, KLMC is used to search the DFT landscape or the hybrid landscape resulting from chaining a rigid ion mode with DFT.Calculations at the interatomic potential level of theory are performed by the General Utility Lattice Program (GULP), 42,43 whereas for the DFT level of theory, a bespoke library version of the numeric basis set computer program FHI-aims (version 071711_6) is employed. 44Parallel versions of both these TPS are available; for GULP, however, the less computationally demanding of the two, the serial version was used in this work.
As the chosen search algorithms are amenable to simultaneous evaluations of multiple sample points (configurations), a Message Passing Interface (MPI) parallelism strategy has been employed within KLMC in order to exploit many processors on high performance computing (HPC) platforms (large-scale supercomputers).Thus, KLMC makes simultaneous calls to GULP.The bespoke FHI-AIMS library enables a more efficient approach than having KLMC employ a ''system call'' to start each run of FHI-AIMS.Although there is an added risk that the TPS can crash KLMC (the TPS must pass back control to KLMC and not make a call to ''STOP''), the library approach avoids a potential problem caused by HPC platforms, namely, that MPI system calls are not allowed unless made on the head node(s), i.e. from the originally submitted script and not from KLMC.We note that KLMC also has a client-server capability such that KLMC can run on a local desktop machine whilst the computationally expensive DFT calculations can be sent to and retrieved automatically from a remote supercomputer for processing (which also avoids nonhead node MPI calls).
In this work, a Lamarckian evolutionary algorithm (EA), as implemented within KLMC, is employed to search the energy landscapes.Within an EA, the natural selection processes of ''survival and procreation of the fittest'' is simulated, where the metric of fitness for our application is the energy of the nanocluster.Ideas of Lamarck rather than Darwin are adopted; genetic information -in the form of structural coordinatesthat has matured (aged) is used to create new offspring.Here, the aging process of a child becoming an adult is simulated by the application of standard local optimisation routines to relax the atomic structure.While this method of evaluating each candidate structure is computationally more expensive, cf.single energy calculations in Darwinian global optimisation, fewer candidate structures need to be evaluated, and all matured candidates (adults) are at least stationary points on the energy landscape.][47] The starting point for an EA -see Fig. 1 -is the generation of an initial set of representative structures, or initial population, which should ideally span the full potential energy surface.There is no unique way to generate the initial coordinates; one could use previously known structures out of a database or extract candidate structures from a high temperature molecular dynamics simulation at sufficiently spaced points of time.For the work presented in this manuscript, initial candidate structures were created, at minimal computation cost, by randomly placing the appropriate number and type of atoms in a cube with length between 4 Å, for the smallest clusters, and 10 Å, for the largest.During this process, a constraint was imposed that a distance of at least 80% of a typical bond length must separate all atoms to prevent unfeasibly dense structures being created, which could cause problems with the DFT optimisations.Expanded structures benefit from the initial increased mobility of atoms as they have more room to move during geometry optimisation.This restriction also prevents randomly placing two metal cations next to each other, which would result in the DFT predicting an unwanted formation of a metal-metal bond.
During each cycle of the EA, the current population of m members (labelled as the ''Nth population'' in the flow chart) is replaced by the unique lowest energy structures recruited both from the current and the ''New adult population''.Note that duplicates (based on comparing energies and moments of inertia) are actively removed to help to maintain structural diversity in the population.In each EA cycle, before Action 5 (''Tournaments for survival'') can be performed, KLMC generates m candidate structures that form the ''Child population'', which subsequently matures to become the ''New adult population''.Details of how each candidate matures or how each candidate is evaluated, are given below and shown as a flow chart in Fig. 2. From the current population, tournaments are simulated within sets of three candidates chosen at random.The best structure in each set, one with the lowest minimised energy, is recruited in the ''Population of winners''.Action 1 of each EA cycle is repeated m times; the resulting population of winners may include the same structure more than once.Each member of the ''Population of winners'' is paired with one randomly selected member of the current population and passed to Action 2 (the application of a crossover moveclass, which combines structural information from both parent members to create a new child candidate structure) and Action 3 (a mutation moveclass, which perturbs the resulting child structure to help to introduce new structural features). 14In particular, the crossover moveclass, used in this work, fuses a fragment from each parent structure to form a new structure -subject to forming a child structure with the same composition and total number of atoms as each parent.The mutation moveclass implemented in this study is a random displacement of each atom by up to a maximum of 1.0 Å from its original location.
The maximum number of EA cycles is set by the user, but can be reduced by either the time limit on runs on the chosen computer platform or the convergence of the Nth population (i.e.no improvements in the quality of the structures within consecutive populations).The population size, m, was typically set to either 10 or 16.

Assessment of candidate structures
ZnO nanoclusters have been investigated in previous work using interatomic potentials, where the atomic structures were predicted using EA routines within the GULP code. 20,48,49These configurations were also found using DFT methods, 50 although for larger sized clusters the ranking (with respect to energy) of the nanoclusters differed slightly.In the current work, interatomic potentials are employed for ZnO and MgO during a prescreening stage before evaluating the DFT energy.The chosen interatomic potentials include a number of superimposed terms: a Coulomb potential: where q i is the point charge representing ion i, k is a dimensional constant, and r ij is the interatomic distance between point charges i and j; a Lennard-Jones potential: where A and B are species dependent parameters; and a Buckingham potential: where C, r and D are also species dependent parameters.2][53] Unusually, three sets of Buckingham parameters for the Zn-O interactions were defined; the Zn-O Buckingham potential parameters are dependent upon r ij .A polynomial function, is employed between the three regions where the Zn-O Buckingham potential acts.The potential parameters of eqn (4) were fitted to ensure the resulting Zn-O potential is smooth (an important feature for many standard local optimisation algorithms).All parameters, along with the range of interatomic distances in which they apply, are given in Table 1.The formal charge of +2.0 was used for magnesium and zinc cations and À2.0 for oxygen anions.Structural relaxation during the pre-screening stage was achieved using the method of conjugate gradients until the resultant atomic forces were less than 10 À2 eV Å À1 , which typically results in an energy point near a stationary point, thereafter the more advanced rational function optimisation algorithm was used to ensure that the stationary point is either stable or metastable and forces are less than 10 À7 eV Å À1 .
After the population of candidates have been pre-screened, the structures are further refined at the DFT level; KLMC calls FHI-aims, rather than GULP.All of the calculations were performed with the FHI-aims default ''light'' settings for species specific basis sets (analogous with split valence and double zeta basis sets used in conventional Gaussian codes) and grids combined with a scalar ZORA relativistic treatment. 54During relaxations, the restricted wavefunction description was employed to discourage fragmentation of clusters, and energies were converged to within 1 meV per atom.Typically, the solids-corrected Perdew-Burke-Ernzerhof (PBEsol) Generalised Gradient Approximation (GGA) exchange-correlation functional was employed. 55PBEsol was chosen as it is unbiased and is not too computationally expensive.The aim is to develop a method that is generically applicable rather than tuned to a particular system or property.A quasi-Newtonian relaxation algorithm was used with a convergence criterion of 10 À3 eV Å À1 .

Atomic structures and properties of (AB) n nanoclusters
Four binary compounds are investigated: KF, MgO, ZnO and CdSe.The atomic structures of their nanoclusters may depend on the relative size of the cation and anion, the degree of ionic character of the bonding, as well as the oxidation state and polarisability of the ions.Table 2 contains the differences in the electronegativity 56 of the atoms in each binary, DX, which gives a measure of the ionic character of a bond between them.Data in Table 2 is arranged with respect to increasing DX, and not surprisingly the greatest ionic character is expected for K-F bonds.The dissociation energy of a dimer and the refractive index of the binary compound also increase, from CdSe, ZnO, MgO to KF.The refractive index gives a measure of how polarisable the atoms are in the bulk phase, and therefore how polarisable we expect the atoms will also be when part of a nanocluster.Comparing cationic and anionic radii, K + is larger than F À , whereas it is the reverse for CdSe; Cd 2+ is smaller than Se 2À .If the ratio of ionic radii is the dominant factor determining the atomic structure, then we may expect similar nanoclusters for MgO and ZnO, and that the ordering of cations and anions in a particular structural motif for KF and CdSe to be reversed.
The lowest DFT energy nanoclusters of four compounds -ZnO, MgO, KF and CdSe -obtained using KLMC, with either the direct or pre-screening approach, are shown in Fig. 3 to 8. 59 As well as ordering by size and rank, the images of the clusters are also arranged such that the least polarisable compound (KF) are shown at the bottom, and the most polarisable (CdSe) at the top, of each figure.1][62] Likewise for CdSe nanoclusters. 63KF nanoclusters have received much less attention in the literature, and therefore comparison is made with those reported for (LiF) n , (NaCl) n , (NaCl) n Cl À and (MgO) n when constituents are modelled as Mg + and O À . 19,28,64,65 more detail comparison of the LM with those reported elsewhere is given below.For each size and compound, the  56 of the atoms (X B -X A ); ratio of ionic radii 57 (R B /R A ); bond dissociation energy 58 at 0 K of the A-B dimer (D); and index of refraction 58 of the compound AB (n r )   This journal is © the Owner Societies 2014 lowest energy, local minimum (LM), nanocluster is referred to as the GM (putative global minimum), the second lowest energy structure is referred to as LM2, and the third lowest energy structure is referred to as LM3.
3.1.1Two to six atom clusters.The bond lengths of the n = 1 structures were used to determine the minimum interatomic distance between the atoms in the initial, randomly generated nanocluster configurations.The structural parameters of both n = 1 and 2 nanoclusters along with the bond strengths are summarised in Table 3.
The first three smallest GM nanoclusters are shown in Fig. 3.For each size, the same structural motif is found for all four compounds.The shape of the n = 3 ring for KF is comparable to the rigid ion model of (ZnO) 3 , i.e. a model that does not include polarization effects.When a shell model is used anions are displaced further out from the centre than the cations, resulting in the structure generated for (ZnO) 3 using FHI-aims. 20he DFT binding energy of an n = 1 cluster is presented in Table 2, which gives a measure of the strength of the dimer bond, calculated here with a restricted Kohn-Sham orbital ansatz.The same rank of the dimers with respect to this calculated bond strength and the dissociation energies reported in      bond strength in oxide dimers from the fluoride, as well as the increased polarisation (more acute cation-anion-cation bond angles) can be readily understood as an effect of the negative second electron affinity of oxygen, which counteracts the Coulomb energy gain due to a transfer of an additional electron from the cation to the anion and bond shortening between the smaller sized ions.Continuing the trend, the bond strength of CdSe is weakened significantly while it is more easily polarised due to the much larger ionic sizes of both the cation and anion even though they retain the formal oxidation states of the respective metal oxides.
3.1.2Eight to ten atom clusters.Fig. 4 shows the lowest energy structures for n = 4 and n = 5, and at this size the ranking of the structural motifs can be compound dependent.For n = 4, ZnO and CdSe have a ring GM structure and a cuboid LM2 structure.The cuboid motif, however, is the GM for MgO and KF.The coordination of the atoms in the ring is lower than that found in the bulk phase.Thus, the stability of a cluster is expected to improve if there is an increase in the average coordination.However, the improvement in stability caused by an increased number of nearest neighbour (cation-anion) interactions is offset by strain and second nearest neighbour effects.For each compound, the anion-anion and cationcation distances in the ring are shorter than those in the cuboid, and thus this second term favours rings.Moreover, there is typically less strain in the rings as the lower coordinated anions have more space for their valence electrons to relax into.The magnitude of this third stabilisation energy (or polarisation) will also depend on how easily the anions are polarised.Thus the observed trend in our predicted n = 4 GM nanoclusters is: from a KF cuboid to a CdSe ring.
For n = 5, MgO and KF have the same structural motifs and corresponding ranks for the three lowest energy nanoclusters.The two lowest energy n = 5 clusters for ZnO and CdSe also have the same structural motif, but differ in their ranking.
The degree of polarisation is reflected in the appearance of the rings resembling more like a square (for n = 4) or a regular pentagon (for n = 5) than a regular octagon or regular decagon, respectively.Curiously, however, even though ZnO adopts the ring motif as its GM, CdSe has the same GM as MgO and KF.The balance between the destabilisation of strain and nonbonded repulsive interactions and the stabilisation of the attractive Cd-Se interactions for (CdSe) 4 and (CdSe) 5 reverses resulting in an increase in the average coordination with an increase in n.This structural transition of the GM from a 2D to a 3D motif occurs earlier for MgO and KF, between n = 3 and 4. ZnO GM nanoclusters remain 2D, as the lowest energy 3D structures are 2.5 eV and 2.4 eV higher in energy for n = 4 and n = 5, respectively.The LM2 for the other compounds are close in energy to the GM (within 1 eV).Note that the n = 4 ring structure for KF is actually ranked fourth lowest in energy, being 1.019 eV higher in energy than the GM.
The change in relative stability between tetragonal and the larger sized rings manifests in the appearance of ladders; the LM3 structure of MgO and LM2 structure for KF.The average coordination of the atoms within the ladders is also higher than that in the 2D-rings, but less than that in the cuboids.Thus, preferred coordination is also related to the stability and ranking of the clusters; Mg is less likely than Zn to be stable in a low coordinated site. 60For example, the different ranking in the lowest two n = 4 clusters and the structural motif of ZnO LM3, which has three two-coordinated cation sites, is unstable for MgO.
3.1.3Twelve to eighteen atom clusters.For n = 6 and n = 7, shown in Fig. 5, the ZnO energy landscape retains the 2D ring motif as the GM, although the energy difference between the GM and LM2 is now reduced to less than 1 eV.The transition of ZnO GM to a 3D structure occurs between n = 7 and 8 (see Fig. 5  and 6), where a drum is adopted as the ZnO n = 8 GM.Unlike the other compounds, ZnO also has other 2D clusters within the top three configurations for n = 6 and 7; patchworks of three rings.As seen for n = 5, only (ZnO) 7 has a different motif for the GM.
The structural motifs of all the low energy MgO clusters, shown in Fig. 5 and 6, can be found in clusters shown for one or more of the other compounds.The n = 7 LM3 motif is LM3 for ZnO and LM2 for CdSe; the n = 8 LM3 drum motif is the GM for ZnO; and n = 9 LM3 is the GM for both ZnO and CdSe.The other MgO motifs, including tubes and rock-salt cuts, are all found for KF.The n = 6 drum (double ring) is well known and appears in all the compounds (either GM or LM2), and has recently been shown to be important for hydrogen storage. 66or each size, at least one motif of the metastable low energy configurations shown for CdSe is not found elsewhere.In particular, even though the rock salt motif is not seen in the figures for CdSe, clusters with an average coordination number greater than three are competitive for (CdSe) 9 ; there is a highly coordinated central Se in both LM2 and LM3.
3.1.4Twenty to twenty four atom clusters.Low energy nanoclusters for n = 10 and n = 11 are presented in Fig. 7 and n = 12 in Fig. 8.For n = 10, MgO and KF again have the same motif for their GM, however, the LM2 of MgO, which also has hexagonal rings, is the LM3 of KF as the LM2 structure for the latter compound is a non-cuboid rock-salt cut.The GM of (MgO) 11 is the LM2 of (KF) 11 , whereas the motif of the other (KF) 11 clusters are again based on rock-salt cuts.The similarities between MgO and KF continue for n = 12, where, apart from the change in ranking, the clusters have the same motifs.Although rock-salt cuts are adopted by KF and MgO, the relative stability of tubes is greater for MgO, and as such the GM of (MgO) 12 is a tube, and the (MgO) 12 sodalite cage (see GM for (ZnO) 12 and (CdSe) 12 ) is only 0.15 eV higher in energy than LM3. 67he LM of (CdSe) 10 are very similar (and probably contained within the same energy super-basin).LM2 has the same structural motif as LM2 for (ZnO), the right hand side of which (as viewed in Fig. 7) is the same as that in the n = 9 GM for ZnO and CdSe.The left hand side is composed of a tetragonal and a squashed hexagonal ring.If the central Se atom (highlighted by a yellow circle) is displaced to the centre of the cluster (changing the Cd-Se-Cd bond angle from convex to concave) and its coordination is increased by one then the motif of LM3 for CdSe is generated; increase the coordination of the Cd to the right of this Se to obtain the GM motif.Note that relaxation of initial (ZnO) 10 geometries that are data-mined, without rescaling, from any of the three (CdSe) 10 LM will generate same LM.
The ashtray motif for n = 10 ZnO GM is also the motif for MgO LM3.For CdSe and ZnO, the same n = 11 bubble motif is predicted as the GM, which is also the LM3 for MgO, and (CdSe) 11 LM2 matches (ZnO) 11 LM3.In general, low energy ZnO and CdSe nanoclusters adopt a bubble motif.
3.1.5Comparison of nanoclusters with earlier predictions.Differences in predicted structures can be caused by missing local energy minima (a more exhaustive search of the energy landscape is required) and/or a different definition of energy is employed.In the latter, even after successfully matching clusters based on structural motifs, there can still be a change in energy ranking of the clusters.Below, the size (n) and the energy ranking of LM clusters may also be used in the labeling of atomic configurations such that na = GM, nb = LM2, nc = LM3. . .
With the lowest dimensional and simplest landscapes (cf.][61][62][63] For our predicted atomic structures for (CdSe) n , there is a reasonable agreement with structures suggested by Sanville et al. 63 Their CdSe configurations were data-mined from ZnS GM 50,68 that correspond to our 1a, 2a, 3a, 4a, 5b, 6a, 7a, 8a, 9a, 11a and 12a for CdSe.Our 5a, which is also the GM for (ZnO) 5 , (MgO) 5 and (KF) 5 , is only 0.024 eV lower in energy than our 5b for (CdSe) 5 .The largest discrepancy occurs for n = 10; no match is found between their (CdSe) 10 bubble cluster (which has C 3 symmetry) and any of our top three PBEsol (CdSe) 10 LM.As discussed above, the three PBEsol LM are very similar and therefore have similar energies.The non-perfect -has two twocoordinated atoms -bubble motif is, however, LM2 for (ZnO) 10 on the energy landscape of the rigid ion (parameters in Table 1) and shell model (SM), 20 PBEsol LM3 for (ZnO) 10 , and found to be 0.07 eV less stable than our PBEsol (CdSe) 10 GM.
The (ZnO) n atomic configurations are now compared to energy minima structures from a shell model (SM) 20 that are obtained using an evolutionary algorithm implemented within GULP 48,69 and PBE energy minima structures 37 (optimised using the DMol code 70,71 ), the initial configurations of which were either constructed (handmade) or local minima from a rigid ion model (potential parameters fitted to bulk ZnO).We expect optimised PBE and PBEsol structures to be very similar, and changes in energy ranking only possible when there is a small energy difference between LM.In fact there is a better agreement between our PBEsol results and the PBE than between PBEsol and SM.Starting from the smallest cluster size, the discrepancies between the latter are found for 5c, 6, 7, 8, 9c, 10b, 10c, 11b, 11c, 12b and 12c; whereas for the former we were not able to match 4c, 5b, 5c, 7c, 8a, 8b, 9c, 10b, 10c, 11b, 11c, 12b and 12c.The lowest energy SM metastable LM have higher symmetry than that predicted for PBEsol LM.Many of the mismatches for the smaller clusters are caused by the SM employed erroneously containing additional LM. 60Removing these LM, the match improves; for example, the motifs for the three lowest energy n = 6 and 8 clusters are now the same although the rank of the n = 6 ring and drum are reversed and the n = 8 SM GM is our PBEsol LM3.Allowing for polarization on the oxygen anions, the rank of 2D configurations deteriorate, which curiously leads to a better match of our n = 7 PBEsol LM with that of a rigid ion model.Typically, Wang et al. reports more than three PBE LM per cluster size. 37Many of the mismatches between our top three PBEsol LM and that of PBE is probably due to missing PBE LM, which would result in a larger PBE (smaller PBEsol) energy difference between LM when there is a mismatch.For example, the PBE (PBEsol) energy difference, in eV, between 4b and 4c, 5a and 5b, 9a and 9c, 10a and 10b, 11a and 11b, 12a and 12c is 1.1 (0.5), 2.4 (0.4), 0.5 (0.1), 0.2 (0.1), 0.9 (0.5), 1.8 (1.4), respectively.A change in rank is also caused by the change in the cost function, for example, 7c and 7d, which have very similar energies, are reversed.A larger change in ranking is found for metastable n = 10 LM; PBE and SM have a C 3 and a C 2k bubble for 10b and 10c, whereas these are 0.245 eV and 0.406 eV higher in energy than the PBEsol GM.Finally, the reported PBE 8a structure, 2D patchwork of two n = 4 rings connected via the formation of a n = 2 ring, is 0.466 eV higher in PBEsol energy than our GM.Wang et al. do comment 37 that previous B3LYP calculations 50,72 suggest that the tube-like motif is more stable.Using PBE, we find that the 2D patchwork is still less stable than the drum, although only by 0.017 eV, so suspect that the default cutoffs used for basis functions (double numerical including d-polarization functions, DND 70,71 ) in their PBE calculations may have been too small.
The (MgO) n atomic configurations are now compared to: rigid ion (RI) LM (that were proposed alongside mass spectra of (MgO) n + clusters); 73 Hartree-Fock (HF) and, with correlation corrections, Coulomb Hartree-Fock (CHF) 74 LM (optimised from initial geometries created by cutting from bulk phases or data-mining from alkali-halide clusters after initial pair potential calculations); 75 PBE energy minima structures (optimised using DMol from initial geometries constructed using a topological method based on predefined range of coordination numbers); and B3LYP LM 61 (optimised using TURBOMOL, where initial geometries were generated using a genetic algorithm 76 ).There is a 42% match between our 12 PBEsol GM and the reported RI LM; the low match is caused by both the change in the model employed and perhaps key motifs missing from their dataset.There were fewer mismatches between the HF and CHF GM: HF 5a is a 2D ring; CHF (MgO) 7 GM matches our 7b; a D 4d drum and S 4 bubble motifs are the (MgO) 8 GM for HF and CHF, whereas the rock salt PBEsol GM is their LM3; and the T h sodalite cage is the HF and CHF (MgO) 12 GM.As expected, an also perfect match is found between our PBEsol LM (shown earlier in Fig. 1-5) and the PBE reported LM; our 5b is the only missing cluster from their set, and 7b and 7c are reversed.We have also used PBE for MgO clusters and found a small number of changes in their ranking; 7b with 7c, 8a and 8b, 11a and 11c, and PBE 12b is the sodalite cage.Clusters with the same structural motifs are reported for both PBEsol and B3LYP LM, only the ranking differs such that the GM do not match for n = 8, 10, 11, and 12.What is more important here is the match between experimental and simulated infra-red (IR) spectra that Haertelt et al. have achieved in order to validate their predicted (MgO) n atomic structures.As we obtain clusters with the same structural motif, we are able to gain confidence in our predicted PBEsol structures resembling those synthesized.As expected, contributions to each spectrum may be from more than one of the lowest LM, and the dominate contribution is typically from the B3LYP GM.The B3LYP 8a matches our 8b (in fact there is a very small PBEsol energy difference between 8a and 8b); B3LYP 10a matches our 10c; B3LYP 11a matches our 11c (however the structural match with our 11a is also used in their simulated IR); and B3LYP 12a is again the sodalite cage.Interestingly, it appears that the dominate contributions to the n = 12 IR spectrum comes from the cluster that matches our PBEsol GM.
A rock-salt (ring) structural motif is found for the larger (smallest) PBEsol (KF) n GM, which is in line with that reported for clusters of other alkali halides. 65,77This trend is consistent with the findings of Roberts and Johnston, 19 where the relative stability of bubble and rock-salt motifs are reversed when the magnitude of the charges on the atoms is reduced from 2 to 1.Our GM also match that reported by Fernandez-Lima et al. 28 who employed a genetic algorithm to search for B3LYP LM for n = 1 to 4. They also reported a number of other LM for larger sizes; their lowest energy LM match our 5a, 6b, and 9a.Their n = 8 drum cluster was slightly more stable than their 3 Â 1 Â 1 cuboid (LiF) 8 cluster, and degenerate after correcting for zero point energy.
3.1.6Properties of the GM clusters.A similar trend in the GM energies is found for all four compounds; decreasing energy and its rate of change with cluster size -see Fig. 9. Eventually the curves will converge to their respective bulk values, e.g.À7.64 eV per MgO.There also appears to be larger oscillations in the energy curve for (MgO) n .The relative change with respect to n is easier to see from the second energy differences (SED).There are three minima, and a greater local stability, for (MgO) n at n = 4 (cuboid), 6 (drum) and 9 (barrel, or tube), which correlates to the reproduction of a larger quantity of these sizes when (MgO) n clusters are synthesized (see mass spectra in ref. 73).A smaller local minimum is found for other compounds if their GM has the same structural motif or, in the case of KF, the 2 Â 1 Â 1 cuboid.SED increases with increasing ring size.Local maxima in SED occur for the n = 3 2D hexagonal ring for KF (which prefers a rock-salt motif of tetragons); the n = 5 cluster, which is composed of a cuboid and two twocoordinated atoms, for KF, MgO, and CdSe; and the n = 7 clusters, where the second energy differences for all compounds are also much higher than those at n = 9.
From n = 1 to n = 12, the average coordination number of the constituent atoms in the (AB) n GM form a set of plateaux and gradually increases for larger KF clusters (which have a rocksalt rather than a bubble or tube motif); see Fig. 10.The plateaux correspond to the dimensionality of the cluster (1D, 2D then 3D); hence the longer plateau for 2D (ZnO) n .The average bond distance, hDi, also increases when the dimensionality of the cluster increases, but gradually decreases with increasing n otherwise.When clusters have the same dimensionality, hDi for MgO is slightly longer than for ZnO (which has the smallest value), KF is slightly smaller than those for CdSe (which has the longest value).
Any subtle differences between compounds can be masked by a change in structural motif.In Fig. 11 and 12, the change in bond lengths and bond angles for rings and double rings (drums) are shown as a function of size for each compound.The bond lengths within rings or the larger rings of the drums   A broken line connects data points corresponding to bonds that are formed for each drum when constructed from two rings.Black, red, green and blue lines and symbols correspond to (CdSe) n , (ZnO) n , (MgO) n and (KF) n nanoclusters, respectively.
decrease with increasing ring size.However, bonds connecting two such rings in the formation of a drum lengthen with increasing n.
The rate of change of the latter is greater for ZnO drums and then for CdSe, which suggests a greater destabilisation of drums compared to rings (note that there are one more ring GM for CdSe and four for ZnO).The differences in relaxed and ideal bond angles of the rings, which increase along the series KF, MgO, ZnO and CdSe, reflect the strength of the polarisation.Note that the bond angles centred on anions are more acute than those centred on cations.The same trend is found for angles within the large rings of the drums, but the opposite trend is found for the tetragonal faces; these become more regular (square) with cluster size.
In the previous subsection, our GM were compared to atomic structures already reported in the literature.Given that the nanoclusters of KF and CdSe are less extensively reported, we have also performed frequency calculations for all GM of these -see Fig. 13, where the corresponding simulated infra-red spectra are also shown.As no imaginary frequencies were found, the GM are indeed LM and not stationary points.
The highest occupied molecular orbital (HOMO), the lowest unoccupied molecular orbital (LUMO), and the difference between them, DE, are given in Table 4 for ZnO and MgO, and Table 5 for KF and CdSe.Having employed the GGA approximation in the DFT calculations, it is expected that DE, which is also plotted in Fig. 14, is smaller than observed optical absorption transitions by a factor of 1.5-3 for the heteropolar compounds studied. 78,79Therefore only qualitative trends will be of interest here.In common to all four compounds, there is an increase in the DE with n increasing from two to three, with the smallest change observed for KF.This behaviour can be attributed to the difference in the n = 3 structures, where the KF ring is hexagonal in shape, and the other compounds adopt a more triangular shape.Moreover, as the size of the nanocluster is increased, the valence electrons localised on the anions are further apart and thus are more stable.The next transition should be expected and, indeed, occurs as the structures transform from 2D to 3D morphologies.
For ZnO, DE shows a transition between n = 7 and 8, which directly correlates with the 2D to 3D structural transition discussed earlier and which also agrees with early reports of Matxain et al. 72 For 2 o n o 7, DE of ZnO was found to be around 3 eV, then dropped to 2 eV from n = 8.This is also similar to what has been reported in PBE exchange-correlation results by Wang et al., 37 although they found that DE, for this same range, monotonically increased from 3 to 4 eV.Similarly, the 2D-3D transitions occurs at n = 3-4 for MgO and can be  seen as a drop in DE.The smaller 3D structures of MgO are observed to have a smaller DE, a trend previously reported from experiment on larger nanoparticles. 81he calculated DE for KF is nearly constant around 5 eV for the range of n considered, which is about half the reported bulk value. 86Considering all 3D GM structures of KF, only n = 5 includes lower coordinated ions in a ''handle'' feature, which corresponds to an observed drop in DE.
The calculated DE for CdSe for the larger nanoclusters is comparable with the absorption edge experimentally observed for the smallest nanoparticles of about 3 eV. 84Extensive measurements on larger CdSe nanoparticles of different morphologies show an inverse trend compared to MgO, as the particle size increases the gap decreases, 84,85 with data extrapolated bulk values of 1.74 eV and 1.86 eV for nanoparticles and nanorods, respectively.
Based on the calculated DE values and their known underestimation using the exchange-correlation functional adopted, the general absorption in the UV region by the nanoclusters is predicted.
The first ionisation potential (I) and electron affinities (A) for the GM clusters are shown in Fig. 15.Only KF clusters do not accept an additional electron.The electron affinity for MgO decreases with size, whereas it increases and plateaux for ZnO and CdSe.Continuing this trend would result in the qualitatively correct work function of the bulk material for these compounds.All trends also conform to our reported structural phase transition, in particular, from 2D to 3D.Comparing ZnO, CdSe and MgO, the ionisation potential of the bulk materials is highest for ZnO, a trend in the ionisation potential that is also seen for the nanoclusters.
If the same structural motif is used for each cluster size, then, comparing data from different compounds, the trends seen in DE, I and A with respect to cluster size should be more similar.This is certainly true for DE; LM in the curves now align and there are fewer crossovers and DE is only smaller for MgO    compared to ZnO when n = 3 (see earlier discussions).For the electron affinity curves, ZnO and CdSe now follow the same pattern.However, there is a peak, not seen previously for any GM, at n = 9 for KF, the LM of which is 0.72 eV less stable than its GM.
3.2 Global optimisation of atomic structures of (AB) n nanoclusters 3.2.1 Searching DFT energy landscapes.The most computationally expensive part of the energy landscape search is the DFT refinement of the candidate structures.Pure DFT calculations are normally performed on structures that are close to their ground state energy minimum and therefore only require a small refinement.A search on the DFT energy landscape will, however, require investigating structures that are far from a local minimum and, therefore, need many more iterations to converge, and indeed may be even problematic to converge the electronic structure for the initial atomic configuration.Hence, any avenue that may lead to a saving in computational effort is worth investigating.One popular way to stabilise and accelerate the electronic self-consistent field (SCF) calculations, particularly in a solid-state context, is by using one-electron energy smearing, for example, with a Gaussian function.The default value of the smearing width used in FHI-aims is 0.01 eV, tuned on metallic systems.For this work, the Gaussian smearing width was increased to 0.1 eV, and its effect on the SCF calculation on a zinc oxide dimer at varying interatomic distances was investigated.Fig. 16 shows the dependence of the number of SCF cycles required to converge single-point energy calculations with respect to the interatomic distance.It was found that using a value of 0.1 eV significantly reduced the number of SCF cycles needed to reach convergence.Even greater values of the Gaussian dispersion could be used, however, the resultant energy landscape may differ significantly from that approached by DFT with zero smearing.In particular, it becomes possible to trap the optimisation process in artificially stabilised high-energy minima, or excited states with unusual atomic configurations, which are not of direct interest here.Alternatively, important low-energy minima can be missed if too many of the structures in the initial population relaxes to the GM or a particular LM.
Typically, for larger sized clusters, geometry optimisations using this 0.1 eV Gaussian smearing width did not affect the resulting atomic configurations of the GM and low-lying LM.This value of the Gaussian smearing width was employed in the initial iterations of the DFT refinements, before a final structural relaxation with zero smearing.
3.2.2Interatomic potential based pre-screening.Interatomic potential based pre-screening and a direct search of the DFT landscape was used to predict the structures of (ZnO) n and (MgO) n nanoclusters, with n = 4-12.With an increased number of possible local minima as the size of the cluster is increased, the initial EA population was increased from ten structures to sixteen at n = 8, and therefore more likely to span the energy landscape.
Tables 6 and 7 give the averages for the number of EA iterations and number of DFT calculations required to find the lowest energy nanoclusters for ZnO with and without the use of pre-screening, respectively.The corresponding tables for MgO are Tables 8 and 9.For ZnO and MgO the lowest energy minima were found within the first three EA iterations for all n considered, except for LM2 of (ZnO) 12 , which required five EA iterations.Without pre-screening, many more EA cycles were typically required; hundreds rather than tens of DFT calculations were performed.Comparing the tables of ZnO and MgO, it is clear that fewer EA cycles are needed to locate the MgO low-energy minima.Without prescreening, a smaller average number of EA iterations was required to locate the GM for (MgO) 12 than expected; the same number of iterations as required with prescreening, just two.As the prescreening typically coalesce fragmented clusters, a greater number of DFT calculations are performed per iteration with prescreening and therefore suggests a direct search is more efficient for this particular example.This in fact is misleading as the average time for each DFT run is much longer when prescreening is not applied.The time required for each DFT calculation depends on the number of SCF steps and the number of times the analytical derivatives are calculated (Geometry Optimisation steps).For the (MgO) 12 example, the GM was found after an average of 20 593 SCF and 1782 GO steps when prescreening employed, and a staggering 158 463 SCF and 7478 GO steps without.Clearly the comparison of the average required DFT calls underestimates the performance difference between the two approaches.
The tabulated data provide direct evidence that using a prescreener is an effective way of reducing the computational load and increasing the efficiency of the EA.This efficiency is attributed passing atomic configurations that are typically near stable (or metastable) local minima on the DFT energy landscape for DFT refinement.Moreover, this procedure removes the unphysical structures that take much computational effort to optimize using DFT.This conclusion of course relies on the suitability of the chosen interatomic potentials, which is considered below.
Another advantage the pre-screener has over a direct search on the DFT energy landscape is that pre-screening will coalesce fragmented nanoclusters more readily.Fragmented nanoclusters are high on the energy landscape and therefore computing them is a waste of computational resources from the point of view of the result.The charge on atoms in the model using interatomic potentials is fixed (set by the user) and so oppositely charged fragments will be attracted to each other.However, using DFT the electrons can redistribute to ensure charge neutrality of each fragment, and therefore a fragmented cluster will often remain fragmented.Furthermore, dispersion interactions between fragments are poorly described by standard DFT.
Searches on the PBE landscape were also conducted for (MgO) n , which produced matching LM to those reported above for PBEsol as well as similar statistics in the success and efficiency of the searches.
3.2.3Effect of the interatomic potential on the search.It is important that the DFT landscape is searched and that the prescreening does not lead to missing important local minima.One possible way to limit the bias of the pre-screener is to restrict the number of line searches (relaxation steps) or reduce the required accuracy of the local optimiser.This typically will allow for the removal of any structure with very short interatomic distances that may cause DFT problems, but any computational cost saving in the pre-screening is negligible.Here, the effect of the accuracy of the chosen interatomic potentials (IP) on effectiveness of the pre-screening is investigated.In particular, the IP used for pre-screening MgO nanoclusters was used during the search for ZnO nanoclusters.This IP is less computationally complex (it does not have the multi-region Buckingham potential), but is still appropriate for modelling oxides with an additional benefit of exploiting ionic size similarity between Mg and Zn.No ZnO local minima were missed and, moreover, no discernible difference was found in the computational cost or the efficiency to locate all the local minima.It is important to note that the number of local minima on the IP and DFT landscapes will differ, but this can be either an advantage or a disadvantage of this approach.For example, unwanted higher DFT energy minima may not exist on the IP landscape and therefore easily avoided when implementing pre-screening -see below.
3.2.4Energy landscape complexity.In order to understand why the MgO LM were located quicker than the corresponding ZnO LM, differences between the energy landscapes of ZnO and MgO are investigated.The ease of finding a particular LM using a Lamarckian approach will depend on the catchment area of the LM.The catchment area is defined as the area of landscape from which the local optimisation routine will converge to the LM, and typically will be larger than the energy basin that the LM is contained within.An estimate of the relative size of these catchment areas is obtained by using the random quenching routines of KLMC (the accuracy of which will improve with the number of sample points).One hundred random configurations of ZnO and MgO were generated and subsequently refined at the DFT level.The frequency of finding the GM, LM2 and LM3 structures are summarised in Tables 10 and 11. the performance of the random quench and a Lamarckian evolutionary algorithm showed, as expected, that the latter required fewer DFT calculations to locate the local minima, and thus highlights the efficiency of global optimisation techniques such as an evolutionary algorithm.Further analysis showed that the majority of structures located with a random quench on the DFT energy landscape were peroxide structures.The formation of these structures is impossible with the pre-screening potentials used here, and therefore these structures are automatically avoided in any DFT refinements.Traditionally, DFT energy minima were obtained by refining IP energy minima structures found from global optimisation.With the increase in the available computer power, there is a drive towards direct searches on DFT energy landscapes.This study has shown that a careful application of interatomic potential based pre-screening will lead to a successful and, in fact, a more efficient algorithm for searching the DFT energy landscape.Pre-screening is therefore recommended for future applications of structure prediction techniques.

Fig. 1
Fig. 1 Flow chart of the EA implemented within the module of KLMC.Blue represents actions solely undertaken by KLMC; orange -the main parallelised action; and green -main result of an action.The green arrow marks the last step of each EA cycle.

Fig. 2
Fig. 2 Flow chart of the evaluation process as implemented by KLMC.Green represents actions solely undertaken by KLMC; orange -actions undertaken by third party software; and blue -user-defined choice or main result of an action.

Fig. 3 (
Fig.3(CdSe) n , (ZnO) n , (MgO) n , and (KF) n GM nanoclusters, as determined on the PBEsol energy landscapes, for n = 1-3, arranged by size and degree of polarisation (with point symmetry group labels).Colours: turquoise is Cd, pink is Se, sea green is Zn, red is O, blue is Mg, light blue is K, and dark blue is F.

Fig. 9
Fig. 9 PBEsol energy difference, E n * = E n À nE 1 , and the second energy difference of (AB) n GM nanoclusters.

Fig. 11
Fig. 11 Bond lengths within rings (filled symbols) and drums (open symbols).A broken line connects data points corresponding to bonds that are formed for each drum when constructed from two rings.Black, red, green and blue lines and symbols correspond to (CdSe) n , (ZnO) n , (MgO) n and (KF) n nanoclusters, respectively.

Fig. 12
Fig. 12 Differences between PBEsol relaxed and ideal bond angles.Upper graph: within a ring (unconnected symbols) and within one of the two larger rings of a drum centred on anions (broken line) and on cations (solid line).Lower graph: angles between one bond within a larger ring and one bond connecting the two larger rings.Note ideal implies 901 in the lower graph, and internal angles for regular 2n (n) sided polygons for rings (drums) in the upper graph.For colours see Fig.11.

Fig. 14
Fig.14The energy difference between HOMO and LUMO, DE, for ZnO, MgO, KF, and CdSe GM (upper) and LM with a matching motif (lower) as a function of nanocluster size n (formula units).

Fig. 15
Fig.15Ionisation potential and electron affinity for ZnO, MgO, KF, and CdSe as a function of nanocluster size n (formula units) for GM and LM with the chosen motif (see Fig.14).

Fig. 16
Fig. 16 Number of SCF cycles to convergence for a ZnO dimer at a given inter-atomic separation using the Gaussian smearing of 0.01 eV.Inset: the effect of increasing the Gaussian smearing to 0.1 eV.

Table 2
Differences of electronegativity

Table 2
is found, i.e.CdSe has the weakest bond, ZnO and MgO the second and third weakest, whereas KF has the strongest bond.The lowering of the

Table 4
HOMO and LUMO energies and their differences for ZnO and MgO, in eV As given in ref.39.bExperimental value reported in ref.80.

Table 5
HOMO and LUMO energies and their differences for KF and CdSe, in eV a Experimental value on CdSe nanoparticles reported in ref. 82.b Experimental value reported in ref. 83.c Nanorod bulk-limit value reported in ref. 84.d Nanoparticle bulk limit value reported in ref. 85.

Table 6
Average over three independent runs of EA iterations, N EAi , and number of DFT calculations, N DFT , to locate the lowest energy nanoclusters of ZnO when using pre-screening

Table 7
Average over three independent runs of EA iterations, N EAi , and number of DFT calculations, N DFT , to locate the lowest energy nanoclusters of ZnO without using pre-screening This journal is © the Owner Societies 2014 Phys.Chem.Chem.Phys., 2014, 16, 21119--21134 | 21131

Table 8
Average over three independent runs of EA iterations, N EAi , and number of DFT calculations, N DFT , to locate the lowest energy nanoclusters of MgO when using pre-screening

Table 9
Average over three independent runs of EA iterations, N EAi , and number of DFT calculations, N DFT , to locate the lowest energy nanoclusters of MgO without using pre-screening This journal is © the Owner Societies 2014 Phys.Chem.Chem.Phys., 2014, 16, 21119--21134 | 21133