Screening heteroatom distributions in zeotype materials using an effective Hamiltonian approach: the case of aluminogermanate PKU-9.

We introduce a method to allow the screening of large configurational spaces of heteroatom distributions in zeotype materials. Based on interatomic potential calculations of configurations containing up to two heteroatoms per cell, we parameterize an atomistic effective Hamiltonian to describe the energy of multiple substitutions, with consideration of both short- and long-range interactions. Then, the effective Hamiltonian is used to explore the full configurational space at other compositions, allowing the identification of the most stable structures for further analysis. We illustrate our approach with the aluminogermanate PKU-9, where we show that increasing the aluminium concentration changes the likely siting of Al, in agreement with experiment.


Introduction
The application of microporous solids in catalytic, ionexchange, molecular adsorption and separation processes is controlled by the structure of the pores and their composition. Hence much work has been directed at obtaining architectures, through various strategies, which provide the optimal material for specific applications. 1

, 2 Brunner and
Meier identified, almost thirty years ago, that the assembly of small rings promotes the formation of low-density zeolites, with larger pores. 3 The synthesis of new frameworks with such larger pores has mainly been achieved through the introduction of heteroatoms, other than silicon and aluminium, into the zeolitic framework. 4,5 In particular, germanium has been identified as a promoter of large pores, due to longer Ge-O bond length (~1.74 Å) [6][7][8][9] and smaller Ge-O-Ge angle (~130°) [6][7][8][9] compared to the geometries obtained in aluminosilicate units, related to the static flexibility imparted by Ge atoms stabilizing small units such as double four rings. [10][11][12] Recently, it has been proposed that Ge also confers dynamic flexibility to the framework, in the sense that it leads to enhanced molecular diffusion within the zeolite. 13 The incorporation of Al in germanate frameworks requires the presence of charge-compensating extra-framework cations which will impart ion-exchange and catalytic properties. Moreover, the presence of Al enhances the stability of the framework upon template removal. 14 Both the amount of Al incorporated, and its location in the framework, impact the physical and chemical properties of the resulting material, similarly to what happens in aluminosilicate zeolites. [15][16][17][18][19] A large body of experimental and computational work exists aimed at identifying and attempting to explain the distribution of Al in aluminosilicate materials. [20][21][22][23][24][25][26][27][28][29][30][31] In aluminogermantes the scenario is different, and computer modelling offers a valuable tool for identifying preferred siting of heteroatoms, as evidenced by previous successful applications to the investigation of Si-Al [24][25][26][27][28][29][30][31] and Si-Ge distribution in zeolites. 12,[32][33][34][35] Heteroatoms can be distributed over the framework of zeolites with varying degrees of order, from full ordering in some cases to completely random distribution in other cases. Computational studies of heteroatom distributions might suffer limitations in zeolites with small concentration of heteroatoms, like those exhibiting high Si/Al ratio, or with small energy differences between the configurations of foreign atoms over distinct tetrahedral sites. In such cases, the location of heteroatoms is often random or it is directed by the synthesis conditions. 30,[36][37][38][39][40] However, the larger the energy differences between configurations, the more important thermodynamic factors will be in controlling the distribution. This explains the appearance of some -particularly naturally brewsterite, 45 HEU-topology clinoptilolite 46 and heulandite, 47 epistilbite, 48 and levyne. 49 Computational modelling have successfully shown the preferential ordering in goosecreekite 28,29 and partial ordering in HEU-type zeolites 26,49,50 51 and recently enlarged to 209 zeolite frameworks. 52 When modelling heteroatoms distribution in zeotype frameworks, interatomic potential methods have proven particularly successful as they are low-cost and have been shown to be able to reproduce subtle structural and energetic differences: quantum-based methods remain prohibitively expensive except for considering single substitutions in a unit cell or when considering a small subset of ordered structures. But when the configurational space is as large as it is in the present study (or in similar problems in related materials) the computational expense of interatomic potential methods still remains a restriction.
The local geometry of germanate tetrahedra, besides allowing stabilization of small rings, facilitates the substitution of Al atoms in structural units rarely observed in silica and aluminosilicate zeolites, such as 3-membered rings (3MR) and spiro-5 units. 53 This structural diversity might lead to a wide distribution of Al atoms in germanate frameworks, as compared to aluminosilicates. However, when the Al content is relatively high (Ge/Al ratio below ca. 4), identifying the location of Al atoms is both experimentally (for the reasons given above) and computationally challenging, the latter due to the very large size of the configurational space. As an example, we consider the aluminogermanate PKU-9 (PUN IZA topology), which exhibits a zeolite framework composed of zeolite CGS layers and spiro-5 units 53 ( Figure 1). This structure has 5 distinct T sites, one located in the centre of the spiro unit (T5) and two around this centre (T1 and T2), with each of these three sites being part of a 3MR. Structure refinement of X-ray diffraction data 53 suggests that the Al and Ge are randomly distributed over the five T sites. Moreover, PKU-9 has only been prepared with a Ge/Al = 3.5 and no report is found of higher Ge/Al. However, the distribution of heteroatoms in related zeolites with CGS structures varies: in phosphates ordering is found for the phosphorous and other tetrahedrallycoordinated atoms (e.g. Ga and Zn or Co) at a P/heteroatoms ratio of 1, 54, 55 but in CGS gallosilicates, Ga and Si atoms remain disordered even when Ga/Si is only about 2.

56, 57
These contrasting results suggest that we may expect some ordering in PKU-9, with significant differences in the local environment of the different T sites. Moreover, the absence of a range of PKU-9 compositions might also allude to particular topological constraints on the number and location of aluminium atoms in this particular structure. For example, if at low aluminium content the stability of the structure is strongly dependent on particular T-site occupation, hydrothermal self-assembly of such a structure is unlikely. However, in order to explore computationally such a wide range of compositions, to probe any changes in preferred occupation, we have to screen as much of the phase space as possible, which will be limited by computational cost. This is a common issue when probing distribution configurations in materials chemistry. Therefore, we propose here an approach to model the Ge/Al distribution in complex aluminogermanates by introducing an effective Hamiltonian for a fast evaluation of configurational energies. In the 1980's, effective Hamiltonians were developed to study the siting of extra-framework cations in zeolites 58,59 and Al distribution in high-symmetry zeolites were also studied shortly after. 60,61 These early investigations did not consider relaxation effects and also were limited to screening relatively small configurational spaces. More recently, Monte Carlo simulations have been used to study Si-Al distribution in zeolites via sampling of non-relaxed configurations. 62 However, for some zeolites relaxation effects are known to significantly affect the distribution of heteroatoms, and indeed there may be no correlation between the energy of the configurations before and after structural relaxation, which limits the applicability of such fast-sampling approaches. 26 Hence, new developments are still needed to deal with large configurational spaces in complex zeolites. In the approach that we introduce here, an effective Hamiltonian suitable for fast sampling of a very large configurational space, is parameterised to reproduce atomistic simulations which included full geometric relaxation. In this way, information about site-specific geometric relaxation behaviour is (indirectly) included in the model.  26-29, 31, 70 However, in zeolites with high concentrations of heteroatoms even with the use of symmetry and of some ad-hoc structure-related constraints, the number of independent configurations is larger than what is tractable by fully atomistic simulations. Our selection of the aluminogermanate PKU-9 zeolite, will provide here an example on how the structural information regarding the location of the Al atoms will vary according to the approach used as the composition changes from that found experimentally to higher Ge/Al.

Methodology
The composition of PKU-9, Ge/Al = 3.5 (8 Al atoms by unit cell), leads to over 30 million Si-Al configurations in a unit cell. Such a number of configurations is far too high to allow a costeffective lattice energy minimization strategy, even with a simple interatomic potential. Note that the average time for the energy minimisation of a GeAl-configuration here is of 30 s, which would imply 28 years on a single processor for calculating all the configurations. Using symmetry relations, the configurational space can be reduced to ~8 million symmetrically non-redundant configurations. But this is still a very large number of calculations to perform. In addition, the high flexibility introduced by the Ge atoms also prevents the use of ad hoc restrictions (based on reduction of the local stress) which have been used previously in the simulation of aluminosilicates. 26, 50 Therefore we perform our analysis of Al siting by exploiting the screening capabilities offered by effective Hamiltonians. Our approach is based on a parameterized energy function extracted from pairwise interactions energies obtained from interatomic potential calculations, when two aluminium atoms are introduced into the unit cell, as discussed below. To show why it is necessary to correctly model the Al-distribution at the experimental composition, we first study lower Al contents to see whether the gained information can be extrapolated -in other words if the siting of Al is independent of the concentration incorporated. Since in this case there is a manageable number of configurations for computing their lattice energies, we have chosen a forcefield approach, as it is common in the investigation of zeolites when relatively large sets of configurations are explicitly considered. 26, 31, 33-35, 50, 71-73 Our calculations give access to the energies of a very large set of configurations of heteroatom distributions, which in the first place provides a measure of the likelihood of thermodynamic control of the heteroatom distribution. As mentioned above, large energy differences suggest partial or even full ordering, whilst small energy differences may favour a random action distribution or could lead to the distribution to be strongly dependent on synthesis conditions. It is also worth recalling that during synthesis, when energetic conditions are favourable, the resulting zeolites often undergo Ostwald ripening, transforming to more stable structure. 74 Indeed such transformation can also occur without changes in the zeolite topology, modifying the heteroatom distribution to gain stability. 75

Interatomic potential calculations
All the calculations based on interatomic potentials were performed using the GULP code. 76 The germanate framework is modelled using the Ge-O potential of Sastre and Gale, 33 while the Al-O interactions are described by the Jackson and Catlow potentials. 87 In general, mixing potentials from different sources can reduce the accuracy of the calculations. However, in the present case this is not problematic as both sets of potentials were parameterized for zeolite-like materials, starting from the same O-O interaction potential as the energy reference, and therefore they are expected to be compatible. Any remaining slight inaccuracies will be masked in the context of the effective Hamiltonian approach. A potential that also considers the same energy reference and includes Ge-O and Al-O interactions has been reported by Sastre and Gale. 88 To provide further confidence in the method and the potentials used we selected 60 configurations and re-minimised using the Sastre and Gale potential. These configurations were selected from the entire energy spectrum, with the same number of configurations for each decile of ordering of the lattice energies computed with the other potential. The correlation between both sets of energies is 0.991, which provide us confidence on our selected parameters.
The introduction of heteroatoms into a tetravalent framework is typically accompanied by extra-framework charge compensating cations. Under synthesis conditions these are usually the cationic template molecules and/or inorganic cations. 74 However, explicit consideration of these species add even more complexity to an already large configurational space, as modelling such extra-framework species would imply an additional large set of configurations for each framework heteroatom distribution. 29,50 The lower charge of Al can also

Configuration generation
Introduction of Al atoms into the PKU-9 framework (Figure 1) leads to a very large number of configurations. At the experimental Ge/Al ratio of 3.5, we have 30 million configurations and even at Ge/Al = 5 the number is over 1.9 million. The application of symmetry constraints allows us to simplify matters somewhat. The SOD (Site Occupancy Disorder) code, 91 which has been successfully employed for studying Si-Al distribution in other zeolites, 31 generates the complete configurational space for each composition of the computational cell and then extracts the symmetrically inequivalent configurations by considering the crystallographic symmetry operators: thus significantly reducing the number of configurations that have to be considered. The occurrence probability for each configuration can then be calculated assuming a Boltzmann distribution and hence the T site occupancy determined by considering the occupancy of each site weighed by the occurrence probability and the site multiplicity. Using SOD we fully explore the configurational space at relatively high Ge/Al ratios (with 1 to 4 Al per unit cell) and we can afford (at reasonable computational cost) to evaluate the energy of each configuration by full energy minimization using interatomic potentials.

The effective Hamiltonian approach
Effective Hamiltonians (EH), which give the energy as a function of site occupancy by introducing interaction parameters between nearest and (sometimes) next-nearest neighbour sites, have previously been applied to non-porous aluminosilicates with high Al content. 92 However, in the case of microporous solids the consideration of short-range interactions is not sufficient, because their open topology results in longer range interactions also affecting the aluminium siting, even at large Al -Al distances. 31 We therefore develop an alternative energy model as follows, starting from E 0 , which is the lattice energy of pure-germania PKU-9 (with no Al) computed with GULP.  (1) is negligible in comparison with full energy minimisations using a forcefield. Our method can be seen as a site-and pair-based extrapolation of energies from low to high concentration of substitutions.
To validate this approach, we considered two sets of structures extracted from the full phase space of configurations with 8 Al per unit cell. In the first (Set 1), we analyse the 10,000 lowest-energy configurations given by Equation 1. Then, in order to achieve a wide screening of the configurational space, we also consider a second set of 150,000 configurations randomly selected from the entire phase space (Set 2), but with effective-Hamiltonian energies above those of Set 1. The configurations in both sets were subject to full lattice energy minimization using GULP. The multiplicity (number of symmetry redundant configurations) of each one was determined considering the complete set of configurations (~30 million). Then, the lattice energies in conjunction with the multiplicity of the configurations were used to determine the occurrence probability and therefore the T site occupancy. The multiplicity of sites T1 to T4 is 8, while it is 4 for T5 (Figure 1). Figure 2 shows the comparison of lattice energies in Set 1 obtained from full atomistic minimization with those obtained from the effective Hamiltonian (using Equation (1)). The energy trends are well captured by the effective Hamiltonian, achieving a correlation factor of 0.969, and showing a slope equal to 1.012 that shows the strong correlation between the two methods As expected from previous work describing heteroatom distributions, 25-31, 49, 50, 83, 89, 93-95 some scattering is observed, as subtle structural effects finely control the precise distribution, and it is also apparent from Figure 2 that they are not fully reproduced by the effective Hamiltonian in a number of configurations. Nevertheless, it is clear that the general trend is well captured, with the majority of the lowest energy configurations being identified by the effective Hamiltonian. As an additional observation, one can see in Figure 2 that the line representing the linear fit (red solid) slightly departs from that showing the y = x function (green dashed). This is caused by higher order terms not considered in the formulation of the effective Hamiltonian, i.e. interactions between 3, 4, or more Al atoms. The computational cost to introduce such terms would be quite high, with only a minor improvement on the results. We also find that the EH approach captures the essence of the stability of the entire phase space: Figure 3 combines the results for all the configurations in Set 1 and Set 2. Indeed, the correlation factor is improved to 0.993, with the overall dispersion decreasing for the higher energy configurationsrecall Set 2 is a random sample from the entire phase space. This analysis would suggest that the EH successfully identifies both lower and higher energy configurations well, emphasizing its usefulness as a screening tool. The ability of the EH to determine relative energies is also accentuated by the fact that only a small number of configurations (80 of 150000) from Set 2 subsequently fall into the energy range of Set 1 when subjected to full energy minimization using interatomic potentials. Moreover, the most stable of these is still ranked only 5221 of all those structures now fully optimized.

Correlation between interatomic potentials and effective Hamiltonian
We therefore conclude that the constructed EH correctly identifies configurations within this vast phase space, giving us confidence that it is an appropriate and effective tool in screening such materials.

Isolated Al -site preference and Al-Al interactions
We explore now the differences found in the Al occupation of the different T sites when isolated substitutions are introduced, and when a second Al is also introduced. When only one Al is introduced we see significant variation in the relative energies between the five T-sites, namely 10.8, 5.6, 10.9, 6.8, and 0.0 kJ mol -1 for T1 to T5, respectively, which we can ascribe to the different local geometries of the sites. This result shows a qualitatively different behaviour to that shown by zeolite ZSM-5, where the energy differences from single Al substitution over the distinct T sites are typically much smaller (within 4 kJ mol -1 for the seven sites with lower Al-substitution energy) 31 . Given the larger energy differences in PKU-9 we may expect a degree of non-random Al distribution in a material with this topology if formed at high Ge/Al, with T5 Providing a quantitative description of the Al -Al interactions is complex but can be illustrated by considering first the simplest case of incorporating two Al per unit cell. In Figure 4 are displayed the energies of the compositions with 2-Al incorporated as a function of the sum of the energies of the configurations having 1-Al substituted in each site. Hence, deviation from the line with slope 1 reflects the Al -Al interaction energy. While in ZSM-5 the values of the interaction energies are about 75 kJ mol -1 , excluding non-Lowenstenian configurations) 31 , here we have some values up to 25 kJ/mol higher. This is associated with the smaller size of the unit cell, which makes the Al-Al interaction larger Note that those configurations where the interaction energy is most significant (>125 kJ mol -1 ) are those that have adjacent T sites occupied by Al, suggesting (as is common) that Loewenstein's rule (formulated obviously originally for aluminosilicates) is generally obeyed for aluminium (or even other formally 3+ metals) distribution in framework materials. These results, as we show below, further suggest that even at relatively low aluminium content (Ge/Al = 17) we may expect some degree of excess Al population in T5 and conversely lower than expected content in T1 and T3.

Increasing Al content
We now consider lower Ge/Al compositions, tending towards the experimentally found compositions of 3.5 (i.e. 8 Al per unit cell). If the Al was randomly sited in PKU-9, one would expect that the occurrence probability for all configurations of a given Ge/Al to be similar. However, in Figure 5 it is observed that discrete occupation is present for a number of lowest energy configurations. Indeed, we observe that for 1-4 Al per unit cell a few configurations are considerably more favourable than others, and in each case one configuration will have an occurrence probability over 4 times higher than the next most stable configuration. Even when 8 Al are present (Figure 5d) there remains one strongly favoured configuration with a further clear group of other more stable configurations above the continuum. Recall that for the structures with 1 to 4 Al atoms per unit cell, full screening of the Al distribution was possible using fully atomistic simulations for all possible configurations. However, for the 8-Al structure the energies are those obtained solely from application of our effective Hamiltonian approach, which is the only tangible route to scan the full configurational space.
The above results, particularly for Al atoms content up to 4 atoms per unit cell, therefore suggest that the Al average occupancy in the tetrahedral sites is not likely to be random.
Recall that site occupancy also depends also on the multiplicity of the sites: 8 for T1 to T4 and 4 for T5. However, we also find that the preferred location of Al varies with composition, as shown in Figure 6. At low Al content, from 1 to 4 Al per unit cell, preference for T5 is dictated by the topology. However, when 8 Al are present in the cell, which corresponds to the experimental composition, the Al-Al repulsion ( Figure 4) becomes more relevant than the preference associated to topology. In this case, sites T1 and T2 show the higher populations, with T3 now being the least favoured. Noticeably, the differences between the various sites are now less pronounced.  Bars from left to right (red to magenta) corresponds to T sites from T1 to T5. For each composition, the occupation is normalized so that the most favored site has an occupation of unity.

ARTICLE
In order to further quantify the heterogeneity of the site occupancy for each Al content, we use the standard deviation of the Al site occupancy probabilities as a measure, i.e. the lower this value the lower the heterogeneity, and thus the larger the likelihood of random Al siting. Please do not adjust margins Please do not adjust margins for 8 Al atoms/cell is at least twice lower than for the other cases, which indicates that the bias for single site occupation due to the framework topology is largely smeared by the Al -Al interactions and the local distortions accompanying the Al incorporation. However, we also note that the computed standard deviation of the average probability by Al atom for the highest Al content, 8 Al per unit cell, is about 2.7 times higher than that determined in the experimental work (0.03). 53 In other words, our theoretical analysis suggests that Al siting in PKU-9, in thermodynamic equilibrium, should be less random than what was determined by XRD. This discrepancy could be due to the approximations involved in our model, or to non-thermodynamic effects (which we ignore) contributing to the actual distribution. Alternatively, it could also be a consequence of the limitations of XRD analysis in determining site occupancies for this type of system. This point therefore deserves further theoretical and experimental investigation. It may be useful to repeat a refinement starting from different occupations, such as the most stable and least stable determined here, to establish if the experimental data can indeed be used to distinguish any preferential siting. It is worth to note that the effective Hamiltonian introduced in this work, has been used along with experiments to provide a deep characterization of the Si-Ge substitutional series in the chiral STW family of zeolites. 98

Conclusions
An approach has been introduced for screening the multimillion configurational-space associated with heteroatom distributions in zeolites at high concentration values. This method fills a gap in the theoretical treatment of heteroatom distribution in zeolites, as existing approaches are not adequate when the configurational space is composed by more than a few tens of thousands of configurations. The method uses an atomistic effective Hamiltonian parametrized by Al-related energies obtained from fully atomistic calculations, including distant Al -Al interactions. The robustness of the method was verified against lattice energies calculated by interatomic potentials over 150,000 Al-Ge configurations in PKU-9 aluminogermanate zeolite at the experimental chemical composition.
As a case study, a detailed computational study of the Al distribution in the framework of PKU-9 aluminogermanate was presented. Al contents of 1 to 4 and 8 (experimental content) atoms per unit cell were considered. We have been able to reduce the size of the configurational space of Al distribution using the SOD code 91 for Al contents up to 4 Al atoms per unit cell. For 8 Al atoms in the unit cell, a selection approach based on the stability order indicated by the effective Hamiltonian approximate energy was implemented. At Al content up to 4 atoms per cell, clear preferential Al location is predicted for T5 site. However, sites T2 and T3 are expected to show higher occupancies at the experimental composition (8 Al per cell). This reveals that the Al -Al interactions are crucial in controlling the Al distribution in aluminogermanate, and therefore information gained for lower Al incorporation cannot be simply extrapolated to higher Al content.
Conversely, the lack of significant preferential siting at the experimental compositions may provide an insight into why it is this particular composition that can be formed: for higher Ge/Al ratios, specific Ge siting may be required to form stable structures. Su et al.'s analysis of the XRD data suggests a purely random distribution, which our data generally supports. Nevertheless, we do predict some non-homogeneous distribution of the Al atoms over the T sites at the experimental composition. It can be argued that, without a suitable starting model, Rietveld analysis methods may not converge to anything other than a random distribution. Therefore, it would be valuable to perform further experimental studies, perhaps using Al NMR data to combine with the XRD data or using our calculated distribution as a starting point in the Rietveld analysis.

Conflicts of interest
There are no conflicts to declare