Evolutionary exploration of polytypism in lead halide perovskites

The regular ABX3 cubic perovskite structure is composed of close-packed AX3 layers stacked along the 〈111〉 axis. An equivalent hexagonal close-packed network can also be formed, in addition to a series of intermediate polytype sequences. Internally, these correspond to combinations of face- and corner-sharing octahedral chains that can dramatically alter the physical properties of the material. Here, we assess the thermodynamics of polytypism in CsPbI3 and CsPbBr3. The total energies obtained from density functional theory are used to paramaterize an axial Ising-type model Hamiltonian that includes linear and cubic correlation terms of the pseudo-spin. A genetic algorithm is built to explore the polytype phase space that grows exponentially with the number of layers. The ground-state structures of CsPbX3 polytypes are analysed to identify features of polytypism such as the distinct arrangements of layers and symmetry forbidden sequences. A number of polytypes with low ordering energies (around thermal energy at room temperature) are predicted, which could form distinct phases or appear as stacking faults within perovskite grains.


Introduction
Perovskite is an important and exible structure type in solidstate chemistry and physics. [1][2][3][4] Metal halide perovskites have attracted signicant attention as photovoltaic materials since the rst report of a photoactive absorber in 2009, 5 with the power conversion efficiency (PCE) climbing up from the 3.8% to the current record of 25.5%. 6 One key for improving solar cell technologies based on the hybrid organic-inorganic perovskites such as MAPbI 3 (MA ¼ CH 3 NH 3 + ) and FAPbI 3 (FA ¼ CH(NH 2 ) 2 + ) or the inorganic lead halide perovskites such as CsPbBr 3 and CsPbI 3 , is to synthesize large-area single-crystal perovskite lms, which is challenging due to intense phase competition. MAPbI 3 undergoes degradation to release CH 3 I and NH 3 gas at temperatures as low as 80 C, 7 posing a challenge to longterm stability. The compositions used in the state-of-the-art perovskite solar cells are shiing to incorporate FA + and Cs + as A-site cations, with reported PCEs up to 25.2% 8 and 20.4%, 9 respectively. They are comparatively more stable than MAPbI 3 at elevated temperatures. However, the photoactive black phases (cubic a-phase) of both FAPbI 3 and CsPbI 3 can spontaneously transform into photoinactive, non-perovskite yellow phases (hexagonal d-phase FAPbI 3 ; orthorhombic CsPbI 3 ) at room temperature. [10][11][12] These transformations involve the formation of multiple intermediate phases. 13,14 Phases can coexist in as-synthesized perovskite crystals, for instance, the presence of cubic and hexagonal domains in a FAPbI 3 nanowire was conrmed with in situ optical microspectroscopy. 15 The authors proposed a disordered amorphous interfacial layer at the phase boundary where the octahedra are distorted and tilted. Recently the mixed perovskite MA 1Àx FA x -PbI 3 (x ¼ 0.5, 1) was also reported with characterisation evidencing the appearance of an intergrain 2H phase. 16 On the other hand, the coexistence of cubic and hexagonal phases is oen observed for oxide perovskites. For instance, various mixed (ordered or disordered interleaved) packing of hexagonal and cubic layers in BaNiO 3 , BaCrO 3 , BaMnO 3 , and BaRuO 3 can form due to the small energetic difference between arrangements, giving rise to strong polytypism. 17,18 Polytype engineering can enable novel electronic and ion transport properties as demonstrated for hexagonal oxide perovskites in a range of energy-related technologies. 19 Similarly, polytypism of lead halide perovskites has been shown to be energetically accessible. First-principles calculations revealed that the formation enthalpy of d-FAPbI 3 was lower by about 70 meV per formula unit than a-FAPbI 3 at 0 K. 20 Our density functional theory (DFT) calculations, reported below, also conrmed that the formation enthalpy of a hexagonal CsPbBr 3 /CsPbI 3 phase is 40 meV/90 meV per formula unit higher/lower than the corresponding cubic phases at 0 K. 21 It has been reported that the black to yellow phase transitions in FAPbI 3 and CsPbI 3 are entropy-driven and allow for kinetic trapping of intermediate phases. 22 Another aspect is the small lattice mismatch along the h111i stacking direction between the hexagonal and cubic polytypes, which is just 4.7% for CsPbI 3 and 5.4% for CsPbBr 3 , according to our calculations.
Due to the range of accessible ordered stacking sequences with distinct crystal symmetry and the appearance of stacking faults that break symmetry in halide perovskites, polytypism can bring forth novel physical and chemical phenomena. This includes carrier-blocking effects of hexagonal regions in cubic crystals and a likely contribution to photogenerated carrier recombination in other cases. 21 In this work, we aim to shed light on the phase space for polytype formation in lead halide perovskites. To overcome the limitations of standard modelling approaches, we build a workow based on an evolutionary exploration of a model crystal Hamiltonian. We provide insight into the ordering trends and ground-state structures of CsPbX 3 as a prototype system. Universal features such as the accessible layer arrangements and symmetry forbidden sequences are reported.

Computational methods
We used a combination of computer simulation approaches in this study. DFT can provide reliable rst-principles total energies, but is limited to relatively small polytype stacking sequences. For this reason, a model Hamiltonian approach was employed to rapidly predict polytype energies and assess the extent of polytype disorder. Aerwards, a genetic algorithm with optimisation functions derived from the model Hamiltonian was built to rapidly converge the searching of low energy congurations.

Training data
For each polytype in the training set, DFT calculations were performed using the Vienna Ab initio Simulation Package (VASP) 23,24 with the standard frozen-core projector augmentedwave (PAW) 25,26 method. The cutoff energy for basis functions was 420 eV. The generalized gradient approximation (GGA) 27 of the Perdew-Burke-Ernzerhof functional for solids (PBEsol) 28 was used for the exchange-correlation. We used a dense 7 Â 7 Â 5 k-point mesh for structural relaxation until all atoms were relaxed with Hellmann-Feynman forces below 0.01 eVÅ À1 .

Model construction
Ising-type model Hamiltonian. To get the interaction coef-cients for the model Hamiltonian, the summation of each correlation function, , was numerically calculated. Aerwards, the corresponding interaction coefficients, J 0 , J 1 , ., were generated by solving the linear matrix to get the least-squares solutions.

Genetic algorithm
To quickly identify the ground-state stacking sequences for a given layer number, a genetic algorithm was designed by setting the model Hamiltonian as the objective function. The parameters for running this algorithm include the maximum number of iteration, population size, mutation probability, elite ratio, crossover probability, and parental portion. These parameters can be tuned to monitor and optimise the performance of the algorithm. The converged values are given in the results section.

Thermodynamic stability of CsPbX 3 polytypes
The thermodynamic stability of a CsPbX 3 polytype can be evaluated by its ordering energy. Along the stacking chain of a polytype, two adjacent PbX 6 -octahedra are connected either with a face-sharing (h) or a corner-sharing (c) conguration, forming a unique stacking sequence for each polytype, which in turn determines the interaction between layers and its formability. Given a random CsPbX 3 polytype with N layers, its ordering energy (E O ) is dened as The ordering energy is determined by a combination of electrostatics and crystal strain. Low ordering energy usually indicates the ease of formation. Though similar, the ground state structures of CsPbBr 3 and CsPbI 3 polytypes are starkly different, allowing us to probe the chemistry of ordering ions.
For an ionic crystal, the Madelung energy based on point charges should fully describe phase energetics. However, we found that CsPbX 3 is beyond this regime, due to the contribution from the polarisation of ions. As shown in Fig. 2, under the perfect ionic lattice assumption, the Madelung energies of all the polytypes for the two halide perovskites CsPbBr 3 and CsPbI 3 were calculated, we found that the most stable phase should be cubic 3C. Moreover, the Madelung ordering energies E O,Madelung for the two sets of halide perovskite polytypes are lowered dramatically, by 2.51 eV/layer (CsPbBr 3 ) and 2.55 eV/layer (CsPbI 3 ) respectively, from the hexagonal 2H phase to the cubic 3C phase, indicating that the electrostatic Coulomb energies do inuence the ordering preferences. 29 However, as also shown in Fig. 2, rst-principles calculations revealed that the most energetically stable phase for CsPbI 3 is 2H, while it is 3C for CsPbBr 3 . The ordering energies E O,DFT are less than 100 meV/layer for both halides, with E O,DFT (CsPbI 3 ) ¼ 90 meV and E O,DFT (CsPbBr 3 ) ¼ 40 meV between the 2H and 3C phase. The greatly reduced ordering energies mean that the bare coulombic interactions are regulated and screened due to the polarisation of ions. Moreover, the reversed order of stability in CsPbI 3 and CsPbBr 3 polytypes indicates the non-negligible role of anion polarisation; although we note that Pb 2+ is also highly polarisable due to its lone pair (6s 2 ) electrons, which contributes to the ionic dielectric response. 30

Ising-type model Hamiltonian
While rst-principles calculations can provide reliable total energies, they are limited to relatively small stacking sequences and can only provide a fragmentary picture of the physics of ordering. The other hurdle further stops us from using DFT calculations to explore the phase space directly is the explosion of combinations. Structurally, the ordering of h and c in the stacking chain of a polytype shows a binary nature for each layer. If a polytype has N layers, the possible stacking sequences reach 2 N .
To overcome this problem, an Ising-type Hamiltonian was built to calculate polytype energies. This model Hamiltonian can not only be used to compute the random-state energies for the 2 N stacking sequences of h and c, but also implicitly includes a full set of chemical interactions between the neighbouring layers, manifested by a series of interaction coefficients and their corresponding correlation functions. In Fig. 1, we showed the mapping of a stacking sequence (12H) onto an Ising-type model with s i ¼ 1 for a face-sharing octahedra pair and s i ¼ À1 for a corner-sharing octahedra pair. The rst, second and third nearest neighbour interactions are included during the model construction. Thereby, for CsPbX 3 polytypes, the random-state total energy of a polytype with N layers can be written as where N denotes the number of layers, H 0 is the total energy per layer in the absence of layer interactions, i is an index of the layers from 1 to N, and J, K, L are interaction coefficients, which can be further obtained by tting the Hamiltonian to the DFT calculated total energies. In 1988, Plumer generated an approximate phase diagram for possible ground-state structures of ABX 3 perovskite Fig. 1 Illustration of nine CsPbI 3 polytypes with low periodicity (#12 layers). The cubic perovskite (3C) consists of pure corner-sharing PbX 6 octahedra; and the hexagonal (2H) structure consists of pure face-sharing octahedral network. Other polytypes in between 2H and 3C are shown and distinguished according to the halide in each layer being corner-sharing (c) or face-sharing (h). Note that the stacking axis corresponds to the h111i direction of the cubic unit cell. The labels on the right side correspond to the pseudo-spin (+1 for h, À1 for c) in the Ising-type model description of the 12H structure. Different interlayer correlations coefficients are labeled with J, K and L, according to their specific interaction types. polytypes. 31 In this model, limited by available data at the time, the ground-state structures for both CsPbI 3 and CsPbBr 3 should be 3C, concluded by the sign of their coefficients. However, rstprinciples calculations have conrmed that, even though both CsPbI 3 and CsPbBr 3 belong to the ABX 3 type perovskite and have similar interaction coefficients, the most stable phase for CsPbI 3 is 2H, while for CsPbBr 3 it is 3C. This conclusion reects the underlying chemical complexity and how the choice of elements can play an important role to alter the stability of polytypes with similar stacking sequences.
Following this procedure, the model Hamiltonian calculated stability trends, denoted as the ordering energies E O,Ising for both CsPbBr 3 and CsPbI 3 in Fig. 2, in agreement with the DFT results E O,DFT . The Ising-type model Hamiltonian is found to work well, with the largest tting error being negligible at just 3 meV/layer for the 11H(II, 9c2h)-CsPbBr 3 polytype and 2 meV/ layer for the 11H(II, 9c2h)-CsPbI 3 polytype. Furthermore, the model Hamiltonian predictions have been checked with new (unseen) polytype data, which also conrms the high prediction accuracy of the model Hamiltonian (Table S1 †).
The impact of ion choice and ordering for tuning the stability of different stacking sequences is contained within the eight interaction coefficients and their corresponding correlation functions. For instance, the H 0 value for the CsPbBr 3 stacking sequences is about 1.62 eV/layer lower in energy than that for the CsPbI 3 polytypes. Physically, the H 0 can be viewed as the energy of the suspended monolayer of octahedra. The more negative H 0 value in CsPbBr 3 polytypes shows that Br À stabilises the perovskite structure more than I À , in consensus with the conclusion drawn from the perspective of tolerance factors that cubic CsPbBr 3 has a more ideal Goldschmidt tolerance factor (t(CsPbBr 3 ) ¼ 0.92; t(CsPbI 3 ) ¼ 0.89). This highlights the key anion role in determining polytype stability. The additional chemical contributions are quantied by the seven interlayer correlation-related coefficients in the model Hamiltonian, which act as gears to tune the stability of a given polytype.
The most striking parameters are the small values for J 2 and K, being only several meV. As shown in Fig. 1, J 2 denotes the interaction between a layer and its second nearest neighbour, and K denotes the interaction between a group of three neighbouring layers. However, it is not the absolute values of the interaction coefficients that dominates the nal ordering effect. The interaction coefficients matter only when the stacking sequence allows it. This can be understood from two aspects. One is that the stacking sequence determines the weight of each type of interaction in the model Hamiltonian through the Table 1 Interaction coefficients in the Ising-type Hamiltonian for CsPbI 3 and CsPbBr 3 polytypes (unit: eV)  correlation functions: ; and the other reason is, the impact of a sole interaction coefficient can only be pronounced when such interaction is not quantitatively counterbalanced by other interactions in a given stacking sequence. For instance, for the CsPbX 3 polytypes, the total energies of 2H and 3C phases per unit layer are Due to the special stacking sequences for 2H and 3C, the ordering effect of their stacking sequences give an equal weight for different interlayer interactions, which only differ in sign. With the absolute values provided in Table 1, the impact of J 2 and K become pronounced because the absolute total contribution of interlayer interactions is at the 10 meV level, making the inuence from the J 2 and K related terms non-negligible. Moreover, from the model Hamiltonian, why the ground state structures for CsPbBr 3 and CsPbI 3 polytypes prefer different stacking sequences can also be explained. By subtracting eqn (3) and (4), the energy difference between the 2H and 3C phases per layer is Consequently, the three interlayer interactions corresponding to J 0 , K and K 0 are responsible for the ordering energy between the 2H and 3C phases. For magnetic systems, J 0 is viewed as the Zeeman-like term linear in s, representing an externally applied magnetic eld pointing along the stacking direction, however in this pseudo-spin model Hamiltonian for polytypism, J 0 is a phenomenological parameter with no implicit physical meaning and if the numbers of h and c are equal in a stacking sequence, the weight for J 0 will become zero and will not contribute to the total energy. Different from K explained previously, K 0 corresponds to the group interaction of  three layers consisting of a pair of nearest neighbouring layers and one layer that is the second nearest neighbour to any of them. As a consequence, the ordering energy between the 2H and 3C phases could be quantitatively decided. Taking the 3C phase as the reference, for CsPbI 3 DE 2H-3C is À0.09 eV/layer, meaning that the 2H phase is 90 meV/layer more stable than the 3C phase. For CsPbBr 3 , the DE 2H-3C is 0.04 eV/layer, meaning that the 3C phase is 40 meV/layer more stable than its 2H phase. This result arises from the more negative value of K 0 for CsPbI 3 polytypes, which alters the ground-state structures. Such more negative value of K 0 may be also related with the higher polarizability of I À compared to Br À .

Distinct polytypes for an N layer polytype h n c m
The Ising-type model Hamiltonian opens up a fast means to compute the energy for an arbitrary h n c m polytype. The only computing cost is from the summation over pseudo-spin correlation functions determined by the stacking sequence. An expression to calculate the number of distinct polytypes for an N layer polytype h n c m , including periodic boundary conditions, takes the form where A represents the number of distinct arrangements, N is the layer number, n and m are the number of face-sharing (h) and corner-sharing (c) layers along the stacking sequence; QS denotes the ceiling of the whole fraction. Accordingly, we listed the number of distinct arrangement of h and c for CsPbX 3 polytypes in Table 2. For instance, a polytype with 2 layers can have 3 distinct arrangements: hh, cc and hc or ch; a polytype with 3 layers can have 4 distinct arrangements: hhh, hhc, hcc and ccc. With the constraints of periodicity, the space of distinct arrangements of h and c for CsPbX 3 polytypes increases slower than the exponential explosion to the 2 N for a linear chain of pseudo-spin. There are only 60 distinct arrangements for a 9 layer polytype, which should be 2 9 ¼ 512 without periodic boundary conditions. Similarly, the space of distinct arrangements for a 12 layer polytype also decreases dramatically by 1-311/(2 12 ) ¼ 92.4%.
The determination of the distinct polytype set is instructive. Although the energy of a specic polytype h n c m can be computed using the model Hamiltonian swily, only with the full set of the distinct polytypes, can we grasp the complete thermodynamic landscape of the polytypism. More generally, the number of distinct arrangements determines the number and degeneracy of energy levels that a polytype h n c m can possess.

Low energy polytypes
Here, low energy polytypes refer to polytypes whose ordering energies are within the range of ordering energies between the 2H and 3C phases of CsPbI 3 and CsPbBr 3 . These are 90 meV/ layer and 40 meV/layer, respectively. In experiments, during the kinetically controlled growth of CsPbI 3 and CsPbBr 3 crystals, usually the low energy structures would dominate. Along the stacking sequence of the cubic CsPbX 3 lattice, the thickness of an octahedral monolayer is around 3.55Å for the CsPbBr 3 and 3.75Å the for CsPbI 3 . Thus the domain of a 12 layer stacking faults will be about 4.3-4.5 nm in size, both being much smaller than the typical grain sizes of the cubic phase CsPbI 3 and CsPbBr 3 , 32,33 which can extend to microns.
A genetic algorithm was adopted to provide an efficient way to locate the ground state structures of polytypes without calculating all energies. At most, it would take 2 12 ¼ 4096 trials for 12-layer polytypes, but there is an interest to extend to even larger structures in the future. According to evolutionary theory, the gene of a species is a unique DNA sequence. Given the binary nature of each stacking layer, the random stacking sequence of a polytype can also be generated under the biologically inspired operators such as mutation, crossover and selection. Not only that, but also the central idea of natural evolution, 'survival of the ttest', is also implemented in genetic algorithms tailored for optimisation problems. The core of a genetic algorithm is the objective function, which here is the Ising-type Hamiltonian for CsPbI 3 and CsPbBr 3 , During the optimisation process, because of the binary nature of each layer in the stacking chain, the initial population is generated by a random choice between À1 and 1 for each component layer (see Fig. 3). Then the tness scores, i.e. values of the objective function corresponding to different stacking sequences, are sorted and normalized. Aerwards, the cumulative probabilities for the initial population are calculated and used for the parent selection. As a result, the parents are formed of elitists and individuals randomly selected from the current population until the set population size is reached. Then, two basic operators are used to generate the new generation: crossover and mutation. The crossover sites are chosen randomly and the number of them is determined by the crossover probability. Here the crossover happens in a uniform way by 'ipping a coin' for each site of the parents to decide whether or not it will be included in the next generation; and the mutation takes place randomly within the given mutation probability. Finally amongst the new generation, the new best pseudo-spin sequence to minimize the objective function is generated and compared with the previous generations, which guarantees the algorithm to keep descending to the global minimum for the objective function. The optimisation process will stop when it reaches the set maximum number of iterations.
As the parameters used for the genetic algorithm affect the speed and output, a series of tests was conducted to detect the model sensitivity. Taking the 12 layer CsPbI 3 polytype as an example, the genetic algorithm converges to the most stable phase of the 12 layer CsPbI 3 in only two seconds with the output stacking sequence {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}, i.e. the 2H phase, consist with the DFT results. Interestingly, as shown in Fig. 4a, there are plateaus during the convergence, corresponding to energy levels for distinct polytypes. This also means that other competitive structures can be extracted from the plateaus by tuning the parameters in running the algorithm. Fig. 4c shows the selected ground state structures for the 12, 10, 8 layer CsPbI 3 polytypes and their ordering energies compared with the hexagonal 2H phase. Detailed stacking sequences for these ground state polytypes are listed in Table 3. For instance, the 12 layer ground state with an ordering energy of 33 meV per layer in Fig. 4c corresponds to the 12R phase. Such low energy polytypes should be accessible thermodynamically. By considering the thermal energy at room temperature, polytypes with ordering energies lower than 25 meV/layer are more likely to be detected in experimental samples. For example, the 18-layer 2c7h2c7h-CsPbI 3 polytype with a 23.5 meV/layer ordering energy is a candidate phase.
The genetic algorithm can also be used to nd the highest energy phases by ipping the sign of the objective function. In principle, the pure 3C phase should be the highest energy. However, as shown in Fig. 4b for the 12 layer polytypes, the output total energy is higher with the global maximum being the stacking sequence of hhhchhhchhhc. The origin of such high energy phases is discussed in the next section.
To validate the predictions, 220 stacking sequences from the 12 layer CsPbI 3 polytypes were randomly sampled and their total energies were calculated. As shown in Fig. 4d, a polar coordinate was used to distribute the 220 total energies, with the radius denoting the total energy per layer and the angle representing each polytype. Similar to the plateaus that emerged during the genetic search, the total energies for these 220 polytypes are also discretely distributed, mainly falling upon ve energy levels. Taking the total energy per layer of the pure 3C CsPbI 3 as the reference, which is À12.23 eV/layer, most of the total energies fall in the inner energy levels, indicating that these stacking sequences will be hard to form. Moreover, typical sequences exhibiting abnormally high total energies per layer are extracted from the learning paleatus, such as hhhc, hhhccc, hchhhc, hhhhcc and hhhhhc. These sequences can be viewed as the high energy gene for a polytype, which are not favored during the evolution, for instance, hhhc and hhhhchhhhchhhc were found to be unstable. Such high energy gene sequences led us to identify symmetry forbidden sequences, which further decrease the number of available arrangements for h and c.

Symmetry forbidden sequences and the electronic impact of stacking sequences
The stacking sequences predicted to be highly unfavourable are in fact forbidden by crystal symmetry. The root cause is the mismatch between different counts of h and c layers in a stacking sequence. Originally, the representation that we use for the 3C unit cell is rhombohedral (R 3m), while the 2H unit cell is hexagonal (P6 3 /mmc). However, as shown in Fig. 5a, for the sequence hhhc, there is no translational symmetry as the start and end of the cell are located at different atomic sites. While the energy of these sequences can be calculated from the model Hamiltonian, which has no atomic resolution, they cannot be mapped onto a valid atomistic supercell. Such sequences should be at least doubled to form a valid centrosymmetric repeat unit. Therefore, for the sequence hhhc, its lowest periodicity is hhhchhhc.
To preserve translational symmetry for an arbitrary stacking sequence h n c m , its minimum periodicity of layer number P can be determined by the following three rules: (1) If n is odd, then P ¼ 2(n + m), the lattice is hexagonal; (2) If n is even, then P ¼ 3(n + m), the lattice is rhombohedral; (3) If n is even and m is 3X, X is zero or an positive integer, then P ¼ (n + m), the lattice is rhombohedral. Thus, the genetically unfavorable stacking sequences, i.e. hhhc, hhhccc, hchhhc, hhhhcc, hhhhhc, are symmetry forbidden. These can only be component sequences for larger lattices and cannot exist alone, so the hhhc should be doubled into the centrosymmetric hexagonal hhhchhhc sequence; the hhhccc should be doubled into the centrosymmetric hexagonal hhhccchhhccc; and the hchhhc should be tripled into the rhombohedral hchhhchchhhchchhhc sequence and so on. The requirement of translational symmetry further decreases the number of distinct polytypes for an N layer polytype h n c m .
At last, an example of the impact of stacking sequences on the underlying electronic structures of polytypes is presented in Fig. 5b-e. Taking the 11-layer polytypes as an example, the DFT electronic structures of 2h9c and 3c8h polytypes are shown. The effect of stacking sequence manifests itself by altering the band structures for two CsPbI 3 polytypes, that is, from a direct band gap semiconductor (2h9c-CsPbI 3 ) to an indirect band gap semiconductor (3c8h-CsPbI 3 ). Even though a direct band gap is retained for CsPbBr 3 polytypes, the 3c8h-CsPbBr 3 polytype exhibits a noticeably larger band gap than 2h9c-CsPbBr 3 . The inuence of stacking sequences shows the potential for electronic structures engineering in perovskite polytypes with identical compositions.

Conclusions
A family of perovskite polytypes can be represented by sequences of the face-and corner-sharing PbX 6 octahedra. The combinatorial explosion arising from the increasing periodicity of the stacking sequence makes it resource and time demanding to perform exhaustive rst-principles calculations. To address this problem, an Ising-type model Hamiltonian was constructed and the total energy for any given stacking sequence can be rapidly computed. A genetic algorithm was then built to nd the ground state structures for an N layer polytype h n c m . The algorithm can converge to the lowest energy sequence in seconds without enumeration over the sequences. Plateaus in the convergence behaviour led us observe symmetry forbidden sequences, which shrinks the possible valid arrangements of h and c to form polytype structures.
The energetic and congurational features of the polytypism in CsPbX 3 discussed in this work are also instructive for organic-inorganic systems, an extension which would require additional parameterisation for molecular disorder, for example through an on-site electrostatic model. A strong molecular disorder contribution is expected when the A-site cation is methylammonium (MA + ) cation, due to its polar nature and relatively fast rotations. The formamidinium (FA + ) is less polar and with slower rotations so FAPbI polytypes may be expected to behave more similar to CsPbBr 3 , considering both its close to ideal tolerance factor (0.99) and ground state 3C phase.
Finally, we note the polytypes discussed here may appear as stacking faults or inclusions rather than as distinct phases for many halide perovskite compositions. We hope that this work will help to bridge the divide between the real-world complexity and atomistic modelling of halide perovskite crystals, and can support future polytype engineering.

Data availability
Structure models and the codes used to generate all structures to parameterize the Ising model Hamiltonian are available in an on-line repository at http://doi.org/10.5281/zenodo.5186354.

Author contributions
Z. L. performed all calculations for model building and testing. J. P. wrote the code for polytype generation. A. W. supervised the project. All authors contributed to the analysis and writing.

Conflicts of interest
There are no conicts of interest to declare.