Which conformations make stable crystal structures ? Mapping crystalline molecular geometries to the conformational energy landscape †

The ability to anticipate the shape adopted by flexible molecules in the solid state is crucial for engineering and predicting crystal packing and, hence, properties. In this study, the conformations adopted by flexible molecules in their crystal structures are assessed in terms of their relationship to the calculated global conformational landscape. The study quantifies the limits on molecular strain that can be induced by intermolecular interactions in single-component crystal structures of molecules with no intramolecular hydrogen bonding, demonstrating that some molecules are distorted by up to 20 kJ mol 1 by crystal packing forces. Furthermore, we find that crystallisation often selects high energy conformers, but only when the high energy conformer is more extended than the lower energy options, allowing for greater intermolecular stabilisation. Based on these observations, we propose that the crystallisability of conformers is assessed in terms of their energies and surface areas. We formulate this as a parameterised pseudo-energy related to molecular surface area, which leads to a dramatic improvement in our ability to predict the conformations adopted by molecules in their crystal structures.


Introduction
Crystal engineering involves the design of crystal structures and the deliberate targeting of solid-state properties through our understanding of structure-property relationships. Successful crystal engineering relies, in part, on a knowledge of the overall shape of molecular building-blocks, as well as the relative arrangement of functional groups within a molecule that can participate in structure-directing interactions. Molecular shape can be easily predicted for rigid molecules, but becomes more challenging to anticipate as molecular exibility is increased and the molecules of interest have a choice of conformer when self-assembling into a crystal. Conformation determines the overall molecular shape and position of functional groups, so different conformers may lead to very different crystal packing arrangements, ultimately inuencing the properties of the crystal. Therefore, conformational exibility can be seen as a potential obstacle to crystal engineering. This is particularly relevant for the engineering of pharmaceutical materials, where solid form properties of active pharmaceutical ingredients may be manipulated, either by selection between polymorphs, or the design of multi-component crystals, such as salts or cocrystals.
Here, we address the issue of predicting which conformation will be adopted in a exible molecule's crystal structure. We approach this problem from the point of view of computationally-guided crystal engineering and the development of methods for crystal structure prediction (CSP). CSP, as applied to organic molecules, has made substantive progress over the past decade. [1][2][3] The most commonly applied CSP method is based upon a global search of the lattice energy surface. 4 The main assumption is that the most likely observable crystal structures correspond to the lowest energy minima. This approach has been successfully applied to the prediction of both single- [5][6][7][8] and multi-component 6,7,9 molecular crystal structures, sometimes guiding experimental efforts towards the discovery of new solid forms. 10,11 In terms of crystal engineering and crystal structure modelling, a rigid molecule should be dened as a molecule that has a single conformation and where that conformation is unaffected by crystal packing forces. For a rigid molecule, CSP requires an exploration of possible structures as a function of unit cell dimensions, molecular positions and orientations. Once a set of crystal structures is generated, their ranking by relative energies requires the calculation of intermolecular interactions only. Putative crystal structures are oen separated by small energy differences, and the goal of CSP has encouraged the development of accurate models for intermolecular interactions. 12 The challenges involved with extending the capabilities of CSP methods to dealing with exible molecules have hindered applications to pharmaceutical molecules, and are apparent in the results of the CSP blind tests, 2,3 where success rates for exible molecules are low. The two principle difficulties associated with molecular exibility are that: 1. The exibility of intramolecular degrees of freedom means that the molecular geometry may be distorted under the inuence of intermolecular interactions in a crystal structure, to reach the optimum balance between inter-and intramolecular energies. Therefore, the impact of the crystalline environment on the molecular geometry must be modelled.
2. Conformational degrees of freedom increase the dimensionality of the energy landscape that must be explored when generating crystal structures.
The distortion of molecules away from their gas phase structures has recently been discussed in the context of conformational polymorphism. 13 Off-the-shelf force eld methods for modelling intramolecular interactions do not provide a sufficiently accurate description of the associated energetic cost. 14 Therefore, the high demands of CSP for exible molecules have required either solid-state electronic structure based approaches 15,16 or hybrid energy models, where intramolecular degrees of freedom are treated quantum mechanically and atom-atom models are used for intermolecular interactions. [17][18][19][20][21] The problem of sampling conformational phase space and predicting the relevant molecular geometries is less well developed. Conformational degrees of freedom can be sampled concurrently with packing parameters, but for highly exible molecules it can be advantageous to split the high dimensional global optimisation problem into two smaller problems: a search for relevant conformers followed by a crystal packing search for each. Apart from reducing the dimensionality of the search space, this approach leaves room for additional selection criteria when assessing which molecular conformations might lead to the most stable crystal structures. 22,23 A purpose of this work is to study the conformational energy landscapes of a large set of exible, pharmaceutical-like molecules, to investigate the relationship between the ensemble of possible conformers of the isolated molecules and the molecular geometries seen in their observed crystal structures. These comparisons should provide insight into conformational preferences in crystal packing and inform the future development of structure prediction methods for exible molecules. Therefore, we have carried out a series of calculations to address the following questions: Which conformation will a molecule adopt, and where on the energy landscape of the isolated molecule is this conformation found?
Apart from energy, can other molecular properties be identied that predict which conformation will be observed in crystal structures?
A starting assumption in this work is that the geometry adopted by a molecule in its crystal structure will be close to that of one of the conformers of the isolated molecule. However, in comparing the energy landscape of the isolated molecule to its crystalline geometry, we also investigate to what extent molecular geometries are distorted away from the geometry of the isolated conformer by intermolecular forces. The results will help quantify the importance of considering distortions away from idealised molecular geometries when attempting to predict the crystalline arrangement of organic molecules, in a general crystal engineering context as well as for the specic goal of ab initio CSP.

Methods
The aim of this investigation is to compare the geometries that organic molecules adopt in their crystal structures to the landscape of possible conformers of the isolated molecule. For molecules whose crystal structures have been determined, the crystalline molecular geometries are obtained from the atomic coordinates in the crystal structure. However, an exhaustive set of conformers is only available from computational methods. The methods used for conformational searches and energy calculations on the observed and calculated molecular geometries are described below.

Crystal structure optimisations
The rst aim is to obtain an accurate geometry and energy for all conformations of a molecule in all of its observed crystal structures. The energies of the molecular geometries observed in the known crystal structures were evaluated aer lattice energy minimisation of the experimentally determined crystal structures, removing the inuence of experimental errors in atomic coordinates.
For consistency, we apply the same level of theory in an atomic basis set method for crystal and molecular energy calculations: we use dispersion-corrected density functional theory (DFT-D), which has been shown to provide an accurate description of the structures and energies of molecular organic crystals. 15,24,25 All DFT-D calculations were performed using the CRYSTAL09 (ref. 26 and 27) soware, with the B3LYP hybrid functional, 28,29 6-31G* basis set, and the rescaled empirical dispersion correction suggested by Civalleri et al. 30 All atomic positions were relaxed during lattice energy minimisation, with unit cell dimensions constrained at experimentally determined values. All single molecule calculations were also performed in CRYSTAL09, using the same functional, basis set, DFT integration grid and optimisation thresholds as the periodic lattice energy minimisations. Full details of DFT-D calculations are provided in the ESI. † All symmetry-independent molecules were extracted from the optimised crystal structures and single-point molecular energy calculations were performed to give the energies of the molecules in their crystalline geometry (see point A in Fig. 1). These molecular geometries were also taken as starting points for unconstrained molecular geometry optimisations, yielding the local energy minimum on the conformational energy surface associated with the crystalline molecular geometry (point B, Fig. 1). We refer to the energy difference between the single-point and optimised molecular geometries as the intramolecular strain energy, DE strain .

Conformational searches
We applied a low-mode conformational search (LMCS 31,32 ) method, as implemented in MacroModel, 33 to generate as complete and unbiased sets of conformers as possible for each molecule studied. LMCS is a mode-following algorithma starting molecular geometry is perturbed along one or a combination its calculated normal modes before re-minimising. Full details of the search parameters used here are available in the ESI. † All unique conformers resulting from an initial search using the OPLS2005 (ref. 34) force eld were re-optimised using DFT-D, applying the same computational parameters described above for the crystalline molecular geometries. The result should be a complete set of low energy conformers for each molecule (points 1-4 in Fig. 1), assuming that the conformer search was complete and that all local minima on the conformational energy surface described by DFT-D have a corresponding local minimum on the force eld described energy surface. The predicted conformations were re-clustered to remove duplicates aer DFT-D re-optimisation and, in most cases, additional structures were removed at this stage, suggesting that the DFT-D conformational energy surface is smoother than the OPLS surface.
Geometric comparisons were performed to locate the optimised conformations taken from the observed crystal structures in the sets of predicted conformers (e.g. point B ¼ point 2 in Fig. 1). We refer to the energy difference between the conformer corresponding to the crystalline geometry and the lowest energy computer-generated conformer as the relative conformational energy, DE conf (Fig. 1).
As further characterisation of the conformational energy landscape, the Connolly surface area of each DFT-D-optimised conformer was calculated using Material Studio 35 with a probe radius of 1.8Å.

Choice of molecules
The molecules included in the study were chosen to explore a range of molecular size and exibility. To focus on pharmaceutical-like molecules, we restricted ourselves to molecules that satisfy Lipinski's rule of ve, 36 placing limits on molecular mass, number of hydrogen bond donors and acceptors and predicted octanol-water partition coefficient. Molecules with the potential for intramolecular hydrogen bonding are excluded to focus on molecules without strong, directional non-bonded intramolecular interactions. The inuence of intramolecular hydrogen bonding on the conformations adopted in crystal structures is the subject of an on-going study.
Molecular exibility was measured by the number of rotatable (exocyclic) single bonds in the molecule, excluding terminal methyl groups. Three molecules were chosen with each of 4, 5, 6, 7 and 8 such rotatable bonds, yielding a test set of 15 molecules. While their presence was not used as a criterion for the selection of molecules, the set also includes molecules with exible ring systems, whose exibility is sampled during the low-mode conformational search.
All chosen molecules have experimentally determined crystal structures and the molecules were selected to include diversity in known polymorphic behaviour. For each level of exibility, one molecule was chosen with known conformational polymorphism, one with known packing polymorphism (alternative crystal packings of the same molecular conformation) and one molecule with no reported polymorphism. To maximise the data obtained from each molecule, molecules with the most reported polymorphs at each level of exibility were selected. For molecules with the same number of known polymorphs, and for the non-polymorphic molecules, the molecule with the highest quality reported crystal structure (as measured by the R factor) was chosen. To our knowledge, no solvate structures have been reported for the molecules in our set, so the study includes only single-component crystal structures. We refer to molecules by their 6-letter Cambridge Structural Database (CSD) 37 REFCODE family names which are provided, along with chemical diagrams, in Table 1.

Intramolecular strain energy
Any distortion in the crystal structure away from the molecular geometry of the isolated molecule must involve an increase in intramolecular energy, DE strain , which we calculate with reference to the nearest local minimum on the isolated molecule's energy surface (see Fig. 1).
The distribution of DE strain is summarised in Fig. 2 for the 36 distinct molecular geometries in the 29 lattice energy minimised crystal structures of the 15 molecules (see Table S1 in ESI †), including molecular geometries from different polymorphs and independent molecules within the same crystal structure (for structures with multiple molecules in the asymmetric unit).
There is a correlation, albeit weak, between DE strain and the geometrical distortion of the crystalline molecular geometry away from the geometry of the isolated molecule (see Fig. S3 and S4 in ESI †). However, there are cases where a large molecular strain energy results from very small changes to stiff intramolecular degrees of freedom or where large geometry changes come at a relatively low energetic cost. The latter typically correspond to rotation about so dihedrals (for example, see Fig. 3a).
The majority (75%) of molecules are strained by less than 10 kJ mol À1 and it is unsurprising in these cases that the increase in intramolecular energy can be compensated by improved Table 1 Chemical diagrams and CSD REFCODEs for the studied molecules. The three numbers in parentheses following each REFCODE refer to the number of polymorphs, the number of independent molecular geometries (summed over all polymorphs) and the number of unique conformers found in all polymorphs, respectively    The results that we nd here are in broad agreement with those recently reported for a large set of conformational polymorphs. 13 At the low energy end of the DE strain distribution, we note that none of the molecules are completely unaffected by crystal packing. The smallest calculated value is DE strain ¼ 2.4 kJ mol À1 for the triclinic polymorph of 1,4-dibenzoylbutane (CSD REFCODE GALCAX01), corresponding to an all-atom RMS difference in atomic positions of 0.03Å between gas and solid phase molecular geometries (Fig. 3c). From a structure prediction perspective, these results demonstrate the limitations of any rigid-molecule treatment of even moderately exible molecules, where the molecule in the crystal structure is assumed to be undistorted from its gas phase geometry. Energy contributions of 2-3 kJ mol À1 to the relative stability of crystal structures can be crucial in the context of CSP, where energy separations between computer-generated crystal structures are oen 1 kJ mol À1 or less. 4 3.2 Which conformation is adopted in a crystal structure? Energetics Molecules are not restricted to their lowest energy conformer during crystallisation and the best balance of inter-and intramolecular interactions sometimes involves a higher energy conformer. We compare the conformers observed in known crystal structures with the sets of predicted conformers to investigate the frequency with which higher energy conformers are adopted in the crystal structures of exible organic molecules.
We dene the conformer associated with the crystalline molecular geometry as the structure reached aer DFT-D optimisation of the molecule extracted from the crystal structure. From our initial set of 36 independent molecular geometries, we nd only 21 distinct conformers because some molecular geometries (e.g. of the packing polymorphs) converge to the same conformer on the gas phase molecular energy surface.
For all molecules apart from one, we nd a corresponding conformer in the sets generated by the conformation search. For the remaining molecule, SEVJAF, the original LMCS search failed to return the conformation obtained from the known crystal structure. However, the observed conformer was obtained aer initiating a second, longer, search from the geometrically closest conformer in the initial set. For the purposes of this study, we assume that the initial set of SEVJAF conformers is representative of the conformational surface, despite being incompletely explored, and we proceed with the original set of SEVJAF conformers supplemented by the conformer corresponding to the observed crystalline molecular geometry.
The results are summarised in Table 2 using the quantity DE conf (Fig. 1), the energy difference between the conformer corresponding to the crystalline molecular geometry and the global minimum conformer found by the conformational search. Surprisingly, only 6 of the 15 molecules adopt the global minimum conformer in one of their known crystal structures. Two of these 6 molecules display conformational polymorphism, adopting a non-global minimum conformer in one polymorph. Therefore, a non-global energy minimum conformer is observed in the crystal structures of 11 of the 15 molecules.
It is clear from these results that a range of molecular conformers must be considered when assessing which molecular conformation will be adopted in the solid state. In the majority of cases, the crystalline geometry is energetically close to the most stable conformer of the isolated molecule: of the 21 conformers in this study, the crystalline conformation is within 5.9 kJ mol À1 of the global minimum in 15 cases. The remaining 6 are more interesting, ranging from DE conf ¼ 13.9 to 25.6 kJ mol À1 ; the highest energy observed conformer in this set of molecules is DE conf ¼ 25.6 kJ mol À1 in crystal structure DAD-NUR. These values suggest that, for a exible organic molecule, conformations up to about DE conf ¼ 26 kJ mol À1 must be considered as possibilities for what will be observed in the solid state. These results are discouraging for structure prediction and crystal engineering: such an energy window can encompass large numbers of conformers (Table 2), whose shapes and arrangement of functional groups can vary greatly. Using the molecular energy as a sole selection criterion, any conformer within this range should be considered a possibility for what will be observed in a molecule's crystal structure. This raises the question of whether non-energetic descriptors of molecular conformers can be found that highlight those conformers that are most likely to form low energy crystal structures, or lter out those that will not lead to stable crystal structures.
Molecular symmetry may be benecial for close packing and it has been shown that molecules that adopt a centrosymmetric geometry in their crystal structure almost exclusively crystallise with molecules on crystallographic special positions. 38,39 However, it is not clear what role symmetry plays in determining the conformer that a molecule adopts. Here, we observe that, of the six molecules with potential centrosymmetry (FAHNOR, ODNPDS, FIBKUW, GALCAX, CELHIL SIKRIN), ve adopt centrosymmetric or nearly centrosymmetric conformations in their crystal structures. For SIKRIN and one polymorph of FAHNOR, the observed centrosymmetric conformer is the global minimum conformer. In all other cases, the global minimum energy conformer lacks centrosymmetry. FIBKUW adopts its lowest energy centrosymmetric conformer, while CELHIL and GALCAX adopt centrosymmetric conformers that are high in energy, in favour of alternative, lower energy centrosymmetric and non-symmetric conformers. The sample is too small to judge the overall importance of centrosymmetry in conformer selection, and further studies focussed on symmetry could be interesting.

Molecular surface area as a non-energetic criterion for conformer selection
One consequence of molecular exibility for many of the molecules studied here is the possibility of both compact and extended conformers. The lowest energy conformers tend to be relatively compact, stabilising the conformer by bringing structurally remote atoms into non-bonded contact (e.g. Fig. 4a). Extended, open conformations lack this intramolecular stabilisation, while making more molecular surface area available for intermolecular interactions in the solid state. The observed conformer in crystal structures must represent a balance between these intra-and intermolecular non-bonded interactions.
Indeed, in analysing the sets of observed and computergenerated conformers, we nd that the conformer that is found in a molecule's crystal structure tends to have a relatively extended geometry (e.g. Fig. 4b). Similar observations have been made regarding a preference for extended conformations of ligands bound to proteins. 40 The potential for a molecule to form close intermolecular interactions is limited, at a basic level, by its accessible surface areathe surface area that can be brought into contact with an adjacent molecule. Therefore, to quantitatively compare conformers, we calculated the Connolly surface area 41 of all predicted conformers of each molecule, providing the contact surface mapped out by a spherical probe rolling across the van der Waals surface of the molecule. We used a probe radius of 1.8Å, the approximate radius of a methyl group, to provide a measure of the molecular surface area that is accessible for close intermolecular contact with other copies of itself in a crystal structure.
The results for three of the most exible molecules are summarised in Fig. 5, as plots of relative conformational energy against Connolly surface area. It is clear from these results that conformers with large surface areas are favoured in the crystal structures of exible molecules. The molecules summarised in Fig. 5 adopt conformers with low (DANQEP, Fig. 5a), middling (HAJYUN, Fig. 5b) and large (DADNUR, Fig. 5c) DE conf , while in each case the crystalline conformers are amongst those with the largest possible accessible surface areas.
This observation is not limited to these three molecules; for all 15 molecules, a high energy conformer is adopted in the crystal structure only if it has a larger surface area than that of the global minimum energy conformer (surface area vs. energy plots for all remaining molecules are available in the ESI, Fig. S5-S9 †). On the basis of these observations, it appears that compact high-energy conformations are unlikely to occur in crystal structures.

Relating intermolecular interaction energies to molecular surface area
To further explore the notion that accessible surface area is an indicator of the stabilisation that can be achieved by intermolecular interactions in the crystal, we examined the variation in measured sublimation enthalpies of a series of small, rigid organic molecules as a function of their calculated surface areas. We believe that increased stabilisation due to higher surface area is dominated by dispersion interactions. To focus on dispersion contributions, rather than more specic electrostatic contributions to lattice energies, we limited this study to hydrocarbon crystal structures: details of the molecules and their measured sublimation enthalpies can be found in Table  S2 in the ESI. † The relationship between sublimation enthalpy and molecular surface area is remarkably linear (Fig. 6) with very little scatter about a linear regression with a gradient of m EvsSA ¼ 0.75 kJ mol À1ÅÀ2 . This suggests that the increase in intermolecular contributions to lattice energy with increasing molecular surface area should be predictable, at least in the absence of strong, specic intermolecular interactions.   7 summarises the energies and surface areas for all conformers of the 15 molecules in our test set; each conformer is represented by its relative energy and relative Connolly surface area, calculated with respect to the energy and surface area of the lowest energy conformer of the same molecule. The gradient, m EvsSA derived from the sublimation enthalpies of rigid hydrocarbons is superimposed on the energy/surface area distribution. We note that all conformers found in the observed crystal structures (highlighted in red, Fig. 7) fall below and to the right of the m EvsSA line, where the expected intermolecular stabilisation due to greater surface area outweighs the higher intramolecular energy than the global minimum conformer.
The calculated molecular surface area seems to be predictive of what conformers will be found in crystal structures and we propose that conformers falling in the shaded area (Fig. 7) compact, high energy conformers whose higher intramolecular energy is not compensated by their expected intermolecular stabilisationare unlikely to be observed.   We propose that this observation can be formulated as a biasing term added to the calculated conformational energy: where the DFT-D calculated intramolecular energies of the isolated conformers are adjusted by a pseudo-energetic contribution, DE pseudo,inter , which predicts the relative intermolecular interaction energy of conformers in their potential crystal structures from their relative accessible surface areas.

Re-ranking of predicted conformers
Comparing the distributions of all predicted conformers distributed according to the bare DFT-D energy (Fig. 8a) and E conf,biased (Fig. 8b), we nd that the surface area correction term does not dramatically perturb the total distribution of energies. E pseudo,inter has little effect on the total range in energies, while skewing the distribution of structures slightly towards lower relative energies. Most importantly, the location within the overall ensemble of the conformers that are adopted in the crystal structures (red bars in Fig. 8) is changed dramatically. The application of the E pseudo,inter correction leads to a signicant enrichment of the observed conformers in the low energy region of the distribution. This is evident from a comparison of the cumulative distribution functions in Fig. 8: all crystalline conformers are found within the rst 7.1% of the overall distribution of E conf,biased , while they range over a much greater proportion (39.5%) of the overall distribution of pure E conf .
The improvement is also reected in the ranking of the observed conformers for each individual molecule ( Table 2). The ranking of all of the high energy conformers is dramatically improved aer applying the pseudo-intermolecular energy correction. Furthermore, the energy difference, DE conf,biased from the global minimum is decreased in almost every case.
Practically, this demonstrates that the combined consideration of calculated intramolecular energy and molecular surface area can narrow the ensemble of conformers that must be considered as likely candidates for what will be found in crystal structures. Specically, the application of the surface area pseudo-energy correction that we propose here means that the conformers that are seen in observed crystal structures are found much nearer the top of ranked lists of possible conformers than when ranking is based on the calculated energy of the isolated molecule.
In terms of applications to CSP, this extends the size of molecule that can be practically studied, by focussing computing effort on the smaller set of conformers that are able to achieve the best balance of inter-and intramolecular energies in the solid state. In the more general context of crystal engineering, the combined criteria based on calculated energies and molecular surface area can help anticipate the molecular shape and spatial arrangement of functional groups on a molecule, which determines the mutual arrangement of molecules in the solid state, and therefore many materials properties of interest.

Conclusions
One important conclusion from this study is that exible molecules frequently do not adopt their lowest energy (global minimum) conformer in their crystal structures, but oen assume a higher energy conformer to optimise the balance of inter-and intramolecular interactions. The observed crystalline conformer is usually energetically close to the global minimum, but in several cases molecules are found to adopt high energy conformers, up to 25 kJ mol À1 above the most stable gas phase conformer. As a general rule, high energy conformers are only adopted when there is an associated increase in accessible surface area, increasing the molecule's potential to form stabilising intermolecular interactions.
We propose that the relative conformational stabilisation due to intermolecular interactions in a crystal can be estimated from an empirical linear relationship with accessible molecular surface area. This relationship has been parameterised from the variation of measured sublimation enthalpies with molecular surface area for a series of rigid hydrocarbons. By adding this pseudo-energy to the calculated intramolecular energies of conformers, we achieve a dramatic enrichment in the ranking of observed conformers in the low energy range of the global conformational landscape.
A further observation is that, for any molecule with moderate exibility, there is some molecular distortion away from the gas phase geometry due to intermolecular interactions in a crystal structure: DE strain is never zero and can be as large as about 20 kJ mol À1 for some molecules. This is important for anticipating overall molecular shape when considering how a molecule might pack in its crystal structures. Molecular geometries can be signicantly distorted by the close-packed environment in a crystal structure.
The ndings are clearly relevant to CSP, where conformational diversity can quickly become the bottleneck in fully exploring structural space for large, exible molecules. Any narrowing down of the conformational space that must be considered improves the efficiency of global structure searching, reduces the resources that are required to perform predictions for a given molecule and effectively increases the range of these methods. The observations made here are also relevant in more traditional crystal engineering approaches, where the geometrical attributes of a molecule, particularly the arrangement and relative orientation of functional groups, must be anticipated in order to use their interactions to direct the arrangement of molecules in the solid state.