Sarah L.
Price
Department of Chemistry, UCL, 20 Gordon Street, London WC1H 0AJ, UK. E-mail: s.l.price@ucl.ac.uk; Fax: +44 (0)20 7679 7463; Tel: +44 (0)20 7679 4622
First published on 22nd November 2013
Currently, organic crystal structure prediction (CSP) methods are based on searching for the most thermodynamically stable crystal structure, making various approximations in evaluating the crystal energy. The most stable (global minimum) structure provides a prediction of an experimental crystal structure. However, depending on the specific molecule, there may be other structures which are very close in energy. In this case, the other structures on the crystal energy landscape may be polymorphs, components of static or dynamic disorder in observed structures, or there may be no route to nucleating and growing these structures. A major reason for performing CSP studies is as a complement to solid form screening to see which alternative packings to the known polymorphs are thermodynamically feasible.
Key learning points(1) Crystal Structure Prediction (CSP) methods only generate ordered crystal structures and an approximation to their relative energies.(2) The extent of the search needs to be appropriate for the molecular flexibility and aim of the study, but typically 103–106 plausible crystal structures are generated. (3) Lattice energy landscapes are very demanding of the computational method used, which usually involves high-level quantum mechanical calculations on the molecule or crystals. These calculations are for perfect static lattices, neglecting the effects of temperature. (4) The crystal energy landscape is the set of crystal structures that are sufficiently low in energy to be thermodynamically plausible as polymorphs. The complexity of the crystal energy landscape is determined by the molecule, and can differ markedly between closely related molecules, such as isomers. (5) The crystal energy landscape usually includes many more structures than experimentally observed polymorphs. Understanding why, in terms of kinetics of crystallisation, is the main challenge to polymorph prediction. |
Since polymorphs differ in their physical properties, the consistent production of the same polymorph is essential for all molecular crystalline products. This is particularly important for the pharmaceutical industry, as a change in polymorph can change the solubility and dissolution rate. A knowledge of the solid-state structural landscape of a molecule5 and the interdependence of the structure, properties, processing and performance of a drug (the pharmaceutical materials science tetrahedron)6 is essential for choosing the solid state form with the optimal compromises of physical properties, and to exclude the crystallisation of unwanted forms in industrial manufacture. Considerable expertise has been built up within the pharmaceutical industry in the multi-disciplinary searching and characterising of organic solids,7 with CSP studies emerging as a complementary tool.8
Research into polymorphism is changing the view of the complexity of the organic solid state. Although there are only crystal structures of polymorphs for about 5% of the molecules in the Cambridge Structural Database (CSD),9,10 this reflects one of the primary uses of crystallography to confirm molecular structure and the difficulty of growing single crystals suitable for X-ray analysis rather than the incidence of polymorphism. A survey11 from one of the polymorph screening companies of 245 molecules that they had screened, reported that 50% showed polymorphism and 90% had multiple crystalline and non-crystalline solid forms. (The term form is wider than polymorph, as it also includes solvates etc.) Correspondingly, the use of CSP methods has developed into calculating and interpreting the crystal energy landscape of a molecule,12 the set of crystal structures which are sufficiently low in energy, to be thermodynamically feasible polymorphs.
The crystal energy landscape rarely contains only one crystal structure, i.e. it is relatively rare for a molecule to have one way of packing with itself that is significantly more favourable, than any other. The fields of crystal engineering and self-assembly are dominated by multicomponent systems because of the scarcity of molecules that can close pack with strong intermolecular interactions defining a unique packing in all three dimensions. Hence, the main use of CSP studies is to find the range of different packings that are thermodynamically plausible crystal structures. The most stable should be an observed crystal structure. In the cases when there is one structure that is significantly more stable than any others, for example, isocaffeine which has an unusually large energy gap of about 6 kJ mol−1 (see Fig. 1), then polymorphs are very unlikely. The more common case, such as the isomer caffeine (Fig. 1), where there are structures that are thermodynamically competitive requires qualitative interpretation of the crystal energy landscape. How are these structures related? How does the relationship between the structures limit the possibilities of the structures nucleating and growing under different conditions? Thus, CSP studies aid thinking about what alternative outcomes there may be for crystallisation processes. As such they are a complement to experimental screening aimed at finding all solid forms, where it is clear from the huge developments in this area that it is impossible to cover all the range of experimental conditions that have led to the discovery of new polymorphs.14
Fig. 1 The contrasting distributions of CSP generated crystal structures of (top) isocaffeine and (bottom) caffeine.13 Each symbol gives the lattice energy and packing coefficient (proportional to density for isomers) of a mechanically stable crystal structure. The crystal energy landscape of isocaffeine contains only one low energy structure, (which corresponds to the known structure), whereas that for caffeine contains a group of layer structures with different stackings of the molecule. The difference in the crystal energy landscapes can be rationalised by the intermolecular interactions as represented by the electrostatic potential on the van der Waals surface plus 1.2 Å; +1.5 V corresponds to an interaction energy of +1.4 kJ mol−1 with a positive point charge of 0.01e. The low temperature experimental structure of caffeine has disorder components corresponding to rotation by 180° about the two marked axes.13 |
Fig. 2 Summary of the results of the most recent 2010 blind test of crystal structure prediction.15x/y denotes that x of the y groups entering had the correct structure within their three submissions. |
The methodology of CSP studies is evolving fast, and so for details of the range of methods used by the groups that accepted the challenge see references within,15 and check citations of this paper for new methods. This article will concentrate on the basic considerations for a CSP study, done in conjunction with experimental work, that are in common with the methods that have been most successful in the blind tests. These methods, in principle, can be developed to not rely on the availability of experimental data, which is important for the molecular design of new organic materials such as energetics. However the interpretation of the output of a CSP study in terms of its implications for crystallisation control often needs to build on the increasing experience of using CSP studies as a complement to interdisciplinary experimental work.
With the molecular bonding chosen, we can define our crystal energy as relative to infinitely separated molecules in their lowest energy conformation. By ignoring molecular vibrations (even the zero-point motions) and so considering a perfect infinite static lattice compared with the infinitely separated molecules (all nominally at a temperature of 0 K), we can approximate our crystal energy as the lattice energy Elatt. The effect of pressure can be added, but the differences in density between polymorphs are usually so small that this term is generally neglected unless the experimental work is done with applied pressure.
The lattice energy, Elatt, can be conceptually broken up into two components, the intermolecular interactions between the molecules, Uinter, and the change in the molecular energy between the crystal and gas, ΔEintra, so that
Elatt = Uinter + ΔEintra |
Many CSP methods are based on making this division, and indeed the development of CSP methods started with molecules whose rigidity meant that the second term could be ignored.
Small adjustments in conformation, such as changes in torsion angles by a few degrees, rotation of a methyl group or amine pyramidalisation, can change the lattice energy significantly. (Indeed, an investigation of how much lattice energies change if the molecular structure is held rigid at the conformation determined by crystallography at different temperatures, or worse still, if the proton positions are not corrected for the systematic error in X-ray structures that shortens bond lengths to hydrogen, can be very instructive. The use of calculations to confirm or correct the proton positions, which cannot be accurately located from the diffraction data is increasing.) These minor types of conformational change can be taken care of at the final stage of structure optimisation in a CSP study. However, even for a rigid molecule, the use of the molecular structure taken from the crystal can bias the search towards the observed packing, so the input conformation has to have the isolated molecule geometry and symmetry, and is usually derived by an ab initio optimisation of the isolated molecule.
Larger differences in conformation, such that the close packings of the molecular van der Waals surfaces would be qualitatively different, have to be covered in the search. Although the molecular conformations observed in crystal structures are generally low in energy, how close this is to a local or global minimum energy conformer20 for the isolated molecule depends on flatness of the torsional potentials. If each conformation is in a deep energy well, then separate searches may be performed for each conformer. However, if there is a low energy barrier between conformers that give rise to a very different overall shape, then this flexibility needs to be included in the search from the earliest stages.
It is challenging to evaluate a reliable conformational energy surface for larger molecules, even in isolation, as the intramolecular dispersion plays an important role and this is not well captured by ab initio methods. For example, routine self-consistent-field (SCF) calculations give a broad minimum in the conformational profile for phenyl rotation in fenamic acids (2-(phenylamino)-benzoic acids) in the region which corresponds to a low population of experimental structures21 and to a local maximum (∼5 kJ mol−1) for higher quality ab initio methods that include electron correlation. Searches where the conformational energy term favours the wrong conformations will generate structures that are too unrealistic to be useful as starting points for refinement with more expensive methods. This can be a problem with many force-fields that are used in biological modelling. Hence the GRACE program uses many periodic dispersion corrected density functional calculations to parameterise a molecule-specific force-field for use in its search,22 and CrystalPredictor uses an interpolated grid of isolated molecule ab initio intramolecular energies.23
Surveys of the conformations of related molecular fragments within their crystal structures are generally good guides to conformational preferences in other phases, and hence a useful cross-check that the appropriate range of conformations could be generated in the CSP study. Once the crystal energy landscape has been calculated, a comparison of the conformations within the crystal structures will reveal which subset of the range of conformations can pack densely with favourable intermolecular interactions. For example, for olanzapine, there are two low energy conformational wells, but one conformation does not produce plausible crystal structures (Fig. 4).24 In contrast, GSK269984B has very different gross as well as hydrogen bonding conformations on its crystal energy landscape, raising the question as to why the conformation is the same (apart from the hydrogen bonding proton), within all the experimental crystal structures.19
Fig. 5 The number of molecules in the asymmetric unit cell Z′ can be important, as in the case of 7-fluoroisatin, where the expected doubly hydrogen bonded dimer appears in form I and the closely related Z′ = 2 form III, but in the most stable form II the two independent molecules use different hydrogen bond acceptors, a motif that is intrinsically Z′ = 2 which could not have been generated in the Z′ = 1 search. Catemeric structures are found higher in energy on the crystal energy landscape.26 |
The cost versus benefits of the completeness of the search, which depends on the aim of the study, also applies to the choice of the 230 space groups to be covered. Most organic molecules crystallise in a fairly small range of monoclinic, triclinic and orthorhombic space groups, and it would not be worth the computational expense of including tetragonal, hexagonal or cubic space groups unless there was some experimental evidence, high molecular symmetry or a tendency of your type of molecule to suggest it would be worthwhile. Only a small number of space groups can be adopted by a chiral molecule, and a search in just 5 of these is likely to be adequate. For non-chiral molecules and racemic compounds, you would also need to consider the space groups that contain inversion operators or mirror planes, but with an additional 15 space groups you would cover the most populated.27
The vastness of the search space and the computational cost of accurate methods of evaluating the lattice energy require that all searches are hierarchical, in that some approximate estimate is made of the relative energies in the search, duplicates are removed and then only the more promising are reassessed with more accurate energy evaluations. Since lattice energy minimisation techniques generally only go to the nearest local minimum, there are trade-offs between the completeness of the search method, the accuracy of lattice energy being minimised, and the rate at which structures are discarded. These, like the human and computing resources required, can be very molecule as well as CSP method and computer system dependent; anyone embarking on CSP studies should consult the published papers and documentation of the program suite chosen after reviewing the current alternatives.1,15 Physical insight into the basis of the calculation can prevent disappointment. For example, the “Prom” search approach27 is based on sequentially building up clusters by adding crystallographic symmetry elements and continuing or discarding structures according to a simple force-field evaluation. In contrast the MOLPAK28 search is based on seeking densely packed structures in common coordination types using a pseudo-hard sphere model. (MOLPAK was designed for energetic materials where density is a key property.) Both programs will very quickly generate a few thousand plausible crystals structures for a rigid molecule, with the Prom procedure more liable to miss structures which do not contain a strongly cohesive centrosymmetric (hydrogen bonded) dimer, and MOLPAK more likely to miss structures which do. Orders of magnitude more plausible crystal structures would be generated by more extensive search methods, such as GRACE,22 or CrystalPredictor,23 where there are parameters for monitoring the rate of appearance of new structures to converge the search, as evaluated by fairly accurate, molecule specific force-field models. These programs can also handle very flexible molecules, and multi-component systems, where the search may be terminated for practical reasons at over a million plausible structures.
The issue of removing duplicate structures is not straightforward, mirroring the difficulty in defining the distinction between polymorphs and experimental sample and temperature dependent structural variations.9 Simulated powder X-ray diffraction patterns are often used, but there are cases where crystal structures that differ when you look at the elements can have very similar powder patterns. Overlaying the molecular coordination sphere, usually a 15 molecule cluster, could exclude polymorphs with longer range packing differences.
Finally, testing that a computer generated crystal structure has no imaginary frequency vibrational modes of the crystal and that it is mechanically stable, can reveal that optimising the crystal structure within a given space group has produced a transition state between lower symmetry structures with more independent molecules in the asymmetric unit. If the energy lowering from removing the symmetry constraint is small, then the molecular motion in the crystal may mean that it is observed in the higher symmetry structure.
Any CSP program that will only automatically and completely search through the vast swathes of possible ordered crystal structures would waste considerable computer and human time. A little knowledge of crystallography of the appropriate family of molecules, and thoughtful choices appropriate to the aim of the study can usually define a practically useful search that can be complemented by other calculations. These should be on the known structures, including those derived by computational desolvation (removing solvate molecules and optimising), or computational substitution (swapping the molecule in crystal structures of related molecules) and could include mini-searches (e.g. just in P21/c) using plausible small clusters of molecules or unusual conformations as the search input.
Hence the computational method should be chosen after being tested for being able to reproduce the crystal structures and give plausible relative lattice energies for the known polymorphs of the molecule, or related molecules. The lattice energy minimum closest to these experimental structures is the closest approximation to these structures that could be found in the CSP study. If the lattice parameters have changed by more than the few % which could be attributed to thermal expansion, or the molecules rotated or translated significantly from their experimental positions, or the molecule changed its conformation (these changes are usually very strongly correlated) then a CSP study with this model for the lattice energy is a waste of time.
When the CSP search has been concluded, if there is a large energy gap between structures as for isocaffeine (Fig. 1), then the ranking will not be sensitive to your approximations in calculating the energy, and a confident prediction can be made. We can estimate whether the theoretical underpinnings of a given type of crystal energy evaluation is suitable for a given molecule, in terms of its size, functional groups, likely conformational flexibility and intermolecular interactions, so that a CSP study is worthwhile. However, it is the molecule itself that determines whether it has one good way of packing with itself, or a range of equally bad compromises giving such small energy differences that the ranking of structures may not be possible with the best affordable methods. If the known structures are not at or near the global minimum, then this could be an artefact of the chosen method of energy evaluation. This needs to be tested by recalculating the relative lattice energies of the low energy structures with a range of alternative programs, based on different assumptions (e.g.ref. 34, but more methods such as Pixel27 are now being used) before too much time is invested in the experimental search for the elusive more stable “predicted” polymorphs.
The blind tests of crystal structure prediction have shown the limitations of the easy to use, PC based modelling methods that rely on traditional force-fields.15,35 This can be attributed to the limitations of the functional form, particularly in using atomic charges, and in using the same charges to model both inter- and intramolecular interactions. This does not mean that CSP studies using such force-fields will not give useful results for some molecules (where the parameterisation of the force-field is particularly good) and for some purposes.5 This could include the generation of ideas about possible structures for helping interpret experimental data, or for testing methods that incorporate non-thermodynamic aspects of crystallisation, such as informatics approaches15 using the CSD.
A successful intermediate between conventional force-fields and periodic electronic structure calculations is based on the separation of inter- and intramolecular energies and using distributed multipoles rather than point charges to evaluate the intermolecular lattice energy. The specific representation of the electrostatic effects of the non-spherical features such as lone pairs and π electron density makes a considerable difference to the ability to represent the directionality of hydrogen bonding and π⋯π stacking. Using distributed multipoles rather than an atomic charge representation of the same molecular charge density considerably improves the proportion of rigid molecule crystal structures found close to the global minimum.36 The resulting anisotropic atom–atom model intermolecular potential is implemented in the organic solid state modelling code DMACRYS.37 Most CSP studies using a distributed multipole electrostatic model have combined it with an empirical isotropic atom–atom model potential, of the form
For flexible molecules, the conformational energy penalty ΔEintra contributes directly to Elatt, and the redistribution of charge within the molecule as it changes conformation, represented by the distributed multipoles, affects Uinter. Both can be very dependent on the quality of the electronic structure method used (see Section 2.2), which can usually be of a higher quality for a molecule than is affordable for electronic structure calculations on crystals. Hence, the refinement of the molecular conformation within the crystal structure relies on multiple evaluations of the molecular wavefunction, which is made feasible by the database structure in CrystalOptimizer.39 However this approach, unlike periodic electronic structure methods, does require explicit selection of which molecular torsion and bond angles are likely to differ significantly between crystal structures and in the isolated molecule (as approximated by quantum mechanics).
Once all these choices have been made, the CSP study can be performed. There are many compromises in terms of human and computing time, availability of software etc. that have to be matched to the aim of the study. The size of the group of structures that will need further examination will always depend on the molecule (the information you are seeking) but also the extent of your search and the uncertainty in your relative energies (which qualifies your discussion of the results).
Thus, performing CSP studies requires a complex interplay of different programs, the choice of which is likely to depend on the type of molecule being studied, as well as many other factors. They can be packaged to automate the workflow for different computer architectures though, as Section 4 shows, the human interpretation of the results in terms of structural differences and similarities still has to evolve.
Fig. 6 The crystal energy landscape of naproxen,42 with the input conformational analysis and the classification of structures by packing motif. The three torsion angles indicated by red arrows differed significantly in the 8 conformational energy minima that were used as input into the search, and they, plus the additional torsions indicated by black arrows, were refined in each crystal structure in the final stage of structural optimisation. The structures on the energy landscape are classified firstly by chirality; a circle if they only contain the S-enantiomer (green) and so are enantiopure, otherwise the structure is racemic and also contains the R-enantiomer (red); and secondly by the most extensive supramolecular constructs contained in the structure. The blue squares indicate hydrogen bonding of the carboxylic acid to the methoxy groups, which is less thermodynamically plausible. △ is a Z′ = 2 variant of the racemic experimental structure. Reprinted with permission from Cryst. Growth Des., 2011, 11, 5659. Copyright 2011 American Chemical Society. |
Whilst the CSP study was successful in assisting the determination of the racemic structure, and both racemic and enantiopure structures were the most stable within the appropriate set of space groups, the lattice energy difference between the two structures varied from 6 to 9 kJ mol−1, depending on the (respectable) molecular wavefunction and plausible dielectric constants used in a polarisable continuum model for ΔEintra and the distributed multipoles.42 The enthalpy difference between the structures derived from the heat of melting at 155.8 ± 0.3 °C and 156.2 ± 0.1 °C respectively was 1.5 ± 0.3 kJ mol−1, and from solubility difference measurements in an ethanol–water mixture between 10 and 40 °C of 2.4 ± 1.0 kJ mol−1. These results formed the basis of discussion of the many approximations used in the comparison of the energies of the chiral and racemic crystals.42 There is still some way to go before even the lattice energy contribution can be calculated with the accuracy required for the design of chiral resolution processes.
The definition of a crystal energy landscape as the set of crystal structures whose energies are sufficiently close to the most stable to be thermodynamically plausible as polymorphs, requires a decision as to what is “sufficiently close”. Polymorphic energy differences are typically less than a few kcal mol−1,3 though it is the barrier to rearranging to the more stable form, rather than absolute energy difference, which is important. For example, desolvating a solvate crystal structure can lead to a high-energy form, which may be kinetically stabilised. Hence, allowing for some error in the relative lattice energy, structures within a cut-off of 7–10 kJ mol−1 of the global minimum for a neutral molecule, or the point at which there is a marked increase in the number of structures which are approximately equi-energetic, are typically taken as the upper boundary of the crystal energy landscape. Whilst there is generally an increase in the number of crystal structures with decreasing stability (and also decreasing density), this can vary significantly, for example the lattice energy landscape of caffeine (Fig. 1), where there are a group of structures at the global minimum. Would it be worth refining the relative lattice energies of these structures, for example by periodic electronic structure (DFT-D) calculations on each, or does the thermodynamic argument require other approaches?
Many organic solid-state polymorphic phase transformations are difficult, as once the molecule is close packed within a crystal, there is a significant barrier to rearrangement. It has been argued that all organic molecular transitions are first order, with the phase change requiring a nucleation and growth mechanism, rather than a mechanism that goes from single crystal to single crystal, maintaining translational symmetry throughout. The type of polymorphism that can be most readily observed by crystal to crystal transformation on changing the temperature can come from the high temperature, higher symmetry phase being a dynamic average over lower symmetry lattice energy minima. Examples are plastic or dynamically disordered phases. Since the high temperature phase of caffeine (form I) is a dynamically disordered layer structure, it seems likely that a Molecular Dynamics simulation starting from any of the group of low energy caffeine structures would result in form I caffeine.
Fig. 7 Similarities in CSP generated crystal structures can suggest that crystallising in one ordered structure is unlikely. For eniluracil a range of structures based on this layer, which are almost identical if a CO and C–H group are not distinguished, are very close in energy and show nearly identical powder X-ray diffraction patterns. Left; non-polar hydrogen-bonded ribbons with the ethynyl groups interdigitated so ribbons are anti-parallel, right; polar ribbons interdigitated in parallel (giving a polar crystal). The possible interdigitation of these two motifs emphasises how the number of possible structures would increase with the extent of the CSP search. The configurational entropy associated with these structures45 shows that the most stable structure would be disordered, as found experimentally. |
A more subtle issue for thermodynamic prediction is the role of solvent, which can be undetected by routine crystallography if it is mobile within the crystal structure. Investigations prompted by the relatively high lattice energy of carbamazepine form II48 found solvent molecules moving quite freely in the channels. The issue of guest molecules changing the thermodynamics as well as the kinetics of crystallisation is beginning to be explored: polymorphs found by desolvating solvates may be highly metastable; the framework structures of organic inclusion compounds can be found as high energy, low density structures on the crystal energy landscape; and indeed CSP has been able to predict the structures of porous organic molecules without explicitly considering the solvent.43,49 Indeed, analysis of the structures on densely populated crystal energy landscapes are beginning to be used to discuss gel formation, amorphous states and other issues relating to the prevention or apparent inability of molecules to crystallise.43
Thus, calculating accurate relative free energies of all different possible solid phases of a molecule is a challenge that extends far beyond the capabilities of CSP approaches. However, because thermodynamics is rarely the only factor that determines the observed solid forms of organic molecules, a CSP study provides a guide to the potential complexity of the solid state by generating the favoured modes of self-assembly of a molecule. As such CSP studies can form a complement to solid form screening, where considerable effort needs to be expended to try to crystallise without inadvertent seeds of the known forms and to vary the crystallisation conditions to try to either ensure or avoid thermodynamic control of the crystallisation process.7,8,14,19
Qualitative estimates of whether two structures are so closely related that only the more stable would crystallise, or sufficiently different that there would be a barrier to interconversion during nucleation,43 can be based on dominant strong interactions such as hydrogen bonding. However, after the 2001 blind test, the consensus was that two molecules were likely to have polymorphs with different hydrogen bonding motifs. Two new polymorphs of 6-amino-2-phenylsulfonylimino-1,2-dihydropyridine were found, but screening of 3-azabicyclo[3.3.1]nonane-2,4-dione found a high-temperature plastic phase. Further calculations then showed that this spherical molecule could reorientate very readily as the imide hydrogen bonding was very weak. Hence, hydrogen bonds, as defined by software that uses distance criteria, is not a reliable guide to the barriers to changing hydrogen bonding motif during nucleation.
The ability to trap molecules in metastable polymorphs because they cannot rearrange readily can also be associated with larger functional groups and conformational flexibility. The concept of a polymorphophore (Fig. 8) as a molecular fragment that appears to promote polymorphism is exemplified by the fenamates21 and ROY (this nickname for 5-methyl-2-[(2-nitrophenyl)amino]-3-thiophenecarbonitrile comes from the distinctive red-orange-yellow colours of its polymorphs which differ in the torsion angle).4,21 Both systems have an aromatic ring which has a very low barrier to rotation through a wide range of angles in the isolated molecule. Once the aromatic rings interdigitate in forming the nucleus or crystal, the conformational flexibility is drastically reduced, helping different packings with different torsion angles to be trapped as polymorphs. The crystal energy landscapes of both tolfenamic acid21 and ROY23 show that their known polymorphs are competitive in energy with other structures which may yet prove to correspond to other polymorphs, such as the structurally uncharacterised forms of ROY.4 In contrast, both a bromo-derivative of ROY and fenamic acid appear to be monomorphic. For fenamic acid the low energy structure generated in a Z′ = 1 CSP study21 was so closely related to the observed Z′ = 2 structure, and the next most stable structure, 2 kJ mol−1 higher in energy, that it seems most unlikely that the computer generated structure could be trapped as a distinct polymorph. Thus, the relationship between the low energy structures on the crystal energy landscape shows whether there are low energy structures that are more readily kinetically trapped for polymorphophores.
This has been demonstrated in the case of carbamazepine, a generic anti-epileptic that has been extensively used in polymorphism studies. The crystal energy landscape showed that some packings of catemeric hydrogen bonded amide chains were thermodynamically competitive with structures containing the doubly hydrogen bonded dimer, including the four known polymorphs. A closely related molecule, dihydrocarbamazepine, had a polymorph which was isostructural with a CSP generated catemeric form. Subliming carbamazepine in the presence of a seed crystal of dihydrocarbamazepine form II allowed the nucleation and growth of catemeric carbamazepine form V on the heteromolecular template (Fig. 9).40
Less specific properties of the low energy crystals may be useful in designing crystallisation strategies. Examination of the hydrogen bonding can suggest that experiments in specific solvents may be worthwhile,51 whereas denser structures could be targeted by crystallisation under pressure. However, the range of polymorphs is clearly limited by the ability to vary the crystallisation conditions sufficiently to change the mode of self-assembly. The difficulty with assessing this is illustrated by the role of impurities in crystallisation: whilst impurities can have a major effect on polymorph screening and characterisation, including inhibiting crystallisation,7 there are also cases such as form I sulphathiazole and form II progesterone, where specific, chemically related impurities appear needed to produce a long-lived metastable form.43 Recently a designed hetero-nuclear seed has been found so effective at producing the predicted but previously unobtainable caffeine:benzoic acid cocrystal that four different laboratories collaborated to establish its role.52
You can be confident that crystallising a naturally enantiopure molecule can only lead to polymorphs in chiral space groups, unless there is the possibility of racemisation during the crystallisation. Some conformational changes are unlikely; in the last blind test15 some groups included both cis and trans amide conformations of the model pharmaceutical XX (Fig. 2) in their search, but this stereochemistry is likely to be defined during synthesis and not change during crystallisation. However, when the barriers to gross conformational changes are smaller, the range of conformations and the types of solute association in solution may vary with the solvent molecule. For example, the observation that olanzapine is in a dimer (Fig. 4) in all its crystalline forms, despite the crystal energy landscape having thermodynamically competitive structures which do not contain the dimer but otherwise resemble known polymorphs,24 suggests that the dimer forms early in the crystallisation process. Does the unusually rapid crystallisation of the only anhydrate polymorph found in extensive screening of GSK269984B19 mean that seeding by nuclei of this form cannot be avoided? In this case, although it seems unlikely that none of the solutions contained some of the conformers seen in the low energy calculated metastable forms (Fig. 4), the anhydrate and solvates all had essentially the same conformation as the isolated molecule apart from the hydrogen bonding proton.
Although crystal energy landscapes have been contrasted with careful experimental work for a few hundred molecular systems, the range of different crystalline behaviours observed, even within families of closely related molecules (as exemplified by caffeine and isocaffeine in Fig. 1), means that the interpretation of the landscapes with small energy gaps between different structures is necessarily tentative. Considerably more experience is needed before we can eliminate thermodynamically plausible structures as unlikely to be observed because they cannot crystallise distinct from slightly more stable related structures, or design an experiment to find the novel polymorph. Observing the complex interplay between nucleation and growth of different polymorphs is challenging, even in the most advantageous cases,4 let alone when impurities or heterogeneous surfaces play a role in the first nucleation of a new polymorph. A careful CSP study can either reduce or expand the amount of experimental work needed to determine the full complexity of the solid form landscape of a molecule by either confirming that all practically important polymorphs are known, or suggesting that additional structures that should be targeted. The complexity of crystallisation behaviour is very specific to the molecule.
This journal is © The Royal Society of Chemistry 2014 |