Aurora J.
Cruz-Cabeza
*ab and
Colin R.
Groom
a
aThe Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, UK CB2 1EZ. E-mail: cruz@ccdc.cam.ac.uk.; Tel: +44 1223 336408
bThe Pfizer Institute for Pharmaceutical Materials Science, Department of Materials Science and Metallurgy, University of Cambridge, Pembroke Street, Cambridge, UK CB2 3QZ
First published on 27th September 2010
Families of tautomers in the Cambridge Structural Database (CSD) have been identified, analysed and classified. We identified 108 molecules that crystallise in two different tautomeric forms. Most commonly, pairs of tautomers crystallise together in the same crystal structure. Tautomeric polymorphs—pairs of tautomers crystallising in different crystal structures with no other or identical components—are very rare. The calculation of the relative stabilities of the different pairs of tautomers mirrors the relative frequencies with which tautomeric forms are observed in the CSD. Our improved understanding of the factors influencing tautomeric preferences in crystal structures may allow the prediction and design of crystal structures containing tautomeric forms that are, as yet, unobserved in the solid state.
In the context of pharmaceutical materials, the identification and characterisation of tautomers of drug molecules in the solid state could have important intellectual property and commercial implications. This has been illustrated by the late identification of new tautomers of well-known pharmaceutical molecules such as barbituric acid,13,14 omeprazole,15 ranitidine,16 sulfasalazine17 and irbesartan.18 Indeed the potential tautomeric complexity of drug molecules is well illustrated by the case of the anticoagulant drug warfarin, which can exist in approximately 40 distinct tautomeric forms.19 Although usually formulated as crystalline materials, ultimately drugs are dissolved and their tautomeric state might also change in solution.
In solution, tautomers are of course in equilibrium, with the nature of the solvent, the pH and temperature able to influence their relative population. In the solid state, the observed tautomeric form may differ from that which predominates in solution;20–22 temperature, proton transfer reactions23 and mechanical processes can all influence the tautomeric forms observed in crystal structures. In an interesting example of the latter, grinding commercially available barbituric acid yielded a new phase that contained the less stable trihydroxyl tautomer.13 Another study also reported that upon preparation of KBr-discs of barbituric acid for IR-spectra recording, variable amounts of a new tautomer were observed depending on the grinding treatment of the mixture.14
Tautomers in the solid state have been referred to as desmotrops. The term desmotropy was defined as “tautomerism in which both tautomeric forms have been isolated”.24 This term should not, therefore, be used in instances where two different tautomers are observed within the same crystal structure. The term desmotropy is not, however, widely known or used within the crystallographic community so it will not be adopted in this article, but there is a clear definition and nomenclature problem at the confluence of tautomerism and polymorphism.25 We refer to crystal structures of different tautomers as “tautomeric polymorphs”, “multicomponent crystals of tautomers” or “crystal structures containing both tautomeric forms” as appropriate; all are discussed further.
The aim of this study has been to identify, analyse and classify families of tautomers in crystal structures. We have, additionally, calculated the relative molecular energies of the different families of tautomers using density functional theory (DFT) calculations with a polarisable continuum model (PCM) to account for the crystal structure environment. We hope that this analysis might be used for future predictions of the phenomenom of tautomerism and tautomeric polymorphism26 in the solid state.
Fig. 1 Frequency of the number of tautomers generated for a total of 35232 unique tautomerisable molecules in the CSD. |
Only 99 families of tautomers (a total of 198 different tautomers) were observed in the CSD, in a total of 324 crystal structures. In addition, 37 crystal structures containing molecules with considerable disorder contained multiple tautomers, belonging to 33 tautomeric families. In all cases, only two different tautomers were observed per family. It appears from our analysis that three or more tautomeric forms of any molecule, in the solid state, has not yet been reported.
For different tautomers of the same compound to be observed in the solid state, one might assume that their relative energy difference must be plausible. Tautomeric pairs with very large energy differences might therefore represent errors. After calculating the relative energies of all tautomers with a molecular weight of less than 300, we found that some pairs showed rather large tautomeric energy differences (> 50 kJ/mol) and unrealistic short contacts in the crystal structure. It seems clear that the least stable tautomer of these pairs represent incorrect determination of hydrogen atom positions, since such a large energy penalty is unlikely to be compensated for in order to generate a crystal. These large energy differences highlighted the presence of 12 incorrect tautomers in the CSD.‡
The COMPACK algorithm compares atomic positions in a molecule and its 15 closest neighbours in the crystal with other crystal structures.34 By ignoring hydrogen atom positions and bond types, we were able to use this algorithm to identify identical crystal structures containing different tautomers. Where we noted such cases, the molecular geometries of the pairs of molecules were studied using the CCDC software MOGUL.35 MOGUL can compare the bond lengths, bond angles, torsion angles and ring geometry of any molecule with all similar structural units in the CSD. We reasoned that incorrectly assigned tautomers may show molecular geometry that was unusual with respect to the distributions calculated by MOGUL. Fig. 2 illustrates one such example, where unusual bond lengths are seen for one tautomeric form of 3-chloro-1,2,4-triazole.This example happens to have been studied in depth, resulting in the re-determination of the crystal structure (CSD refcode: CLTRZL01).36
Fig. 2 Example of the use of MOGUL geometry check to determine if a tautomeric form has been wrongly assigned. In (a), the N–N bond length is clearly outside the normal MOGUL distribution whereas for the correct tautomer assignment in (b), the N–N bond distance is within the normal ranges. |
Cases in which both tautomers show unusual values for bond lengths or angles were subject to further analysis, including assessment of the appropriate literature. After removing the incorrect tautomeric assignments identified by the tests described above, the remaining 108 tautomeric pairs, present in 277 distinct crystal structures were further analysed.
Fig. 3 Histogram of the tautomers relative energies in kJ/mol. |
Pairs of tautomers were grouped in approximately equal energy ranges representing low, medium and high energy differences between tautomers: (i) 0 to 12 kJ/mol, (ii) 12 to 24 kJ/mol and (iii) 24 to 35 kJ/mol. 65% of the observed tautomeric pairs show a relative energy difference of less than 12 kJ/mol. In this energy range, small improvements in the intermolecular interactions may be sufficient to compensate the small energetic penalty of a molecule existing in a less favourable tautomeric form. 35% of all tautomeric pairs show relative energies of less than 4 kJ/mol.
We observe a great variety of tautomer types: from annular tautomers to functional tautomers and those enabled by 6-member ring resonance intramolecular hydrogen bonds.22
The molecular nature of the tautomers tends to correlate well with their relative stabilities. For example, most tautomers constituted by 6-member ring resonance intramolecular hydrogen bonds, show small relative energy differences and proton disorder (Fig. 4). Of course, although the resonance intramolecular hydrogen bonded pairs could be in dynamic equilibrium, the two states were optimised independently since they are both stationary points and local minima of the molecular energy surface, only the energy difference between the two ordered molecular states was calculated.
Fig. 4 Examples of 6-ring intramolecular hydrogen bonded tautomers. |
The most abundant type of tautomers corresponds to derivatives of azoles (e.g. pyrazoles, indazoles, imidazoles, triazoles…). The majority of these tautomeric pairs (∼ 70%) show energy differences of <12 kJ/mol. However, we do observe higher energy differences for azolic tautomers in cases where one of the tautomers has a greater electronic resonance than the other, when the molecule is protonated or deprotonated (in salts) or when one of the tautomers shows stronger intramolecular interactions than the other.
Most of the molecules in the 20–35 kJ/mol energy range are derivatives of 4-pyrimidones (Fig. 5). As pair–pair hydrogen-bonding can be so complementary in the 2-amino derivatives both tautomers are often observed in the same crystal structure. Large energy differences might also be observed when a stable tautomer shows an intramolecular hydrogen bond but the higher energy tautomer does not. Fig. 6 illustrates this effect, the higher energy tautomer does not show an intramolecular hydrogen bond but this effect is compensated by forming extensive intermolecular hydrogen bonding in the crystal structure, as seen previously.1,37
Fig. 5 Energy differences in tautomers of 4-pyrimidones and hydrogen bonding complementarity in derivatives of 2-amino-4-pyrimidones. |
Fig. 6 Intra- versus inter-molecular hydrogen bonding (HB) in two tautomeric pairs observed in the CSD. |
Fig. 7 Classification of solid state occurring tautomers exemplified with pyrazole type molecules (where R1 ≠ R3). The CSD refcode of the crystal structures is indicated in the figure. |
Fig. 8 Frequency of observation of tautomeric polymorphs, multicomponent crystals of tautomers and crystal structures containing both tautomers in the CSD. The black bars indicate number of families of tautomers whereas the green bars indicate number of crystal structures. |
A second group of tautomers can be found in multicomponent crystals (solvates, hydrates, cocrystals or salts). In this group, we often observe that when a molecule crystallises with itself, it may do so in a less energetically favourable tautomer, in order to form a stable lattice. The presence of different components in the crystal means that this compromise may not be required, allowing it to adopt the most favourable tautomeric form. As one might expect a change of cocrystallisation component (or salt counter ion) might change the tautomeric form.
The final and largest group (Fig. 7 & 8) corresponds to crystal structures containing both tautomers. Here we find ordered crystal structures with varying stoichiometric ratios of tautomers and also structures showing disorder between the two tautomeric forms, be it static, dynamic or even dependent on temperature. We also find crystal structures in which the two tautomeric forms cocrystallise with a second or third additional component.
The use of complementary experimental techniques (such as solid state NMR) or theoretical models can help to clarify the nature of the tautomer in the crystal structure when needed. Whilst the use of DFT packages for this aim is more time consuming and might require the help of an expert user, straightforward geometrical analysis of the molecules, for example using the tool Mogul, is also effective.
The observation of different tautomers in the solid state is, however, a rare phenomenon. Only around 0.5% of molecules able to tautomerise in the CSD are observed in different tautomeric forms in the solid state. This represents just 0.05% of the molecules in the CSD. We identified no cases where more than two tautomeric forms per family were observed. Perhaps the generation of such systems represents an interesting challenge to the crystal engineer.
The majority of observed pairs of tautomers have small molecular energy differences of less than 12 kJ/mol. Annular tautomers and 6-member ring resonance intramolecular hydrogen bonds constitute a major proportion of the molecules crystallising in different tautomeric forms. Only derivatives of 2-pyrimidones and tautomeric pairs with intra- vs. inter-molecular hydrogen bond changes, show energy differences greater than 20 kJ/mol. Our observations allow us to make a general rule of thumb: “For different tautomers to be observed in the solid state, their relative energy must not exceed that of a strong hydrogen bond in an organic crystal”. The energy ranges presented for the different families of tautomers and their exceptions should be very useful for predicting the observation of one or more tautomers in the solid state based on ΔEt differences. They may also be useful in crystal structure prediction studies of molecules with the ability to form different tautomers.
Crystal structures containing two tautomers are the most abundant in terms of tautomeric diversity. Tautomeric polymorphs, on the other hand, are very rare. The use of different cocrystalisation components (or counter-ions) to isolate new tautomers, could perhaps be the only way of manipulating tautomeric preferences. It would be fascinating to see crystal engineering strategies developed to generate tautomers not yet seen in the solid state.
Footnotes |
† Electronic Supplementary Information (ESI) available: CSD Refcodes and information on tautomers. See DOI: 10.1039/c0ce00123f/ |
‡ CSD refcodes: IMMTAZ, SOJNAG, MALHYZ10, ATTDAZ, VAHHAO, WEYROH, ILOQAB, ILOQEF, AHPAZP10, FOFBOS, LUNMPO10, SIHBOA |
§ Available by download from http://www.ccdc.cam.ac.uk |
This journal is © The Royal Society of Chemistry 2011 |