Aurora J.
Cruz-Cabeza
*ab,
Matteo
Lusi
c,
Helen P.
Wheatcroft
b and
Andrew D.
Bond
d
aDepartment of Chemical Engineering, School of Engineering, University of Manchester, UK. E-mail: aurora.cruzcabeza@manchester.ac.uk
bChemical Development, Pharmaceutical Technology & Development, AstraZeneca, Macclesfield, UK
cDepartment of Chemical Sciences, Bernal Institute, University of Limerick, Limerick, Ireland
dYusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
First published on 21st April 2022
The ΔpKa rule is commonly applied by chemists and crystal engineers as a guideline for the rational design of molecular salts and co-crystals. For multi-component crystals containing acid and base constituents, empirical evidence has shown that ΔpKa > 4 almost always leads to salts, ΔpKa < −1 almost always leads to co-crystals and ΔpKa between −1 and 4 can be either. This paper reviews the theoretical background of the ΔpKa rule and highlights the crucial role of solvation in determining the outcome of the potential proton transfer from acid to base. New data on the frequency of the occurrence of co-crystals and salts in multi-component crystal structures containing acid and base constituents show that the relationship between ΔpKa and the frequency of salt/co-crystal formation is influenced by the composition of the crystal. For unsolvated co-crystals/salts, containing only the principal acid and base components, the point of 50% probability for salt/co-crystal formation occurs at ΔpKa ≈ 1.4, while for hydrates of co-crystals and salts, this point is shifted to ΔpKa ≈ −0.5. For acid–base crystals with the possibility for two proton transfers, the overall frequency of occurrence of any salt (monovalent or divalent) versus a co-crystal is comparable to that of the whole data set, but the point of 50% probability for observing a monovalent salt vs. a divalent salt lies at ΔpKa,II ≈ −4.5. Hence, where two proton transfers are possible, the balance is between co-crystals and divalent salts, with monovalent salts being far less common. Finally, the overall role played by the “crystal” solvation is illustrated by the fact that acid–base complexes in the intermediate region of ΔpKa tip towards salt formation if ancillary hydrogen bonds can exist. Thus, the solvation strength of the lattice plays a key role in the stabilisation of the ions.
Proton transfer is also common in the solid state, as demonstrated by the large number of molecular salts4 and zwitterionic crystal structures reported in the Cambridge Structural Database (CSD).5 In molecular crystals, proton transfer reactions generally lie fully to one side of the chemical equation, yielding either a co-crystal where no proton transfer occurs or a salt where a proton is transferred from an acid to a base. Exceptions include dynamic systems, such as solid-state proton conductors,6 or cases in which protons appear to be located part way between the acid and the base.7 The characterisation of such systems can be complicated by the inherent difficulties associated with locating H atoms using X-ray diffraction data, and in some cases the extent of proton transfer may be temperature dependent.8,9 Because of this, complementary characterisation techniques, such as XPS,10–12 solid-state NMR13 or neutron diffraction14,15 (amongst others16), often need to be used to locate precisely the position of the acid H atoms. There are also cases where both ionised and non-ionised species of an acid or base are present in the same crystal.17
The ionisation of active pharmaceutical ingredients (APIs) into salts plays a crucial role in the formulation of medicines.4,18 Salts are often more soluble than their corresponding non-ionised forms, thereby affording drugs with improved pharmacokinetics.19 In recent years, co-crystals have been shown to be a viable alternative to salts since they can also improve the solubility of APIs20 whilst offering other advantages, such as being less prone to hydration.21 Co-crystals are often designed using crystal engineering concepts,22–25 whereby suitable co-crystal formers are selected based on their potential to form strong molecule-to-molecule interactions (synthons), typically based on hydrogen bonds.26–28 Very often, the targeted interactions involve an acid and a base. Indeed, hydrogen bonds are particularly strong, and therefore most attractive for crystal engineering strategies, for acid–base pairs with ΔpKa ≈ 0, where ΔpKa = pKa(protonated base) − pKa(acid).29,30 However, such strategies may be compromised by proton transfer in the solid state, thereby yielding a salt rather than the intended co-crystal product.
As a rule of thumb, salt formation in a molecular crystal is typically considered to be likely if the ΔpKa for the constituent acid and base is larger than 3. In 2005, Bhogala et al.31 noted that negative values of ΔpKa always lead to the formation of co-crystals. This so-called “ΔpKa rule” has been commonly used for the rational design of salts and co-crystals, supported by the free availability of experimental and calculated pKa data. In 2012, Cruz-Cabeza revised the rule on the basis of a survey of ca. 6500 crystal structures and pKa values calculated for the molecular components.32 It was found that ΔpKa > 4 almost always leads to salts, ΔpKa < −1 almost always leads to co-crystals and ΔpKa between −1 and 4 can be either. The intermediate region of ΔpKa accommodates what Childs has referred to as the salt/co-crystal continuum.33
Given the enduring popularity of the ΔpKa rule amongst chemists and crystal engineers,34–36 this paper sets out to review the theoretical basis of the rule, highlighting (in particular) the role of solvation for determining the outcome of proton transfer reactions in the solid state. When applying the ΔpKa rule to predict salt/co-crystal formation, a key question is how the crystal lattice influences proton transfer, particularly in the intermediate region close to ΔpKa ≈ 0. Some new data extracted from an extensive set of crystal structures demonstrates how the ΔpKa rule might be fine-tuned for specific classes of crystals, such as crystalline hydrates. Analysis of those structures also reveals the important role played by ancillary hydrogen-bonds in stabilising ionisation in the solid-state.
The CSD Python API interface was then used to generate SMILES strings for all components in the crystal structures, and the stoichiometry and nature of the components were established. Components with a positive charge were designated as cations, those with a negative charge as anions, and those with separation of charges within the same compound as zwitterions (which could also be anions or cations depending on the overall charge). Cations and zwitterions were further classified as quaternary (Q4) if the positive charge was not due to proton transfer (e.g. N+ bonded to four other C atoms). The SMILES of neutral components were checked and flagged as water, solvent (if within the 18 most common solvent types in the CSD) or main molecular component. Most of the total 276k retrieved crystal structures passed successfully through the pKa calculator, but some failed due to errors, disorder or uncommon SMILES strings. The final dataset-1 contained nearly 259k crystal structures with calculated pKa values.
Dataset-1 was then filtered for structures containing at least one acid–base pair, and thus one ΔpKa value, resulting in dataset-2, containing over 148k crystal structures. A Python script was written to analyse all data. The number of protons transferred in each crystal structure was recorded, as well as the location of the strongest acid and base. If the strongest acid and base were contained within the same molecule, ΔpKa was classified as ΔpKa,self and these data were used for the analysis of trends in zwitterions and neutral compounds. If the strongest acid and base were contained within different molecules in the crystal, this was classified as a standard ΔpKa. Second strongest acids and bases were also considered and ΔpKa,II was recorded for relevant structures.
• The HB donor is either an N or an O atom (no C atoms).
• The HB acceptor is either O or the O− in the carboxylic acid or carboxylate, respectively.
• The distance between the non-H atoms involved in the HB is less than the sum of their van der Waals radii.
• The angle of the HB (N/O–H⋯O/O−) is between 120 and 180°.
• Both intramolecular and intermolecular HBs were recorded.
For a monoprotic base (B) accepting a proton in water, it is convenient also to quantify the equilibrium using the acid dissociation constant of the protonated base, :
Since an equilibrium between HA and B in aqueous solution is a combination of the acid and base equilibria, the equilibrium constant can be expressed as a quotient of the individual acid dissociation constants:
HA(aq) + B(aq) ⇌ A−(aq) + BH+(aq) |
Taking logarithms equates the log of the equilibrium constant for the proton transfer reaction in water to ΔpKa:
Using the thermodynamic relation between the standard Gibbs free energy of a reaction and its equilibrium constant, the standard free energy change for proton transfer between the acid and base in water is related to ΔpKa as follows:
At 298 K, = −5.71 ΔpKa. Thus, positive values of ΔpKa correspond to negative and therefore favour proton transfer from acid to base. Negative values of ΔpKa correspond to positive and favour non-ionised acid and base. Each additional unit of ΔpKa changes by ±5.71 kJ mol−1. For ΔpKa = 0, = 0, giving equal quantities of ionised and non-ionised species in water. Most pKa data refer to an aqueous solution at 298 K. However, the values may change drastically in other solvents and media.47
(1) |
Fig. 2 Thermochemical cycle summarising the free energy changes associated with the formation of a molecular salt or co-crystal. |
If < 0, a salt will be favoured over a co-crystal. Conversely, if > 0, a co-crystal will be preferred.
(2) |
Although less commonly reported than pKa values, proton affinities (PA) in the gas phase can be measured or computed, which enables to be determined as follows:
PA(acid) = ΔE°{A−(g) + H+(g) → HA(g)} |
PA(base) = ΔE°{B(g) + H+(g) → BH+(g)} |
The second term in eqn (2) is usually highly negative because lattice energies of salts are significantly more negative than those of co-crystals due to the coulombic contributions in the salt, and the first term in eqn (2) is usually highly positive (energy is required for proton transfer). It is the balance between and (ΔEsaltlatt − ΔEco-crystallatt) that determines whether a co-crystal would form.
In this context, ΔpKa can be viewed as a proxy estimate for . In general, the strengths of acids and bases in aqueous solution do not necessarily correspond to the proton affinities in the gas phase, but a good linear dependence between the gas phase and aqueous acidity has been shown for organic acids of the type typically found in molecular crystals.49 If HA is a strong acid and B is a strong base (leading to a large positive ΔpKa), is likely to be less positive and amply compensated by the lattice energy gain of the salt. If HA and B are a weak acid/base (leading to a negative ΔpKa), will have a larger positive value which could outweigh the energy gain of the salt lattice and therefore favour the co-crystal. For intermediate cases, the balance is delicate.
If a crystal structure (or structures) is known, the difference in lattice energy between the salt and co-crystal can be explicitly calculated for a given system.‡ For example, Mohamed et al. have shown the energetic balance between a salt and co-crystal for a series of pyridine/carboxylic acid crystals.50 In this series, the acid and base interact directly in the crystal structure so the salt/co-crystal balance can be explored by shifting the acid proton across an O–H⋯N hydrogen bond. In some cases, a single-well potential was found with a clearly favoured position (either co-crystal or salt), but other cases showed a very shallow potential with very little energetic difference between either situation.50 Similar studies on a series of hydrazone/dicarboxylic acid crystals produced comparable results.51 Further to this, commonly applied DFT-D methods have been shown to often result in incorrect proton transfer in acid–base cocrystals.53 Hence, the second term in eqn (2) may need careful consideration on a crystal-by-crystal basis. The biggest limitation here, of course, is the requirement to determine the crystal structure(s) in order to make the calculations. To predict salt/co-crystal formation from the molecular structure alone, the entire process would require crystal structure prediction (CSP) of both the co-crystal and salt systems followed by accurate computation of the lattice energies.52 Hence, the full evaluation of eqn (2) is generally impractical as a predictive tool.
(3) |
(4) |
Using eqn (4), the sign of and, hence, the relative thermodynamic stability of the salt and co-crystal, can be visualised through a simple diagram (Fig. 3). For each ΔpKa value, the value of at which the free energy of proton transfer is 0 represents the boundary between the co-crystal and salt regions. This diagram illustrates neatly the ΔpKa rule. Co-crystals dominate when ΔpKa < 0. When ΔpKa > 4, salts dominate in most solubility ratio ranges. In the intermediate region of ΔpKa, the outcome is strongly sensitive to the relative solubility product of the salt versus the co-crystal.
While ΔpKa can easily be measured or calculated, the second term in eqn (4) cannot be quantified unless both a co-crystal and a salt exist for the system, and the solubility products can be measured. This is almost never the case. Hence, eqn (4) is no more practical than eqn (2) as a general predictive tool.
Finally, it should be stressed that the balance between ΔpKa and the solubility ratio may be complex. For example, in the case where an acid becomes significantly weaker than the expectation in aqueous solution due to a change to a poorly solvating solvent, this may be compensated by a change in the relative solubilities of the co-crystal and the salt. For example, upon changing from water to an organic solvent, the effective ΔpKa may decrease, but this is likely to be compensated by a decrease in the solubility product of the salt relative to the co-crystal. An example of such behaviour is seen for cediranib maleate.54 Hence, the applicability of expectations based on an aqueous environment must be carefully considered when transferring to other solvents.
Pobs(salt, %) = 17 ΔpKa + 28 |
In this context, with over 1 million crystal structures now available in the CSD, we set out in this work to probe further the experimental observations of salt and co-crystal formation, in order to shed light on the impact of crystal composition (particularly hydration), the location of the acid and base groups, the local hydrogen-bonding environment and the possibility of further proton transfers on the ΔpKa rule.
Fig. 4 Calculated ΔpKa values and gas-phase proton transfer energies for acid–base pairs formed between AA with three bases (THF, PYR, TMA) of increasing strength. |
To expand on these values, and to illustrate the impact of solvation, the potential energy curve for proton transfer for each acid–base pair was calculated in different solvation environments: (a) gas-phase; (b) toluene; (c) 2-heptanone; (d) water. The solution environments were accounted for implicitly using SMD solvation models. Toluene and 2-heptanone were chosen because their dielectric constants (toluene = 2.3, 2-heptanone = 12) are similar to those anticipated for neutral molecular solids and molecular salts, respectively.55 Hence, they are intended to give a coarse approximation of the situation that may be found in the crystalline state. They are not necessarily implied to be common laboratory solvents (indeed, 2-heptanone is rare in practice). For the calculations, the AA molecule was located next to the base so that an A–H⋯B hydrogen bond was formed, and the geometry was optimised. The potential energy was then calculated as a function of the position of H+ within the A–H⋯B hydrogen bond, as outlined in the Methods section. The results are shown in Fig. 5.
For AA:THF (ΔpKa = −7.4), none of the environments favour proton transfer from acid to base because of the high value for this pair. For AA:TMA (ΔpKa = 5.0), a similar situation is seen in the gas phase, but there is a progressive emergence of an energy minimum for proton transfer as the dielectric constant of the solvent increases. In aqueous solution, the minimum is pronounced and the ionised species are favoured over the neutral species by over 20 kJ mol−1. For AA:PYR (ΔpKa = 0.6), an intermediate situation is seen, whereby the curve for aqueous solution resembles that for AA:TMA in 2-heptanone. Hence, the balance between and the extent to which the ionised species are stabilised by solvation is clearly seen. To extend this picture to the solid state, the key question is how the crystal lattice impacts the outcome. The empirical evidence established for the ΔpKa rule suggests that the effect of the lattice is never sufficient to influence the outcome for acid–base pairs with ΔpKa < −1 or ΔpKa > 4, but that it can have a significant effect in the intermediate region. The remainder of this paper considers whether existing crystallographic information can be used to enhance the predictive power of the ΔpKa rule in the intermediate region.
Crystal type | #MCa | Compositionb | N | Percentage of dataset (%) | ||
---|---|---|---|---|---|---|
Unsolvated | Hydrate | Solvate | ||||
a #MC = number of main components. b N = neutral non solvent or zwitterion; Z = zwitterion; Z+ = cationic zwitterion; Z− = anionic zwitterion; C+ = cation; A− = anion; Q4 = quaternary. | ||||||
Neutral | 1 | N | 215491 | 92 | 4 | 5 |
Zwitterion | 1 | Z | 2112 | 60 | 34 | 6 |
Zwitterion-Q4 | 1 | Q4–Z | 3677 | 86 | 10 | 4 |
Co-crystal | 2 | N N | 9464 | 89 | 6 | 5 |
Co-crystal with Z | 2 | N Z | 349 | 73 | 21 | 6 |
Co-crystal with ZQ4 | 2 | N Q4–Z | 197 | 85 | 7 | 8 |
Salt | 2 | C+ A− | 16610 | 72 | 23 | 5 |
Salt Q4 | 2 | Q4–C+ A− | 6479 | 65 | 27 | 8 |
Salt with Z | 2 | Z+ A− | 181 | 66 | 33 | 1 |
Salt with Z | 2 | C+ Z− | 138 | 69 | 28 | 3 |
Salt with Z | 2 | Q4–Z+ A− | 123 | 64 | 33 | 3 |
Co-crystal | 3 | N N N | 248 | 91 | 6 | 3 |
Ionic co-crystal | 3 | C+ A− N | 1364 | 76 | 20 | 4 |
Ionic co-crystal | 3 | C+ A− Z | 85 | 82 | 18 | 0 |
Ionic co-crystal | 4 | C+ A− N N | 46 | 100 | 0 | 0 |
The most common types of crystals in the dataset are those containing only one neutral component, followed by salts, co-crystals and quaternary salts. Of these crystal structures, 10% contain ionised species (salts or (zwitter)ionic co-crystals), of which almost 73% involve proton transfer and 27% are quaternary salts. The transfer of a single proton is most common amongst all of the salts, being observed in 74% of the structures, followed by the transfer of two protons in 23% of the structures. Zwitterions are observed in only 3% of the dataset, so structures with zwitterions remain relatively rare in the CSD. Zwitterionic compounds that are not a result of proton transfer (zwitterion-Q4) are more common than zwitterionic forms that arise from self-proton transfer. This is likely to be a simple consequence of the chemical space represented in the CSD, which reflects to a large extent the common preferences of synthetic chemists. Whilst quaternary systems are analysed here for the purpose of general statistics, these are not relevant to the ΔpKa rule and thus are removed for the subsequent sections and analyses.
Most interesting in the current context is the relationship between salt/co-crystal formation and the hydration/solvation of the crystals. Whilst solvates make up only a small percentage of all crystal types, hydrates are much more common. In part, this is to be expected due the ubiquitous nature of water and its frequent use as a solvent, and several authors have discussed reasons why hydration is particularly common in molecular crystals.21,56–59 However, while hydrates constitute up to 6% of crystals containing only neutral (non-zwitterion) species, the percentage increases significantly to around 30% for crystals containing ions and/or zwitterions.
The result for neat co-crystals and salts is very similar to that presented by Cruz-Cabeza in 2012, with just a small shift to higher ΔpKa values at which the percentage of the dataset is 50%:50% for co-crystals and salts. The percentage value in the plot can be considered as the probability for an acid/base system of a given ΔpKa to be ionised when crystallised in an unsolvated form. In Fig. 6a, the co-crystal:salt equivalence point is observed at ΔpKa ≈ 1.4, compared to ΔpKa ≈ 1 in the 2012 study. Interestingly, the data for hydrates of co-crystals and salts is significantly shifted to the left, with the equivalence point in that distribution found at ΔpKa ≈ −0.5. The intermediate region of the plot (−3 < ΔpKa < 2) deviates noticeably from a sigmoidal shape. This may be influenced by the relatively smaller sample size, or it may indicate a greater degree of uncertainty in the apparent trend when applied to crystalline hydrates.
Fig. 8 Relative occurrence (%) of salts with 0, 1 or 2 proton transfers as a function of ΔpKa and ΔpKa,II (bins of one ΔpKa value). The dataset is the same as that used to produce Fig. 7. |
In the regions where the non-ionised versus the ionised probabilities cross, a linear correlation can be found. The fitted equations in Table 2 may prove valuable for prediction of proton transfer in molecular crystals of various nature and compositions.
System: neutral (a) vs. ionised (b) | Range | ΔpKa cross | P obs (ion, %) | R 2 |
---|---|---|---|---|
a Ionisation, second ionisation or self-ionisation as appropriate. | ||||
Monovalent vs. multivalent salt | [−7, 0] | −4.2 | 11ΔpKIIa + 96 | 0.89 |
Hydrated co-crystal vs. hydrated monovalent salt | [−5, 4] | −0.6 | 10ΔpKa + 56 | 0.94 |
Hydrated neutral vs. hydrated zwitterion | [−3, 5] | 1.1 | 11ΔpKselfa + 38 | 0.98 |
Cocrystal vs. monovalent salt | [−2, 4] | 1.3 | 14ΔpKa + 32 | 0.98 |
Neutral vs. zwitterion | [1, 7] | 4.1 | 15ΔpKselfa − 12 | 0.97 |
Concerning unsolvated co-crystals versus salts, the data suggest that unsolvated forms of acid–base pairs are usually worse at solvating the ionic species than water. To compensate for this, ΔpKa must be positive for ionisation to occur. In order to illustrate this observation, we have selected two examples of benzoic acid forming a salt and a co-crystal with two different bases with ΔpKa values between 1 and 2. Using the Mercury full interaction maps (FIMs) tool with a water oxygen as probe, we can visualise the ancillary hydrogen bonds (aHBs) where water is involved with the acid–base pair in aqueous solution in its non-ionised form versus its ionised form (Fig. 10). In aqueous solution, water forms one aHB with the carbonyl of the carboxylic acid in its neutral form whilst forming two strong aHBs with the carboxylate. For the crystal outcome to be predicted solely by ΔpKa, the level of solvation of the acid–base pair in solution must be maintained in the solid state. In such a scenario, a positive ΔpKa will imply salt formation, whilst a negative ΔpKa will imply co-crystal formation. If solvation in the crystal is worse than in water (fewer aHBs), then the ΔpKa value for the switch between salt and co-crystal must be shifted to higher values. In PUTMAU (Fig. 10), for example, there are no other HB donors in the acid or base, so no aHBs are possible and, thus, the outcome is a co-crystal. In HOMWEN (Fig. 10), however, there is a hydroxyl group available on the base able to form an aHB with the carboxylate, and thus a salt is observed in the solid state. For an acid and base with a ΔpKa between 1 and 2, the experimentally derived probability for a salt to form would be between 50–60% (Table 2). In that scenario, a higher number of aHBs should result in a salt, whilst a lower number should result in a co-crystal.
Fig. 10 Visualisation of water solvation of non-ionised versus ionised acid–base pairs with full interaction maps versus the resulting ionisation and interactions observed in their crystals. |
To illustrate further the importance of crystalline aHBs to the ionisation outcome, we analyse the number of aHBs in acid–base co-crystals and salts with a ΔpKa ≈ 0 (Table 3). 147 crystal structures were found where the carboxylic acid/carboxylate was involved in two aHBs, 247 in one aHB and 137 in no aHBs (Table 3).
N total | Salt% | Co-crystal% | |
---|---|---|---|
2 aHBs A/A− | 147 | 54% | 46% |
1 aHBs A/A− | 247 | 12% | 88% |
0 aHBs A/A− | 137 | 3% | 96% |
Acid–base crystalline complexes able to afford two aHBs generate a solvation environment similar to water. Interestingly, nearly 50% of the 147 crystals with two aHBs are salts and the other 50% are co-crystals, mirroring the behaviour expected in aqueous solution where ΔpKa ≈ 0. When only one or no aHBs are possible, co-crystals clearly dominate over salts. Here, when a reduction of aHBs occurs, the nature and strength of those aHBs would matter, with stronger HBs better able than weaker aHBs to stabilise the ionic species. This effect is clearly seen in the case of double ionisation where the ΔpKa,II values for the switch between salt and co-crystal are shifted to negative values.
To illustrate the important role of the solvation environment on the ionisation outcome in the crystalline state, we have examined the applicability of the ΔpKa rule in crystalline systems with different compositions. This reveals that the point of equal probability of salt/co-crystal formation lies at ΔpKa ≈ 1.1 for neat (unsolvated) forms, but at around 0 for hydrates. In zwitterions, an even more positive ΔpKa value of 4.1 is seen, whilst in multi-protic salts the point is negative, around ΔpKa ≈ −4.2. New probability equations for ionisation (based on linear fittings to the CSD observations) are provided to enhance the predictive power of the ΔpKa rule in the intermediate region for systems of various compositions.
We have also highlighted the importance of solvation in this intermediate region of ΔpKa values for a few illustrative examples. Acid–base pairs that are stabilised by strong solvation in the crystalline state are more likely to be ionised than acid–base pairs that are not involved in any ancillary hydrogen bonds in the crystal. Finally stoichiometry and its impact to the hydrogen-bonding environment of the acid–base complex can also play a role.60 We anticipate that this contribution will be useful for future applications of the ΔpKa rule. The herein derived probability equations for crystals of various compositions, combined with an analysis of possible aHBs, can undoubtedly enhance the ΔpKa rule’s predictive power in the intermediate region.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d1fd00081k |
‡ If a classical modelling approach is used, the ionisation term and the lattice energy terms will be computed separately. However, if a quantum modelling approach is used, the lattice energy will be optimised together with the ionisation state. A single optimisation of a crystal structure with DFT-d will result in either a salt or a co-crystal for a single well potential scenario since it will optimise the total energy of the lattice simultaneously. |
This journal is © The Royal Society of Chemistry 2022 |