Substituent effects on the basicity (pKa) of aryl guanidines and 2-(arylimino)imidazolidines: correlations of pH-metric and UV-metric values with predictions from gas-phase ab initio bond lengths

The dissociation constants of two related series of 2-(arylimino)imidazolidine and aryl guanidine a2-adrenoceptor antagonists (35 compounds in total) were measured by potentiometric titrations and by UV-spectrophotometry using the 96-well microtitre plate method. The experimental values obtained using both methods were quite consistent and showed a very good agreement with the pKa values calculated using the AIBLHiCoS methodology, which uses only a single bond length obtained using ab initio calculations at a low level of theory. The prediction power of the imidazolidine and guanidine set of compounds was very good with deviations typically o0.30 and o0.24 pKa units, and a mean absolute error (MAE) of 0.23 and 0.29, respectively. The study of the quantitative effect of diverse substituents on the basicity of aryl guanidine and 2-(arylimino)imidazolidine derivatives is useful for medicinal chemists working with biologically relevant guanidine-containing molecules.


Introduction
The guanidine group of the amino acid arginine is ubiquitous in Nature.In its protonated state, several resonance forms that delocalize the cationic charge of the guanidinium cation over the entire functional group contribute to the high basicity of guanidine (pK aH = 13.6 in water). 1 The H-bonding ability of guanidine makes this group a valuable tool for the design of compounds for different pharmacological applications such as DNA binding drugs (e.g.][18][19] Centrally acting a 2 -adrenoceptor antagonists are potentially useful pharmacological tools for the treatment of depression and schizophrenia. 167][18][19][20] The prototypical example of a (phenylimino)imidazolidine drug is a 2 -AR and imidazoline receptor agonist clonidine (i.e.N-(2,6-dichlorophenyl)-4,5-dihydro-1H-imidazol-2-amine) used, among other applications, as an antihypertensive drug and for controlling neuropathic pain (Fig. 1). 21As both the guanidine and 2-aminoimidazoline groups are strong bases, they are protonated at physiological pH.Thus, this physicochemical property may impair the absorption and/or limit the uptake of guanidine-containing molecules through biological membranes, including the blood-brain barrier, 22,23 a drawback that is highly relevant for centrally acting drugs.
The protonation state of a drug molecule determines its lipophilicity, solubility in biological fluids, ability to cross biological membranes, degree of protein binding, and its capacity to bind to biological targets.Hence, knowing the dissociation constants (pK a ) of drugs is crucial because this parameter affects the key pharmacokinetic properties such as administration, distribution, metabolism and excretion (ADME). 24ne of the objectives of this work is to understand the quantitative effect of diverse substituents on the aqueous basicity of pharmacologically important families of compounds: aryl guanidines and 2-(arylimino)imidazolidines.This knowledge is useful to assist in the design of new biologically relevant guanidine-containing molecules with improved pharmacokinetic properties. 22Another objective is to compare the experimental pK a values of these series with those obtained by ab initio calculations using the AIBLHiCoS method. 25,26The Ab Initio Bond Lengths High Correlation Subsets protocol works on the basis that, for a series of electronically congeneric compounds, chemical space may be partitioned according to a linear free energy relationship (LFER) between the calculated gas-phase equilibrium bond lengths and aqueous pK a values.The equations resulting from highly correlated, local linear relationships for structurally similar molecules are then used as predictive models for calculating the pK a values of other compounds belonging to the appropriate High Correlation Subset (HiCoS).In this work, using a specific bond length and tautomeric form in the neutral state, we have constructed one predictive model with a training set of 13 guanidines and a second model with a training set of 23  2-(phenylimino)imidazolidines.8][29] However, this method has not been tested, until now, with the medically relevant 2-(phenylimino)imidazolidine scaffold.The predictive models built with these series show that the AIBLHiCoS method is an accurate pK a prediction tool with this pharmacologically important family of compounds.
It is known that low aqueous solubility and chromophore strength can affect the accuracy of experimental pK a determination for certain bases. 34As all the compounds with greater discrepancies between UV-metric and pH-metric pK a values have weak chromophores, this could explain the lack of accuracy with these compounds.In contrast, if we take the nitro derivatives 11a-15a as an example of compounds with a strong chromophore and a smooth transition from the protonated (l max E 298 nm at pH 3) to the neutral form (l max E 364 nm at pH 12), we observe an excellent correlation between the UV-metric and pH-metric values (Table 1).
The pK a values calculated using the AIBLHiCoS methodology correlated well (r 2 = 0.965) with the experimental results obtained using pH-metric titrations (see Fig. 2B) with deviations o0.30 pK a units for the phenyl-imidazolidine derivatives.However, some deterioration was observed for pyridinyl compounds (o0.58) and two compounds containing nitro groups, 13a (À0.79) and 15a (À0.62), were also poorly predicted and are discussed in the next section.The mean absolute error (MAE) was found to be 0.23 with a standard deviation of 0.22 for the error values using the model which returned the highest validation statistics during calibration (C-N(im), Fig. S1, ESI †).This may be compared to the MAE value for ChemAxon predictions using the imino tautomer, which was found to be 0.77, with a standard deviation of 0.47 (Table S1 in the ESI †).The prediction power with the guanidine set of compounds was slightly lower with an MAE of 0.29, and most deviations r0.24 pK a units except for compounds 17b (À0.55), 24b (+0.70) and one very high error for 22b (+1.05), which will also be discussed in the next section (Table S2, ESI †).

Validation of the AIBLHiCoS prediction method applied to arylguanidines and 2-(arylimino)imidazolidines
The bond length vs. pK a model used for most of the imidazolidine predictions in this work was chosen on the basis of its superior r 2 , leave-one-seventh-out-q 2 and Root Mean Error of Estimation (RMSEE) values compared to a number of other candidate bond length models.For the 22 compounds of the training set (listed 1i-22i in Table S3, ESI †), the highest validation statistics correspond to a model constructed using a C-N bond length of the imidazolidine ring, when compounds were represented as the imino tautomer ''T3''.This model is labelled C-N(im) in Fig. S1 (ESI †) and will be referred to as such throughout the following discussion.The second best validation statistics were observed for the N-C(Ph) bond, which is also labelled as such in Fig. S1 (ESI †).Predictions using both models are also listed in Table 1.
For guanidines, superior validation statistics for the training set (listed g1-g13 in Table S4, ESI †) were obtained using the CQN bond length with compounds represented as amino tautomer ''A'', as illustrated in Fig. S2 (ESI †).All predictions listed in Table 1 and Table S2 (ESI †) were obtained using this model.
A high degree of prediction accuracy in modern terms is considered to be within 1 pK a unit per prediction, with a mean absolute error (MAE) for a test set of less than 0.5 units. 35As the training sets consist of pK a values taken from various sources, and therefore contain measurements obtained using different techniques, noise within the experimental datasets is to be expected.Small variations in conditions are also possible, i.e. solvent purity, sample purity and lab temperature, which can also have an effect on the values measured.According to the above conditions, the error statistics for both test sets show that both the guanidine and imidazolidine AIBLHiCoS models perform well.On only one occasion does any error exceed 1 unit, and the MAE values for both test sets are well below 0.5 (0.23 for imidazolidines and 0.29 for guanidines).For predictions made using the pK a predictor plug-in within the Marvin Suite by ChemAxon, 36 with the imino tautomer as the input structure (Tables S1 and S2, ESI †), nine prediction errors exceed 1.0 pK a units and the MAE values are in excess of 0.5 units for both imidazolidines and guanidines (0.77 and 0.54).AIBLHiCoS therefore finds its place here as a useful tool in instances where the empirical tools with a wider applicability radius struggle, as is the case for these guanidine-type ionisable groups.
However, two of the largest prediction errors of the imidazolidine test set correspond to those for 13a and 15a, 2-(3-chloro-4-nitrophenylamino) and 2-(3-fluoro-4-nitro-phenylamino)imidazolidine.Errors for the pH-metric measurements are À0.79 and À0.62, and compared to the UV-metric values they also give errors of À0.94 and À0.58.The resonance-induced strong electron-withdrawing effect of nitro groups is known to perturb active bond lengths to such an extent that compounds containing these groups often form their own separate HiCoS. 28,29Contrary to the previous results, the partitioning of the imidazolidine set into more structurally specific subsets may not be defined in this case by the presence of a nitro group at the para-position.This is explained by the fact that compound 7i, (2-(2,6-Cl 2 ,4-NO 2 -phenylimino)imidazolidine), is present within the training set used to form the predictive HiCoS, and 11a, 12a and 14a of the test set fall in the expected region of the trend line for the LFER, as illustrated by their low prediction errors.According to the LFER exhibited between the C-N(im) bond length and pK a , a shorter equilibrium bond length simply corresponds to a lower pK a .In the case of 13a and 15a, we therefore observe an anomalously short C-N(im) bond, resulting in lower pK a values than we would expect according to the LFER exhibited by most other congeners.
Looking to the optimised structures for 13a and 15a to explain these short bond lengths reveals the presence of an interaction between the O À atom of the p-nitro group, and the m-Cl (13a) and m-F (15a) atoms.As a result of this interaction, the nitro group is seen to adopt a geometry whereby it is no longer co-planar with benzene.It may be asserted that the presence of this substituent interaction and deviation in nitro geometry, which is absent in all remaining p-nitro compounds, may be the cause of the anomalously short C-N(im) bond lengths.
The internal and external validation statistics for the chosen model (C-N(im)) are superior to any other bond length vs. pK a plot (r 2 = 0.97, q 2 = 0.96 and RMSEE = 0.23).The next best model according to these validation statistics is observed for the same imidazolidine training set with compounds in the imino tautomeric form, but for the C-N bond length connecting benzene to 2-iminoimidazolidine (marked as N-C(Ph) within the structure labelled ''T3'' in Fig. S1, ESI †).This model has an r 2 value of 0.96, a q 2 value of 0.96 and a RMSEE value of 0.25.When the linear equation for this alternative model is used to predict for the whole test set 1a-23a, the resultant MAE is significantly worse than the original model we chose to implement based on its superior statistics (the MAE is now 0.57 vs. 0.23).
For the 3-pyridyl compounds synthesized in this work, 16a-18a, we observe errors of +0.51, +0.58 and +0.30 for the pH-metric measurements and À0.51, +0.38 and +0.28 for the UV-metric values.Despite the increase in MAE when the whole set is considered, use of the alternative N-C(Ph) model is seen to reduce the errors for 16a and 17a to À0.31 and +0.17.This improvement of predicted values can also be seen for the  The pK a value of the pyridin-2-yl nitrogen could not be worked out.h Literature: pK a = 10.77. 33roblematic nitro compounds, 13a and 15a, for which the errors decrease dramatically when predicted using the N-C(Ph) model (À0.26, +0.19, +0.06, À0.09 and +0.09 for 11a-15a).
We now take a moment to briefly introduce the Interacting Quantum Atoms (IQA) energy partitioning scheme in accordance with the Quantum Theory of Atoms in Molecules (QTAIM). 37,38y partitioning the total energy of a system into intra-and interatomic terms, it has been shown that we may derive a quantitative measure of covalent-like interactions between atoms in molecules.This comes in the form of the exchangecorrelation potential energy V AB xc , which is the sum of the exchange energy V AB x and the correlation energy V AB c .The former term dominates V AB xc , and expresses the Fock-Dirac exchange, a consequence of the Pauli Principle, which describes the ever reducing probability of finding two electrons of the same spin as they approach each other (i.e. the Fermi hole).The latter term V AB c is associated with the Coulomb hole, which corresponds to the electrostatic repulsion between electrons.The absolute value of V AB xc between two atoms can be taken as the extent of delocalization of electrons between them, a factor that also determines the bond distance.
For the training set compounds, for C-N(im) and N-C(Ph), the V CN xc values (Table S5, ESI †) for the bonded atoms are found to have r 2 values of 0.990 and 0.992 with the corresponding bond lengths (Fig. S3(a) and (d), ESI †).However, the analogous plots for other bonds (marked ''b'' and ''c'' in Fig. S1 and (b and  c) in Fig. S3, ESI †) have weaker correlations (r 2 = 0.939 and 0.897).Therefore, we can conclude that the bond lengths that correlate most highly with aqueous pK a are those which are also most accurately reflecting the extent of delocalization between the two bonded atoms.Including the V CN xc vs. bond lengths for our outliers, 13a, 15a, 16a and 17a, shows that they do not diverge substantially from the highly correlated lines of the best fit for either C-N(im) or N-C(Ph).Correspondingly, these four compounds remain as outliers for the plot of V CN xc values vs. pK a for C-N(im), and as observed for the bond lengths, fall within the expected region of the plot for V CN xc values vs. pK a for N-C(Ph) (Fig. S4, ESI †).
On the basis of previous work, we suspect that for compounds with m-halogen and p-nitro substituents (13a and 15a), the halo-nitro interaction perturbs the charge distribution of the guanidine fragment to such a degree that the relationship between C-N(im) and pK a is no longer reflected in the general trend observed for most other congeners.If more pK a data were available, it is proposed that we could form a new HiCoS consisting of only compounds of this type.Currently, in the absence of an adequate quantity of pK a data, a separate predictive HiCoS using the C-N(im) bond lengths cannot be constructed.Given that the N-C(Ph) bond lengths and corresponding V CN xc values do not diverge from the general trend as a result of the halo-nitro interaction, the C-N(Ph) model should be used to predict compounds of this type.For two of three 3-pyridyl compounds (16a-18a), an explanation for their 40.5 prediction errors is not immediately accessible in terms of their structure.This is due to the fact that one 3-pyridyl containing compound, 18a, has a relatively low error of +0.30 predicted using the C-N(im) model.We can therefore assert that the presence of the heteroatom at the 3-position is not the sole cause of the higher pK a values we predict, based on their longer C-N(im) bond lengths.Once again, in the absence of an adequate amount of experimental data for like compounds at the time of writing, we cannot prove indefinitely that these compounds would form their own subset.We therefore suggest that future predictions for all 3-pyridyl compounds be made via implementation of the N-C(Ph) model equation.
2.3.2.Guanidines.For the guanidine test set, the only AIBLHiCoS prediction above 1 unit corresponds to 22b, 1-(5chloro-2-pyridinyl)guanidine.Our model gives an error of +1.05 units whereas ChemAxon gives an error of À0.89.There are no pyridinyltype compounds within the training set, but two 3-pyridinyls, 17b and 18b, return respectable errors of +0.55 and À0.24 using the CQN model.pK a data for 1-(2-pyridinyl)guanidine and 1-(6-methyl-2-pyridinyl)guanidine are available, and when the bond lengths of their optimised structures are included in our CQN vs. pK a plot using tautomer A (shown in Fig. S2 of the ESI, † bond labelled ii), they also appear below the line of the best fit, near the data point for 22b.When predicted using the model equation, errors of +1.51 and +1.22 units are observed.It appears that the presence of the heteroatom at the 2-position of the aryl group causes further partitioning of the set once again.A new HiCoS cannot be constructed using only two data points, and so a predictive equation which would better describe the LFER for these compounds cannot be formulated until more pK a data become available.
Boltzmann weighting of tautomers according to either internal or Gibbs energies at the level of theory used to obtain the bond lengths for the AIBLHiCoS protocol reveals a general prevalence of the ''imino'' tautomer for both sets of guanidines and imidazolidines.This tautomer is characterised by a CQN double bond between the central carbon atom of the Y-shaped guanidine Fig. 3 Substituent effects on the pK a of 2-(arylimino)imidazolidines and arylguanidines.The pK a values were determined using potentiometric titrations at 25 1C.DpK a is the difference in pK a of the substituted compound with respect to the unsubstituted compound (1a, 1b, 19a, and 16a, respectively).
fragment and the nitrogen adjacent to the phenyl group.For imidazolidines, the optimal model emerges from one of the C-N bond lengths of the 5-membered ring of this most stable tautomer.The emergence of the optimal model from the most stable tautomer is therefore congruent with it being the dominant species in solution, confirmed by the energetics.In terms of the ChemAxon predictions, using the most stable tautomer as an input structure allows for better prediction accuracy for both the imidazolidines and guanidines.However, for AIBLHiCoS the best internal validation statistics for guanidines are revealed using equilibrium bond lengths from a specific conformation of a higher energy tautomeric form, which is not in agreement with the energy rankings.Regardless of that these results corroborate those of our previous work, 27 which states that in the case of guanidines conformational commonality undermines stability; that is to say, as long as the compound of interest containing the guanidine group is optimised from this specific form, a very high degree of prediction accuracy may be obtained.

Quantitative effect of aryl substituents on the pK a of aryl guanidines and 2-(arylimino)imidazolidines
According to a survey by Manallack on the pK a distribution of drugs, 24 no CNS drug has a pK a of above 10.5.Thus, phenylguanidine (1b: pK a = 10.90), the reference compound in our study, is unlikely to have good CNS activity.Replacing the guanidine group of 1b by a 2-iminoimidazolidine group (1a: pK a = 10.24)leads to a drop in the basicity of the molecule (DpK a = À0.66).Changing the phenyl ring of 1a for a 2-pyridinyl (19a) or a 3-pyridinyl (16a) group has a strong effect on the pK a decrement (DpK a = À0.79 and À1.25, respectively).The effect of electron-donating and electron-withdrawing groups on the pK a of 2-(arylimino)imidazolidines and arylguanidines is summarized in Fig. 3 and Table S6 (ESI †).As expected, these effects depend mainly on the capacity of these substituents to donate or withdraw electrons from the aryl ring, resulting in higher or lower electron density on the 2-imino nitrogen, respectively.
A linear relationship was found between the pK a values and the Hammett s values of meta and para substituents as shown in Fig. 4. 39,40 For para-amino substituents conjugated with the imidazolidine nitrogen (2a, 2b, 24b), s p À values were used to get a good correlation.In contrast, compounds with an amide group (9a) in para position correlated poorly with the Hammett constant (red points in Fig. 4).Amino substituents (18a, 18b, 24b) in the para-position to the guanidine/imidazoline group yield the strongest pK a increase (DpK a = +1.11 to 1.34), with the exception of the aminophenyl group (2a, 2b), the electron donating capacity of which is much reduced by the presence of the phenyl ring (DpK a = +0.25 to 0.46).Expectedly, the pK a of the pyridine nitrogen is influenced to a larger extent by the presence of amino and alkyl electron-donating substituents (DpK a = +1.31and +1.92 for 17a and 18a, respectively).Alkoxy substituents (8a, 26b, 27b) have a smaller effect on the pK a increase (DpK a = +0.06 to 0. These values are consistent with the pK a decrements measured for a series of chloro-and fluoro-substituted bis-2-(arylimino)imidazolidines reported previously. 22immermans et al. have shown that the presence of two chlorine atoms in the position ortho to the 2-imino nitrogen reduces the pK a value of the molecule by 1.66 units (e.g.4-nitroclonidine: pK a = 6.86 at 20 1C). 41According to the pK a values reported by Rozas and co-workers for a series of symmetric and unsymmetrical bisguanidines and bis-2-(phenylimino)imidazolidines, 42,43

Conclusions
Arylguanidines and 2-(arylimino)imidazolidines whose pK a values are reported here belong to a pharmacologically important class of molecules with promising applications as centrally active a 2 -AR modulators.As the ionisation constant is a key physicochemical parameter that governs the membrane permeability of drugs (e.g. the BBB permeability in the case of CNS active compounds), the data of substituent effects on the basicity of arylguanidines and 2-(arylimino)imidazolidines can be useful for the design of new molecules of this class.
The dissociation constants of 23 arylguanidines and 12 2-(arylimino)imidazolidines measured using the spectrophotometry method were consistent with those measured using potentiometric titrations, further validating the 96-well microtitre plate method 32 as a useful tool for the medium throughput determination of pK a of series of molecules.
Finally, we have shown that the AIBLHiCoS method is useful to predict the pK a values of aryl guanidines and 2-(arylimino)imidazolidines with low RMSD values, using only a single bond length obtained at a low level of theory.It must be noted that compound 12a is an excellent example of the predictive power of the AIBLHiCoS model.The initial (pH-metric) pK a value we were working with was 8.13, meaning that our prediction of 7.28 was 0.85 units out, whilst for 14a, the 2-fluoro analogue, the prediction error was only 0.19.Re-measurement of the pK a value for 12a then revealed the first to be erroneous, for reasons that are not clear, and the new pK a value was observed to be 7.29, only 0.01 units out from the prediction.This value was also in excellent agreement with the UV-metric pK a value (7.33).

Chemistry
Melting points were measured in open capillary tubes using a Stuart Scientific SMP3 apparatus or a Mettler Toledo MP70 melting point system and are uncorrected.LC-MS spectra were recorded on a WATERS apparatus integrated with a HPLC separation module (2695), a PDA detector (2996) and a Micromass ZQ spectrometer.Analytical HPLC was performed with a SunFire C18-3.5 mm column (4.6 mm Â 50 mm).Mobile phase A: CH 3 CN + 0.08% formic acid and B: H 2 O + 0.05% formic acid.UV detection was carried over 190 to 440 nm.Accurate mass was measured with an Agilent Technologies Q-TOF 6520 spectrometer using electrospray ionisation. 1 H NMR and 13 C NMR spectra were registered on Bruker Avance-300, Varian Inova-400, and Varian-system-500 spectrometers.Chemical shifts of the 1 H NMR spectra were referenced to the residual proton resonance of the deuterated solvents: D 2 O (d 4.79) and CD 3 OD (d 3.31).Chemical shifts of the 13 C NMR spectra were internally referenced for D 2 O and CD 3 OD (d 49.0).Coupling constants J are given in hertz (Hz).
N-Phenyl-4,5-dihydro-1H-imidazol-2-amine acetate salt (1a).Yellowish gum (88%).HPLC (UV) 492%.Potentiometric pK a determination.Titrations were carried out at 25 AE 0.5 1C in 0.15 M aqueous KCl solution under a nitrogen atmosphere using a SiriusT3 apparatus (Sirius Analytical Instruments Ltd, East Sussex, UK) equipped with an Ag/AgCl double junction reference pH electrode and a turbidity sensor.Standardised 0.5 M KOH and 0.5 M HCl were used as titration reagents.The KOH solution was standardized by potassium acid phthalate.The pK a values are the mean of 3 titrations AESD except otherwise noted.
Spectrophotometric pKa determination.The UV-spectra were recorded at 30 1C (i.e.room temperature) with a Thermo Multiskan spectrum apparatus using the 96-well microtiter plate method as described, 32 with the following modifications: (1) compounds' stock solutions were prepared in DMSO at C = 5 mM to ensure that the maximum absorbance of the compound was below 1.5 AU (i.e. the final concentration of the compound in the well was 0.1 mM).(2).The buffer solutions of constant ionic strength (0.1 M KCl) were prepared according to http://www.biomol.net/en/tools/buffercalculator.htm:(a) 25  The raw UV data were processed using the Excel program (see the Excel sheet template in the ESI †) and the pK a values were determined by linear regression from the total absorbance vs. pH curve as reported. 32The pK a values are the mean of 3 or 4 experiments (expressed as K a values) AE SD.

Gas-phase ab initio calculations
The initial structures of each member of the respective training sets (guanidines and 2-(phenylamino)imidazolidines) were built using the program GaussView in each available tautomeric form.This equates to three tautomers for the imidazolidines, labelled T1-T3 in Fig. S1 (ESI †) and five ''tautomers'' for guanidines, labelled A-E in Fig. S2 (ESI †).For compounds with higher degrees of conformational freedom, several starting structures were generated using the ''Conformers'' plug-in within Marvin Sketch by ChemAxon. 35Geometry optimization in the gas-phase was then performed for each conformer of each compound in each of the tautomeric forms using the GAUSSIAN09 program. 44Calculations were initially carried out at the B3LYP/6-31G(d,p) level but the basis set was changed to 6-311G(d,p) for reasons that will be explained below.For biguanide derivatives included in the guanidine training set, various tautomeric forms of the guanidine fragment present within the R group were considered, whilst maintaining the specific tautomeric form of the guanidine fragment under consideration.Frequency calculations were carried out on optimised structures again using GAUSSIAN09, at the same level of theory.The output files were then inspected to confirm the absence of negative eigenvalues after diagonalization of the Hessian matrix, so that geometries are confirmed as true energy minima.Where a number of starting conformers of a tautomer were generated, the most stable conformer was chosen by comparing the total energies of each optimised structure.At this point we noticed that the chosen basis set was causing an issue in terms of ranking conformer stability between 2-(2-halogen-phenylimino)imidazolidine geometries with and without IHBs.These IHBs are seen to exist between an N-H group on imidazolidine and the o-halogen atom on benzene, and are ubiquitous throughout the most stable conformations of all compounds containing this substituent for tautomer T3, with the exception of 2-(2,4-Cl-phenylimino) and 2-(2,5-Cl-phenylimino)imidazolidine.This observation prompted the re-optimisation of all T3 training set compounds at the B3LYP/ 6-311G(d,p) level of theory.Using the valence triple zeta basis set revealed a consistent picture, where the presence of an IHB is a stabilising feature for all compounds.At this point all other calculations for T1 and T2 and tautomers A-E of the guanidine training set were also repeated for consistency.All analysis corresponds to the results of the calculations using the 6-311G(d,p) basis set.
Seven bond lengths were extracted from each imidazolidine compound, corresponding to each of the three CN bonds of the guanidine fragment, in addition to the N-C(Ph) bond connecting imidazolidine to benzene and the N-C, C-C and C-N bonds of the imidazolidine ring.These are labelled a-g in Fig. S1 of the ESI.† Five bond lengths from within the guanidine moiety were also extracted, which correspond to the two CN single and one CN double bond of the guanidine group, the N-H bond attached to the imine nitrogen in A, B, D and E, and one N-H bond of a primary amine group.These are labelled i-v, in Fig. S2 (ESI †).The five bond lengths i-v within tautomers A-E were then regressed against the pK a values for the set of guanidines.The seven bond lengths a-g of the three tautomers T1-T3 were also regressed against their pK a values for the set of 2-(phenylimino)imidazolidines.The squared correlation coefficient (r 2 ), Root-Mean-Squared Error of Estimation (RMSEE) and leave-one-seventh-out q 2 values were obtained using the program SIMCA-P 10.0. 45y a comparison of internal and external validation metrics obtained for each bond length model, an optimal linear equation was chosen.For guanidines, the model was constructed using the CQN bond lengths of training set compounds as tautomer A, labelled ii in Fig. S2 (ESI †).For imidazolidines, the model was constructed using an endocyclic C-N bond length of the imino tautomer T3, labelled ''a'' in Fig. S1 (ESI †).The predictions for test set compounds 1a-23a (2-(phenylimino)imidazolidines) and 1b-27b (aryl guanidines) were obtained by energy minimization, (via a conformational search using Marvin followed by geometry optimisation), frequency calculations, and insertion of the appropriate bond length into the linear equation for the optimal bond length vs. pK a model.The test compounds were constructed in the same tautomeric form as those of the training set used to construct the model.
The program AIMAll (version 14.04.17

Fig. 1
Fig. 1 Resonance forms of the guanidinium and 2-iminoimidazolidinium cations responsible for the high basicity of these functional groups.Structure of the archetypal 2-(arylimino)imidazolidine drug clonidine.

a
Calculated using the AIBLHiCos approach with the C-N(im) bond model of the imino tautomer.b pH-metric determination at 25 1C using Sirius T3 apparatus.Mean of 3 titrations AE SD. c Determined by UV-spectrophotometry at 30 1C using the 96-well microtiter plate method.32Mean of Z 3 titrations AE SD. d Bond length model built from the same imidazolidine training set with compounds in the imino tautomeric form, but with the C-N(Ph) bond length connecting 2-iminoimidazolidine to benzene.e p s K a measured with MeOH as a co-solvent (Yasuda-Shedlowsky extrapolation in H 2 O).f pK a value of the pyridin-3-yl nitrogen.g
the addition of a 4-imidazoline or 4-guanidine group on the phenyl substituent of 2a-b and 3a-b results in a pK a decrement of approximately À0.50 units with respect to 2a-b and 3a-b.The values of DpK a with respect to 1a-b are shown in Fig. 5.

Fig. 5
Fig.5Substituent effects on the pK a of bis(2-(phenylimino)imidazolidines) and bisguanidines.Reported pK a values determined using potentiometric42 or UV-spectrophotometric43 titrations at 25 1C.DpK a is the difference in pK a of the 4-substituted compound with respect to 1a or 1b.

Table 1
pK a values of aryl/pyridinyl guanidines and iminoimidazolidines calculated using AIBLHiCos and determined experimentally by potentiometric and spectrophotometric methods