Seyedeh Maryam
Salehi
a,
Silvan
Käser
a,
Kai
Töpfer
a,
Polydefkis
Diamantis
b,
Rolf
Pfister
c,
Peter
Hamm
c,
Ursula
Rothlisberger
b and
Markus
Meuwly
*a
aDepartment of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland. E-mail: m.meuwly@unibas.ch
bLaboratory of Computational Chemistry and Biochemistry, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland
cDepartment of Chemistry, University of Zurich, Switzerland
First published on 5th October 2022
Halogenated groups are relevant in pharmaceutical applications and potentially useful spectroscopic probes for infrared spectroscopy. In this work, the structural dynamics and infrared spectroscopy of para-fluorophenol (F-PhOH) and phenol (PhOH) is investigated in the gas phase and in water using a combination of experiment and molecular dynamics (MD) simulations. The gas phase and solvent dynamics around F-PhOH and PhOH is characterized from atomistic simulations using empirical energy functions with point charges or multipoles for the electrostatics, Machine Learning (ML) based parametrizations and with full ab initio (QM) and mixed Quantum Mechanical/Molecular Mechanics (QM/MM) simulations with a particular focus on the CF- and OH-stretch region. The CF-stretch band is heavily mixed with other modes whereas the OH-stretch in solution displays a characteristic high-frequency peak around 3600 cm−1 most likely associated with the –OH group of PhOH and F-PhOH together with a characteristic progression below 3000 cm−1 due to coupling with water modes which is also reproduced by several of the simulations. Solvent and radial distribution functions indicate that the CF-site is largely hydrophobic except for simulations using point charges which renders them unsuited for correctly describing hydration and dynamics around fluorinated sites. The hydrophobic character of the CF-group is particularly relevant for applications in pharmaceutical chemistry with a focus on local hydration and interaction with the surrounding protein.
A halogen bond “[…]occurs when there is evidence of a net attractive interaction between an electrophilic region associated with a halogen atom in a molecular entity and a nucleophilic region in another, or the same, molecular entity.”16 Hence, halogen atoms act as electrophiles and can form an attractive interaction with a nucleophilic counterpart. Based on the analysis of the molecular surface electrostatic potential (ESP),17 the “halogen bond” was also associated with a “σ-hole bond”18 which is a noncovalent interaction between a covalently-bonded halogen atom X and a negative site, e.g. a lone pair of a Lewis base or an anion.17 Such a “bond” involves a region of positive electrostatic potential, i.e. the σ-hole, and as an extension of one of the covalent bonds to the atom. The σ-hole arises as a consequence of the anisotropy of the ESP around the halogen atom. The strengths of the interactions generally correlate well with the magnitudes of the positive and negative electrostatic potentials of the σ-hole and the negative site. As fluorine has the largest electronegativity and the lowest polarizability, for some time it was in fact assumed that there is no σ-hole and that therefore fluorine is not involved in halogen bonding at all.17,19,20 However it is now well established that it can have a positive σ-hole and form halogen bonds when it is linked to strongly electron-withdrawing groups including Cl, Br, and I.21,22 Moreover, there is also experimental evidence for fluorine engaging in halogen bonding.23
Introducing a fluorine atom into organic molecules can cause major changes in the physico-chemical properties such as solubility, chemical reactivity and biological activity compared to non-fluorinated analogues.24 In particular, fluorine often replaces hydrogen in organic molecules but the size and stereoelectronic influences of the two atoms (hydrogen vs. fluorine) are quite different albeit it is often regarded as isosteric substitution.4 In bio-inorganic and medicinal chemistry, the formation of intermolecular O–H/F–C and N–H/F–C hydrogen bridges was assumed to be important in binding fluorinated compounds to enzyme active sites.25 Such interactions affect enzyme ligand binding affinity, selectively coupled with the changes in pharmaco-kinetic properties by fluorine substitution.10,24 The effects of fluorine substitution on the related pharmaco-kinetic properties like lipophilicity, volatility, solubility, hydrogen bonding and steric effects affect the resulting compound binding, absorption, transport and hence the related biological activity.26 Finally, about 20% of the commercial pharmaceuticals contain fluorine27 which underline the practical relevance of fluorinated compounds and further motivate to consider a fluorinated species in the following.
A variety of functional groups, including C–H, C–OH, CO, and C
N, have utilized the C–F bond as a bioisostere.28 However, it is difficult to generalize the relative ability of fluorine to act similar to a hydrogen or hydroxy group, and different factors must be considered in each case. The van der Waals radius of fluorine (1.47 Å) lies between that of oxygen (1.57 Å) and hydrogen (1.2 Å) and as it is the element with the highest electronegativity, the C–F bond is almost identical to C–OH in terms of bond length and polarity. Despite its three electron pairs, the C–F bond interacts more weakly with the environment compared to an oxygen atom and is better described as “weakly polar” rather than “hydrogen bonding”.28,29 In pharmacological applications the replacement H → F is often considered to avoid metabolic transformation due to the high stability of the CF bond. Examples are drugs interacting with P450 for which fluorination has been widely used to block metabolic transformations.28
Given the different qualitative characterizations outlined so far, a more molecularly refined picture of the energetics and dynamics of fluorinated model compounds is warranted. For this, hydrated fluoro-phenol (F-PhOH) as a typical representative is considered. Using linear infrared (IR) spectroscopy together with computational characterizations at different levels of theory the structural dynamics and spectroscopy of F-PhOH is characterized. The computations use advanced empirical force fields including multipolar interactions, a machine-learned, neural network-based representation of the full-dimensional potential energy surface (NN-PES), mixed quantum mechanics/molecular mechanics and ab initio molecular dynamics (MD) simulation techniques. MD-based infrared spectra are well-established30,31 and provide a powerful approach to characterized the structural dynamics in solution, in particular when combined with experiments32–34 because the measured IR spectra can be directly compared with the computational results and can provide a structural interpretation of the spectroscopic features.
In the present work first the different energy functions employed are described and validated for F-PhOH in the gas phase. Next, results for F-PhOH in solution are presented. This is followed by a comparison of spectroscopy for F-PhOH and PhOH to quantitatively probe the consequences of the H → F replacement opposite the –OH group. Finally, the solvent structure around the two solutes is analyzed from radial distribution functions and from 2-dimensional solvent distributions.
Simulations for F-PhOH and PhOH were carried out in a cubic box of 303 Å3 (283 Å3 for simulations with the NN-PES, see below) using TIP3P42 water molecules. Minimization, heating, and equilibration procedures for 40 ps were employed to prepare the system at 300 K. This was followed by 5 ns production simulations in the NVE ensemble using the Velocity Verlet propagator.43 The time step was Δt = 1 fs and every fifth snapshot was recorded. Lennard-Jones interactions were computed with a 12 Å cutoff switched at 10 Å.44 The electrostatic interactions for the monopoles (point charges) are treated using Particle-Mesh Ewald45 (PME) with grid size spacing of 1 Å, characteristic reciprocal length κ = 0.32 Å−1, and interpolation order 6. All bonds involving hydrogen atoms are constrained via the SHAKE algorithm.46,47 Additional MD simulations were also carried out for PhOH in water with the same setup that was used for F-PhOH in order to directly compare their spectroscopy and solvent structure.
For the simulations with the NN-based PES (see below) the atomic simulation environment (ASE) was used.48 The van der Waals interactions were those from the CGenFF36 parametrization and the fluctuating charges are from the PhysNet representation, see below. For both terms interactions the cutoff distance is at 14 Å and switched between 13 to 14 Å. To avoid artifacts of the electrostatic Coulomb force in the cutoff range, the Coulomb force at the distance of 14 Å was shifted to zero in accordance to the shifted forces method.49 In the gas phase 1000 trajectories, each 200 ps in length, are run to obtain an ensemble average. The NVE simulations are run at 300 K initialized from random momenta corresponding to a Maxwell–Boltzmann distribution, with a time step of 0.5 fs, equilibrated for 50 ps and propagated for 200 ps. Simulations in solution NVT using Langevin50 thermostat at 300 K are performed for 20 trajectories of 100 ps each with a time step of 0.2 fs to obtain a total of 2 ns for PhOH and F-PhOH in solution, respectively. The IR spectra are then calculated from the dipole–dipole moment autocorrelation function51–53 and averaged over all 1000 trajectories.
![]() | (1) |
![]() | (2) |
From the FFCF, the decay time is determined by fitting the FFCF to a general expression56
![]() | (3) |
For both the gas phase and the condensed phase systems, the full QM simulation protocol consisted of (i) an equilibration of the system at 300 K first with Born–Oppenheimer (BO) MD and then with Car–Parrinello (CP) MD,59 and (ii) a production phase in the microcanonical (NVE) ensemble. The respective time steps for BO and CP MD were 10 and 2 atomic units (a.u.), respectively. In CP MD, the fictitious electron mass was equal to 400 a.u. In the production phase, frames were saved every 10 a.u., corresponding to a time interval of approximately 0.48 fs.
Density Functional Theory (DFT)-based ab initio MD simulations of F-PhOH in gas phase and in aqueous solution were carried out using the CPMD code60 using the BLYP functional for the exchange and correlation energies61,62 with the addition of Dispersion-Corrected Atom-Centered Potentials (DCACPs)63–65 for the description of dispersion forces. Norm-conserving Martins–Trouiller pseudopotentials66 were used in combination with a plane wave basis with a 175 Rydberg kinetic energy cutoff of for the expansion of the single-particle wavefunctions. The latter value was selected because it reproduces a converged equilibrium C–F bond distance at the BLYP-DCACP level of 1.35 Å for F-PhOH in gas phase, which is in good agreement with values obtained at the MP2/6-311++G(df,pd) (1.34 Å) and B3LYP/6-311++G(df,pd) (1.35 Å) levels, respectively.67
The systems were first equilibrated classically, using AMBER18.72 F-PhOH and PhOH were modelled with the GAFF2 force field,73,74 while the TIP3P model was used for water. Following an initial minimization, the two systems were equilibrated in the isothermal–isobaric (NPT) (300 K, 1 atm) ensemble with the Berendsen barostat75 and Langevin dynamics50 for pressure and temperature control respectively, followed by further equilibration simulations in the NVT ensemble with Langevin dynamics, for a total of 100 ns. A time step of 2 fs was employed. In view of the small periodic box size, a reduced real space cutoff of 7 Å was used for the nonbonded interactions.
Following this preparation, production simulations with CPMD in the NVE ensemble were carried for both systems, whereby the solute was treated at the QM level and the solvent at the classical (MM) level. The QM setup, and the simulation time step were the same as described above for the full QM simulations. The QM/MM MD simulation protocol was also similar to the one described for the full QM simulation, apart from the use of two separate Nosé–Hoover thermostats for the QM and MM parts respectively, during the equilibration with BO and CP MD. For the F-PhOH system, the equilibration and production runs lasted 10.1 ps and 35.9 ps respectively, while for the PhOH system they lasted 12.6 ps and 25.0 ps, respectively. During the production phase, frames were saved with the same frequency as in the full QM simulations (0.48 fs).
PhysNet was trained on ab initio energies, forces and dipole moments calculated at the MP2/6-31G(d,p) level of theory using Molpro78 according to the protocol reported in ref. 76. The reference data, containing different geometries for both molecules, is generated from MD simulations at 50, 300 and 1000 K using CHARMM force field (5000 geometries each yielding a total of 30000 geometries) and extended with geometries obtained from normal mode sampling79 at temperatures between 10 and 2000 K (6600 geometries for each molecule). The complete data set thus contains 43
200 PhOH and F-PhOH structures. The performance of the PhysNet PES is reported in Fig. S1 (ESI†) which shows the correlation between the reference MP2 and the PhysNet energies for a test set of 3700 randomly selected points with R2 = 0.9999 and an RMSE of 0.0037 eV. Simulations with PhysNet for the solute and an empirical water model are subsequently referred to as ML/MM MD.
Fig. 1 reports the infrared and CF-/OH-power spectra of F-PhOH from simulations with the experimental FT-IR spectrum67 from 1100 to 1400 cm−1 and between 2600 and 4000 cm−1, respectively. For clarity, the left hand column shows the low-frequency vibrations whereas the right hand column is for the –OH-stretch region. The experiments were carried out in CCl4 solvent and the spectral lines have a full width at half maximum of ∼10 cm−1. As an indication for the solvent-induced shift incurred, for PhOH in CCl4 the CO stretch is found at 1257 cm−1 which amounts to a red shift of ∼−5 cm−1 compared with the gas-phase frequency of 1261.7 cm−1.80,81 Hence, CCl4-induced shifts for F-PhOH are expected to be a few wavenumbers as well. To the best of our knowledge, no gas-phase spectra are available for F-PhOH. Hence, the frequencies for F-PhOH measured in CCl4 are used in lieu of gas phase data.
![]() | ||
Fig. 1 Comparison of experimental (in CCl4) and computed (gas phase) spectra for F-PhOH in the 1100–1400 cm−1 (left) and 2600–4000 cm−1 (right) frequency range. (panel A) Experimental (black) spectrum from ref. 67 extracted using g3data.82 The arrows in the right hand column refer to the experimental frequencies for the CH and OH stretches.67 (panels B and C) IR spectrum from the Fourier transform of the total dipole moment correlation function and CF/OH power spectra using PC (blue) and MTP (red) for F-PhOH. (panel D) Global (solid) and CF/OH (dotted) power spectrum in violet from QM simulations. (panel E) IR spectrum (maroon) from ML/MM MD simulations. |
The measured CF and CO stretching modes occur mainly at 1226 and 1262 cm−1, respectively, while they couple to one another and potentially to other modes. According to the analysis67,81 (see Table 1), the CF stretch is coupled to the in plane bending of the ring and also the C–H bend while the CO stretch couples to the CC and CF stretching vibrations. Therefore, the CF stretch is intimately coupled with other modes and for that reason it is not possible to assign a local CF stretching mode to one particular frequency.
PhOH | 1150.7 | 1168.9 | 1176.5 | 1261.7 | 1343 | 3656 | ||
---|---|---|---|---|---|---|---|---|
δ (CH) | δ (CH) | δ (OH) | ν (CO) | δ (CH) | ν (OH) | |||
ν (CC) | ν (CC) | δ (CH) | δ (CH) | δ (OH) | ||||
δ (OH) | ν (CC) | |||||||
F-PhOH | 1149 | 1174 | 1226 | 1262 | 1310 | 1323 | 3613 | |
δ (CH) | δ (OH) | ν (CF) | ν (CO) | δ (CH) | ν (CC) | ν (OH) | ||
ν (CC) | δ ring | ν (CC) | ν (CC) | δ (OH) | ||||
δ (CH) | δ (CH) | ν (CF) | δ (CH) |
For the force field simulations with PC and MTP the β-parameter of the CF-Morse potential (see Methods) was slightly adjusted to β = 1.665 Å−1 to correctly describe the experimental spectrum in CCl4 (Fig. 1A), hence the favourable comparison with the IR spectra in Fig. 1B. The CF power spectrum (Fig. 1C left) clarifies that this mode couples strongly to other vibrations (e.g. the CO-stretch) close in frequency for both, PC (blue) and MTP (red) force fields. The peak structure from the CF-power spectrum in Fig. 1C left matches that from the experiment whereas for the IR spectrum in panel B this is only qualitatively the case. This difference arises because power spectra are determined from temporal variations of selected internal coordinates whereas the infrared spectrum is based on the total molecular dipole moment autocorrelation function.57 For the QM simulations in the gas phase (Fig. 1D) the two main peaks of QM are at 1217 and 1256 cm−1 and appear to be shifted by 5 to 6 cm−1 with respect to experiments at 1222 and 1262 cm−1 which were recorded at T = 300 K. Finally, MD simulations using the PhysNet representation of the MP2/6-31G(d,p) reference data the IR-spectrum in Fig. 1E shows two strong peaks at 1215 cm−1 and 1276 cm−1.
In the region of the OH-stretch vibration, the experimental spectrum for F-PhOH in CCl4 reports a band at 3613 cm−1 (black arrow)67 compared with 3614 cm−1 and 3616 cm−1 from PC and MTP simulations, respectively. Using PhysNet, the main peak in the gas phase is at 3889 cm−1 (harmonic frequency at 3882 cm−1 at the MP2/6-31G(d,p) level; corrected frequency at 3655 cm−1 by multiplying with 0.94 for this level of theory84) whereas the DFT-BLYP/DCACP QM simulations report the OH stretch at 3522 cm−1, somewhat shifted to the blue and red, respectively, compared with experiment. Moreover, there are signatures in the infrared spectra due to the CH-stretch vibrations (black arrow) around 3035–3077 cm−1 which are also observed in the MD simulations. For PhysNet the corresponding peak is at 3256 cm−1 (3060 cm−1 after correction with a scaling factor of 0.94) whereas for QM at the DFT-BLYP/DCACP level the absorption is at 3006 cm−1. The power spectra from the PC and MTP simulations (Fig. 1C) confirm that the OH-stretch is a local mode.
In summary, the gas phase spectrum from finite-T MD simulations find comparable patterns for the frequencies in the 1100–1400 cm−1 region when compared with experiment. It is also found that the CF stretch is coupled to other modes in this spectral range and no local mode for this motion can be assigned.
The MTP/MD simulations for the CF power spectrum (panel C) show two prominent peaks at 1187, and 1264 cm−1 which approximately line up with the features in the infrared spectrum (panel B) but are displaced from those observed experimentally. The peak at 1171 cm−1 is red shifted compared to the double peak at 1201 and 1222 cm−1 of experimental spectrum while the peak at 1264 cm−1 is blue shifted or captured at the same position compared to two additional experimental peaks at 1244 and 1264 cm−1. Furthermore, the smaller peak at 1357 cm−1 is also red shifted compared to 1368 cm−1 from experiment. The corresponding infrared spectra in panel B align with the features at 1260 cm−1 but the signal at 1187 cm−1 from the power spectra has no oscillator strength in the infrared spectrum. Instead, a peak in the infrared appears at 1171 cm−1.
The ab initio MD simulations in solution for the CF-power spectrum (Fig. 2D solid violet) find two prominent bands at 1207 and 1262 cm−1 compared with band maxima at 1217 cm−1 with faint shoulders below 1200 cm−1 from gas-phase simulations, see Fig. 1D. The two prominent bands are also found from QM/MM simulations (solid orange) with band maxima at 1225 and 1266 cm−1. The CF-power spectra assign these features to CF-stretch involving motions and accounting for the scaling for BLYP calculations (∼0.99 largely independent of basis set)84 they shift to the red which is consistent with the experiments. Thus, the splitting between the two peaks decreases from 55 cm−1 for the full QM simulations to 41 cm−1 indicating the sensitivity of the solvent interactions and the resulting frequency shifts and splittings in this spectral range.
For ML/MM MD simulations a broad band is observed between 1220 to 1290 cm−1 in the IR spectrum which is blue shifted compared to the experiment. The CF-power spectrum with peak maximum at 1273 cm−1 (dotted maroon trace in Fig. 1E) indicates that part of this broad IR-lineshape is due to CF-stretching motion. Moreover, the CO power spectrum has a first peak at 1250 cm−1 which contributes to the broad IR peak below 1300 cm−1 and a second, prominent peak at 1323 cm−1, see Fig. 1E. The overlapping peaks of the CF- and CO-power spectra indicate coupling between the two types of motion.
The high-frequency region of the experimental spectrum for solvated F-PhOH (Fig. 2A) above and below 3000 cm−1 involves a broad absorption extending from ∼2700 to 3100 cm−1, a region between 3100 and 3500 cm−1 that is not reliable due to the dominating background from the OH stretch vibration of bulk water which can not be subtracted off completely (grey area in Fig. 2A), and a high-frequency feature at 3643 cm−1 assigned to the free OH vibration presumably originating from F-PhOH but possibly also from H2O. This latter assignment is less likely, though, as the same sharp peak appears for PhOH in solution but shifted to the red by ∼40 cm−1. If the signal was due to water it is expected to occur at closer frequencies given the similarity of the solutes. Furthermore, experiments on hydrated PhOH with up to 49 water molecules find the water–OH-stretch vibration at ∼3700 cm−1.85 From simulations with PCs and MTPs the high frequency peak at 3607 cm−1 is consistent with experiment. Comparison with the spectrum for F-PhOH in CCl4 and PhOH in the gas phase shows that this signal corresponds to the “free OH stretch” vibration, see Fig. 1. Features at 3150 cm−1 in the infrared spectrum (Fig. 1B) are due to the CH-stretch vibrations which could be brought into better agreement with experiment by slight reparametrization of the force constants. These features are not present in the OH-power spectra, Fig. 1C, as expected which confirms the assignment to the CH-stretch vibration. The structured spectrum below 3000 cm−1 is present in both, the PC and MTP simulations, albeit with lower intensity.
For the full QM and QM/MM simulations the global power spectra (Fig. 2D solid violet and orange) find a signal centered at 3000 cm−1 which is typical for the CH-stretch modes. At higher frequency (∼3250 cm−1) the OH-stretch vibration is located which is confirmed by the OH-power spectra (dotted violet and orange traces). However, no signal in the 3600 cm−1 region is present which suggests that the “free OH” signature in these simulations, expected around 3500 cm−1 from the QM gas phase simulations (Fig. 1D), is absent.
Simulations with the PhysNet energy function primarily find the high frequency –OH stretch at 3803 cm−1 with broad, largely unstructured undulations below 3000 cm−1. It is likely that the MP2/6-31G(d,p) level is not sufficient for quantitatively describing the spectroscopy of F-PhOH. Accounting for a frequency scaling of 0.94 for harmonic frequencies84 shifts all frequencies to the red which is more consistent with the experimentally determined spectra. Specifically, the 3800 cm−1 and 3280 cm−1 band maxima shift to 3572 cm−1 and 3083 cm−1, both of which are consistent with OH- and CH-stretching motions.
The broad feature below 3000 cm−1 from the experiments deserves additional attention. Regular signatures in this frequency range were previously reported for thin film liquid PhOH and solid PhOH86 and for PhOH at the air/water interface.87 Such regular structures have been observed also in other hydrogen-bonded systems, such as the acetic acid dimer, and are typically used to characterize a medium-strong hydrogen bond.88 They are attributed to a Franck–Condon-like progression of the hydrogen-bond vibration (with a frequency of ca. 50 cm−1 in the present case) that is anharmonically coupled to the high-frequency OH stretch vibration.
Simulations for F-PhOH in solution with PC/TIP3P, MTP/TIP3P, and full QM show an extended spectroscopic response in this frequency range with pronounced peaks superimposed which are washed out in the ML/TIP3P simulation and entirely absent in the QM/MM simulations. This suggests that the spectroscopic signature below 3000 cm−1 is due to coupling between the H-bonding motion of water around the –COH part of F-PhOH which is primarily sensitive to the nonbonded interactions. To assess whether or not flexibility of the water solvent also affects the spectral signatures, simulations with the reparametrized,89,90 flexible KKY (Kumagai, Kawamura, Yokokawa) model were carried out.91 One 5 ns simulation with a time step of Δt = 0.25 fs for F-PhOH was run and analyzed. The power spectrum of the F-PhOH OH-stretch vibration confirms the pronounced, regular pattern with peaks separated by some ∼50 cm−1 below 3000 cm−1, see Fig. S2 (ESI†). In addition, the main peak between 3300 cm−1 and 3600 cm−1 shifts to the red by 78 cm−1 compared with simulations using the rigid TIP3P water model. This confirms that the pattern below 3000 cm−1 is due to anharmonic coupling through nonbonded interactions between solute and solvent and not caused by the water internal modes.
For assessing solvent-induced frequency shifts, the frequency distributions from 5 ns simulation of hydrated F-PhOH, analyzed with instantaneous normal modes (INM) for the PC and MTP model are compared with the normal modes from gas phase simulations, see Fig. 3 and/or Table S4 (ESI†) for the frequency maxima. For obtaining the instantaneous normal modes the water environment was frozen and the structure of the solute was optimized, followed by a normal mode calculation. The band positions compare well with the frequencies from Table 1. However, the bimodal distribution around 1250 cm−1 for both, simulations with PC and MTP, can not be convincingly correlated with the experimental spectra.
The participation ratios (see Methods) of the local modes to the frequencies of the five modes in the 1100–1400 cm−1 range (ν1–ν5) from the MTP simulations are shown in Fig. S3 (ESI†). These were determined from the normal modes of F-PhOH over 105 snapshots in solution. The contributions of the CF, CO, and CH stretch and the COH bending modes to each of the vibrations between 1100 and 1400 cm−1 were determined by projection and the results confirm mixing of these modes.
In summary, the simulations confirm that the modes in the 1100–1400 cm−1 frequency range in F-PhOH are strongly coupled. Assignment of individual spectral features from comparing experiment with simulations is not obvious. Consistent with experiment the force field-based simulations for the high-frequency modes find a high-frequency (>3600 cm−1) phenolic –OH stretch together with broad features below 3000 cm−1. These extended absorptions are also found from QM MD simulations without, however, the high-frequency –OH stretch.
Consistent with experiment, the number of spectral features for F-PhOH is larger than for PhOH in the 1200 to 1300 cm−1 range. However, none of the computed spectra display the pronounced double-peak structure above 1200 cm−1 for F-PhOH with the broad peak for PhOH to the blue of it. The simulations using MTPs find a single absorption between the low- and high-frequency absorption in F-PhOH, see Fig. 2B (green). Considering the CO-power spectrum the feature at 1224 cm−1 involves the CO-stretch vibration together with a band at 1283 cm−1. The QM/MM simulations report a larger number of spectroscopic features for PhOH (Fig. 2D, green) than experiment does. In particular, the single absorption at 1242 cm−1 is not present but rather a broad absorption extending from 1150 up to ∼1250 cm−1 is found. Finally, simulations using the PhysNet energy function quite well capture the absorption for PhOH at 1270 cm−1 (Fig. 2E, green) with an additional peak above 1300 cm−1 not present in the experiment. Correcting the position 1270 cm−1 by 0.94 as was done for the CH- and OH-stretch vibrations shifts this band too far to the red compared with experiment, probably because for coupled vibration the standard correction factor is inappropriate. The absorption for F-PhOH is shifted to the red, in agreement with experiment but does not exhibit the double peak structure. Finally, it should be noted that the experiments are carried out for solute concentrations at which aggregation of F-PhOH and PhOH molecules can not be entirely excluded, which might affect both, the position of the absorption frequency and the line shapes of the various modes.
For the high-frequency part (Fig. 2A right panel) the spectra of solvated PhOH and F-PhOH follow each other closely except for a pronounced absorption at 3596 cm−1 in PhOH which blue-shifts to 3643 cm−1 upon fluorination. This suggests that either electronic coupling between the CF- and OH-sites leads to a slightly stronger OH-bond strength in F-PhOH compared with PhOH, or that the hydration structure around –OH is affected by fluorination, or a combination of the two. From normal mode calculations (MP2/6-31G(d,p)) the OH-stretch vibrations are at 3833 cm−1 and 3829 cm−1 for PhOH and F-PhOH, which is an insignificant difference and suggests that an electronic origin for the shift is unlikely. For the MTP (3636 and 3588 cm−1) and ML/MM (3794 and 3793 cm−1) simulations the –OH stretch for PhOH and F-PhOH reproduce the proximity of the two absorptions in solution, but in reverse order compared with experiment. The QM/MM simulations for PhOH (green trace Fig. 2D) report the OH-stretch at 3247 cm−1 which is to the red of that for F-PhOH (3257 cm−1, orange trace), in agreement with experiment. However, the absorptions are shifted by about 300 cm−1 to the red relative to the experimental line positions.
The F–HWAT pair correlation function gF–HW shows even more pronounced differences between simulations with PCs compared with all other models. The peak at 1.75 Å points towards a strong, favourable interaction between solvent hydrogen atoms and the fluorine atom which is not found in any of the other four methods. This can be explained by the negative partial charge qF = −0.29e on the Fluorine atom in the PC model. The MTP and ML/MM simulations find comparable distribution functions whereas the two QM-based simulations differ from this in that the broad first maximum is peaked at around 3 Å with the full QM simulations and shows a more pronounced local minimum. All distribution functions except that with PCs report the first maximum at an F–OW separation of ∼3 Å which points towards a largely hydrophobic behaviour of the CF site. This is also consistent with notions from pharmaceutical chemistry in that a CF group reduces ligand solvation and increases its hydrophobicity.28 The number of water molecules within distance r is reported in Fig. S4 (ESI†) and shows that for small F–water separations (r ≤ 4 Å) the occupation from simulations with MTP and PhysNet is similar but clearly below that of QM/MM and QM simulations (which are identical) whereas for larger separations (r ∼ 5 Å) that from QM simulations approaches the MTP and PhysNet simulations. For the corresponding N(r), see Fig. S6 (ESI†).
For analyzing hydration around the hydroxyl group the HOH–OWAT and HOH–HWAT pair correlation functions were considered, see Fig. 5. From simulations using the PC, MTP, and PhysNet models gHOH–OW characterizing the hydrogen bond between OH and water-oxygen atoms is similar up to and including the first minimum, see Fig. 5A. The second maximum shifts to somewhat larger separations for the PhysNet simulations. Similar observations are made for gHOH–HW in panel B. The hydrogen-bond pair distribution function with a first maximum at ∼1.6 Å suggests that solvent water molecules are quite strongly bound to the –OH group and that the solvent water at the –OH group exchanges.
For gHOH–HOW from the QM and QM/MM simulations (blue and red traces in Fig. 5C) the position of the first maximum is almost identical whereas the height of the first peak differs. Both pair correlation functions have a first minimum around 2.5 Å with an amplitude close to 0 which suggests that during the 25 ps simulation one solvent water molecule is strongly bound to the OH-group of F-PhOH and does not exchange with the surrounding solvent. As a consequence the “free phenolic –OH stretch” (spectroscopic feature around and above 3600 cm−1, Fig. 2) is absent in the QM MD simulations, in contrast with what was found from the experiments and from the PC, MTP, and ML/MM MD simulations. On the other hand, the radial distribution function between the water–oxygen atoms and the two carbon atoms flanking the COH group in F-PhOH from a 5 ns MTP simulation (Fig. S7, ESI†) demonstrates that the solvent distribution dynamics is exhaustively sampled on the 5 ns time scale.
For the phenolic oxygen as an H-bond acceptor the O–HWAT and O–OWAT pair correlation functions for F-PhOH are reported in Fig S8 (ESI†). Panels A and B provide a direct comparison of the three force field-based simulations whereas panels C and D are those from the QM and QM/MM simulations, respectively. The MTP simulations (red) find strongest localization of the water followed by PC (blue) and PhysNet (green) simulations. For the QM simulations the first minimum for gO–OW is deeper than for the QM–MM simulations whereas the position and height of the first maximum are comparable. The radial distribution functions from simulations with MTP are closest to those from the QM and QM/MM simulations, respectively.
Two-dimensional solvent distribution functions were generated from the positions of the water–oxygen atoms around F-PhOH. For that, the structures of the 5000 snapshots were oriented with C1 in the origin, the C1–C4 bond along the x-axis the [C1, C4, H] atoms in the xy-plane. A 2-dimensional histogram of the water positions was generated and then refined from kernel density estimation using Rstudio.92 The distribution of the solvent water around the CF-part of F-PhOH is comparatively flat for all simulations with PC, MTP and PhysNet when contrasted with the COH-moiety of the solute, see Fig. 6. It is also found that for PhOH (top line) the solvent distributions resemble those for F-PhOH (bottom line). The asymmetry in the solvent distribution around the –COH group is due to the reference atoms chosen for the superposition of all structures. If the phenol-oxygen atom is excluded in reorienting the structures the 2d distribution becomes manifestly symmetric (Fig. S9, ESI†) and also clarifies that the –OH group of the solute rotates on the time scale of the simulations, see Fig. S10 (ESI†). Specifically, rotation of the –COH group is an activated process and occurs on the ∼100 ps time scale from MTP and ML/MM simulations.
![]() | ||
Fig. 6 The 2-dimensional solvent distributions from reorienting snapshots with respect to all heavy atoms of the solute, including the phenolic oxygen atom. Solvent distributions around PhOH and F-PhOH for PC (panels A and B), MTP (panels C and D) and from PhysNet (panels E and F) model. The iso-contour values are shown in each panel. For the solvent distribution upon reorienting with respect to all heavy atoms excluding the phenolic oxygen atom, see Fig. S9 (ESI†) which manifestly shows symmetric water distribution according to the underlying dynamically averaged spatial symmetry of the solute due to rotation of the OH group around the –COH axis, see Fig. S10 (ESI†). |
The weak interaction between solvent and solute around the CF-site of F-PhOH can also be gleaned from the behaviour of the frequency fluctuation correlation functions (FFCFs). For this, the FFCFs were determined from the frequency trajectories ωi(t) for the five modes in the CF-stretch region, i.e. ν1 to ν5, between 1100 cm−1 and 1400 cm−1 and for the OH-stretch vibration, see Fig. S11 and S12 (ESI†). The FFCFs contain information about the coupling between a particular mode and the environmental dynamics. The fitting parameters for ν1 to ν5 in Table S5 (ESI†) show that the fast correlation is generally τ1 ∼ 0.1 ps whereas the longer time scale ranges from τ2 = 0.28 ps to τ2 = 0.83 ps. Such short correlation times point towards weak solvent/solute interactions close to the CF-site. The static components Δ0 are very small, too, which also imply rapid, unspecific dynamics around the CF-group. For the OH-stretch vibration, the fast correlation time is τ1 ∼ 0.1 ps while the longer one is τ1 ∼ 0.5 ps which are comparatively short. Compared with the FFCFs for vibrations involving the CF-stretch the t = 0 amplitude and the static offset Δ0 is larger by one to two orders of magnitude which points towards somewhat slower dynamics around the OH bond of F-PhOH.
For the phenol-OH-stretch vibration both, “gas phase” (around 3600 cm−1) and “water–phenol hydrogen bonded” signatures (below 3000 cm−1) are found from experiments and the simulations. A regular pattern with a spacing of ∼50 cm−1 is reminiscent of recent SFG spectra of PhOH at the air/water interface and earlier experiments on liquid and solid pure PhOH86 and is assigned to water–phenol hydrogen bonded motions. For the high frequency (OH-stretch) modes early experiments for PhOH vapor, in apolar solvent (CCl4) and as a liquid reported band positions at ∼3650 cm−1, ∼3600 cm−1, and at 3500 cm−1, respectively.86 In CCl4 solution and the pure liquid an additional band is at 3350 cm−1.86 The present work finds a sharp peak at 3596 cm−1 which is assigned to the “free” phenol–OH stretch in a non-hydrogen bonded environment. This is supported by the observation that the peak is only marginally shifted from the PhOH OH-stretch in vapor and the fact that the spectroscopic feature is sharp and therefore can not be due to water.
More recently, infrared spectra were recorded for PhOH complexed with variable numbers of water molecules in the gas phase,93,94 in matrices,95 and for PhOH at the air/water interface using vibrational sum frequency generation (SFG).87 The cluster studies all report the phenolic-OH stretch vibration at frequencies above 3000 cm−1 whereas the experiment at the air/water interface assigns a very broad signature in the SFG signal extending from 2550 cm−1 to 3500 cm−1 to the OH-stretch mode.87 This finding is consistent with the present experiments which report a broad absorption for both, F-PhOH and PhOH extending down to ∼2700 cm−1 which is also assigned to the phenol-OH stretch for water-coordinated -COH, see Fig. 2 and Fig. S2 (ESI†).
Hydration around the –CF group as characterized by 1d- and 2d-solvent distribution functions does not feature any pronounced hydrogen bonding unless a conventional PC model is used which, however, exaggerates the directed interaction along the CF bond. On the other hand, solvent water molecules form intermittent H-bonds with the –OH group of F-PhOH which leads to the broad spectroscopic response below 3000 cm−1 observed experimentally and reproduced by several of the computational models. Although the different computational approaches examined are able to capture part of the spectroscopy qualitatively and satisfactorily, none of them performs uniformly well. This is in part due to the coupling between the modes in the 1200 to 1300 cm−1 region and part due to deficiencies in the methods themselves (BLYP and MP2, respectively). To further improve the intramolecular energies, transfer learning from the MP2 to the CCSD(T) level of theory can be envisaged, as has been recently done for formic acid dimer and a number of other small molecules.96–98 If additional refinements to capture charge anisotropy are required with such an improved energy function for the bonded terms, replacing the charges by multipoles or fluctuating minimally distributed charges99,100 can be considered.
From the perspective of halogenic modifications of organic frameworks used for drug design the present work suggests that the local hydration of –CH and –CF groups is comparable (see Fig. 4 and Fig. S7, ESI†), supporting the notion that “F behaves like a large H-atom”.28 It should be pointed out that this conclusion requires advanced force fields (multipoles, fluctuating charges (as in ML/MD or QM(/MM)/MD) and is not supported by a conventional point charge force field. Although the C–F bond has three isolated electron pairs, it has weaker electrostatic interactions compared to O due to the small size and high electronegativity, which compromises its hydrogen bonding ability and can thus better be described as a weakly polar interaction rather than a hydrogen bond. Considering chlorinated and iodinated species is expected to yield altered hydration structures due to their more pronounced sigma hole. Such insights are important guidelines for rational drug discovery as they provide a basis for directed modification and evolution of ligands with specific interactions in protein binding sites.
Footnote |
† Electronic supplementary information (ESI) available: The supporting information provides Tables for the force field parametrizations (Tables S1–S3), frequency maxima for frequency distributions (Table S4), and parameters for the FFCFs (Table S5) together with Fig. S1–S12 the report quality of the PhysNet model, additional solvent distribution functions, dihedral time series and the FFCFs for the CF- and OH-stretch frequencies. See DOI: https://doi.org/10.1039/d2cp02857c |
This journal is © the Owner Societies 2022 |