Jakub Kaminský*,
Valery Andrushchenko* and
Petr Bouř*
Institute of Organic Chemistry and Biochemistry, Academy of Sciences, Flemingovo náměstí 2, 16610, Prague, Czech Republic. E-mail: kaminsky@uochb.cas.cz; andrushchenko@uochb.cas.cz; bour@uochb.cas.cz
First published on 23rd February 2021
Chiroptical spectroscopic methods are excellent tools to study structure and interactions of biomolecules. However, their sensitivity to different structural aspects varies. To understand the dependence of absorption, electronic and magnetic circular dichroism (ECD, MCD) intensities on the structure, dynamics and environment, we measured and simulated spectra of nucleosides and other nucleic acid model components. The conformation space was explored by molecular dynamics (MD), the electronic spectra were generated using time dependent density functional theory (TDDFT). The sum over state (SOS) method was employed for MCD. The results show that accounting for the dynamics is crucial for reproduction of the experiment. While unpolarized absorption spectroscopy is relatively indifferent, ECD reflects the conformation and geometry dispersion more. MCD spectra provide variable response dependent on the wavelength and structural change. In general, MCD samples the structure more locally than ECD. Simple computational tests suggest that the optical spectroscopies coupled with the computational tools provide useful information about nucleic acid components, including base pairing and stacking.
In the past, the quantum-chemical procedures have been tested to provide universal basis for understanding the shapes and position of observed spectral bands.4,7 In the present study, we focus on the dependence of ECD and MCD spectra on molecular flexibility and interactions. The long-time goal is to increase the amount of information obtainable from the experimental data. We chose a series of molecules allowing to estimate contributions of individual nucleic acid components (e.g., guanine → guanosine → guanosine monophosphate), the effect of base substitution (uridine vs. methyluridine), and the difference between purine and pyrimidine bases (e.g. guanosine vs. cytidine). Conformational equilibria and dynamics appeared to modulate significantly all spectral features.
Previous experience shows that an average of “snapshot” (conformer) geometries obtained from molecular dynamics (MD) well-represents the time averaging and leads to realistic band shapes for solid state, liquids and solutions.8–10 Although the averaging is computationally demanding, it has been possible by efficient averaging scripts, the time-dependent DFT methodology for ECD11,12 and time-efficient sum-over state (SOS)13,14 MCD calculations.
While ECD has been used for studies of nucleic acids for a long time,15,16 MCD is used less frequently and its full potential for nucleic acid studies is rather unexplored. In early research, MCD bands were assigned mostly empirically.17 Today, more accurate simulations are possible and MCD can provide very useful data complementary to absorption and ECD. For example, some MCD bands were found to be quite sensitive to the environment of the nucleotides.4
The understanding of nucleotide spectroscopic behavior is also required for modeling of longer nucleic acid polymers, where the nucleotide data can be used as parameters in simplified computational procedures. Other computational possibilities are provided by the embedding techniques,18,19 which in the recent years provided useful information on the chromophore behavior including ECD20 and MCD spectra.21
Our results show that accounting for the geometry dispersion during thermal molecular motion is particularly important for realistic ECD and MCD theoretical intensities. We follow a “bottom-up” approach, first characterizing spectral properties of bare nucleobases. Then spectra of the nucleosides are analyzed. The theory can also well indicate sensitivity of different spectral types to molecular interactions, such as base pairing and stacking. We believe that such mapping of factors affecting spectral shapes will help to increase reliability of the simulations, desirable for establishing a closer link between spectral shapes and molecular structure and dynamics.
In an alternate model, selected spectra for geometries obtained from MD were averaged. For MD, the investigated molecule was put in a cubic box with approximately 700 water molecules using the Desmond software.28 The OPLS-2005 force field29 was employed in 100 ns simulations, using 1 fs time step, NpT ensemble, pressure of 1 bar, and temperature was kept at 300 K by the Nose–Hoover thermostat. The TIP3P force field30 was used to treat water molecules. We recorded the snapshot geometries each 0.5 ns.
In metadynamics simulations free energy of the nucleosides as dependent on the χ and γ torsional angles (Fig. 1) was determined. The angles were scanned with 10 degree increment using the Maestro graphical interface.31 The simulations were run in Desmond with default parameters (100 ns for each, NpT ensemble, height of the Gaussian kernel was 0.03 kcal mol−1, interval of dropping the kernel in phase space was 0.09 ps).
Solute–solvent clusters containing waters closer than 3.6 Å to the solute were extracted from the MD snapshot geometries. The 3.6 Å limit has been based on previous experience, as it allows to include molecules from the first solvation shell.32,33 Apart from the raw MD geometries, to explore optimization effects, selected clusters were subjected to constrained optimization in vibrational normal coordinates; modes with energies below 300 cm−1 were fixed.34,35 As shown below, the optimization did not lead to qualitatively different results. For the spectra calculations, explicit water molecules were deleted and replaced with the CPCM23,36 continuous model. Calculations in which water molecules in the first solvation sphere were kept and the others treated by CPCM were also tried.
The MD simulations treat the intra- and intermolecular hydrogen bonds more consistently than PCM. Example of the dependence of free energy on the (χ, γ) torsion angles is plotted for cytidine in Fig. 2. Potential energy surfaces of other nucleosides and GMP are analogous (Fig. S2†). All species generate six local energy minima (I–VI) with very similar (χ, γ) values. As discussed previously,39 the sugar rotation around glycosidic bond generates syn and anti-conformers, differing in the χ-angle, and each of them can additionally adopt one of three orientations of the methoxy (–CH2OH) group (γ ∼ ±60°, 180°). The free energy surfaces suggest that for most nucleosides only conformers with γ ∼ −60° are populated at room temperature, except for U preferring γ ∼ 60°. The relative energies and corresponding geometries of the six local minima are listed in Table S1.† Interestingly, the uridine methylation results in a flip of the χ angle, from anti (−171°, conformer VI of uridine) to syn (52°, 55° conformers I and II of 5-methyluridine) values. This can be explained by a steric clash of the methyl group with the sugar.
Fig. 2 Dependence of cytidine free energy on the torsion angles (χ, γ, Fig. 1), as obtained from the metadynamic simulations. Numbers I–VI denote local minima of the energy. |
The biggest transition dipole moments lie in the plane of the bases, i.e. the transitions belong to the A′ symmetric representation of the Cs point group. The A′′ transitions with transition dipole moments perpendicular to the plane are weak and nearly invisible within 190–300 nm. Some studies also indicate transitions to delocalized Rydberg states in this region.7 These, however, are not included in our model of a limited basis set and their contribution to the spectra is supposed to be small due to the small overlap between the ground and excited states. We also neglect the vibrational splitting of the electronic bands41–44 as it is not apparent in the experiment, except for a band broadening.
The pyrimidine bases (cytosine, uracil, thymine, Fig. S3†) all provide a strong absorption band around 260 nm associated with a negative MCD; below 240 nm the spectra of cytosine differ more from the other two and MCD is mostly positive. The purine bases, adenine and guanine, provide two strong MCD bands of opposite signs around 260 nm. The phosphate group starts to contribute significantly to the absorption and MCD spectra below 240 nm.
Fig. 3 Calculated (black, solid) and experimental (red, dashed) absorption (ε), ECD (Δε) and MCD (Δε/B) spectra of six nucleosides, a B3LYP/6-311++G**/CPCM weighted average over six geometries representing minima I–VI defined in Fig. 2 and S2.† |
λABS | ε | λECD | Δε | λMCD | Δε/B | ||
---|---|---|---|---|---|---|---|
a Ref. 40.b CAM-B3LYP. | |||||||
C | Exp. | 273, 270a | 10425 | 274 | 4.82 | 271 | −0.44 |
238 | 8217 | 219 | −4.99 | 236 | −0.66 | ||
199 | 32493 | 207 | −0.98 | 201 | 0.34 | ||
200 | −2.71 | ||||||
Calc. | 259, 245b | 10655 | 258 | 9.76 | 259 | −3.36 | |
229 | 10409 | 236 | −1.82 | 242 | −1.08 | ||
191 | 28348 | 227 | −0.74 | 229 | −1.31 | ||
214 | −7.48 | 207 | 1.65 | ||||
198 | 7.52 | ||||||
U | Exp. | 261, 261a | 11022 | 271 | 2.77 | 259 | −0.65 |
205 | 12942 | 241 | −1.50 | 232 | −0.81 | ||
216 | −1.82 | 217 | −0.58 | ||||
197 | 2.57 | ||||||
Calc. | 247, 234b | 17602 | 259 | −3.61 | 246 | −3.57 | |
203 | 9360 | 242 | 9.91 | 204 | 1.63 | ||
207 | −8.58 | ||||||
5-MU | Exp. | 268 | 8772 | 275 | 2.20 | 264 | −0.74 |
215 | 6985 | 245 | −1.24 | 220 | −0.42 | ||
201 | 9503 | 218 | −1.58 | ||||
197 | 5.39 | ||||||
Calc. | 256, 243b | 17976 | 251 | −6.32 | 256 | −3.14 | |
206 | 12084 | 228 | 3.00 | 207 | 2.30 | ||
209 | −1.78 | ||||||
194 | 2.31 | ||||||
A | Exp. | 257, 257a | 12142 | 265 | −0.37 | 270 | −1.31 |
210 | 22751 | 229 | −0.12 | 250 | 1.22 | ||
194 | 30806 | 198 | −4.28 | 222 | 0.83 | ||
212 | −0.92 | ||||||
Calc. | 254, 239b | 21897 | 247 | −3.39 | 256 | −4.14 | |
194 | 25679 | 228 | 0.09 | 237 | 1.81 | ||
204 | −0.53 | 207 | −4.19 | ||||
187 | −3.8 | ||||||
G | Exp. | 276, 275a | 9426 | 250 | −0.32 | 279 | −1.99 |
252, 253a | 14508 | 215 | 3.27 | 251 | 1.74 | ||
205 | 20878 | 198 | −7.13 | 201 | 2.24 | ||
192 | 30749 | ||||||
Calc. | 250, 249b | 25679 | 263 | 4.74 | 264 | −4.17 | |
232b | 16065 | 241 | −3.34 | 245 | 3.37 | ||
202 | 26850 | 215 | 7.19 | 206 | −1.05 | ||
179 | 197 | −7.27 | |||||
dG | Exp. | 277 | 6522 | 245 | −0.75 | 278 | −1.39 |
253 | 9765 | 213 | 2.51 | 251 | 1.10 | ||
204 | 14836 | 196 | −4.51 | 202 | −1.49 | ||
Calc. | 251, 250b | 28091 | 259 | −2.64 | 262 | −3.27 | |
232b | 29262 | 229 | 2.58 | 245 | 2.50 | ||
183 | 200 | −6.30 | 202 | −1.03 | |||
polyU | Exp. | 260 | 9640 | 274 | 7.75 | 272 | −1.54 |
202 | 11381 | 242 | −3.33 | 243 | 0.20 | ||
209 | −0.56 | 214 | 0.34 | ||||
192 | 2.31 | ||||||
polyA | Exp. | 257 | 8641 | 264 | 16.48 | 272 | −1.03 |
210 | 12388 | 247 | −11.79 | 251 | 1.09 | ||
193 | 16405 | 232 | 0.77 | 226 | −0.28 | ||
221 | 5.79 | 212 | −1.94 | ||||
205 | −20.58 | 198 | 0.97 | ||||
DNA | Exp. | 258 | 7858 | 277 | 3.17 | 267 | −0.52 |
211 | 7995 | 245 | −3.52 | 249 | 0.16 | ||
201 | 9531 | 222 | 1.33 | 209 | −0.52 | ||
210 | −0.51 | 193 | 2.16 | ||||
197 | 7.56 |
ECD spectra (middle panels in Fig. 3) reflect the non-aromatic nucleoside parts more than the absorption. The experimental spectra of the pyrimidine derivatives (C, U and 5-MU) exhibit a positive band around the strong HOMO–LUMO absorption band within 263–273 nm. ECD tends to be negative around 230 nm, and the sign changes again below 200 nm. For C and U, the simulations capture this trend reasonably well, although the position of the highest-wavelength absorption band is calculated too low. A wrong sign at the longest wavelengths is predicted for 5-MU, likely because of an error in the conformer energies obtained from MD. Indeed, experimental ECD spectra of U and 5-MU are quite similar, i.e. the effect of the methylation seems to be overestimated in the simulations. Part of this inconsistency can also be explained by the main absorption band (exp. ∼ 268 nm) calculated too low (∼256 nm).
Purine-based A, G and dG compounds have much weaker mostly negative experimental ECD within 230–300 nm, with a stronger negative band around 195 nm. The overall trend seems to be reasonably predicted by the computations, particularly below 230 nm; yet the simulated intensities at higher wavelengths are overestimated. The best agreement is observed for dG. The U/5-MU and G/dG pairs show similar experimental but different computed spectra, which may indicate problems with the MD force field providing too different relative free energies.
As for the absorption, all nucleosides have calculated MCD spectra very similar to those of the parent bases (cf. Fig. 3 and S3†). This has been also observed experimentally.40 The simulations provide the basic MCD sign pattern correctly, although for the pyrimidine bases (C, U, 5-MU) the simulated intensities are overestimated. The MCD intensity overestimation seems to be occurring both within the DFT framework4 and wavefunction approaches.7 For the purine A, G and dG nucleosides the simulated curves follow the experiment more closely than for C, U and 5-MU.
The optimized geometry provided significantly different ECD spectra than raw MD structures, whereas absorption and MCD are affected less (Fig. 4b). This brings in the necessity to average many MD geometries to obtain converged simulated ECD pattern. For the average, the effect of the partial optimization was much smaller (Fig. 5). ECD is also very sensitive to the basis set size; the 6-31G**/6-311++G** basis set change causes significant changes in the ECD pattern around 260 nm (Fig. 4c). On the other hand, the results are rather indifferent if the basis set varies on water atoms only. The CAM-B3LYP functional provided similar results as B3LYP, giving transition wavelengths even shorter, further from the experiment, although for G and dG it gave more realistic splitting of the longest-wavelength band (Table 1 and Fig. S7†).
Finally, we investigate the dependence of the spectra on the number of transitions involved in the TDDFT calculations (Fig. 4d). This is of practical importance as the computational time steeply increases with it. The experimentally accessible region (∼200–300 nm) comprises about 50 electronic transitions, and few more, centered below 200 nm, may also contribute due to the inhomogeneous band broadening.45 In addition, the MCD signal of a particular transition, unlike for absorption and ECD, is formed by all the electronic transitions in the molecule within the SOS scheme.14,46–48 Fortunately, the contribution of high-energy transitions/electronic states fades rapidly and within 200–300 nm the MCD spectral shape is affected by the short-wavelength transitions only in a minor way. This has been observed also in previous studies.44,49–51 Therefore, we find it reasonable to simulate all spectra with 200 transitions as a default.
To account both for the weighting of the six energy minima (presented in Fig. 3) and the geometry dispersion within each conformer class, we generated the spectra for C, 5-MU, A and G as averages of 200 MD structures in Fig. 6. The snapshots were taken during a 100 ns free dynamic run. In general, this approach provides more realistic shapes than the more static model. The absorption intensity of the HOMO–LUMO band still remained overestimated compared to the experiment, but its position moved right, closer to the measured wavelength. Even more beneficial was the averaging for ECD spectra, intensity of which overall decreased and better followed the experimental curves. An extreme example is guanosine (G), where the ECD intensity within 240–300 nm nearly zeroed-out, in agreement with the observation.
Fig. 6 Calculated (black solid) and experimental (red dashed) spectra of four nucleosides, obtained as an average of 200 MD geometries (B3LYP/6-311++G**/CPCM calculation, no explicit waters). |
However, the more extensive averaging did not always improve the static result. For example, it predicts two negative ECD bands at 230 nm and 260 nm for A, while the static approach provided only one (250 nm), which better described the broad experimental signal in this region. The remaining inconsistencies can be attributed to an error of the force field/conformational sampling and extreme sensitivity of ECD to the conformation shown in this and previous studies.45,52 Perhaps surprisingly, also the MCD intensities, presumably not so dependent on the conformation, significantly diminished and became more realistic in the more extensive MD averaging. This can be explained by the wavelength shift to longer values for non-equilibrium MD geometries and a reduced coupling between different electronic states determining MCD rotational strengths.13,14,47
As the next step to the two presented computational approaches (Fig. 3 – geometry minima + CPCM, and Fig. 6 – solute geometry averaging + CPCM), an explicit involvement of the solvent to the quantum mechanical computation would be probably desirable.53 However, we find it currently computationally too expensive. In addition, for a guanosine–water cluster, where the solvent was explicitly included, the results were not radically different (Fig. 4, part a). The solvent does affect electronic properties of the chromophores, but the effect can be smaller after the averaging.45 Another increase of accuracy in the future may come from using polarizable54 or ab initio32 force fields, again too expensive at the present stage.
Fig. 7 Calculated spectra of B-DNA like stacked dimers (top) and planar Watson–Crick pairs (bottom), compared to plain sums of the bases' spectra, normalized to one base. |
In Fig. S8† a similar effect of the pairing is documented for dA–dT and dC–dG Watson–Crick dimers, and homonucleoside dimers d(A)2, (dG)2, (dT)2 mimicking stacking in B-DNA geometry. The sensitivity of the spectra to molecular interactions documents the potential of the optical spectroscopy, once the experimental data can be supported by reliable simulations. The planar pairing of nucleosides has even a bigger effect on absorption and MCD than for the plain bases, suggesting that the sugar part can amplify some interaction effects of the bases on the spectra, maybe through increased molecular polarizabilities.
Finally, spectra comparison of polyU and polyA with their respective nucleoside monomers (U and A) in Fig. 9 confirms the predicted sensitivity of MCD and particularly ECD to the base stacking. Thus, ECD spectra of polyU and U are generally similar, which suggests a disordered conformation of polyU, where the main source of chirality is local interaction of uridine base with the sugar. Much bigger differences are between ECD spectra of A and polyA, the latter giving strong signals, indicating partially helical arrangement with stacked bases. This is in agreement with previous study based on vibrational circular dichroism.57 Adenosine and polyA MCD is quite similar, in accord with the simulation on the adenine base (Fig. 7). The salmon testes DNA spectra exhibit several features of the simpler systems, although detailed analysis goes beyond the scope of the present study.
Fig. 9 Experimental spectra of longer DNA models normalized to one nucleotide residue (black) and single U or A nucleosides (red, dashed). |
Footnote |
† Electronic supplementary information (ESI) available: Further computational test and details. See DOI: 10.1039/d1ra00076d |
This journal is © The Royal Society of Chemistry 2021 |