Rafał
Szabla
*ab,
Holger
Kruse
b,
Petr
Stadlbauer
bc,
Jiří
Šponer
b and
Andrzej L.
Sobolewski
a
aInstitute of Physics, Polish Academy of Sciences, Al. Lotników 32/46, PL-02668 Warsaw, Poland
bInstitute of Biophysics of the Czech Academy of Sciences, Královopolská 135, 61265 Brno, Czech Republic. E-mail: szabla@ibp.cz
cRegional Centre of Advanced Technologies and Materials, Department of Physical Chemistry, Faculty of Science, Palacký University, 17. Listopadu 1192/12, 77146 Olomouc, Czech Republic
First published on 22nd February 2018
Cyclobutane pyrimidine dimers (CpDs) are among the most common DNA lesions occurring due to the interaction with ultraviolet light. While photolyases have been well known as external factors repairing CpDs, the intrinsic self-repairing capabilities of the GATT DNA sequence were discovered only recently and are still largely obscure. Here, we elucidate the mechanistic details of this self-repair process by means of MD simulations and QM/MM computations involving the algebraic diagrammatic construction to the second order [ADC(2)] method. We show that local UV-excitation of guanine may be followed by up to three subsequent electron transfers, which may eventually enable efficient CpD ring opening when the negative charge resides on the TT dimer. Consequently, the molecular mechanism of GATT self-repair can be envisaged as sequential electron transfer (SET) occurring downhill along the slope of the S1 potential energy surface. Even though the general features of the SET mechanism are retained in both of the studied stacked conformers, our optimizations of different S1/S0 state crossings revealed minor differences which could influence their self-repair efficiencies. We expect that such assessment of the availability and efficiency of the SET process in other DNA oligomers could hint towards other sequences exhibiting similar photochemical properties. Such explorations will be particularly fascinating in the context of the origins of biomolecules on Earth, owing to the lack of external repairing factors in the Archean age.
The vulnerability of DNA and RNA to the harmful effects of UV-radiation is particularly intriguing in the context of the origins of life. First of all, it is reasonable to assume that the highly sophisticated photolyases were absent during the emergence of the first oligonucleotides. Furthermore, recent theoretical estimates showed that much higher amounts of UV-light were reaching the surface of the Archean Earth, owing to the lack of oxygen in the atmosphere and higher Sun activity in the ultraviolet spectral range.18–22 Interestingly, a recent work by Bucher et al.23 revealed that specific DNA sequences may promote very efficient self-repair driven by photoinduced electron transfer to the CpD lesion containing two dimerized thymine bases (TT). This remarkable property was discovered for the GATT sequence in single and double strands, while no repair was reported for the TATT and ATT oligomers.23 The significant selectivity of this process implies that UV-light played an important role in the prebiotic selection of the most photochemically stable nucleic acid sequences.24
Many fundamental aspects of the photochemistry of nucleic acid oligomers (such as GATT) are still obscure, including the relative contribution of locally-excited (LE) and delocalized excitations in DNA strands, the extent of the charge-transfer (CT) character of the latter, and the availability of different photorelaxation pathways in such assemblies.25,26 While some valuable insights into these processes were provided by time-resolved (TR) spectroscopic (e.g. transient absorption UV) techniques,27 it is generally difficult to study the clear origin of the long-lived states due to overlapping absorption features of different nucleobases.25 In addition, selective synthesis of specific photodamaged sequences is challenging. On the other hand, considerable system sizes often restrict the applications of theoretical simulations to methods such as Time-Dependent Density Functional Theory (TDDFT). Even though TDDFT was employed with some success in investigations of photochemical and photophysical properties of oligonucleotides,28,29 the applicability of this methodology to similar problems has been criticized because of often significant overstabilization of CT states.26,30,31 While range-separated functionals can alleviate this problem to some extent for vertical excitations, nonadiabatic molecular dynamics simulations of adenine with TDDFT failed to correctly describe the excited-state lifetimes and photodeactivation pathways.32
In the light of the above discussion, characterization of photoinduced processes in oligonucleotides based on highly accurate quantum-chemical methods is urgently needed. In particular, a much more balanced description of the LE and CT states, even outside the Franck–Condon region, can be obtained with the algebraic diagrammatic construction to the second order [ADC(2)] method.33,34 The ADC(2) method was recently proved to provide reliable results in the investigations of the photochemistry of nucleic acids fragments within the QM/MM framework (referred to as ADC(2)/MM from here on).26,35–39 However, these simulations only involved the nucleobases treated at the ADC(2) level and the effects of the truncation of the QM region at the N-glycosidic bond are unclear. Moreover, previous studies did not consider optimizations of minimum-energy crossing points (MECPs) at the ADC(2)/MM level, which are crucial in understanding the reactivity of different electronic states participating in the photochemistry of oligonucleotides.
Our goal is to identify the possible mechanistic features and intermediate states that govern the sequence selective self-repair in nucleic acid fragments, first observed by Bucher et al.23 For this purpose, we employ the ADC(2)/MM protocol (nucleobases treated at the ADC(2) level) to optimize the minimum-energy geometries and compare the relative energies of different LE and delocalized states in the GATT oligomer. We further discuss MECP optimizations and validate the energies of all stationary points by including the whole tetranucleotide in the QM region. This enables us to expose some of the potential pitfalls of the previous approaches, like often incorrect prediction of the relative energies of intermediate delocalized states. Based on these simulations we propose that the photochemical self-repair of the GATT sequence can be envisioned as a sequential electron transfer (SET) process which involves multiple changes of the orbital (diabatic) character of the S1 state occurring downhill along the slope of the S1 potential energy surface, after the local excitation of one of the purine bases.
Fig. 1 The two QM/MM setups applied to the energy calculations and structural optimizations performed in this work. |
The QM/MM optimizations of the ground-state equilibrium geometries were first performed using QMDNA/MM setup and the low-cost hybrid DFT composite scheme, PBEh-3c.56 The PBEh-3c method was recently shown to be particularly well suited for calculations of RNA tetranucleotides.57 The initial optimizations were performed for the whole QM/MM system using the internal AMBER48 optimizer and the limited-memory BFGS method. These geometries were reoptimized with tighter convergence criteria using a rational function and approximate normal coordinates scheme as implemented in the open-source optimizer XOPT.58–60 During the reoptimization procedure only the atom positions in an inner spherical region of the QM/MM system were relaxed, while the outer region was kept frozen. The inner region contained all the atoms within 11.5 Å of the most central atom of the GA-anti conformer and 12.5 Å of the most central atom of the GA-syn conformer. The inner region was in each case carefully selected to keep an appropriate solvation shell around the tetranucleotide. This approach was also applied in the excited-state calculations.
The algebraic diagrammatic construction to the second order [ADC(2)]33,34,61 method was used in all the excited-state calculations. The ADC(2) method was shown to yield a correct description of the excited-state potential energy surfaces of adenine and thymine outside the Franck–Condon region.32,62 The vertical excitation energies and oscillator strengths were calculated on top of the PBEh-3c geometries, using the QMbases/MM setup, ADC(2) method and the TZVP basis set. The QMbases/MM setup was also used in the ADC(2)/MM optimizations of the different minima on the PE surface of the S1 state and all the minimum-energy crossing points (MECPs). The TheoDORE 1.5.1 package was used to establish the charge transfer numbers and perform electron–hole population analysis.63,64 The molecular orbitals were generated using the IboView program.65 MECPs were optimized employing the approach of Levine, Coe and Martínez,66 which was recently shown to provide reliable S1(ππ*)/S0 conical intersection geometries in several biomolecular systems without the evaluation of nonadiabatic couplings.67,68 This scheme was included in the XOPT code to enable the optimization of state crossings within the QM/MM framework. Even though, it was indicated that the ADC(2)/MP2 methods may fail to correctly reproduce the topography of nπ*/S0 state crossings in nucleobases,68 all of the S1/S0 MECPs considered in this work involved ππ* excitations, and consequently we anticipate the corresponding MECP geometries to be reliable. To keep consistent geometries when comparing the energies of different excited state minima with the Franck–Condon region, the ground-state geometry was additionally optimized using QMbases/MM setup and MP2/MM approach. All of the ADC(2)/MM and MP2/MM optimizations were performed with the def2-SVP basis set. In addition, the energies of all the stationary points were recalculated using the QMDNA/MM setup (whole tetranucleotide in the QM region) the ADC(2) and MP2 methods and the larger TZVP basis set.
The majority of the ADC(2)/MM calculations were performed for the GA-anti conformer, which has the anti orientation of the G and A bases and represents the spatial arrangement of the GATT sequence in longer DNA strands. Since the population of the GA-anti conformer is relatively low (3.2% based on the OPC simulations), we additionally considered the GA-syn conformer having both the G and A bases in the syn orientation with respect to the sugar moieties. Both the SPC/E and OPC simulations indicate that the GA-syn conformer is the dominant molecular arrangement with the estimated populations of 47.0% and 31.2%, respectively. The ADC(2)/MM calculations performed for the GA-syn are shown in the ESI,† since the major conclusions regarding the self-repair mechanism remained unchanged in both studied cases. We also identified two other conformers which contributed to the ensemble of fully stacked geometries, namely the G-syn-A-anti and G-anti-A-syn substates, but were not examined in the further ADC(2)/MM calculations.
The lowest-lying locally-excited state on the G base has an excitation energy of 4.91 eV and is the lowest-energy excitation (S1) in the whole tetranucleotide. We expect that this state was predominantly populated in the experiments conducted by Bucher et al.,23 which involved photoexcitation in the UVB spectral range (λexc = 290 nm, 4.28 eV). The optically bright transitions associated with the A base correspond to the S3 and S4 states in the GATT tetranucleotide and can be accessed at slightly higher excitation energies, i.e. 5.11 eV and 5.21 eV. The remaining states present in the lower energy range of the spectrum are nπ* excitations localized on the C4O groups of the TT dimer. These nπ* states were previously suggested to participate in the direct self-repair of the TT dimers. However, the MM solvent is represented by a point charge scheme and many electronic effects crucial for the dark nπ* are neglected in this approach. In fact, it was demonstrated that the inclusion of explicit QM water molecules results in considerable blue-shift of nOπ* states.73,74 Here, the consideration of all the neighbouring water molecules at the QM level is beyond our computational capabilities, albeit we expect that such blue-shift of these nOπ* states in the GATT tetranucleotide would significantly decrease their availability in the photochemical processes studied in this work. The ππ* excitations associated with the TT dimer can be also found in higher energy range of the spectrum.
Right after the photoexcitation, the GATT tetranucleotide can undergo vibrational relaxation to the minimum of the state (denoted as G* in Fig. 3) associated with moderate puckering of the guanine base. This ring puckering effect is most pronounced at the C4 and C5 atoms in the GA-anti conformer and the C2 carbon atom connected to the amino group in the case of the GA-syn conformer. From these minima the system can reach the ππ*/S0 conical intersection described before as the dominant photodeactivation channel in isolated guanine.75,76 This state crossing in the GA-syn conformer is characterized by a more pronounced puckering of the C2 carbon atom and a slightly sloped topography, since it lies 0.15 eV above the minimum. Based on analogous calculations and virtually identical results for isolated guanine, we anticipate that the photorelaxation of UV-excited guanine in the GA-syn conformer resembles the photodeactivation mechanisms reported in gas phase studies.75 In contrast, our optimizations of the MECP in the GA-anti conformer yielded a C4-puckered geometry that lies 0.5 eV above the respective minimum. This suggests that some of the monomeric photodeactivation channels in oligonucleotides may be significantly hindered in selected conformers due to the interactions with neighbouring bases.
Alternatively, the GATT tetranucleotide containing UV-excited guanine may follow a relaxation pathway on the S1 surface towards a further local minimum possibly associated with a CT or exciplex state. This scenario was previously hypothesized by Bucher et al.,23 who suggested that the G+˙A−˙ state could be the key intermediate state that precedes the electron transfer to the TT dimer.23 Our ADC(2)/MM calculations reveal that even though the G+˙A−˙ state lies above the S10 state in the Franck–Condon region of the GA-anti conformer, it is significantly red-shifted during the initial vibrational relaxation of the UV-excited guanine. In fact both the S2 and S3 excitations, computed on the LE minimum-energy geometry, are delocalized states involving charge transfer between the G and A bases. In particular, the G+˙A−˙ state considered by Bucher et al.,23 lies merely 1.59 eV above the LE minimum and corresponds to the S3 state. Surprisingly, the S2 state lying 1.26 eV above the LE minimum is characterized by an opposite electron transfer from adenine to guanine. While we successfully optimized the S1 minimum corresponding to the G+˙A−˙ state, we could not locate the G−˙A+˙ minimum at the ADC(2)/MM level and we anticipate that this latter state is not involved in the photoreactivity of the GATT tetramer.
The G+˙A−˙ minimum lies 0.9 eV below the LE minimum and may become the direct precursor of the CT state which entails an electron transfer to the TT dimer. In other words, the negatively charged adenine could readily transfer its excess electron to the TT dimer leading to the formation of the reactive state with self-repair propensity. This G+˙ATT−˙ minimum is again lower in energy than the preceding stationary point, by nearly 0.5 eV. In addition, we located another local minimum on the S1 surface containing the negatively charged CpD, namely the GA+˙TT−˙ state. The ADC(2)/MM energies computed for the whole tetranucleotide reveal that the latter CT minimum having the positive charge localized on the A base is, in fact, the lowest-energy S1 minimum available in the GA-syn conformer and possibly the last link in the SET mechanism before the actual TT dimer repair.
Our optimizations of the S1/S0 MECP initiated from both the GA+˙TT−˙ and G+˙ATT−˙ minimum-energy geometries converged to the same state crossing. This MECP is characterized by a partially opened cyclobutane ring with the C5–C5 bond being broken and the C6–C6 bond remaining the single covalent connection between the two thymine bases (cf.Fig. 3). We anticipate that this self-repair mechanism can operate from both the GA+˙TT−˙ and G+˙ATT−˙ minima, since the primary factor that enables the TT dimer repair is the electron transfer to the photolesion and not the location of the hole (positive charge) in the system.
It is generally difficult to envisage the exact reaction coordinates leading to the transitions between the different stationary points presented in Fig. 3, owing to the large number of nuclear degrees of freedom in the GATT tetramer. In other words, S1 hypersufaces of oligonucleotides are significantly more complex when compared to isolated nucleosides and nucleobases. However, we anticipate that the local minima constituting the SET mechanism are rather shallow and we expect the energy barriers separating the consecutive stages to be generally low. This is reflected by the high sensitivity of the excited-state optimizations to the initial guess geometry, i.e. sometimes small changes in the initial geometry may result in convergence to another local minimum on the S1 hypersurface. Another important factor which determines the efficiency of the transitions between the different minima is the availability of S1/S0 conical intersections from the intermediate states. As we have shown above, the high energy of the state crossing will facilitate the population of the G+˙A−˙ minimum. Limited accessibility of the state crossings is discussed below. As pointed out by Lee and co-workers,16 the evaluation of diabatic coupling matrix elements (DCMEs) between the electronic states is crucial for proving the validity of a proposed photoreaction mechanism. Our estimates based on the generalized Mulliken Hush and Boys localization approaches,77,78 yielded DCMEs exceeding 0.1 eV between the consecutive electronic states. This implies that the electron transfer events in the GATT could be indeed very efficient. Nevertheless, our DCME estimates require a separate commentary, which can be found in the ESI.† A detailed and accurate description of the transition paths between the different stationary points in the SET mechanism could be inferred from nonadiabatic molecular dynamics simulations, but this approach is currently beyond our computational capabilities for systems of this size.
Apart from the three CT states already mentioned in the SET mechanism, we also located one exciplex state shared between the G and A bases which does not have any notable CT character. The corresponding geometry of the GA-exciplex minimum is presented in the top left panel of Fig. 4. The ADC(2) optimizations performed using the QMbases/MM setup initially indicated that the GA-exciplex state could be the first intermediate reached from the LE state. At the same time, the ADC(2) energy of the G+˙A−˙ minimum obtained using the QMbases/MM setup was higher than the LE minimum. However, this picture is completely different when the ADC(2) energies are recalculated within the QMDNA/MM setup, i.e. when the sugar-phosphate backbone is included in the QM region. We presume that this unexpected behavior is the result of strong differences in the electric dipole moment direction and magnitude between the QMDNA/MM and QMbases/MM setups characteristic for these two particular states. In other words, the QMbases/MM setup is incapable of correctly reproducing the μ vectors of the G+˙A−˙ and GA-exciplex states, which substantially affects the relative energies of the states in the field of electrostatic point charges. The relative energies of the remaining intermediate states are not affected by the size of QM region in the qualitative sense. Nevertheless, the example of the GA-exciplex and G+˙A−˙ intermediate states shows that the truncation of the QM region at the N-glycosidic bond might result in deceptive results and considerable care needs to be taken during the prediction of relative energies of different electronic states in nucleic acids within the QM/MM framework.
The electron–hole population analysis shows that the G+˙A−˙ state is associated with 0.57 and 0.45 electron transferred from guanine to adenine in the corresponding G+˙A−˙ S1 minima of the GA-anti and GA-anti conformers. The consecutive CT state accessed in the GA-anti conformer involves a transfer of 0.97 e− to the TT dimer from the G and A bases, where 0.55 e− is transferred from G and 0.42 e− from A. In contrast, the corresponding CT state found in the GA-syn conformer involves a transfer of 0.94 e− to the TT dimer occurring exclusively from the G base. However, we denote this state as G+˙ATT−˙, to keep a uniform naming scheme for both studied conformers. Finally, the GA+˙TT−˙ minimum in the GA-anti conformer is associated with 0.96 e− transferred to the TT dimer, where the majority of the hole (0.87) is located on the A base. These results confirm the previous suggestion that CT states could involve delocalization of the transferred charge over the neighbouring bases.25 However, we expect this feature to be dependent on the local arrangement of nucleobases, since it is present in only one of the studied conformers.
The charge transfer process occurring from guanine to adenine results in the formation of fascinating interactions between the two bases (cf.Fig. 5). In the case of the GA-anti conformer, a strong interaction is created between the C5 atom of guanine and the amino group of adenine associated with a C5⋯NH2 distance of 2.14 Å. On the contrary, the G+˙A−˙ state in the GA-syn conformer leads to the formation of an interaction between the C6 atom of guanine and the C6 atom of adenine with the equilibrium distance of 2.24 Å. Shortening of these two distances leads to conical intersections and covalent bond formation between the two bases. The optimized MECP lies 0.25 eV below the G+˙A−˙ minimum in the GA-anti conformer and these two points on the PES are separated by a modest energy barrier of ∼0.1 eV. Interestingly, the MECP located in the GA-syn conformer lies 0.42 eV above the corresponding local minimum, which indicates that this state crossing is much less available and the corresponding state is presumably long-lived, at least in this particular arrangement of the G and A bases. Indeed, Bucher et al. observed the existence of a charge-separated intermediate state with a lifetime of ∼300 ps, which is in excellent agreement with our picture.
The TT dimer repair process can be eventually completed by C6–C6 bond cleavage in the vibrationally hot electronic ground state of the tetranucleotide, i.e. after the photorelaxation through the conical intersection. Such sequential and nearly barrierless CpD ring opening initiated at the C5–C5 site was also proposed based on DFT/MM simulations of the TT dimer in the radical-anionic form.79 The optimization of the ground state geometry starting from the MECP shows that C6–C6 bond rupture occurs in a barrierless manner when at least weak restraints are imposed on the C5⋯C5 distance to prevent its closure. If no restraints are imposed, the C5–C5 bond is spontaneously reformed and the CpD structure can be recovered. Such bifurcation after passing through a S1/S0 conical intersection is typical for many fundamental photochemical reactions, and we anticipate that the dynamic system can yield either the repaired thymine bases or the CpD depending on the momentary arrangement of the surroundings when the discussed state crossing is reached. It is worth to note, that this stepwise mechanism was recently proposed to govern the UV-induced formation of TT dimers in the triplet manifold, which is the opposite reaction to the self-repair process described in this work.17
The MECP optimizations enabled us to assign the long-lived state observed by Bucher et al., to the CT state (G+˙A−˙) in the GA-syn conformer. The experimental lifetime of 300 ps is clearly reflected by the presence of the sloped MECP which lies 0.42 eV above the G+˙A−˙ minimum.
The qualitative features of the SET mechanisms are conserved in both studied conformers, but some specific differences are evident. For instance, the stacking pattern strongly differs in the syn/anti conformers, while retaining the same (formal) intermediate state (cf. the G+˙A−˙ minima in Fig. 5). The different interaction mode from the stacking variants leads to a significantly different energies of the S1/S0 state crossings in the G+˙A−˙ intermediate, resulting in different lifetimes and presumably conformer-dependent self-repair efficiency. This suggests that a more complete understanding of the photochemistry of short oligonucleotides requires conformational sampling.
We propose the main criteria which could help in identifying these nucleic-acid sequences which could efficiently promote self-repair of CpD lesions:
(i) The CT states containing the excess electron residing on the CpD lesion should be available directly or indirectly from the LE states populated soon after the photoexcitation.
(ii) The CpD repairing state can be accessed indirectly via several local S1 minima corresponding to intermediate diabatic states, within the SET mechanism. An efficient SET mechanism operates downhill along the S1 hypersurface, the consecutive diabatic states should be strongly coupled and the local minima should be separated by rather low energy barriers.
(iii) The intermediate diabatic states should have sufficiently long excited-state lifetimes to prevent premature relaxation to the S0 state and enable efficient population of successive minima in the SET mechanism. This property can be deduced from the availability of S1/S0 conical intersections in the respective local S1 minima.
Such predictive capacity is critical for deciphering the yet unclear stages of abiogenesis, since the emergence of first oligonucleotides on our planet was presumably regulated by high UV fluxes and the lack of photolesion repair factors.
Footnote |
† Electronic supplementary information (ESI) available: Computational details of the MD simulations, results of calculations performed for the GA-syn conformer, diabatic couplings and Cartesian coordinates of the stationary points. See DOI: 10.1039/c8sc00024g |
This journal is © The Royal Society of Chemistry 2018 |