C-Linked 8-aryl guanine nucleobase adducts: biological outcomes and utility as fluorescent probes

We summarize the utility and biological outcomes resulting from direct attachment of aryl residues to the 8-site of the guanine nucleobase to afford mutagenic lesions and fluorescent probes in G-quadruplex structures.


Introduction
DNA provides a useful scaffold from which to design novel molecules with applications in diagnostics, 1 therapeutics, 2 nanoscience, 3,4 and genomics. 5 Chemical modication of the DNA nucleobases, [5][6][7][8][9][10][11] or the sugar-phosphate backbone, 12,13 can provide new desirable properties that expand the scope of DNA applications. Nucleobase modications are of particular interest because these can change DNA base-pairing to expand the genetic alphabet, 6 afford new redox 7,8 or uorescent [9][10][11] sensing bases, or affect the gene function in different tissues. 5 In contrast to the design and synthesis of modied DNA bases for specic applications, useful nucleobase modications have been identied by studying the cellular implications of DNA damage. 14 This is particularly true for DNA adducts (addition products), which stem from attack of the human genome by reactive chemical species. For example, 7,8-dihydro-8-oxo-2 0 -deoxyguanosine (8oxoG, Fig. 1) is a biomarker for oxidative DNA damage. 8 The biological impact of 8oxoG stems from the 8-keto conformation, which permits 8oxoG to form a stable Hoogsteen base pair with adenine. 15 Consequently, 8oxoG is mutagenic and unrepaired 8-oxoG lesions cause G / T transversions in cells. 16 Remarkably, 8oxoG possesses an even lower oxidation potential (E 1/2 ¼ 0.74 V versus NHE 17 ) than the parent G (E 1/2 ¼ 1.29 V versus NHE 18 ), permitting its selective oxidation in DNA substrates. Thus, the 8oxoG lesion has become an effective redox tool for studying DNA electron transfer 19,20 and detecting single-base mismatches. 21 Beyond the formation of 8oxoG, aryl groups can covalently attach to the 8-position of G following enzymatic transformations of different mutagens and carcinogens, and the resulting lesions have been classied as either nitrogen-, carbon-or oxygen-linked 8arylG adducts (Fig. 1). [22][23][24][25][26][27][28] Extensive efforts have focused on the biological impact of the N-linked derivatives because these lesions are produced by notorious chemical carcinogens that are present in, for example cigarette smoke, fossil fuels and cooked meats. 29 The associated adducts have been detected in human cells, 30,31 including lesions arising from 2-aminouorene (AFG) and N2-acetylaminouorene (AAFG, Fig. 1b), which serve as prototype adducts to elucidate the molecular mechanisms of mutagenesis induced by arylamine carcinogens. [32][33][34] In contrast to the N-linked variants, relatively little is understood about the structure and biological impact of O-and C-linked 8arylG adducts. Nevertheless, these lesions also arise from our exposure to a variety of sources. For example, O-linked adducts have been connected with pentachlorophenol (PCP, Fig. 1b) found in pesticides, disinfectants and wood preservatives, 22,23,25,27 while the food mutagen ochratoxin A (OTA) arising from several species of (Aspergillus and Penicillium) fungi has been linked to the formation of a C-linked adduct (Fig. 1b). [35][36][37] In terms of structure, the C-linked 8arylG adducts are unique because they lack the exible tether that separates the aryl component from the G nucleobase in the N-and O-linked lesions. As a result, C-linked aryl moieties extend the purine psystem, which commonly results in uorescent nucleobase analogues. [9][10][11] Emissive DNA bases enjoy a wide range of applications that include reporters of nucleic acid structure and function; [38][39][40][41][42][43] detectors of nucleobase damage, single nucleotide polymorphisms (SNPs) and mismatch dynamics; [44][45][46] probes for understanding protein-DNA interactions; [47][48][49][50] components of aptamers to provide a uorescent signal upon target binding; [51][52][53] and oligonucleotide-based therapeutics. 54 Ideal uorescent probes provide emission switching properties while retaining the native behavior of the nucleic acid system being studied. Indeed, it is considered detrimental for probe incorporation to produce a major perturbation to duplex stability, H-bonding interactions, or folding characteristics. Modications to the 8-position of G offer an attractive avenue to uorescent probes because this site is not involved in canonical base-pairing interactions. 38 Nevertheless, uorescent C-linked 8arylG adducts are detrimental to the stability of a B-DNA duplex (Fig. 2) because they prefer to adopt the syn-conformation. 55 In this scenario, the emissive properties of 8-arylG adducts can be exploited as a tool to provide insight into adduct conformation within the helix, 24,28 which can be related to biological outcome, as discussed above for 8oxoG.
Alternative oligonucleotide structures, such as Z-DNA 56,57 and antiparallel G-quadruplexes (GQs), 58,59 are also biologically    (Fig. 2). For example, Z-DNA plays a role in transcription and has been implicated in mutagenesis, 57 while GQs are active drug targets. 59 Both Z-DNA and GQs contain Gs that preferentially adopt the syn-conformation. In these DNA structures, 8arylGs can occupy syn-G positions without disturbing the overall stability and H-bonding interactions in the nucleic acid. In Z-DNA produced by alternating purine-pyrimidine sequences, the purines adopt the syn-conformation, while pyrimidines maintain the anti-conformation, leading to a lehanded "zigzag" helix. Antiparallel GQs are composed of stacked G-quartets, which are stabilized by certain cations and contain alternating syn-and anti-Gs, and intervening sequences, which are extruded as single strand loops. The ability of 8arylG lesions to promote the formation of Z-DNA and GQs may provide a rationale for their biological activity. [60][61][62][63] In antiparallel GQs, 8arylGs also behave as ideal uorescent probes, exhibiting emission that is sensitive to GQ folding. 38,42,52,53 In this perspective, we discuss the synthesis, properties, biological impact and applications of 8arylG adducts. We start by summarizing the nucleoside structures and properties, and methods for their incorporation into oligonucleotide substrates. The structural and biological impact of 8arylG adducts within the B-DNA "NarI" recognition sequence (5 0 -G 1 G 2 CG 3 CC) is then presented, which highlights a comparison between the properties of the C-linked 8arylG adducts with the established biological properties of the N-linked variants. Novel carcinogenesis mechanisms for C-linked 8arylG adducts are then discussed based on their unique ability to promote Z-DNA and GQ formation. Finally, the utility of uorescent 8arylG probes in duplex-GQ exchange systems is presented, which provides a signalling platform in biosensors. Together, the work highlighted within points toward a rich future for 8arylG probes in aptasensor development.
Given that PhG exhibits impressive emission intensity in H 2 O, it became apparent that manipulation of the phenyl ring with various substituents could generate new uorescent nucleobase analogs with potentially useful emission switching properties (Fig. 3b). For example, the ortho-substituted phenolic nucleoside (oPhOHG) can produce a planar conformation in aprotic solvents due to a intramolecular H-bond between the phenolic OH and N 7 of the nucleobase. 71 The planar species absorbs at $320 nm and displays visible emission at 476 nm due to an excited-state intramolecular proton transfer (ESIPT) process to afford the keto tautomer (Fig. 3b). 71 In H 2 O, the intramolecular H-bond is disrupted and the nucleoside adopts a twisted structure (l max ¼ 280 nm) that displays enol emission at l em ¼ 395 nm (F  ¼ 0.44). 71 This differential uorescence has proven useful for probing the local solvent environment of oPhOHG within duplex DNA. 72 The oPhOHG adduct also binds Cu(II) (log K a ¼ 4.59) and Ni(II) (log K a ¼ 3.65) effectively, 73 and the enol emission is quenched upon such metal coordination, illustrating the ability to use uorescence to monitor metal ion binding by nucleic acids. 73 Likewise, the pyridyl derivative (2PyrG) has also been employed for metal ion binding by various DNA topologies. 74 The attached pyridyl group in 2PyrG extends the Hoogsteen H-bonding face of the G nucleobase by an additional H-bonding acceptor. In contrast, the indole-linked derivative (IndG) extends the Hoogsteen H-bonding face by an H-bonding donor, and exhibits emission at 395 nm (F  ¼ 0.78 in H 2 O, l ex ¼ 321 nm) that is sensitive to WC (quenched emission) versus Hoogsteen H-bonding (enhanced emission intensity). 75 The para-substituted phenolic nucleoside (pPhOHG) exhibits pHsensitive uorescent properties. 76 The neutral adduct displays emission at 390 nm (F  ¼ 0.47 in H 2 O) that is quenched upon phenolate formation (pK a $ 8.9). This result prompted the synthesis of the 8-(2-chloro-phenol)adenosine analogue (2-Cl-pPhOHA) with a pK a of 7.29 for uorescent pH-sensing activity in the physiological pH range. 76 The electron-rich nature of the G nucleobase also permits generation of visibly emissive derivatives by attaching electronwithdrawing 8-aryl groups to afford new derivatives with pushpull characteristics. 24,42 The pCNPhG nucleoside displays blue emission at 468 nm that is quenched in H 2 O (F  ¼ 0.04), but lights-up in aprotic solvents (F  ¼ 0.43 in CH 3 CN). 42 The 8quinolyl derivative (QG) displays dual uorescence in CH 3 CN at 384 and 510 nm. However, since red-shied twisted intramolecular charge-transfer (TICT) states for QG are strongly quenched in H 2 O, the single peak for QG in H 2 O at 407 nm (F  ¼ 0.03) was ascribed to locally excited (LE) emission. 24 The emission sensitivity to solvent polarity displayed by these donor-acceptor adducts has proven useful for predicting adduct conformation in duplex structures. 24 Modication of the 8-site of G with aryl groups also accelerates the rate of acid-catalyzed hydrolysis. 70 Hydrolysis of the unmodied base (Scheme 1) proceeds via a stepwise mechanism, involving initial protonation at N 7 (pK a of the protonated species in 2.34 (ref. 77)) followed by rate-limiting unimolecular cleavage of the glycosidic bond (k 1 ). 22,70 The oxocarbenium ion subsequently undergoes hydration to produce the 1 0 -hydroxylated-2 0 -deoxyribose sugar. Rates of hydrolysis for 8arylG adducts are 5-to 45-fold greater than the unmodied base in 0.1 M HCl, which increases to 9-to 200-fold at pH 4. 70 The relative depurination efficiencies of C-linked 8arylG adducts can be rationalized by the relative relief of steric strain in the twisted nucleoside (about the base-aryl moiety bond) upon deglycosylation, which yields a less twisted or even planar nucleobase. 70 Thus, the presence and chemical composition of the bulky moiety in C-linked 8arylG lesions can have a signicant effect on the structure and properties of the isolated nucleoside.

Incorporation into oligonucleotides
Four common approaches are utilized to incorporate modied DNA bases into oligonucleotides in a site-specic fashion (Fig. 4a). 78 The phosphoramidite approach (Fig. 4a(1)) requires the chemical synthesis of the modied phosphoramidite that can be incorporated into oligonucleotides using a DNA synthesizer. The modied phosphoramidite must be compatible with the DNA synthesis conditions (acidic, alkaline and oxidative) and deprotection (strongly alkaline) of the DNA strand. 1 Alternatively, postsynthetic strategies may involve treating an oligonucleotide with a reagent that will modify the desired nucleobase to generate the desired lesion ( Fig. 4a (2)). An inherent problem with this approach is that multiple modications may occur and positional isomers can be difficult to separate. To circumvent some of these challenges, an appropriately modied base may be incorporated site-specically into an oligonucleotide and then chemically modied to generate the desired nal product ( Fig. 4a(3)). Finally, modied deoxynucleotide triphosphates may be incorporated enzymatically into a primer strand by the action of a DNA polymerase ( Fig. 4a(4)). 79 Nevertheless, this approach is limited by the activity of polymerases toward a variety of bulky damaged products.
For site-specic incorporation of 8arylG bases into oligonucleotide substrates, we were concerned that their sensitivity to acids and oxidants may pose issues for their synthesis using the solid-phase approach with 8arylG phosphoramidites. The Hocek laboratory also reported that 8-Ph-dATP is too bulky to be a polymerase substrate, 79 which renders an enzymatic approach unfeasible. Thus, we developed a postsynthetic strategy utilizing the palladium-catalyzed (Suzuki-Miyaura) crosscoupling reaction (Fig. 4b). 80 In this strategy, the commercially available phosphoramidite of 8BrG is incorporated site-specifically into various oligonucleotide substrates using solid-phase DNA synthesis and then reacted with a range of arylboronic acids. This strategy avoids exposure of the 8arylG base to acids and oxidants, and can be employed to incorporate a single adduct into relatively short DNA substrates (3-15mers). However, the strategy has a number of limitations, including poor yields, the inability to incorporate multiple adducts into strands and limitations in generating the longer adducted DNA Scheme 1 Acid-catalyzed hydrolysis of 2 0 -deoxyguanosine. templates that are required to assess the biological impact of the lesion using DNA polymerases or repair enzymes.
A more efficient protocol for synthesis of DNA substrates containing 8arylG bases utilizes the 5 0 -O-2,7-dimethylpixyl (DMPx) protecting group (Fig. 4c) in a solid-phase assisted synthesis strategy. 81 The DMPx group is more acid-labile than DMT (release of an aromatic carbocation versus a benzylic carbocation) and can be efficiently removed using 0.5% dichloroacetic acid (DCA) in dichloromethane rather than the 3% DCA required to remove the 5 0 -O-DMT protecting group. 82 Indeed, we successfully employed the DMPx group to incorporate a range of 8arylG bases (FurG, PhG, pCNPhG, QG, Fig. 4c) into DNA substrates using solid-phase synthesis. For incorporation of FurG, which exhibits a 55.2-fold increase in hydrolysis rate compared to dG in 0.1 M aqueous HCl, the 5 0 -O-DMPx group provided a 4-fold yield increase compared to use of the 5 0 -O-DMT group. 81 However, attempts to purify the modied DNA substrates with the nal 5 0 -O-DMPx group attached using commercially available solid-phase extraction cartridges resulted in degradation of the oligonucleotide through exposure of the 8arylG base to both acid and water. 81 Therefore, although 8arylG bases can be incorporated into DNA substrates using solid-phase synthesis (DMPx or DMT protection), it is critical to remove the nal 5 0 -OH protecting group on-column prior to cleavage of the oligonucleotide from the solid support using aqueous ammonium hydroxide.

Structural and biological impact
As a rst step toward understanding the biological impact of bulky DNA adducts, efforts have been made to relate adducted duplex structures to mutagenic outcomes. For N-linked 8arylG adducts, three distinct conformational themes have been characterized for the associated adducted DNA by NMR spectroscopy, which depend on the (anti or syn) glycosidic bond orientation and the location of the 8aryl group within the duplex (Fig. 5). 29,34 In the B-type conformation, the adduct adopts the anti-glycosidic orientation and maintains WC Hbonding with the opposing base C, which positions the 8aryl group in the solvent-exposed major groove. In the base-displaced stacked (S-type) conformation, the adduct adopts the syn-orientation and stacks the 8aryl ring between the anking base pairs, which displaces the opposing C out of the helix. In the wedge (W-type) conformation, the 8arylG adduct also adopts the syn-conformation, but the 8aryl moiety is positioned in the minor groove and the opposing base remains in the helix. The distribution amongst the various conformations is dependent on the nature of the attached 8aryl moiety (planar-fused multiring systems versus single ring or nonplanar groups) and the neighbouring base sequence context. 29,34 For example, the Nlinked 8arylG adduct produced by AAF induces all three conformational themes in normally paired DNA duplexes, while changes in the linker substitution results in the B and S-type conformations for the N-linked adduct produced by AF. 29,[32][33][34] Thus, each adduct yields multiple conformations of adducted DNA, with each conformation leading to different mutational outcomes. In general, the B-type conformation will favour insertion of the correct base C opposite the lesion, but further extension can be both error free and error prone. 32 The mutagenic arrangement following C insertion opposite the N-linked adduct is induced by interactions between the polymerase and the bulky uorenyl rings rather than by a base pair-stabilized misalignment. 32 The S-type conformation is strongly blocking because the uorenyl rings occupy the position of the incoming dNTP within the polymerase; the S-type conformation is also associated with the production of deletion mutations. 29 The Wtype conformation is strongly associated with misincorporation due to hydrogen bonding interactions between the incoming mismatched dNTPs with the Hoogsteen-face of the synadduct. 29 C-Linked 8arylG adducts are much more rigid than the Nlinked counterparts because they lack a exible tether separating the 8aryl ring from the G nucleobase. They also tend to be highly twisted about the nucleobase-aryl moiety bond in order to reduce steric interactions that arise due to the closer proximity of the aryl group and sugar moiety, as well as decreased inherent exibility within the bulky moiety, in the absence of the tether. As a result, C-linked 8arylG adducts likely exhibit a decreased tendency to produce the highly blocking/mutagenic S-type conformation compared to their N-linked counterparts. Nevertheless, the formation of C-linked 8arylG adducts has also been implicated in a variety of mutagenic outcomes. For example, arylhydrazines, which generate phenyl radicals, produce 8PhG adducts that are mutagenic in bacteria. [60][61][62] Benzo[a]pyrene (B[a]P) undergoes peroxidase-mediated oxidation to afford radical cations, where the potential involvement of an 8B[a]PG adduct produces transversion (G / T and G / C) mutations in yeast. 83 The phenolic mycotoxin ochratoxin A (OTA) produces a bulky C-linked 8arylG adduct, [35][36][37] and increases the mutational frequency, as well as the induction of deletion mutations and double strand breaks, in the kidneys of male rats. 84 To demonstrate relationships between 8aryl ring size, adduct conformation and in vitro mutagenicity for C-linked 8arylG adducts, ve adducts with differing ring types (FurG, PhG, QG, BThG and PyG) were incorporated into the reiterated G 3 -position (X) of the NarI type II restriction endonuclease recognition sequence (i.e. NarI (12) and NarI (22) , Fig. 5). 24,28 The G 3 -site of NarI is part of a CpG dinucleotide repeat that is a hotspot for Fig. 5 Depictions of the three major conformations produced by 8arylG adducts, various 8aryl groups used to model C-linked 8arylG adducts and oligonucleotide sequences of NarI(12) and NarI (22). two-base deletion mutations induced by polycyclic N-linked 8arylG adducts in bacteria via a two-base slippage mechanism (Fig. 6a). 85 Optical experiments (ultraviolet (UV) thermal melting, circular dichroism (CD) and uorescence) combined with molecular dynamics (MD) simulations were utilized to dene the structural features of the adducted NarI (12) duplexes. 24,28 The adducted NarI (22) templates were also employed in primer elongation experiments to assess adduct impact on DNA replication in vitro by the polymerases-Escherichia coli DNA polymerase I Klenow fragment exo À (Kf À ), and DNA polymerase IV from Sulfolobus solfataricus P2 (Dpo4). High-delity polymerases, such as Kf À , can t one templating nucleotide in their active site, and favour accurate replication when correct WC base pairing is established with the incoming deoxynucleotide triphosphate (dNTP). 86 Bulky DNA adducts oen stall or block DNA replication by high-delity polymerases, which in vivo is believed to be a signal for the recruitment of Y-family translesion polymerases. The Y-family polymerases have spacious solvent-exposed active sites that can accommodate bulky DNA lesions, while facilitating low-delity DNA replication. 78,86 Dpo4 is regarded as a prototypical Y-family polymerase that serves as an excellent model for investigating how structural features of adducts determine lesion bypass efficiency and delity.
In the NarI(12) duplex, C-linked 8arylG adducts paired opposite the correct base C strongly decrease duplex stability compared to the unmodied control due to their energetic preference for the syn-conformation. 24,28 More importantly, the ability of a lesion to stabilize the slippage product relative to the full-length duplex correlates with an ability to induce À2 frameshi mutagenesis in bacteria. As a result, at the G 3 -position of NarI, thermal melting parameters (T m values) of the full-length complement duplex (with the adduct paired opposite the correct base C) have been compared to T m values of NarI (12) hybridized to a truncated 10mer (two-base deletion) sequence (À2) (Fig. 6b). 87 The full-length complement and truncated duplexes containing the N-linked 8arylG adduct of AAF (AAFG) have the same T m (DT m ¼ 0.0 C, Fig. 6b). This highlights the ability of the AAFG lesion to stabilize the slippage product, 87 which correlates with reports that AAFG produces À2 deletions (91% mutational frequency) at the G 3 -position in NarI in bacteria. 88 In contrast, the AFG lesion stabilizes the slippage product to a lesser degree (DT m ¼ 8.0 C) 87 and leads to only base-pair substitution mutations. 88 For the C-linked 8arylG adducts in NarI (12), full-length duplexes containing the singleringed derivatives (FurG and PhG) are signicantly more stable than the truncated duplex, suggesting an inability of these lesions to induce À2 deletions by stabilizing the slippage product. In contrast, the larger fused-ring derivatives (BThG, QG and PyG) exhibit similar T m values for the full-length and truncated duplexes, with the truncated duplex being more stable for the bulkiest PyG adduct (DT m ¼ À2.1 C). These ndings suggest that these lesions may induce deletion mutations when inserted into repeat sequences that are prone to slippage. 24,28 MD simulations were carried out on the full-length and truncated duplexes, and adduct conformational preferences were ranked according to the calculated free energies. 24,28 In the full-length duplex with the lesion paired opposite C, C-linked 8arylG adducts favour the major groove (B-type, Fig. 5) conformation, although alternative syn-conformations are energetically accessible for FurG, PhG, BThG, QG (all W-type) and PyG (S-type). In the truncated duplexes, FurG, PhG and BThG favour the anti-conformation, while QG and PyG favour the synconformation. For the push-pull QG, its emissive response was consistent with the preferred conformation predicted by the MD simulations (Fig. 6c). Specically, in the full-length duplex, the anti-conformation of QG is favoured over syn-structures by at least 25 kJ mol À1 , which preferentially exposes the quinolyl moiety to the bulk aqueous solvent in the major groove (anti-QG:C, Fig. 6c), and quenches CT emission of QG (solid black emission trace, Fig. 6c). In contrast, the syn-conformation of QG in the truncated duplex sequesters the quinolyl moiety from the bulk aqueous solvent (syn-QG:À2, Fig. 6c) and results in a 12fold increase in CT emission intensity (dashed red emission trace, Fig. 6c). 24 In primer elongation assays using the NarI(22) templates, Clinked 8arylG adducts strongly block DNA replication by the high-delity DNA polymerase Kf À following insertion of a single base opposite the lesion, typically the correct base C. 24,28 This observation was consistent with their preference for the B-type structure, as predicted by MD simulations. In single-nucleotide incorporation assays, the smallest lesion FurG produced the greatest levels of misincorporation (A > G [ T), which correlations with this lesion resulting in the smallest energy difference between the B and W-type adducted DNA conformations. 24 Using Dpo4 as a model translesion polymerase, C-linked 8arylG adducts were found to cause targeted (at the lesion site, i.e. base substitution) and semi-targeted (in the vicinity of the lesion site, i.e. deletion) mutations. 24,28 The single-ringed derivatives (FurG and PhG) produced the greatest levels of misincorporation (A and G), 24 while the fused-ringed derivatives (QG, BThG and PyG) strongly blocked extension by Dpo4. 24,28 Despite the reported reduced exibility of the C-linked adducts discussed thus far, adducted DNA associated with the fungal carcinogen OTA has a much more complicated conformational prole. Although the genotoxicity of OTA has been debated in the literature, several studies have illustrated that OTA primarily results in a C-linked 8arylG adduct. 35,36 Using MD simulations, a dynamic conformational prole has been revealed for the C-linked OTAG lesion that involves at least two of the B, W and S-type conformations, with the energetically accessible orientations being highly dependent on the lesion site sequence context (Fig. 7), as well as the OTA ionization state. 37 This mixture of conformations correlates with the complicated mutagenic prole associated with OTA exposure, which includes base deletions and double strand breaks. 84 Furthermore, the adoption of the S-type conformation according to MD, coupled with observed double strand breaks, suggests that this bulkier C-linked adduct may inhibit replication. 37 Nevertheless, the genotoxic effects of OTA have been reported in only select tissues (outer stripe of the outer medulla rather than the entire kidney 84 ), which suggests that certain conformations may dominate in particular cellular environments. This example highlights the complex interplay that can exist between the conformations adopted and biological outcomes for C-linked 8arylG adducts.
Alternative mechanisms for the toxicity of 8arylG adducts may stem from their ability to induce alternative DNA conformations. For example, the Gannett laboratory has examined the impact of 8PhG lesions on Z-DNA formation. [60][61][62] Formation of Z-DNA in vivo has been implicated in mutagenesis by stimulating large-scale gene deletions, translocations, and rearrangements. 57 A major structural requirement for Z-DNA formation is sequence (alternating purine-pyrimidine bases) and the addition of cationic species to screen electrostatic repulsion between adjacent phosphate groups in Z-DNA. It is also known that methylation of cytosine at the 5-carbon position in CpG runs can favour Z-DNA formation. 89 In an initial effort to gauge the impact of an 8PhG adduct on the B-/Z-DNA equilibrium, the simplest member 8PhG along with four parasubstituted 8RPhG derivatives (R ¼ CH 3 , CH 2 OCH 3 , CH 2 OH and CO 2 À ) were incorporated into d(CGCGCXCGCG) 2 (X ¼ 8PhG or 8RPhG). 60,61 The effect of 8PhG on the B/Z equilibrium was determined using CD by monitoring the salt concentration required to generate a B/Z equilibrium equal to 1. The unmodied duplex required a salt concentration (NaCl) of $3.2 M, while introduction of the 8PhG lesion lowered the salt concentration by a factor of $3-25, depending on the nature of the para-substituent. These results suggest that 8PhG adducts change the position of the B/Z equilibrium by destabilizing the B-form rather than stabilizing the Z-form. 60,61 Unfortunately, the duplex model previously used to determine the B/Z equilibrium was not ideal because it contained two 8PhG modications that could potentially provide an additive or synergistic effect to favour the Z-form. 62 Furthermore, it would be highly unlikely that two 8PhG adducts would form on consecutive base steps in vivo. Thus, a hairpin sequence (d-(CG) 5 T 4 (CG) 5 ) containing an intramolecular (CG) 5 duplex was synthesized in order to incorporate only one 8PhG modication. 62 Under physiological conditions (2 mM MgCl 2 , 10 mM NaCl, 140 mM KCl and 1 mM spermine, 37 C, pH 7.4) the hairpin preferentially adopted the Z-form structure. This result strengthened the argument that 8PhG lesions favour Z-DNA through destabilization of the B-form, which may have biological signicance. Specically, sequences that can produce Z-DNA are most commonly found in gene promoter regions where Z-DNA can regulate transcription and nucleosome positioning. 57 Therefore, carcinogens that produce 8arylG adducts may overwhelm the normal cellular response to Z-DNA formation and lead to large-scale deletions in mammalian cells. 62 Other unique DNA topologies that are more frequent in gene promoter regions include GQs. 59 Chromosome ends (telomeres) are capped with kilobase-long runs of the repeating 5 0 -(TTAGG) n -3 0 sequence. 90 Although most of the telomere resides as a duplex, the 3 0 -terminal 50-200 nucleotides are singlestranded and can fold into various GQ topologies. As described earlier, GQs are assembled through the sequential stacking of G-tetrads around a metal cation (Fig. 8), with the intervening sequences extruded as single-strand loops. The GQs topologies are classied depending on the orientation of the DNA strands. They can be parallel (all Gs in the tetrad are in the anticonformation), antiparallel (alternating syn-and anti-Gs) or hybrids thereof. Model studies utilizing a four-repeat section of the human telomeric DNA sequence (HTelo22, (d [AG 3 (T 2 AG 3 ) 3 ])) have demonstrated extensive GQ polymorphism (Fig. 8). 63 In Na + solution, HTelo22 produces a basket-type antiparallel GQ, 58 while in K + solution, mixed parallel/antiparallel (hybrid-1 or hybrid-2) GQ structures are the major conformations, with hybrid-1 being the major fold (Fig. 8). 91 In the crystalline state 92 containing K + or in K + solution containing certain additives (i.e., CH 3 CN, ethanol and polyethylene glycol (PEG)), 93 a propeller-type parallel-stranded GQ structure is favoured. Topologies of HTelo22 formed in K + (hybrid or parallel) are expected to be more biologically relevant than the antiparallel structure produced in Na + , because GQs have a stronger binding constant with K + , which is present in a higher cellular concentration ($140 mM [K + ] versus $10 mM [Na + ]). 94 The G-rich nature of the human telomere sequence makes it highly susceptible to electrophilic attack. 94 Hence, model studies have been conducted to determine the impact of the 8oxoG lesion on GQ formation by HTelo. 94,95 The lesion strongly perturbs GQ stability due to the inability of 8oxoG to form Hoogsteen base pairs with G. Placement of 8oxoG in an exterior tetrad forces an antiparallel topology in K + solution, while an unstable triplex-like topology is produced with 8oxoG in the middle tetrad. 96 Such studies prompted our laboratory to examine the structural impact of an 8arylG adduct on the GQ polymorphism exhibited by HTelo22. 63 Unlike 8oxoG, 8arylG adducts can form Hoogsteen base pairs with G; however, their strong syn-preference may perturb or inhibit formation of certain GQ topologies that could impact telomeric function. As a representative 8arylG lesion, FurG was incorporated into various G-tetrad positions (3, 4, 8 or 10, see Fig. 8 for numbering) of HTelo22. 63 On the basis of CD signatures, T m analysis and uorescence measurements, all FurG-modied HTelo22 sequences adopt the antiparallel fold in Na + solution, with T m values higher than the native HTelo22. However, in K + solution, sequences with FurG at positions 3 and 4 exclusively form the antiparallel basket GQ that is produced in Na + solution, while FurG at positions 8 and 10 produces the expected hybrid GQ that is favoured by the native HTelo22 sequence. At all positions examined, the FurG modication strongly impedes formation of the parallel fold that is produced by the native HTelo22 sequence in K + solution in the presence of certain additives. These results demonstrate that production of 8arylG adducts within the human telomeric sequence may make it difficult to form certain GQ topologies, which could impact chromosome stability. Nevertheless, 8arylG adducts can stabilize GQ structures when placed in syn-G sites. The ability to stabilize GQ structures has important biological implications. For example, GQ formation can induce replication dependent double strand breaks. 97 Together, the above studies highlight the diverse structural and biological impacts of C-linked 8arylG adducts when incorporated into DNA structures.

Utility as fluorescent probes
Aptamers are single-stranded DNA, RNA or modied nucleic acids that are designed to fold into unique three-dimensional shapes, which are then recognized by a target molecule. 98,99 Given the stability and diversity of GQs, it is not surprising that many aptamers have been designed based on the GQ structural motif. 100 The antiparallel GQ scaffold provides an opportunity to employ uorescent 8arylG probes for GQ detection strategies. Compared to other uorescent guanine analogues, such as pteridines (3-MI and 6-MI) 9 and benzo-expanded guanine derivatives, 11 8arylG probes are relatively easy to synthesize and have a syn-preference making them useful for dening the G conformation within the tetrad; the probe will generally stabilize the GQ when placed in a syn-G position, but destabilize the GQ when placed in an anti-G position. Furthermore, the G-G base-stacking, hydrogen bonding, and restricted motions within GQ structures can enhance the photoexcited lifetimes and energy transfer properties of a G residue. 38 These factors can amplify the emission of the 8arylG probe within the GQ structure compared to its emission in the duplex. Thus, 8arylG probes can be effective diagnostic tools in duplex-GQ exchange systems that are commonly employed in DNA-based detection strategies. 42,52,53 The level of precision offered by internal uorescent nucleobase mimics can complement other approaches including end-labelled dyes and label-free dyes, because the internal probes may assist in establishing Gs in the tetrad versus loop positions, site(s) of ligand binding, and remove false negatives or positives that can occur with the use of external dyes that are not covalently attached to the aptamer. 101 To test 8arylG probe performance in an antiparallel GQ aptamer, FurG and the donor-acceptor pCNPhG were inserted into various positions within the thrombin binding aptamer (TBA, Fig. 9). 42 The two 8arylG probes were inserted into the 5, 6 or 8 position that represent a syn-G position within a G-tetrad (G 5 ), an anti-G position (G 6 ) or a diagonal TGT loop position (G 8 ). The duplex and GQ structures produced by the modied-TBA (mTBA) strands were compared to the corresponding structures adopted by native TBA using CD, T m analysis, uorescence and MD simulations. 42 In the duplex structure, the emission of the FurG probe (l em ¼ 384 nm, F  ¼ 0. 49 (ref. 67)) was strongly quenched (dashed traces, Fig. 9), while the emission exhibited a 6-to 19-fold increase in uorescence intensity in the chair-like antiparallel GQ (table, Fig. 9) compared to the duplex emission. The excitation spectra for FurG in the GQ also displays diagnostic energy-transfer bands at $255 and 290 nm that are absent in the duplex excitation spectra. The probe had a destabilizing inuence on duplex stability (DT m $ À5 C) and on GQ stability when placed at anti-G 6 and in the diagonal loop G 8 . However, at the syn-G 5 position, FurG increases GQ stability by $9 C. The bulkier pCNPhG probe had a stronger destabilizing inuence on duplex stability and inhibited GQ formation when placed in the anti-G 6 position (T m value could not be determined). Interestingly, the probe was not as destabilizing as FurG in the GQ when placed in the diagonal G8 loop position (due to increased stacking with the G-tetrad); the pCNPhG probe also stabilized the GQ at syn-G 5 (by $7 C). In contrast to the FurG probe, the push-pull pCNPhG exhibited quenched emission in the GQ structures at positions G 5 and G 8 , and at position G 5 in the duplex. However, in the duplex at position G 8 , where the probe is anked by T residues, the probe was strongly emissive ($10fold increase compared to the GQ structure). Thus, FurG serves as an effective turn-on emissive probe in duplex-GQ exchange, while pCNPhG can be employed as a turn-off probe when inserted into the diagonal G 8 loop position of TBA. 42 Indeed, duplex-GQ exchange studies with FurG at syn-G 5 and pCNPhG at G 8 within mTBA have demonstrated the utility of these probes for K + ion 42 and thrombin detection. 52 The turn-on emissive properties of FurG upon GQ folding suggested that it could serve as a donor (D) dye to be paired with an acceptor (A) G derivative for diagnostic uorescence resonance energy transfer (FRET) signalling for GQ formation (Fig. 10a). 53 This prompted the synthesis of 8-vinyl-benzo[b] thienyl-dG (vBthG, Fig. 10b), which has an absorbance maxima at $380 nm and will yield effective spectral overlap with the emission of FurG. The vBthG probe provides visible blue emission at 473 nm (F  ¼ 0.29), but is not effective by itself for monitoring duplex-GQ exchange because its emission lacks sensitivity to the change in DNA topology. 53,101 However, when paired with FurG within syn-G positions of mTBA (i.e. D; A, 10; 5), the D base can act as a switch upon GQ formation, turning on the visible uorescence from the vBthG probe. The FRET efficiency of the D/A pair in the antiparallel GQ was 88% at positions 10; 5 (Fig. 10c). The mTBA sample with the probes at syn-G 5 (A) and syn-G 10 (D) strongly decreased duplex stability (by 20 C), but increased GQ stability (by $9 C) and acted as an effective turn-on duplex-GQ exchange system (4-fold increase in emission intensity at $470 nm upon thrombin binding, Fig. 10d). 53 Overall, the ability of 8arylG bases to exhibit emission switching properties upon change in DNA topology (duplex-GQ exchange) provides a basis for their utility in DNAbased diagnostics.

Conclusions and future prospects
The literature summarized above highlight the structural and in vitro mutagenic impact of C-linked 8arylG lesions together with their potential applications as uorescent probes in duplex-GQ exchange systems. In terms of biological impact, until recently, few studies have focused on the C-linked 8arylG variety despite the fact that many chemical mutagens are known to produce such adducts. Our studies to date demonstrate that C-linked 8arylG lesions are stronger blocks of DNA synthesis than the corresponding N-linked derivatives, and can promote polymerase slippage and misincorporation. Future efforts should examine the conformational preferences of C-linked 8arylG adducts in both duplex structures and in template:primers bound to DNA polymerases using high-resolution structural studies combined with MD simulations. In order for C-linked 8arylG adducts to induce mutagenicity, they must evade repair by nucleotide excision repair (NER) enzymes. NER of C-linked 8arylG adducts has yet to be examined. Here, it may be possible to take advantage of the uorescent nature of C-linked 8arylG adducts to determine incision rates mediated by NER. Fluorescence-based assays for nucleotide selection preferences of Y- Fig. 9 TBA sequence with G 5 , G 6 and G 8 highlighted in red, blue and green, respectively. Excitation and emission spectra for mTBA with duplexes as dashed traces, GQs solid traces and different colours for the various positions of the 8arylG probe within mTBA. Thermal melting parameters for duplex (d) and GQ in K + solution for mTBA compared to the unmodified sequence. family polymerases opposite C-linked 8arylG adducts may also be highly insightful, especially in cases where the lesion induces À2 base slippage; C-linked 8arylG adducts with push-pull character can exhibit enhanced emission when sequestered from the aqueous solvent (i.e. within a 2-base bulge). Finally, Clinked 8arylG adducts with dened structures have yet to be incorporated into DNA vectors for the determination of mutational frequency in cell-based assays. Such studies would better permit direct comparisons between the mutagenicity of Clinked and N-linked 8arylG lesions so that the biological impact of linkage type can be established.
In terms of the utility of uorescent 8arylG probes in duplex-GQ exchange systems, the 8-furyl derivative FurG exhibits favourable turn-on emission switching properties upon change in DNA topology due to energy-transfer from the unmodied Gs in the tetrad that is absent in the duplex structure. Specically, the base can serve as a donor probe to be paired with an acceptor G for effective FRET signalling of GQ formation. So far, our studies have been limited to TBA, which forms an antiparallel GQ and can be used as "proof-of-principle" to test probe performance. However, many aptamers produce GQs upon target binding, including those for OTA, nucleolin, insulin, HIV-1 reverse transcriptase and ATP. 100 Many C-rich aptamers also exist that can be paired with GQ-producing complementary strands, such as aptamers for microcystins. 102 In this scenario, target binding to the aptamer releases the complementary strand containing the 8arylG probe, which is free to fold into a GQ to signal target binding. We expect 8arylG bases to have commercial applications in aptasensors over the next few years. One goal of our laboratories is to develop 8arylG probes that undergo excitation with visible light and can be readily employed at anti-G positions for applications within parallel GQ structures. This issue is particularly challenging because low energy emission is usually associated with large chromophores that may inhibit GQ folding or produce CT states that exhibit quenched emission in H 2 O or a lack of emission sensitivity to changes in DNA topology. Nevertheless, the successes registered to date regarding the range of properties and applications of 8arylG bases help ensure that achieving these ambitious goals will push the current boundaries in many nucleic acid-based technologies.