 Open Access Article
 Open Access Article
      
        
          
            Feng 
            Tang‡
          
        
        
      a, 
      
        
          
            Shan 
            Liu‡
          
        
      a, 
      
        
          
            Qiao-Ying 
            Li
          
        
      a, 
      
        
          
            Jun 
            Yuan
          
        
      b, 
      
        
          
            Lin 
            Li
          
        
      b, 
      
        
          
            Yinsheng 
            Wang
          
        
      b, 
      
        
          
            Bi-Feng 
            Yuan
          
        
       *a and 
      
        
          
            Yu-Qi 
            Feng
*a and 
      
        
          
            Yu-Qi 
            Feng
          
        
       a
a
      
aKey Laboratory of Analytical Chemistry for Biology and Medicine, Ministry of Education, Department of Chemistry, Wuhan University, Wuhan 430072, P. R. China. E-mail: bfyuan@whu.edu.cn;  Fax: +86-27-68755595;   Tel: +86-27-68755595
      
bDepartment of Chemistry and Environmental Toxicology Graduate Program, University of California, Riverside, CA 92521-0403, USA
    
First published on 11th March 2019
Accumulating lines of evidence indicate that reactive oxygen species (ROS) are important signalling molecules for various cellular processes. 8-Oxo-7,8-dihydroguanine (OG) is a prominent oxidative modification formed in DNA by ROS. Recently, it has been proposed that OG may have regulatory and possibly epigenetic-like properties in modulating gene expression by interfering with transcription components or affecting the formation of G-quadruplex structures. Deciphering the molecular mechanisms of OG on regulation of gene expression requires uncovering the location of OG on genome. In the current study, we characterized two commercially available DNA polymerases, Bsu DNA polymerase (Bsu Pol) and Tth DNA polymerase (Tth Pol), which can selectively incorporate adenine (A) and cytosine (C) opposite OG, respectively. By virtue of the differential coding properties of Bsu Pol and Tth Pol that can faithfully or error-prone copy a DNA strand carrying OG, we achieved quantitative and single-base resolution analysis of OG in synthesized DNA that carries OG as well as in the G-rich telomeric DNA from HeLa cells. In addition, the parallel analysis of the primer extension products with Bsu Pol and Tth Pol followed by sequencing provided distinct detection of OG in synthesized DNA. Future application of this approach will greatly increase our knowledge of the chemical biology of OG with respect to its epigenetic-like regulatory roles.
The long-standing view has been that OG is mutagenic and detrimental to cellular processes.7 Recently, OG has been proposed to be able to modulate gene expression.8 A few examples showed that the increased OG formation in the genome is correlated with increased gene expression via the base excision repair (BER) pathway.9,10 Zarakowska et al.11 reported that the transcriptionally active euchromatin DNA harbored more OG than that in transcriptionally silenced heterochromatin in porcine thymus DNA. Park, et al.12 found that G oxidation to OG could occur site specifically in vivo under oxidative stress. More recently, Fleming et al.13 demonstrated that OG can activate mRNA synthesis by facilitating promoter G-quadruplex formation and cells can harness oxidative modification of G to OG for altering phenotype. These studies indicate that OG may have regulatory and possibly epigenetic-like properties in cells that respond to oxidative stress.
Traditional methods for detection of OG normally involve the comet assay14 or OG was determined by extraction of genomic DNA followed by nuclease digestion and mass spectrometry analysis.15 However, these approaches cannot determine the location of the OG in DNA. Thus, sequencing the genome for locating OGs will provide better understanding of the effects of OG on gene regulation. Direct sequencing for OG has been achieved by single-molecule, real time (SMRT)16 or nanopore-based sequencing methods.17 However, these methods generally are proof-of-concept studies and further improvement is required to realize the mapping of OG in genome. Several studies reported the antibody-based OG sequencing that provided a low-resolution sequence mapping of OG (∼10 kb).18,19 More recently, Ding et al.20 developed an approach (OG-Seq) that utilized amine-terminated biotin to label OG for affinity purification followed by next-generation sequencing to map OG in the mouse genome at ∼0.15 kb resolution. This study showed that gene promoters and untranslated regions (UTRs) harbor more OG-enriched sites. Although these studies offer significant advancement of our knowledge regarding genomic OG, the resolution is not high enough to determine precise genomic elements where OG resides. It is important to provide the information about single-base resolution location of OG in genome-wide scale for understanding the molecular basis of diseases as well as the regulatory roles that are related to OG.
Structural studies demonstrate that OG can adopt two alternative conformations (anti or syn) in the active site of DNA polymerases and therefore OG has dual coding potential.21 The anti conformation of OG allows Watson–Crick base pairing with a cytosine, whereas the syn conformation of OG forms a stable mispairing with an adenine in a normal anti conformation by Hoogsteen base pair (Fig. 1B).21 DNA polymerase β (Pol β) has been shown to be able to incorporate 8-oxo-dGTP opposite adenine in preference to cytosine.22 In contrast, the Y-family enzyme of DNA polymerase ι (Pol ι) prefers error-free bypass of OG and can preferentially insert dCTP opposite OG site.23 Therefore, by virtue of the differential properties of different DNA polymerases that can faithfully or error-prone copy a DNA strand carrying OG, we can develop single-base resolution mapping of OG in DNA.
In the current study, we characterized commercially available DNA polymerases and found that Bsu DNA polymerase (Bsu Pol) predominantly incorporated adenine (A) opposite OG and Tth DNA polymerase (Tth Pol) predominantly incorporated cytosine (C) opposite OG. Using the distinct properties of these two DNA polymerases, we developed an approach for detection of OG in DNA at single-base resolution. This method was then successfully used to quantification of OG in telomeric DNAs from HeLa cells.
The relative reaction velocity (v) was calculated from the ratio of the extended product (IE) over the unextended primer (IU) plus the extension product (IE) as follows: v × t = IE/(IU + IE), where t represents the reaction time. The apparent KM and Vmax values were obtained from linear regression analysis of Hanes–Woolf plots using the data points at different dATP concentrations in three independent experiments according to previous described method.26 This assay provides a straightforward method for determining the enzymatic efficiency (Vmax/Km), which is used to describe the selectivity of DNA polymerase for incorporating dATP opposite OG or G. These selectivity values are independent of enzyme activity and thus allow the comparison of different DNA polymerases.
 were used to mimic the telomeric DNA (detailed DNA sequences can be found in Table S1 in ESI†). As for the measurement of the content of OG at telomeric DNA, single-nucleotide primer extension was performed with the mixture (10 μL) of HeLa DNA (5 μg) and different fluorophore-labeled primers (Cy3-primer, Cy5-primer, and FAM-primer-2, 5 pmol for each) under the aforementioned conditions. The percentage of the extension of primers was determined based on three independent experiments.
 were used to mimic the telomeric DNA (detailed DNA sequences can be found in Table S1 in ESI†). As for the measurement of the content of OG at telomeric DNA, single-nucleotide primer extension was performed with the mixture (10 μL) of HeLa DNA (5 μg) and different fluorophore-labeled primers (Cy3-primer, Cy5-primer, and FAM-primer-2, 5 pmol for each) under the aforementioned conditions. The percentage of the extension of primers was determined based on three independent experiments.
      
      
        In addition to the synthesized DNA carrying OG, the sequencing strategy was also applied for the analysis of OG in genomic DNA of HeLa cells. Three human genes, including VEGFA (vascular endothelial growth factor A), TP53 (tumour protein p53), and KRAS (KRAS proto-oncogene, GTPase), were amplified using HeLa genomic DNA as the template. HeLa genomic DNA (500 ng) was denatured and annealed with the corresponding extension primers (VEGFA-L, TP53-L, or KRAS-L, Table S2 in ESI†). The following procedures are the same as those for the analysis of synthesized DNA. In addition, H2O2-treated genomic DNA was also employed as the template for the sequencing analysis. The H2O2 treatment was performed according to a previously described method.27 The detailed sequences of PCR primers can be found in Table S2 in ESI.†
To investigate potential mutation rate dependency on the sequence context, we also performed the high-throughput sequencing using the synthesized DNA of NN-OG-NN that carries two randomized bases flanking each side of the OG site. The primer extension products from the NN-OG-NN by Bsu Pol were amplified by PCR and subjected to high-throughput sequencing. The library construction, high-throughput sequencing, and processing of raw sequencing data were carried out following the aforementioned protocol. Only the reads with perfect match to characteristic strings ‘GACGACTGGCACTAATG’ from the −19th to −3rd nucleotides of the OG site for forward sequence reads, or the reads with perfect match to characteristic strings ‘TGTCGATCCACGCAGCA’ from the 3rd to 19th nucleotides for reverse sequence reads were used to calculate the frequency of the nucleobases flanking original OG site. The sequence context around the OG site was analyzed and generated by Sequence-logo.
We first carried out a screen of commercially available DNA polymerases by monitoring their ability to extend a FAM-labeled single-stranded DNA (FAM-primer-1) for incorporating dATP opposite either G or OG in DNA template (Fig. 2A). The preliminary result of single-nucleotide primer extension assay demonstrated that Bsu Pol, KF Pol, and The Pol showed good performance on incorporation of dATP opposite OG. But KF Pol and The Pol also can incorporate dATP opposite normal G (Fig. 2B), excluding the potential use of these two DNA polymerases for discriminating G and OG in DNA.
We next performed steady-state kinetics study for quantitative comparison of the properties of these DNA polymerases on incorporating dATP opposite G and OG in DNA. As shown in Table 1, Bsu Pol displays the greatest discrimination (34.1-fold) between OG and G in single-nucleotide primer extension reaction with dATP as the substrate. The Vmax value of the DNA-OG template with dATP was higher than that of DNA-G template, and the DNA-OG template had a much more favorable KM value, which resulted in a remarkably better enzyme efficiency (Vmax/KM) for DNA-OG template (27.3 × 10−3 min−1 μM−1) than that for DNA-G template (0.8 × 10−3 min−1 μM−1, Table 1). Given that the adenine base forms a Hoogsteen base pair with OG, it is reasonable to assume that the A![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) :
:![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) OG base pair is more stable than the A
OG base pair is more stable than the A![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) :
:![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) G base pair. The result of KF Pol also revealed a preference (18.7-fold) for DNA-OG template over the DNA-G template (Table 1). Even though the extension efficiency (Vmax/KM) of KF Pol (66.3 × 10−3 min−1 μM−1) for DNA-OG template was higher than that of Bsu Pol (27.3 × 10−3 min−1 μM−1), it had a higher extension efficiency (Vmax/KM) for DNA-G template (3.5 × 10−3 min−1 μM−1) than that for Bsu Pol (0.8 × 10−3 min−1 μM−1), which leads to a slightly less discrimination (18.7-fold) between OG and G compared with Bsu Pol (34.1-fold). The other DNA polymerases didn't show good discrimination between OG and G in DNA (Table 1).
G base pair. The result of KF Pol also revealed a preference (18.7-fold) for DNA-OG template over the DNA-G template (Table 1). Even though the extension efficiency (Vmax/KM) of KF Pol (66.3 × 10−3 min−1 μM−1) for DNA-OG template was higher than that of Bsu Pol (27.3 × 10−3 min−1 μM−1), it had a higher extension efficiency (Vmax/KM) for DNA-G template (3.5 × 10−3 min−1 μM−1) than that for Bsu Pol (0.8 × 10−3 min−1 μM−1), which leads to a slightly less discrimination (18.7-fold) between OG and G compared with Bsu Pol (34.1-fold). The other DNA polymerases didn't show good discrimination between OG and G in DNA (Table 1).
| DNA polymerase | DNA template | V max (% min−1) | K M (μM) | V max/KM (min−1 μM−1) | Discrimination | 
|---|---|---|---|---|---|
| a V max, the maximum rate of the enzyme reaction. KM, the Michaelis constant. Discrimination = (Vmax/KM) DNA-OG/(Vmax/KM) DNA-G. | |||||
| Bsu | DNA-G | 14.5 ± 2.3 | 178.6 ± 4.9 | 0.8 × 10−3 | 34.1 | 
| DNA-OG | 25.1 ± 2.4 | 9.2 ± 0.6 | 27.3 × 10−3 | ||
| KF | DNA-G | 26.0 ± 2.0 | 73.7 ± 7.2 | 3.5 × 10−3 | 18.7 | 
| DNA-OG | 34.5 ± 0.5 | 5.2 ± 1.1 | 66.3 × 10−3 | ||
| The | DNA-G | 56.3 ± 1.7 | 2.9 ± 0.7 | 194.1 × 10−3 | 1.3 | 
| DNA-OG | 53.7 ± 1.8 | 2.1 ± 0.8 | 255.7 × 10−3 | ||
| Vent | DNA-G | 9.0 ± 1.1 | 274.8 ± 17.4 | 0.3 × 10−3 | 1.3 | 
| DNA-OG | 17.0 ± 5.6 | 451.0 ± 46.7 | 0.4 × 10−3 | ||
| Taq | DNA-G | 12.6 ± 0.4 | 90.5 ± 21.0 | 1.4 × 10−3 | 2.1 | 
| DNA-OG | 6.4 ± 0.6 | 21.6 ± 3.5 | 3.0 × 10−3 | ||
| Tth | DNA-G | 5.4 ± 0.5 | 228.3 ± 15.4 | 0.2 × 10−3 | 1.0 | 
| DNA-OG | 6.2 ± 0.6 | 275.8 ± 18.2 | 0.2 × 10−3 | ||
It is worth noting that the single-nucleotide primer extension by Tth Pol for both DNA-G and DNA-OG template with dATP scarcely happened (Fig. 2B and Table 1), indicating that Tth Pol has very low tolerance for both A![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) :
:![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) G and A
G and A![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) :
:![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) OG base pairing. We then further performed the steady-state kinetics study using dCTP by Tth Pol. The result showed that, other than dATP, Tth Pol preferentially incorporated dCTP opposite OG (Table S3 and Fig. S2 in ESI†).
OG base pairing. We then further performed the steady-state kinetics study using dCTP by Tth Pol. The result showed that, other than dATP, Tth Pol preferentially incorporated dCTP opposite OG (Table S3 and Fig. S2 in ESI†).
Previous crystal structure of DNA polymerase ι (Pol ι) in complex with OG-containing DNA revealed, in the narrow active site of Pol ι, adenine nucleoside adopts a syn conformation, which destabilizes A![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) :
:![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) OG base pair and favours C
OG base pair and favours C![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) :
:![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) OG base pair.23 It's possible that Tth Pol may have a similar mechanism to selectively incorporate C opposite OG. Collectively, the properties of Bsu Pol selectively incorporating A opposite OG and Tth Pol selectively incorporating C opposite OG offer the possibility to develop an approach for quantitative analysis of OG through comparing the primer extension product with Bsu Pol and Tth Pol.
OG base pair.23 It's possible that Tth Pol may have a similar mechanism to selectively incorporate C opposite OG. Collectively, the properties of Bsu Pol selectively incorporating A opposite OG and Tth Pol selectively incorporating C opposite OG offer the possibility to develop an approach for quantitative analysis of OG through comparing the primer extension product with Bsu Pol and Tth Pol.
First, we used four synthetic DNA with (TTAG1G2G3)4 repeat sequence, including DNA-1 (G1, G2, and G3 represent guanine), DNA-2 (G1 and G2 represent guanine; G3 represents OG), DNA-3 (G1 and G3 represent guanine; G2 represents OG), and DNA-4 (G2 and G3 represent guanine; G1 represents OG) as well as three primers labeled with different fluorophores (Table S1 in ESI†) to evaluate the quantification (Fig. 4A). FAM-primer-2, Cy5-primer, and Cy3-primer target G1, G2, and G3 in the repeat sequences, respectively. With the use of Cy3-primer, Bsu Pol can incorporate dATP opposite DNA-2 that contains OG at G3 position (Fig. 4B, Lane 2), but not other DNA templates (DNA-1, DNA-3 and DNA-4). Similarly, Bsu Pol can incorporate dATP opposite DNA-3 that contains OG at G2 position with Cy5-primer (Fig. 4B, Lane 7) and incorporate dATP opposite DNA-4 that contains OG at G1 position with FAM-primer-2 (Fig. 4B, Lane 12). However, no extended band was observed in the DNA templates containing G (Fig. 4B, Lane 1, Lane 5 and Lane 9). The results suggest that the single-nucleotide primer extension assay by Bsu Pol has good selectivity for quantitative analysis of OG in G-rich sequence of DNA.
We next performed quantitative analysis of OG in telomeric DNA from HeLa cells. Similarly, we used the three primers (Cy3-primer, Cy5-primer, and FAM-primer-2) that target the different G in the repeating sequence of 5′-(TTAG1G2G3)n-3′ in telemetric DNA (Fig. 5). The single-nucleotide primer extension assay was performed with 100 μM dATP and 0.5 U Bsu Pol in the 10 μL mixture at 37 °C for 20 min, where the fluorescent primers cannot be extended while opposite G in DNA (Fig. 4B). The result showed that stronger signals of the primer extension products were observed by all the three primers in H2O2-treated HeLa DNA compared to control DNA (Fig. 5), suggesting that H2O2 treatment can induce the generation of OG residues in all of the guanine sites of telomeric DNA. In addition, we also observed weak signals of the primer extension products in the control DNA, indicating that OG might exist endogenously in telomeric DNA.
It is possible that endogenous OG sites may be converted to thymine through replication. The products from the primer extension assay may therefore be attributed to the existence of OG or thymine in HeLa DNA since Bsu Pol can incorporate adenine to OG or thymine (Fig. S3 in ESI†). In this respect, we further used Tth Pol instead of Bsu Pol to perform the same single-nucleotide primer extension reaction because Tth Pol cannot incorporate adenine opposite OG (as can be seen from Fig. 2) and only incorporate adenine opposite thymine (Fig. S4 in ESI†). We first mixed the DNA-G and DNA-T templates (detailed sequence information can be found in Table S1 in ESI†) at different ratios and performed the primer extension assay by Tth Pol with dATP as the substrate. The result showed that the extension product can be observed with as low as 0.2% DNA-T (DNA-T/(DNA-T + DNA-G)), suggesting that even low amount of thymine can be detected by using Tth Pol (Fig. S5 in ESI†). The result of the single-nucleotide primer extension reaction using Tth Pol and HeLa genomic DNA showed that no extension product was observed by different fluorophores-labeled primers (Fig. S6 in ESI†), indicating that no detectable OG-converted thymine was present in vivo. Collectively, these results demonstrated that the single-nucleotide primer extension assay by Bsu Pol can be employed for quantitative analysis of OG in telomeric DNA.
The steady-state kinetics study also showed that Tth Pol preferentially incorporates dCTP other than dATP opposite OG (the Vmax/Km are 1.7 × 10−3 and 0.2 × 10−3 min−1 μM−1, respectively, for incorporation of dCTP and dATP opposite OG, Table S3 in ESI†). Here we also performed the primer extension using Tth Pol followed by PCR amplification and sequencing (Fig. 6A). The result showed that a distinct guanine but not thymine signal was observed at the original OG site (Fig. 6B).
We also synthesized DNA carrying multiple sites of OG (L-DNA-OG2 and L-DNA-OG3, Table S1 in ESI†) and performed the same sequencing analysis. The results showed that thymine signals were observed at all the original OG sites by Bsu Pol extension (Fig. S7 in ESI†); however, only guanine signals were observed at the original OG sites and no thymine signal was observed by Tth Pol extension (Fig. S8 in ESI†). This result demonstrated that multiple OG sites in DNA can be distinctly detected by sequencing. Therefore, by comparing the primer extension product using Bsu Pol and Tth Pol, mapping of OG in DNA can be achieved.
Guanine base at a given site of DNA may not be fully oxidized to OG, thus it is important to obtain the frequency of conversion of guanine to OG at specific site in genomic DNA. Hence, we performed the quantitative evaluation of the level of OG in a given site in DNA using the primer extension followed by sequencing. Various ratios of L-DNA-OG1/L-DNA-G were mixed as the DNA template for primer extension by Bsu Pol with the presence of four dNTPs. The Sanger sequencing result showed that only G signal was observed from the 100% L-DNA-G template and only T signal was observed from the 100% L-DNA-OG1 template at the original OG site; however, both G and T signal were observed from the 35% and 70% OG-containing DNA template at the original OG site (Fig. 7A). To evaluate quantitatively the G-to-T conversion rates at the original OG site, we further performed the clone sequencing. The result showed that the ratios of T/(T + G) in the original OG site increased linearly with the increased ratios of L-DNA-OG1/L-DNA-G (Fig. 7B), indicating that the method allowed for site-specific determination of modification stoichiometry of OG in DNA.
We further employed the high-throughput sequencing to assess quantitatively the level of OG in DNA using various ratios of L-DNA-OG1/L-DNA-G as the DNA template. The result showed that the percentage of the “T” signal at OG site (total number of T/total number of reads) increased linearly with the increased level of OG (Fig. S9 in ESI†), suggesting the capability of high-throughput sequencing on the quantitative measurement of OG in DNA. In addition, we explored whether the OG-induced mutation rate could be modulated by flanking sequence context by employing high-throughput sequencing analysis of the primer extension products of synthesized DNA containing NN-OG-NN, where N is an equimolar mixture of A, T, C and G. Shown in Fig. S10 in ESI† is a schematic illustration of the procedures for the library construction. The sequencing result showed similar frequencies of the four natural nucleosides flanking the G-to-T mutation site (Fig. 8), indicating that the sequence context of flanking nucleosides does not play an apparent role on the Bsu-mediated G-to-T mutation at the OG site. Collectively, the above results suggested that, in conjugation with the Bsu-mediated DNA replication, the high-throughput sequencing is capable of mapping OG sites in DNA.
We next applied the Sanger sequencing method to detect OG at three genes (VEGFA, TP53, and KRAS) from DNA of HeLa cells. Because the endogenous level of OG in genomic DNA is relatively low and may not be easily detected by Sanger sequencing, we also performed the analysis of OG in H2O2-treated DNA of HeLa cells. Shown in Fig. S11 in ESI† is the schematic illustration of the analytical procedure. The results showed an obviously increased signal of thymine occurred in certain site of the H2O2-treated DNA of HeLa cells (Fig. 9), suggesting that this sequencing method can be applied for the analysis of OG at specific sites of interest in DNA. In addition, we noticed that the H2O2 treatment induced the formation of OG at the 5′-G in GG motif and at the middle G of GGG motif. This observation is in line with previous studies showing that the 5′-G in GG motif and the 5′- or middle G of GGG motif are more susceptible to oxidation.33,34
Sanger sequencing is appropriate for exploring the sequence information of small specific regions and is not suitable to afford sequence information of the whole genome. On the other hand, parallel analysis of the primer extension products with Bsu Pol and Tth Pol followed by high-throughput sequencing should be useful for detection of OG in genomic DNA. High-throughput sequencing can provide vast quantities of DNA sequencing data. However, sequencing errors can be introduced during PCR amplification including that mediated by OG.35 Lou et al.36 reported a circle-sequencing library preparation method that improved the error rate associated with high-throughput sequencing. They found that the addition of formamidopyrimidine-DNA glycosylase (Fpg) during rolling circle amplification dramatically eliminated the majority of errors caused by OG, which could be due to the removal of OG. In addition, Costello et al.37 and Chen et al.38 used the fact that OG leads to a global imbalance between variants detected in read 1 (R1) and read 2 (R2) in paired-end sequencing. The degree of this imbalance is correlated with the amount of OG present in a sample, which can be potentially used for genome-wide mapping of OG in DNA.
However, these studies mainly focus on lowering the error rate associated with high-throughput sequencing and the genome-wide mapping of OG in DNA has not yet been performed. Genome-wide mapping of OG in DNA with these methods entail repair-enzyme system to repair OG followed by comparing base calls between repaired and un-repaired samples, which requires sophisticated algorithm to analyse the sequencing data and then identify the precise location of OG in genomic DNA. In addition, the repair enzyme of Fpg does not act exclusively on OG. Apart from OG base, other modified bases can also be recognized and removed by Fpg, such as fapy-adenine (4,6-diamino-5-formamidopyrimidine), 5-hydroxycytosine and 5-hydroxyuracil,39,40 which will complicate the readouts of sequencing. It was also reported that this enzymatic repair is not totally accurate and might be a source of error.41 Moreover, different DNA polymerases may have different properties in incorporating nucleotides opposite OG. Not all DNA polymerases incorporate adenine nucleotide opposite OG during DNA replication and induce G-to-T transversion. Therefore, the properties of DNA polymerases used in the replication of OG-containing DNA fragments in library preparation need to be carefully evaluated if these analytical strategies will be used.
Extensive efforts in recent years have been made to modify standard high-throughput sequencing protocols to increase the fidelity. The sensitivity to detect rare variants has been reported in the range of 10−8 to 10−7 per base pair,42,43 with error rates being estimated to be as low as <10−11.42 Endogenous OG in DNA is approximately several OG per million nucleosides,25,44,45 which is in the similar range as those of 5-fdC (5-formyl-2′-deoxycytidine), 5-acdC (5-carboxy-2′-deoxycytidine), and m6dA (N6-methyl-2′-deoxyadenosine) in genomic DNA.46–51 Although the frequencies of these modifications are low in genomic DNA, genome-wide mapping of these modified nucleosides were successfully achieved by high-throughput sequencing.49,52 Hence, genome-wide mapping of OG in DNA should be achievable by high-throughput sequencing. Along this line, antibody-based enrichment of OG-containing DNA fragments can be carried out in the same way as that for mapping m6dA in DNA in the library preparation.49 The enrichment can further improve the abundance of OG in the DNA sample and then alleviate the burden of high-throughput sequencing.
Since we found that Tth polymerase can faithfully insert a cytosine opposite OG, the repair-enzyme treatment system is not needed in the library preparation with using Tth polymerase. The parallel analysis of the primer extension products with Bsu and Tth polymerase followed by high-throughput sequencing can provide distinct detection of OG in DNA. The analytical strategy is straight-forward and the analysis of sequencing data should be relatively easy to perform without the need of complex algorithm. Future application of this approach at the genome-wide scale will greatly increase our knowledge of the chemical biology of OG with respect to its epigenetic-like regulatory roles. Moreover, analysis of the dynamic changes of contents and sites of OG in genome under a variety of conditions (i.e., different stressors) will reveal the regulatory functions of OG in response to stresses.
| Footnotes | 
| † Electronic supplementary information (ESI) available. See DOI: 10.1039/c8sc04946g | 
| ‡ These authors contributed equally to this work. | 
| This journal is © The Royal Society of Chemistry 2019 |