Triplex-mediated analysis of cytosine methylation at CpA sites in DNA †

Modified triplex-forming oligonucleotides distinguish 5-methyl cytosine from unmethylated cytosine in DNA duplexes by differences in triplex melting temperatures. The discrimination is sequence-specific; dramatic differences in stabilisation are seen for CpA methylation, whereas CpG methylation is not detected. This direct detection of DNA methylation constitutes a new approach for epigenetic analysis.

The existence of 5-methyl cytosine ( Me C) in genomic DNA has been known for more than 60 years. 1 Although it has been established as an epigenetic marker involved in gene expression and transposable element suppression in mammals, plants and fungi, its full range of functions are still not fully understood. 2In mammals cytosine methylation is estimated to occur at 70-80% of CpG sites in the genome. 3Non-CpG methylation also occurs, most notably in mammalian embryonic stem cells (ESCs), 4 plants and insects. 3,5Higher relative levels of methylation of CpA sites are observed in pluripotent ESCs compared to differentiated cells, so CpA methylation is suggested to play a role in the origin and maintenance of the pluripotent lineage. 4n order to develop a complete understanding of the biological consequences of cytosine methylation in DNA, sequence-specific detection methods for Me C are urgently required.Several indirect methods exist, 6 the most widespread being bisulphite sequencing.Treatment of DNA with bisulphite converts C to U, leaving Me C unchanged; subsequent sequencing of the treated and untreated samples allows for differentiation between C and Me C. 7 However, this method is labour-intensive, much of the DNA is destroyed in the process, and incomplete conversion of C to U leads to sequencing errors.2c,6,7b Direct detection methods have been investigated, 8 but are not in general use.
With a view to providing sequence dependent detection of DNA methylation we investigated the potential of triplex formation to discriminate Me C from C. A series of modified triplex-forming oligonucleotides (TFOs) were previously shown by us to yield stable triple helices at neutral pH with duplexes containing all four Watson-Crick base pair combinations (AT, TA, GC, CG). 9 In these studies several analogues of N-methylpyrrolo-dC were investigated for improved CG recognition (Fig. 1). 10 It was noted that when the modified nucleotide X P was placed opposite to Me CG instead of CG, the thermal stability of the triplex was greatly decreased.This is illustrated in the denaturation curves in Fig. 2.
The incorporation of Ph P into the TFO opposite a CG or Me CG base pair in the duplex provided the largest difference in triplex denaturation temperature relative to other X P monomers.Replacement of Ph P by thymine (X = T) gave no discrimination between Me C and C (Table 1 and Table S1 †).The sequence dependence of triplex melting was next evaluated using TFOs containing Ph P and several duplexes containing C and Me C. The epigenetically relevant CpG and CpA dinucleotide sequences were incorporated in some of the DNA duplexes (Table 2).The duplexes in set 3 hp and set 5 hp are hairpins linked via a hexaethylene glycol spacer.
The melting temperatures of the DNA duplexes (Table 2) containing Me C and C flanked by adenine or guanine are shown in Table 3.The differences in denaturation temperature (DT m ) between equivalent Me C and C duplexes followed the trend AYA > GYA > AYG > GYG.Triplex discrimination of DNA sequences containing CpA is particularly striking.For the GYG containing duplex (set 2) a much lower triplex melting temperature was observed irrespective of the incorporation of Me C or C in the duplex DNA.This is due to the presence of a TA inversion (G.TA triplet) in the triplex.
The triplex stability of set 2 was also investigated with X = T in place of Ph P. As expected, no significant difference was seen in the T m values (T m = 16.9 1C when Y = C, T m = 17.0 1C when Y = Me C).Set 3 and set 3 hp are identical in sequence (Table 2), and show little difference in triplex T m , indicating that hairpin and two-stranded duplexes have very similar thermal properties.C/ Me C discrimination in different triple helices with the same central tri-and dinucleotide sets is not quite identical, but the similarities between sequences with identical central motifs far outweigh the influence of the wider sequence context.
The same melting studies were performed at pH 5.8, and as expected the triplex melting temperatures were slightly higher compared to those at pH 6.2 (Table S5 †).This is because at pH 5.8, the cytosine base will be protonated, leading to stabilisation of the triplex compared with higher pH values. 11The lower pH did not impact significantly upon triplex discrimination of C and Me C containing DNA duplexes.The guanidinylated pyrrolopyrimidine analogue 10 G P was then evaluated at pH 5.8 in set 2 and set 5 to determine if this would show better discrimination towards Me CpG.This was not the case and G P was not investigated further (Table S1 †).
Other epigenetic modifications of importance include 5-hydroxymethylcytosine ( hm C), 5-formylcytosine ( f C), and 5-carboxycytosine ( ca C).These are considered to be either intermediates in cytosine demethylation or epigenetic marks in their own right. 12To test the ability of triplex probes to differentiate between these cytosine modifications, the interactions between DNA duplexes (sets 1 and 2, Table 2) and TFOs containing hm C, f C and ca C were evaluated at pH 5.8 (Table 4).Final T m values are averages of at least three measurements.a A transition is also seen at lower temperature, most likely due to weak duplex secondary structure.It is also observed when the T m is measured for the duplex only, and is not seen in the hairpin structure (Tables S2 and S3 †).For set 1, ca C and especially f C were found to afford triplexes of higher thermal stability than Me C and hm C, though still lower than the T m found for C.Although these differences are modest, there is promise that the triplex approach could be used to distinguish f C from the other C-analogues.The triplex melting for set 2 was the same irrespective of the cytosine modification incorporated in the duplex.
The key observation from the above studies is that TFOs containing Ph P opposite CG base pairs at CpA sites strongly stabilise triplexes, but produce very unstable triplexes if the C is replaced by Me C, hm C, and to a lesser extent f C and ca C. The structural basis of this has not yet been elucidated, and although the triplex structures in Fig. 1 explain the difference in stability between C and Me C triplexes, they do not explain the observed sequence dependence of triplex melting.This may be influenced by the stability of the triplex region immediately surrounding Ph P in addition to other factors.It is noteworthy that 5-methylation of C enhances hydrophobic interactions and has been found to influence DNA intercalation. 13ntercalation of Ph P in addition to, or instead of hydrogen-bonding (Fig. 1) cannot be ruled out, in which case the intercalation energy could be influenced by methylation of the duplex.It is also known that CpA methylation affects duplex structure in a manner distinct from CpG methylation.While CpG methylation appears to increase the rigidity of DNA, methylation at CpA sites appears to cause local conformational changes, and an overall increase in the curvature of DNA. 14It is possible that this structural change is responsible for the remarkable difference in stabilisation seen with Ph P in the different triplex sequence contexts.
It is surprising that CpA and non-CG methylation in general (CpH methylation) has not previously been the focus of systematic investigation; CpA is the second most common methylation site in most cell types, 4,14a,15 and CpA sites are more abundant than CpG sites. 16,17CpH methylation has been linked to silencing of cancer genes in lymphoma and myeloma cell lines, 18 and significant levels of CpH methylation have also been found in stably integrated plasmids 19 and in human skeletal muscle. 20Interestingly, bisulphite sequencing is thought to underestimate the amount of CpH methylation unless primers are carefully designed to take this into account. 20CpA is thus an interesting and important target for methylation detection.
In conclusion, when the Ph P nucleobase analogue is incorporated in TFOs, its effect on triplex stability in certain sequence contexts is strongly dependent on the methylation status of the cytosine directly opposite in the purine strand of the duplex.This method of determining cytosine methylation status has the advantage of being non-destructive towards the analysed DNA and does not require denaturation of double-stranded helices.The findings reported here demonstrate the promise of triplex probes for the determination of methylation status at specific DNA duplex sequences, as well as detection of other epigenetic marks such as f C. Systematic studies are underway on chemically modified TFOs in an attempt to extend the range of cytosine sequences containing epigenetically relevant modifications that can be analysed by this novel approach.Work is also in progress on epigenetically modified systems to enable us to fully understand the observed sequence-dependence of triplex stability.
This research was funded by BBSRC grant BB/I022791: ''Detecting cytosine methylation at the single DNA molecule level.''Assistance with oligonucleotide synthesis from ATDBio is gratefully acknowledged.

Fig. 1
Fig. 1 Proposed structure for the X P.CG and X P. Me CG triplets.H-bonds are shown with dashed lines, while for C, a potential dipolar C-HÁ Á ÁO interaction is shown with a hashed line.R 1 = 2 0 -deoxyribofuranose.Also shown are the modified nucleotides incorporated in the triplex forming oligonucleotide sequence and the DNA duplex (set 1) studied.M = Me C, Y = C or Me C.

Fig. 2
Fig. 2 Thermal denaturation of set 1 triplexes with Ph P. (A) UV absorption at 260 nm recorded as a function of temperature from 15 1C to 80 1C.(B) Smoothed first derivative of the thermal denaturation curves shown in A. The first transition represents triplex denaturation (TFO dissociation), the second is denaturation of the duplex.10 mM phosphate buffer, 200 mM NaCl, 1 mM EDTA, pH 6.2.Ph P.CG (-), Ph P. Me CG ( ).

Table 1
Melting temperatures of triplex to duplex transitions at pH 6.2 with modified nucleotides in the TFO m data measured in 1C.All experiments performed with oligonucleotide set 1. 10 mM phosphate buffer, 200 mM NaCl, 1 mM EDTA, pH 6.2.Final T m values are averages of at least three measurements, except where marked*, which is the average of two measurements.

Table 2
Triplex sequences investigated M = Me C, X = Ph P, Y = C or Me C, H = hexaethylene glycol.

Table 3
Melting temperatures of triplex to duplex transitions at pH 6.2 m data measured in 1C.Y = C or Me C. 10 mM phosphate buffer, 200 mM NaCl, 1 mM EDTA, pH 6.2.

Table 4
Melting temperatures of triplex to duplex transitions for duplexes with other epigenetic marks in addition to Me C at pH 5.8 m data measured in 1C.Y = C, Me C, hm C, f C or ca C as written.10 mM phosphate buffer, 200 mM NaCl, 1 mM EDTA, pH 5.8.Final T m values are averages of at least three measurements.