Bruce
Chilton
,
Ruby J.
Roach
,
Patrick J. B.
Edwards
,
Geoffrey B.
Jameson
,
Tracy K.
Hale
and
Vyacheslav V.
Filichev
*
School of Food Technology and Natural Sciences, Massey University, Private Bag 11-222, Palmerston North 4442, New Zealand. E-mail: v.filichev@massey.ac.nz
First published on 27th August 2024
DNA G-quadruplexes (G4) formed in guanine-rich sequences play a key role in genome function and maintenance, interacting with multiple proteins. However, structural and functional studies of G4s within duplex DNA have been challenging because of the transient nature of G4s and thermodynamic preference of G-rich DNA to form duplexes with their complementary strand rather than G4s. To overcome these challenges, we have incorporated native nucleotides in G-rich sequences using commercially available inverted 3′-O-DMT-5′-O-phosphoramidites of native nucleosides, to give 3′-3′ and 5′-5′ linkages in the centre of the G-tract. Using circular dichroism and 1H nuclear magnetic resonance spectroscopies and native gel electrophoresis, we demonstrate that these polarity-inverted DNA sequences containing four telomeric repeats form G4s of parallel topology with one lateral or diagonal loop across the face of the quadruplex and two propeller loops across the edges of the quadruplex. These G4s were stable even in the presence of complementary C-rich DNA. As an example, G4 assemblies of inverted polarity were shown to bind to the hinge region of Heterochromatin Protein 1α (HP1α), a known G4-interacting domain. As such, internal polarity inversions in DNA provide a useful tool to control G4 topology while also disrupting the formation of other secondary structures, particularly the canonical duplex.
The G4 structure is composed of stacked arrangements of quartets of guanines, referred to as G-tetrads. A G-tetrad is a square arrangement of four hydrogen-bonded guanines arranged around a central cation, typically Na+ or K+ (Fig. 1A).4,5 Two or more of these tetrads can stack on top of each other, forming G4s which are highly polymorphic, with different orientations of strands and loops depending on the loop sequences, pH, salt concentration or level of molecular crowding. Parallel G4s feature G-tracts all oriented in the same direction and have purely anti-guanosine glycosidic linkages, whereas antiparallel topologies have G-tracts oriented in opposite directions and a mixture of syn- and anti-guanosine glycosidic bonds (Fig. 1B–D). Additionally, structures may differ in the length or orientation of loops, such as lateral, diagonal or propeller loops.
When duplex DNA is unwound, while the G-rich strand can form a G4, the complementary C-rich strand also has the potential to form a secondary structure known as an i-motif.6 This structure is typically detected at a slightly acidic pH (approximately 5.5) as cytosine has to be hemi-protonated, forming C:C+ base pairs (Fig. 1E). i-Motifs can also be formed at neutral pH.7
Both G4 and i-motif structures have been detected in vivo.8 Immunostaining human MCF-7 breast cancer cells showed that G4s were predominantly observed during the S phase and i-motifs were observed during G1 phase. Induced formation of one structure using ligands known to stabilise that secondary structure reduced formation of the other, suggesting that formation was mutually exclusive. In vitro, single-molecule force-ramp assays, using optical tweezers, were carried out on DNA containing complementary G- and C-rich regions on opposing strands.9 Structures were stretched and the forces required to unfold the secondary structures for sequences containing complementary G- and C-rich strands were compared in different buffers expected to allow formation of different structures. When both structures were expected to form, the force-extension curves indicated that only one secondary structure was ever present. However, when the two structures were offset sufficiently, rather than being directly opposite each other, both structures were observed. The prevalence of i-motif forming regions in the genome suggests that these structures could also have an important role in cellular function.6,7
G4s and i-motifs are recognised by many proteins found in cells, although the role of interactions of these structures with cellular components is not well understood from structural and functional perspectives. G4-interacting proteins that are necessary for genome function and maintenance include protein HP1α that maintains heterochromatin,3 DNA methyltransferases, such as DNMT3A,10 a DNA helicase Pif1,11 and many more.12In vitro biophysical experiments have shown that these interactions are topology specific.3,13
Further studies of these interactions require the ability to trap G4 topologies within duplex DNA strands, creating a more accurate model of DNA in cell nuclei. Studying non-canonical structures in these contexts, including large nucleosomal arrays, will advance our understanding of their importance in the genome. However, the free energy (ΔG) values for G4 formation compared to duplex formation primarily indicate that under physiological pH, temperature, and salt concentrations the canonical duplex is favoured over G4 structures.14–16 Single-molecule FRET experiments indicate that in solutions containing G-rich sequences and complementary C-rich sequences, a mixture of secondary structures is observed.17 Molecular crowding has been reported to encourage G4 formation,18 potentially indicating some preference for G4 structures in packed cellular environments. This suggests that G4 formation is transient in nature and obtaining accurate data requires the creation of large, accurate models of genomic DNA containing thermodynamically stable G4s. Inducing and controlling G4 and i-motif formation in vitro and in vivo is therefore a subject of considerable research. Several small-molecule ligands have been shown to stabilise certain non-canonical secondary structures, including G4s,19 but these ligands could interfere with protein-binding sites. These factors have to be considered when studying structural aspects of G4–protein interactions. Modified nucleotides, such as 2′-fluororibonucleic acid, have also been considered to create kinetically trapped G4s,20 but over time and upon heating and cooling down in the presence of the complementary strand, DNA duplexes are formed as the thermodynamic product.21
We hypothesised that DNA duplexes containing a folded thermodynamically stable G4 in the presence of a complementary C-rich strand could be obtained by introducing a polarity inversion in the G-rich sequence, 5′-3′-3′-5′ or 3′-5′-5′-3′. Such sequences create mismatches in a canonical duplex without affecting G4 formation (Fig. 2A and B). Polarity inversion can be accomplished using commercially available 3′-O-DMT-5′-O-phosphoramidites of native nucleosides, where the canonical positions of the DMT protecting group at 5′ and the phosphoramidite linking group at 3′ are interchanged. These modifications would not affect glycosidic bond configurations and G-tract orientations compared to native DNA. The only differences are the interaction with complementary DNA and the presence of a diagonal loop, which is absent in simple 5′-3′ DNA sequences predisposed to form parallel G4s. Canonical duplexes are antiparallel meaning the strands are oriented in opposite directions. With polarity inversion, some nucleotides will always be in an unfavourable orientation for duplex formation.
We have demonstrated that these inversions are not only able to stabilise G4s and affect their topology, they also almost completely disrupt duplex formation. However, the effective change to the topology/loop arrangements of the modified G4 has a negative impact on the binding of the HP1α protein used to test the interactions of modified G4s with a known G4-binding protein.
Name | Sequence | Complementary sequence, 5′-3′ | Name of complementary sequencesa |
---|---|---|---|
a There is no polarity inversion for the C-rich complementary sequences to the G-rich sequences with polarity inversion. | |||
tel | 5′-(TAG3T)2-3′ | (AC3TA)2 | c-tel |
Ttel | 5′-T(TAG3T)2-3′ | (AC3TA)2A | c-Ttel |
telT | 5′-(TAG3T)2T-3′ | A(AC3TA)2 | c-telT |
telTA | 5′-(TAG3T)2TA-3′ | TA(AC3TA)2 | c-telTA |
invtel-tel | 3′-(TG3AT)2-5′—5′-(TAG3T)2-3′ | (AC3TA)2(ATC3A)2 | c-invtel-tel |
invTtel-Ttel | 3′-(TG3AT)2T-5′—5′-T(TAG3T)2-3′ | (AC3TA)2AA(ATC3A)2 | c-invTtel-Ttel |
telT-invtelT | 5′-(TAG3T)2T-3′—3′-T(TG3AT)2-5′ | (ATC3A)2AA(AC3TA)2 | c-telT-invtelT |
telTA-invtelTA | 5′-(TAG3T)2TA-3′—3′-AT(TG3AT)2-5′ | (ATC3A)2ATTA(AC3TA)2 | c-telTA-invtelTA |
letT-Ttel | 5′-(TG3AT)2TT(TAG3T)2-3′ | (AC3TA)2AA(ATC3A)2 | c-invTtel-Ttel |
Unmodified G4s showed the characteristic peaks of antiparallel G4s in Na+ buffer (Fig. 3A), but in K+ buffer they formed a mixture of parallel and antiparallel topologies with a positive peak at 265 nm, a negative peak at 240 nm and a positive shoulder or a peak at 290 nm (Fig. 3B). DNA melting data shown in Table 2 also showed distinct T1/2 values for individual structures. A control sequence letT-Ttel had, as expected, the signature of antiparallel G4 in both buffers, a significant difference to the native bimolecular G4s.
Sequence | T 1/2 (Na+, °C, ±3 °C) | T 1/2 (K+, °C, ±3 °C) |
---|---|---|
a G4 topology: (p) denotes parallel topology, (ap) denotes antiparallel topology. (i) Estimated value as complex was only partially melted at 90 °C. | ||
tel | 37 (ap) | 37 (ap), 48 (p) |
Ttel | 34 (ap) | 48 (p) |
telT | 38 (ap) | 40 (ap), 45 (p) |
telTA | 38 (ap) | 50 (p) |
invtel-tel | 76 (ap) | 83 (p)(i) |
invTtel-Ttel | 34 (ap), 75 (p) | 84 (p)(i) |
telT-invtelT | 68 (ap) | 74 (p) |
telTA-invtelTA | 62 (ap) | 73 (p) |
letT-Ttel | 51 (ap) | 58 (ap) |
In Na+ buffer the sequences containing polarity inversions gave antiparallel CD profiles similar to the parent unmodified sequences, except for invTtel-Ttel, which gave a small positive peak at 255 nm (Fig. 3C). The overall shape, position and relative intensities of CD peaks of invTtel-Ttel are similar to those in the CD spectrum of G3T4G3 sequence, which forms a bimolecular antiparallel G4 with one G-quartet having the opposite hydrogen bond polarities to the other two Q-quartets (PDB structure 1FQP).31,32 However, as with the unmodified sequences, the addition of K+ ions resulted in a significant shift towards the parallel topology. The invtel-tel and telT-invtelT sequences gave mixed topologies similar to the unmodified parent sequences, but invTtel-Ttel and telTA-invtelTA with lengthened diagonal loops shifted completely to a parallel topology in K+ buffer, lacking the shoulder at 290 nm that is diagnostic for the presence of G4 with antiparallel topology (Fig. 3D). The number of components present in all solutions was further analysed using singular value decomposition (SVD) of the melting profile, described below. The longer diagonal loop (six nucleotides for invTtel-Ttel and telTA-invtelTA) alleviates strain on a parallel G4 topology leading to only parallel G4 for these sequences in K+ buffer. Loop length is, therefore, critical to enhance the stability of parallel G4 topology for polarity-inverted sequences over native sequences.
The thermal stability of the resulting G4s was analysed with CD spectroscopy recorded at temperatures in the range of 15–90 °C. Melting temperature (T1/2) is defined as the temperature at which half of the oligonucleotides are folded into G4s. T1/2 values (Table 2, Fig. SI-3 and SI-4†) for unmodified sequences in Na+ buffer were between 35–40 °C, significantly lower than T1/2 of letT-Ttel (51 °C), which was expected from the change in molecularity and strand length. Stability of antiparallel G4s with polarity inversions in Na+ buffer was even greater than that of letT-Ttel, with T1/2 in the range 62–76 °C. invTtel-Ttel showed a second melting in Na+ buffer at high temperature that was assigned to the parallel G4 species present as the CD peak of positive ellipticity shifted from 255 nm to 265 nm during melting at lower temperatures.
In K+ buffer the T1/2 of all G4s, parallel and antiparallel, was higher than in Na+ buffer. Moreover, the T1/2 values of the polarity-inverted sequences were higher than the T1/2 of letT-Ttel (58 °C) (Fig. SI-5†), with invtel-tel and invTtel-Ttel having T1/2 in excess of 83 °C. The sequences telT-invtelT and telTA-invtelTA, which had a lower T1/2 in Na+ buffer than the invtel-tel and invTtel-Ttel, also showed lower T1/2 in K+ buffer. Bimolecular G4s typically have lower thermal stability than unimolecular G4s. This is shown in the increased T1/2 of letT-Ttel compared to unmodified G4s, but the T1/2 values of modified G4s are increased further while also maintaining the parallel topology. Both the initial CD results and the T1/2 measurements show that the presence of inversions encouraged formation of G4s of parallel topology with increased stability. The Ttel and invTtel-Ttel were used to determine if hysteresis occurs for these sequences. In both cases, hysteresis was observed, but the invTtel-Ttel reformed more completely after cooling at the rate of 0.625° min−1 (Fig. SI-6†). This fits our expectation of a behaviour for a unimolecular G4 when compared to the bimolecular Ttel sequence.
The number of significant components in the solutions was assessed by applying SVD to the melting data using guidelines proposed by Gray and Chaires.33 Briefly,
D = USVT | (1) |
D is the CD-data matrix where each column represents a CD spectrum at a specific temperature. The column vectors of U contain the basis spectra showing the spectral form of each component. These are scaled according to significance according to the singular values in the diagonal matrix S. The columns of V show the T-dependent contribution of each basis spectrum.
The number of components can be assessed based on combined consideration of the: location of an elbow in a plot of the singular values vs. component number; relative variance of the singular values; amplitude of the S-scaled U vectors and decay of the first-order autocorrelation functions of the U and V matrix columns.33 Typical behaviour of systems considered to contain three components and the more prevalent two components are shown in Fig. SI-7 and 8,† respectively. Samples containing two components typically contained the G4 topology indicated in Table 2 and a spectrum corresponding to unstructured DNA. If SVD analysis indicated more than two components were present, the third component was an additional G4 topology with significantly less weight in the overall spectra.
SVD analysis of melting data generally agreed with the above interpretation of CD melting data. The unmodified sequences tel, Ttel and telT were all indicated to contain three components (Fig. SI-7†). The primary component is an antiparallel G4 in Na+ buffer and a parallel G4 in K+ buffer. The second component was unstructured DNA and the third component in all cases appears to be G4 with the opposite topology, with approximately 10% the weight of the first component. This was not the case for the sequences with inverted polarity and telTA. SVD analysis of these sequences shows only two components, folded and unfolded structures (Fig. SI-8†), with topologies as indicated in Table 2. The primary exception is invTtel-Ttel in Na+-containing buffer, which indicates three components and appears to switch from an antiparallel topology to a parallel topology before melting completely. Additionally, telT-invtelT contains a third component, an antiparallel G4, in both Na+- and K+-containing buffers which is slightly below the 0.8 cutoff value. In general, detailed SVD analysis agrees with the initial analysis of CD spectra indicating that the introduction of polarity inversions stabilises the parallel topology and promotes the shift from three component, mixed G4 topologies to two component, parallel G4 topologies, particularly in K+-containing buffers.
To evaluate if these G4s resolve to duplex DNA in the presence of their complementary strands (Fig. 4A–E) at room temperature, the G4 was formed then the complementary strand (24–28 nucleotides) added. To elucidate a thermodynamic product, this mixture was heated at 90 °C for 5 min and then slowly cooled down to 4 °C. To evaluate the stability of the G4, these mixtures were then separated by non-denaturing PAGE (Fig. 4A), along with the native and modified G4s and complementary strands as controls.
![]() | ||
Fig. 4 (A) 20% Non-denaturing PAGE showing duplex formation of Ttel and invTtel-Ttel with various complementary sequences (a composite image from a single gel). Lanes 1 and 7: oligothymidylate ladder; lane 2: Ttel; lane 3: Ttel + complementary strand after heating; lane 4: complementary strand c-invTtel-Ttel (control); lanes 5: invTtel-Ttel + complementary strand after heating; lane 6: invTtel-Ttel. See also, Fig. SI-9 in the ESI.† Strand concentration: 100 μM, buffer: 1 × TBE buffer supplemented with 100 mM KCl. Thermodynamic product is formed by mixing a G4 and a complementary sequence and heating the mixture at 90 °C for 5 min, then cooling slowly to 4 °C. (B) Possible products of interaction of native and polarity-inverted DNA sequences with complementary strands: products B and C are canonical duplexes formed by native G-rich DNA (e.g. Ttel) and complementary strands (e.g. (A) c-Ttel and (B) c-invTtel-Ttel); product D is a possible canonical duplex formed by polarity-inverted G4 (e.g. invTtel-Ttel) with two short complementary strands (e.g. c-Ttel) arranged in opposite directions; product E shows an expected disruption of a canonical duplex composed of a sequence with inverted polarity (e.g. invTtel-Ttel) with a complementary unidirectional strand of the same length (e.g. c-invTtel-Ttel). |
From Fig. 4A and SI-9† the lanes containing only G-rich sequences (lane 2 for native Ttel and lane 6 for invTtel-Ttel) showed very similar mobilities (corresponding to approx. T10 marker of the oligonucleotide ladder in lanes 1 and 7) regardless of strand length, which is commonly seen when compact G4s are formed. The results shown in Fig. 4 and SI-9A† suggest that all sequences with inverted polarity could disrupt duplex formation. The unmodified G4 controls, shown in lane 2 on each gel, each had mobility around that of the T10 marker, with a new band appearing with lower mobility in lane 3 indicating formation of the duplex products (Fig. 4C) in the presence of complementary DNA. These bands are obviously distinct from the native G4 and its complementary strand, indicating formation of a new species. This is most obvious in the Ttel sample with new bands appearing around the T15 and T25 markers for short and long complementary sequences, respectively.
In contrast, upon addition of a complementary strand (e.g. lane 5 in Fig. 4A), multiple bands, corresponding to the individual components as seen in lanes 4 and 6 of Fig. 4A, were observed, rather than formation of a duplex product such as the one shown in lane 3 of Fig. 4A for the unmodified sequence. Similar gels for telTA-invtelTA and invtel-tel are shown in Fig. SI-9B and C,† respectively. For invTtel-Ttel only faint low retarding bands were observed in the thermodynamic product suggesting that the putative duplexes shown in Fig. 4D and E were not formed. For telTA-invtelTA and invtel-tel (Fig. SI-9B and C,† respectively), formation of duplexes C or D along with individual components were already observed at room temperature (lanes 9 and 11 for telTA-invtelTA and lane 6 for invtel-tel). Duplex products were more dominant after samples were heated and cooled down in the lanes for thermodynamic products. Preliminary CD and non-denaturing PAGE results demonstrate that the use of polarity-inverted nucleotide sequences in G4s inhibited duplex formation and improved G4 stability, particularly for the invTtel-Ttel sequence. However, the bands on native gels could not be quantified to show how effective these modifications were at disrupting duplex formation. Subsequently, 1H NMR spectroscopy was used as a method for more accurately ascertaining the secondary structure composition of our DNA mixtures.
For 1H NMR analysis we focused on the region from 10.5 to 14.5 ppm, which corresponds to the imino protons of guanosine. The chemical shift of these peaks differs considerably depending on hydrogen-bonding arrangements allowing for secondary structures to be readily distinguished, making this an ideal technique for comparing various secondary structures.34 G4s typically have chemical shifts of imino protons around 11–12 ppm, whereas for the canonical Watson-Crick duplex the imino protons appear around 13–14 ppm, and for i-motifs around 15–16 ppm. We also narrowed our initial focus to the invTtel-Ttel sequence which had shown a significant preference for parallel G4 topology, the greatest increase in T1/2 compared to the unmodified parent sequence, and almost no duplex formation on non-denaturing PAGE in the presence of its complements.
The 1H NMR spectra for Ttel, shown in Fig. 5A and D, was consistent with the expected G4 structure for this sequence. According to the spectra, in Na+ buffer a mixture of G4 topologies is present as evident from the appearance of multiple peaks of various intensities at 10.5–12 ppm. Upon addition of K+ (as KCl) a topology switch was observed for native Ttel, resulting in the symmetrical G4 reported for bimolecular tel sequences previously.35 This is also observed for invTtel-Ttel, suggesting that the polarity inversions did not significantly change the G-tetrad arrangement. On the other hand, the native control letT-Ttel does not exhibit a single conformation, instead forming a mixture of topologies in Na+ and K+ buffers. Variable temperature experiments were carried out (Fig. SI-10†), which confirm that the G4-containing polarity inversion yields significantly more thermally stable G4 species than the unmodified Ttel G4.
Formation of symmetrical G4s is evident from appearance of distinct imino signals having 2:
1
:
1
:
2 ratio in integrated signals at 10–12 ppm which is halved for what can be expected for a G4 formed by twelve guanosines. Interestingly, invTtel-Ttel also contains an additional peak at 14.5 ppm, which indicates the presence of Watson–Crick A–T base pairing occurring in the new elongated central loop. This peak is not observed for letT-Ttel, suggesting that this base pairing is only possible with the inversions present. Integration of this signal gives a value of 0.58 which means that there is one A–T base-pair in a G4 with twelve guanines.
Next, the duplex-formation experiments analysed by non-denaturing PAGE were replicated using 1H NMR spectroscopy. When Ttel and c-Ttel were mixed, as shown in Fig. 6A, peaks began to appear around 13–14 ppm immediately, and after three days almost no peaks were visible in the 11–12 ppm region. Thus, G4 structures began to refold into duplexes almost instantly. Heating of samples followed by slow annealing resulted in a complete switch to the duplex showing that the duplex is both the kinetic and thermodynamic product of mixing these two sequences. However, when invTtel-Ttel and c-invTtel-Ttel were mixed, as shown in Fig. 6B, almost no change was observed, and this was also the case after the mixture was heated and then cooled down. Peaks of low intensity were observable from 13–14 ppm, suggesting that an equilibrium is established, but this equilibrium strongly favours the G4 and not the duplex. One notable shift is the disappearance of the peak at 14.5 ppm, suggesting that the complementary sequence had some interaction with the loop, but doesn't appear to interact with the G4. Overall, these results, on adding (partially) complementary strands establish that the introduction of polarity inversions in G4s significantly diminishes Watson-Crick-duplex formation, thereby increasing the stability of alternative secondary structures such as G4s.
Name | Sequence | Length | G4 topologya |
---|---|---|---|
a 20 mM sodium phosphate buffer, 10 mM KCl, pH 7.0. | |||
Ttel-5′-Tail |
![]() |
40 | Parallel |
invTtel-Ttel-5′-Tail |
![]() |
40 | Parallel |
c-Ttel-5′-Tail | 5′-(AC3TA)2A2(ATC3A)2T4CA2TACATGC-3′ | 40 | — |
Ttel-3′-Tail |
![]() |
39 | Anti-parallel |
invTtel-Ttel-3′-Tail |
![]() |
39 | Parallel |
c-Ttel-3′-Tail | 5′-G2CG2C2GCT4(AC3TA)2A2(ATC3A)2-3′ | 39 | — |
5′-Tail | 5′-GCATGTAT2G-3′ | 10 | — |
c-5′-Tail | 5′-CA2TACATGC-3′ | 10 | — |
3′-Tail | 5′-GCG2C2GC2-3′ | 9 | — |
c-3′-Tail | 5′-G2CG2C2GC-3′ | 9 | — |
Oligo B | 5′-TG3T2G3T2G3T2G3T2G3T2G3T2G3T2G3T-3′ | 40 | Parallel |
Oligo 2G | 5′-TG3T2AG3T2AG3T2AG3TG3T2AG3T2AG3T2AG3T-3′ | 45 | Antiparallel |
Sequences with inverted polarity favour a parallel G4 topology, even in Na+ buffer. In all cases the shoulder is observed at 290 nm, although this is least pronounced for invTtel-Ttel-5′-Tail. This shoulder is possibly caused by the presence of single-stranded DNA, which can give lower intensity CD signals at a range of wavelengths. The sequences added at either end were also tested independently with their complements and were shown to form stable duplexes, as shown by CD spectroscopy in Fig. 7C and 1H NMR spectroscopy in Fig. 8.
In 1H NMR experiments (Fig. 8, SI-11 and 12†), the unmodified controls do not have the same symmetrical features observed previously for the bimolecular G4s, instead giving spectra more similar to letT-Ttel. The symmetrical structure is still partially preserved in the case of polarity-inverted G4s, although individual peaks in the imino region are not as clearly defined as for the sequences without tails. Overall, this indicates that the original G4 structure was only preserved when inversions were introduced. The 9-mer 3′-Tail has five cytosines which contribute to formation of a self-complementary duplex resulting in distinct peaks at 13–13.5 ppm for G4 alone (Ttel 3′-Tail, Fig. 8A, spectrum ii). This structural feature is suppressed in the polarity-inverted sequence (invTtel-Ttel-3′-Tail, Fig. 8B, spectrum iii). Addition of the complementary strand to native sequences showed similar results to the bimolecular G4s, rapidly unfolding G4 and forming antiparallel duplexes (Fig. 8A and C). In both cases the duplex appears to be both the kinetically and thermodynamically favoured product. Addition of complementary C-rich strands to the polarity-inverted sequences (Fig. 8B and D) showed the appearance of peaks at 12.5–14 ppm, corresponding to duplex formation, but the peaks in the 11–12 ppm range remained mostly unchanged. This indicates that duplexes are formed but comparing these peaks to the tail only controls suggests that duplex formation occurs primarily within the tails. This means that non-G4 regions formed antiparallel duplexes, whereas the polarity-inverted G-rich regions formed stable G4s and duplex formation in this region was disrupted. Heating and slow annealing of these samples also caused very little change in 1H NMR spectra, suggesting that the mixed G4/duplex structures are the thermodynamically favoured products.
As previously shown,3 HP1α binds to the parallel G4 formed by Oligo B but not to the antiparallel G4 (Oligo 2G), as shown in Fig. 9B. To our surprise, wild-type HP1α also showed little to no affinity for modified (polarity-inverted) G4s of parallel topology either before (Fig. 9B) or after (Fig. 9C) inclusion of a duplex forming tails.
Since HP1α′s charged lysine patches at residues 89–91 and 104–106 located in the unstructured hinge region (Fig. 9A) have been implicated in G4 binding we decided to test if the HP1α hinge binds to our synthetic constructs. A his6-tagged hinge of HP1α (Fig. SI-13†) was recombinantly expressed in E. coli and purified. The hinge was also immobilised on nickel sensor tips, and the tips then immersed in a K+ solution containing a G4 with a polarity-inverted sequence or a control native G4 or a duplex. In this case, we observed strong binding of the HP1α hinge domain for G4s with a polarity-inverted sequence (Fig. 9D). Table 4 shows the dissociation constants, KD, which were highest for duplexes (i.e. very weak binding), followed by the native G4s, but, surprisingly, polarity-inverted G4s showed almost a five-fold increase in binding affinity over unmodified G4s (smallest KD). This suggests that lysine patches in the unstructured hinge region of HP1α facilitate binding to unmodified and polarity-inverted G4s whereas structured chromo- and chromoshadow domains are responsible for the selectivity of wild-type HP1α binding to G4s.
Name | Sequences | k on/(1000 M−1 s−1) | k off/(10−3 s−1) | K D (nM) |
---|---|---|---|---|
a The error (estimated standard deviation) in the least significant digit(s) is provided in parentheses. | ||||
3′-Tail G4 control |
![]() |
85.6 (19)a | 0.92 (11) | 10.8 (13) |
5′-Tail G4 control |
![]() |
39 (3) | 15.4 (7) | 390 (3) |
3′-Tail duplex control |
![]() |
64 (8) | 53 (4) | 840 (13) |
5′-Tail duplex control |
![]() |
32 (11) | 28 (3) | 900 (3) |
3′-Tail Inv-G4 |
![]() |
85.2 (9) | 4.85 (9) | 56.9 (12) |
5′-Tail Inv-G4 |
![]() |
149 (9) | 39.1 (15) | 263 (19) |
We proposed that several antiparallel G4s did not bind to HP1α due to the presence of lateral or diagonal loops blocking binding sites.3 We tentatively conclude that the location of the introduced polarity inversions creates an additional diagonal loop, causing the modified sequences to behave similarly to antiparallel G4s with respect to HP1α binding. Our telomeric G4s based on polarity-inverted sequences form a symmetrical G4 of parallel topology but all of them have a diagonal or lateral loop that shields one of the external G-tetrads (seen in Fig. 2D and E). This feature, which is typical for antiparallel G4s, might prevent our polarity-inverted G4 constructs from interacting with HP1α. We also infer that HP1α selectivity may be determined not just by G4 topology, but also by the specific loop arrangements seen primarily in antiparallel G4s.
We also explored the binding of these modified DNA strands to a known G4-binding protein, HP1α, an essential protein for chromatin maintenance. The polarity inverted DNA molecules did not bind to wild-type HP1α, but formed complexes with the isolated hinge region that connects the two structured domains of HP1α. This hinge region lacks some of the specificity of the wild-type protein, suggesting that the introduction of an additional loop, characteristic of antiparallel G4s, may explain the lack of binding of the wild-type HP1α to the modified species. However, the affinity of the hinge region suggests that parallel G4-forming sequences with polarity inversion do not inherently disrupt protein interactions. Some proteins have been shown to bind to antiparallel G4s, such as POT1, which showed specificity for antiparallel structures, or SP1, which bound both parallel and antiparallel G4s.12 The interactions of G4-binding proteins such as these may not be disrupted by the introduction of lateral or diagonal loops in the same way as HP1α, and their interactions with sequences containing inversions are worth investigating. Furthermore, these experiments demonstrate the ability of polarity-inverted nucleotides to stabilise G4s and disrupt duplex formation. These results could be applied to other known G4-forming sequences, obtaining G4s with similar properties without introducing undesirable structural elements.
The potential uses for this type of modified sequence are extensive. The initial intended application is to construct large DNA structures containing thermodynamically stable G4s. This would allow for better modelling of DNA as it exists in cells while ensuring the presence of thermodynamically unfavourable secondary structures (such as G4s) during longer duration experiments, such as NMR or small-angle X-ray scattering (SAXS). Other applications include those explored previously, both for polarity-inverted nucleotides and modified DNA in general. Increased stability of G4s could be used to inhibit the activity of proteins whose function is dependent on duplex DNA, a possibility that has already been explored in the context of G4-binding ligands.37,38 Development of better DNA-based inhibitors could be improved by using polarity-inverted nucleotide sequences. Similarly, the affinity of DNA aptamers could be improved through the use of broader nucleotide libraries, potentially including polarity-inverted nucleotides. The data presented here serve to demonstrate the effectiveness of these nucleotides for stabilising non-canonical, thermodynamically unfavourable DNA secondary structures in short sequence while in the presence of complementary DNA. Further experimentation is necessary to determine the implications of these results on the development of new DNA technologies.
DNA synthesis was carried out using a Mermade 4 DNA/RNA automated synthesiser. Controlled pore glass supports carrying the first nucleoside were obtained from Dnature Diagnostics and Research Ltd (New Zealand). Standard 5′-O-DMT-3′-O-phosphoramidites of nucleosides were obtained from Innovassynth Technologies (I) Ltd (India) and inverted 3′-O-DMT-5′-O-phosphoramidites were obtained from Chemgenes Corporation (USA). The coupling time of inverted phosphoramidites was increased to 10 min. After the synthesis, oligodeoxynucleotides were cleaved from the solid support with 28% aq. ammonia at 55 °C for 12 hours. Ammonia was evaporated using a speed-vac Eppendorf Concentrator Plus and oligodeoxynucleotides were purified using Thermo Scientific UltiMate 3000 UHPLC with an Alltech 250 mm × 4.6 mm, 10 μm Hypersil Gold column (Reverse-Phase) or a TSKgel SuperQ-5PV 7.5 mm I.D. × 7.5 cm, 10 μm column (Ion-Exchange). Purified oligodeoxynucleotides were desalted using NAP-5 size exclusion columns. Sequences were verified using mass spectrometry recorded in 15% methanol/H2O using electrospray ionisation MS with a Thermo Scientific Q Exactive Focus Hybrid Quadrupole-Orbitrap Mass Spectrometer. Masses are reported in atomic mass units (a.m.u.).
Folded oligonucleotides (100 μM) were separated on a 20% polyacrylamide gel (PAGE) in 1× TBE buffer (90 mM Tris-Borate and 2 mM Na2H2EDTA, pH 8.3), with either 100 mM KCl or NaCl added, at 4 °C. A 20% aq. glycerol solution was added to increase sample density for loading. After electrophoresis (5–10 W), the gel was rinsed with H2O, stained with 0.35% Stains-All (Merck) in 50% formamide:
H2O for 15 min, destained in H2O, and then imaged. 20% denaturing PAGE was prepared in 1 × TBE Buffer (pH 8.3) with 7 M urea. For denaturing PAGE, oligodeoxynucleotide samples were heated at 90 °C for 5 min and then cooled to room temperature prior to loading on the gel.
Circular dichroism (CD) spectra were recorded using a Chirascan CD spectrophotometer (150 W Xe arc) from Applied Photophysics with a Quantum Northwest TC125 temperature controller. Oligodeoxynucleotides were diluted to 10 μM in the reported buffers. Scans were taken over a range of 220–350 nm with a bandwidth of 1 nm and response of 0.25 s. A buffer-only baseline was subtracted from each CD spectrum before they were smoothed by averaging 10 neighbour points using software provided by Applied Photophysics Ltd. Melting experiments were performed by recording CD spectra every 2.5 °C with equilibration for 2.5 min at each temperature from 15 to 90 °C. The signal at maxima and minima was assessed and values were converted to fraction folded (θT) using the formula:
θT = (θ − θmin)/(θmax − θmin) |
θ is the CD signal at that temperature while θmax and θmin are the signal when completely folded and completely unfolded, respectively. T1/2 is the temperature at which half of the structure is unfolded (θ = 0.5). T1/2 is not reported for sequences which did not completely unfold within the specified temperature range. For SVD analysis, raw data were truncated outside the range 225 to 340 nm to remove artifacts. Baseline correction was performed by subtracting the average intensity in the range 320–340 nm from each spectrum. Data were smoothed using the algorithm using a Savitzky–Golay filter with a 9-point window and polynomial order of 2. The script written in Python is provided in ESI† along with instructions for using it on https://www.Jupyter.org.
1H NMR spectra were recorded using a Bruker 700 MHz spectrometer using trimethylsilylpropanoic acid (TSP) as an internal standard. Chemical shifts are reported in parts per million (ppm) relative to the TSP methyl signal at 0.00 ppm. Spin multiplicities are described as: s (singlet), br.s. (broad singlet), d (doublet), dd (doublet of doublets), t (triplet), q (quartet), m (multiplet). Coupling constants are reported in Hertz (Hz).
Biolayer interferometry (BLI) using a BLItz system (ForteBio, USA) was used to examine the binding of HP1α to a range of oligodeoxynucleotides as indicated at room temperature. Ni-NTA biosensors (ForteBio) were hydrated with 1 × interaction buffer (IB) containing 100 mM KCl, 50 mM NaCl, 20 mM NaH2PO4/Na2HPO4, pH 8.0. G4s were folded in the same buffer using the method described above. 4 μL of 100 μg mL−1 his-tagged HP1α was used to load the Ni-NTA biosensor for 5 min to reach ∼4 nm of signal. The biosensor was then washed with 1 × IB. The association step was performed using 2 μM solutions of oligodeoxynucleotides prepared in 1 × IB or just 1 × IB (reference) for 5 min, then the dissociation step was performed using 1 × IB for 5 min. Reference runs were subtracted from test runs to account for dissociation of the protein. Oligodeoxynucleotides were tested for interaction with Ni-NTA biosensor tips prior to experiments. BLItz Pro 1.2 software was used for curve fitting and KD calculations.
Footnote |
† Electronic supplementary information (ESI) available: Supplementary experimental details about the synthesis of modified oligodeoxynucleotides and RP-HPLC profiles and HRMS (ESI) spectra of oligodeoxynucleotides, CD spectra, DNA melting experiments, PAGE, Python script and instructions for SVD analysis of CD melting curves of G4-DNA and examples of SVD analysis. See DOI: https://doi.org/10.1039/d3sc05432b |
This journal is © The Royal Society of Chemistry 2024 |