A chair-type G-quadruplex structure formed by a human telomeric variant DNA in K+ solution

The chair-type G-quadruplex structure formed by human telomeric variant DNA.


Introduction
Telomeres are highly repetitive DNA regions located at the ends of linear eukaryotic chromosomes. Their function is to protect the terminal ends of chromosomes from being recognized as damaged DNA and to support faithful chromosome replication during each cell cycle. 1,2 Human telomeric DNA contains tandem repeats of the sequence 5 0 -GGGTTA-3 0 . 3 Under physiological ionic conditions, this guanine-rich strand can fold into a variety of four-stranded G-quadruplex structures involving Gtetrads, [4][5][6][7][8][9] which are important for telomere biology and are currently attractive targets for the development of anti-cancer drugs. [10][11][12][13] Many different G-quadruplex topologies are known 4-8 and the four-repeat human telomeric G-rich sequences can adopt a range of intramolecular G-quadruplex structures. 8,14 Eight different unimolecular G-quadruplex structures of various human telomeric DNA sequences containing four canonical GGGTTA repeats have been solved by NMR or X-ray crystallography under different experimental conditions (Fig. S1 †). [15][16][17][18][19][20] All of these structures contain the 21 nt human telomeric sequence d[(GGGTTA) 3 GGG], termed htel21, which should be the shortest sequence in length for the formation of an intramolecular G-quadruplex.
We questioned how the sequence variants with single or double nucleotide substitution in the TTA loops found as subtelomeric repeats in human chromosomes affect the Gquadruplex fold. Based on bioinformatics studies and with the use of CD and NMR spectroscopy, we have found a variant telomeric DNA htel21T 18 that has a T substitution at A18 of htel21 and showed that it adopts a chair-type monomolecular Gquadruplex with three G-tetrad layers which was hitherto unknown among human telomeric quadruplex forms. In this structure, the loop-loop interactions are mediated by the reverse Watson-Crick A6$T18 base pair and, in addition, there is a hydrogen bond between T5 and T16. In the htel21T 18 Gquadruplex the loops are successively edgewise; glycosidic conformation of guanines is syn$anti$syn$anti around each tetrad, and each strand of the core has two antiparallel adjacent strands.
Bioinformatics studies have shown localizations of htel21T 18 and its repeats in the subtelomeric regions of human chromosomes 8, 11, 17, and 19 as well as in the subcentromeric region of chromosome 5. Interestingly, the sequence htel21T 18 can also be localized in a DNase hypersensitive region, implying that this chromosome segment has a propensity to form a chair-like G-quadruplex in vivo. This novel G-quadruplex form expands the repertoire of known G-quadruplex folding topologies and may provide a potential target for structure-based anticancer drug design.

Results
A human telomeric variant DNA, htel21T 18 , forms a stable intramolecular G-quadruplex structure in K + solution.
We screened the human genome using single htel21 variant DNAs which contain single A-to-T and T-to-A and double TT-to-AA substitutions at the various thymine and adenine positions as inputs for BLAT search (http://genome.ucsc.edu/). As shown in Fig S2, † ten 21 nt human telomeric variants were found in the human genome.
The htel21 appeared to form a mixture of G-quadruplex conformations in the presence of K + , as indicated by the 1D 1 H NMR spectrum (Fig. 1a) and the CD spectrum (Fig. 1c).
The 1D 1 H NMR spectra of the htel21T 18 sample were recorded as a function of temperature. At 50 C, the two peaks at $13.4 ppm and $9.6 ppm became broadened due to the exchange with the solvent, while the remaining 12 peaks remained sharp, suggesting that these two peaks did not belong to the G-tetrad core (Fig. S3a †).
The 1D 1 H NMR spectrum of the htel21T 18 sequence in K + solution showed 12 well-resolved imino proton resonances at 10-12 ppm with sharp line widths (Fig. 1b), clearly indicating the formation of a predominant unimolecular G-quadruplex structure. Minor conformations were also present with peak intensities less than 5% when compared with those of the major species and thus did not interfere with the structural analysis of the predominant G-quadruplex structure.
To conrm the molecularity of htel21 and htel21T 18 , we performed gel electrophoresis using dimeric 93del, d [GGGGTGGGAGGAGGGT], and the monomeric human telomere d[TAGGG(TTAGGG) 3 ] as references. The bands corresponding to htel21 and htel21T 18 migrated at a similar position and faster than that of h-telo, indicating that both of these samples formed unimolecular G-quadruplex folds (Fig. S3b †).

CD signature
The circular dichroism (CD) spectrum (Fig. 1c) of htel21T 18 in K + solution displayed two positive absorption peaks at $250 and $290 nm and a trough at $260 nm. The 290 nm peak was characteristic of opposite-polarity stacking of G-tetrads, 21 suggesting that the sequence largely conforms to antiparallel Gquadruplexes in K + solution. 20,22,23 As shown in Fig. 1c, the CD spectrum of htel21T 18 displayed a similar prole to that of htel21, except a shoulder for htel21 at 270 nm probably caused by the conformational heterogeneity.
Resonance assignment and glycosidic torsion angle determination of the htel21T 18 G-quadruplex in K + solution The presence of 12 imino peaks in the 1D proton spectrum of htel21T 18 in K + solution (Fig. 1b) showed that all 12 guanines were involved in the intramolecular G-quadruplex formation and that this G-quadruplex structure contained three layers of G-tetrads. The imino and H8 protons of guanosine bases were unambiguously assigned through the low-enrichment (2%) 15 N site-specic labelling method and 2D HMBC experiments (for more details about NMR assignments see Supplementary Results in the ESI †).
An expanded region for base and sugar H1 0 protons of the non-exchangeable proton NOESY spectrum is shown in Fig. 2a. Six strong cross-peaks in the H8-H1 0 region of the 2D NOESY spectrum acquired at a 75 ms mixing time were interpreted as nucleotides with the syn conformation of glycosidic torsion angle, i.e., G1, G7, G8, G13, G19 and G20 (Fig. 2b), in contrast to the other 6 guanines, namely, G2, G3, G9, G14, G15, and G21, that adopt the anti conformation in the quadruplex.
Assignment of the T4-T5-A6 and T16-T17-T18 loops of the htel21T 18 As shown in Fig. 1, two extra peaks appeared with chemical shis corresponding to $13.4 ppm and $9.6 ppm when A18 was substituted with T. Since all 12 guanines have been assigned through site-specic labelling, we hypothesized that these two peaks belonged to the T-T-A fragment. The chemical shi indicated that the peak at $13.4 ppm should be the imino proton involved in the H/N hydrogen bonding of the A$T pair. 24 The peak at $9.6 ppm should be the imino proton involved in the H/O (donor-acceptor) type hydrogen bond or (c) Guanine imino-H8 NOE connectivities observed for G1$G21$G13$G9, G2$G20$G14$G8 and G3$G19$G15$G7 tetrads. The bases in the G1$G21$G13$G9, G2$G20$G14$G8 and G3$G19$G15$G7 tetrads are colored red, orange, and blue, respectively. (d) Schematic structure of a chair-type G-quadruplex observed for a human telomeric variant DNA sequence, htel21T 18 , in K + solution. anti guanines are colored cyan, while syn guanines are colored magenta. The backbones of the core and loops are colored black and red, respectively.
Overall solution structure of the htel21T 18 G-quadruplex Many inter-residue NOEs are observed in the 2D-NOESY spectrum of htel21T 18 in K + solution. Critical inter-residue NOEs are schematically summarized in Fig. 4. These NOEs dene the overall structure of the telomeric G-quadruplex in K + solution and were used for structure calculations. Ten superimposed lowest energy rened structures of the htel21T 18 quadruplex are shown in Fig. 5a. The ribbon view of a representative rened structure of the htel21T 18 quadruplex is shown in Fig. 5b. As shown in Fig. 5a, the G-quadruplex structure consists of three G-tetrads linked to form four antiparallel right-handed G-strands (G1-G2-G3, G7-G8-G9, G13-G14-G15 and G19-G20-G21) that are connected by three edgewise side loops (T4-T5-A6, T10-T11-A12, and T16-T17-T18). Both edgewise T4-T5-A6 and T16-T17-T18 loops are located on the same side of the G-quadruplex core. The edgewise T10-T11-A12 loop is on the opposite side of the G-quadruplex core. The structures of edgewise loops T4-T5-A6, T10-T11-A12 and T16-T17-T18 are well dened, partially because of the A6$T18 base pair and a hydrogen bond between the imino proton H3 of T5 and the oxygen atom O4 of T16 (Fig. 5c). Our structure suggests that T10-T11-A12 capping the G1$G21$G13$G9 tetrad may contribute to the stability of the structure (Fig. 5d). Experimentally, we observed numerous NOEs between the loop T10-T11-A12 and the G1$G21$G13$G9 layer, such as G9H8-T10H7# and G21H8-A12H2, as well as between the bases of the loop T10-T11-A12 such as A12H8-T11H1 0 , A12H8-T11H2 00 , T10H1 0 -T11H1 0 and others ( Fig. 6a and b). The distance between the methyl groups of T10 and T11 is larger than 6.5Å which corresponds to the absence of a cross peak between them in the NOESY spectrum ( Fig. 6b and c). The experimental data are in full accordance with the conformation of the loop T10-T11-A12 in the structures.
Substitution of A18 with T traps a chair-type htel21T 18 Gquadruplex through the A6$T18 base pair As shown in Fig. 5c and S6, † A6 and T18 form a reverse Watson-Crick trans A$T base pair which is almost parallel to the G-tetrad layer G3$G19$G15$G7. In the reverse Watson-Crick A$T base pair, the methyl group of T and the proton H2 of A should be on the same side. As shown in the schematic representation of interresidue NOE contacts (Fig. 4), there are unambiguous NOEs such as CH3T18/H8G15, H2A6/H1G15 and H2A6/H8G7, indicating that the methyl group of T18 and the proton H2 of A6 are on the same side. These NOEs clearly demonstrate the formation of the reverse Watson-Crick A6$T18 base pair. Additionally, A6 and T18 are connected to G7 and G19 whose sugar-phosphate residues are parallel to each other. In this structural context, incorporation of a cis Watson-Crick pair is less feasible. Otherwise, it would have to be accompanied by a locally le-handed backbone conguration at the A6 / G7 step and a syn glycosidic torsion of the adenosine residue. However, none of these features are supported by experimental data. This A6$T18 base pair formation results in a peak at $13.4 ppm originating from the imino proton of T18, a typical indicator of an imino proton hydrogen-bonded to a nitrogen acceptor (Fig. 1b). As shown in Fig. S5, † a NOE cross peak between H3 of T5 and the methyl group of T17 indicates that T5 and T17 are close in the G-quadruplex. In the calculated structure, there is a direct hydrogen bond between T5(H3) and T16(O4) in htel21T 18 and T5 and T17 are almost capping the A6$T18 base pair and G3$G19$G15$G7 layer (Fig. 5c). The hydrogen bond T5(H3)/T16(O4) corresponds to a sharp imino proton signal of T5 at $9.6 ppm (Fig. S4 †). The imino proton of Fig. 4 Schematic diagram of inter-residue NOE connectivities of the htel21T 18 quadruplex formed in K + solution. The guanines with the syn and anti conformations are represented using gray and white rectangles, respectively. The NOE connectivities clearly define the Gquadruplex conformation and provide distance restraints for structure calculation.
T17 is responsible for the $10.3 ppm signal close to the imino proton of G7 (Fig. S4 †). However, we did not observe any hydrogen bonds for the base T17 in our calculated structure. As shown in Fig. S6, † there may be a potential water-mediated hydrogen bond formed between the imino proton of T17 and a deoxyribose oxygen or phosphate group of T5. It is possible that a potential T$T base pair may exist even though the H3-H3 NOE cross peak between T5 and T17 could not be observed in the NOESY spectra with different mixing times. Additionally, the well-resolved 1D NMR spectra and similar melting temperatures, T m of htel21T18 ($73.1 C) and htel21_A6T ($74.0 C) (Fig. S2 and S7 †), indicate that the formation of the A$T base pair is important for this chair-type G-quadruplex fold. However, the slightly higher T m of htel21_A6T could be possibly caused by the heterogeneity indicated by its 1D 1 H spectrum in Fig. S2. † Hence, the A$T base pair played an important role in trapping and stabilizing this chair-type G-quadruplex fold.

Discussion
In this study, based on the bioinformatics search and with the use of CD and NMR spectroscopy, we have found a variant telomeric DNA htel21T 18 that has a T substitution at A18 of htel21 and showed that it adopts a chair-type G-quadruplex fold with three G-tetrad layers. As shown in Fig. 1b and S2, † htel21T18 favoured a major G-quadruplex structure and gave excellent NMR spectra suitable for NMR structural determination. Both the 1D 1H NMR spectrum and CD spectrum clearly indicated the formation of a predominant three G-tetrad layered, antiparallel G-quadruplex structure by the htel21T 18 (Fig. 1). Stabilization of its loop-loop interactions by a reverse Watson-Crick A6$T18 base pair allows one to predict that an A6T mutant could form a similar structure, with the T6$A18 base pair with the same geometry in its two opposite lateral loops. Indeed, the 1D 1 H spectra of htel21_A18T and htel21_A6T are very similar (Fig. S2b †). Besides htel21T 18 , two htel21 variants obtained by substituting T4 and T10 with A also favoured a single G-quadruplex form. However, the number of imino protons observed between 10.0 and 12.5 ppm indicated that the G-quadruplexes adopted by htel21_T4A and htel21_T10A contain two, not three, G-tetrad layers. Comparison with the 1 H spectra of reported human telomeric G-quadruplexes in ESI Fig. S1 † allows one to expect that the topology of htel21_T4A and htel21_T10A may be similar to that of the intramolecular basket-type Gquadruplex formed by the sequence d[(GGGTTA) 3 GGGT]. 17 It is known that base pairing and stacking in the loops oen serve as stabilizing factors or affect the selection of a particular one among several possible forms for G-quadruplexes. 27 In the current structure of the chair type G-quadruplex of htel21T 18 , the A6$T18 base pair is observed across two juxtaposed lateral loops capping the G-tetrad core on one side. We note, however, that the G-core in the reported quadruplex topology is stable and sustains heating up to 50 C, the temperature at which the A6$ T18 base pair is already melted (Fig. S3a †). In the structure htel21T 18 , there is also a hydrogen bond T5(H3)/T16(O4), and T5 and T17 are almost capping the A6$T18 base pair and underlying G3$G7$G15$G19 layer (Fig. 5c).
Since the imino proton of T17 shows a peak at $10.3 ppm, but no suitable hydrogen bond is observed in our calculated structure, it might be an indication of a water-mediated hydrogen bond with an opposing oxygen on sugar-phosphate (Fig. S6 †). We hypothesize that potential hydrogen bonds of the T$T base pair involving T17 are helpful in stabilizing the chair-type G-quadruplex fold. As far as the T10-T11-A12 loop is concerned, this fragment converged well in the ensemble of computed structures (Fig. 6), suggesting that the particular geometry adopted by single edgewise loop T10-T11-A12 is energetically favorable and its stacking interactions with neighbouring G1$G21$G13$G9 may contribute to the stability of the structure as well (Fig. 5d). The conformations of nucleotides T4 and T16 are predominantly dened by van der Waals interactions which position them in a rather restricted volume dened by the structured covalently bound neighbouring nucleotides ( Fig. 5a and S8 †).
Subtelomeric occurrence in the human genome of the sequence htel21T 18 (Table S1 †), which is strongly prone to the chair-type quadruplex described here, indicates that this variant of human telomeric DNA may result from a single residue mutation and as such may change the equilibrium of Gquadruplex forms on pearls-on-string single-stranded Gtelomeric overhangs, with accompanying changes in topologies and relative orientations of monomeric quadruplex units on such a string. The sequence htel21T 18 does not exclusively localize in sub-telomeric regions, but is also found in other regions, such as in the subcentromeric region of chromosome 5. Interestingly, the occurrence of this sequence in a DNase hypersensitive region implies that in this part of chromosome a chair-like quadruplex can easily form in vivo.
Until now, several chair-type G-quadruplex structures have been reported such as thrombin aptamer d(G 2 T 2 G 2 TGTG 2 T 2 G 2 ), a sequence variant of the human telomeric sequence, d [AGGG(CTAGGG) 3 ], and a Bombyx mori telomeric sequence, Bm-U16, which are all composed of two antiparallel G-tetrad layers. [28][29][30][31] In comparison with htel21T 18 , these structures are different in the loop length, the hydrogen-bond directionalities of the G-tetrad layers, etc. Recently, a four-layer antiparallel Gquadruplex in which the symmetry and strand orientation are similar to htel21T 18 has been reported to be formed by an intronic hexanucleotide GGGGCC (G4C2) repeat of the C9orf72 gene in humans 32,33 (for more details see ESI †).

Conclusions
In summary, we have determined a novel three-layer chair type G-quadruplex structure of a human telomeric variant DNA htel21T 18 . The unique structure and fold of htel21T 18 could enable selective recognition and binding of a ligand and may provide a potential target for the development of a specic drug molecule stabilizing this DNA conformation. Our result expands the repertoire of known G-quadruplex folding topologies and highlights the important role of the loops on the folding topology of G-quadruplexes.

Sample preparation
DNA synthesis was performed on a 1 mmol scale, using a 1000Å LCAA-CPG solid support column on a MerMade 6 Nucleic Acid Synthesiser. Unlabeled and site-specic low-enrichment (2% 15 N-or 7% 15 N, 13 C-labeled) nucleotides were site-specically introduced into the growing oligonucleotide chains. All sequences were fully deprotected in concentrated ammonium hydroxide at room temperature for 24 hours. The DNA samples were dried, redissolved in $1 mL water and puried by gel-ltration chromatography on a Sephadex G-25 column. The DNA sample at 100 mM (single strands) was then re-annealed by heating to 95 C for 15 min, followed by slow cooling to room temperature overnight in an annealing buffer of 70 mM KCl and 20 mM potassium phosphate (pH 7.0). The nal NMR samples contained 0.1-2.5 mM DNA in 20 mM potassium phosphate buffer (pH 7.0) and 70 mM KCl.

Circular dichroism
Circular dichroism (CD) spectra were recorded at 25 C on a JASCO-815 CD spectropolarimeter using a 1 mm path length quartz cuvette with a sample volume of 200 ml. The DNA oligonucleotides were prepared in 20 mM potassium phosphate buffer (pH 7.0) containing 70 mM KCl at a concentration of 20 mM (strands).

Polyacrylamide gel electrophoresis (PAGE)
Non-denaturing PAGE was carried out in 25% polyacrylamide gel (acrylamide : bis-acrylamide 29 : 1), supplemented with 20 mM KCl in the gel and running buffer (TBE 0.5Â). The samples were prepared at a strand concentration 100 mM. Bands were stained with Red-safe dye.

NMR spectroscopy
Experiments were performed on 500 MHz and 800 MHz Varian spectrometers. Imino proton resonances were assigned to samples with nucleotides site-specically 15 N labeled, one at a time, and by through-bond correlations at natural abundance. 34,35 Standard 2D NMR experimental spectra, including NOESY, TOCSY and COSY, were collected at 5, 10 and 25 C to obtain the complete proton resonance assignment. 36 The NMR experiments for samples in water solution were performed with Watergate or Jump-and-Return water suppression techniques. Spectra were processed with the program nmrPipe. 37,38 Spectral assignments were also carried out and supported by COSY, TOCSY and NOESY spectra. NOE peak assignments and integrations were made using peak tting and volume integration implemented in the soware Sparky (http://www.cgl.ucsf.edu/ home/sparky/). Interproton distances involving exchangeable protons for htel21T 18 were categorized as strong (1.8 to 3.8Å), medium (1.8 to 4.5Å), weak (2.8 to 5.5Å) or very weak (2.8 to 6.8 A) based on the cross-peak intensities recorded in three NOESY spectra (75, 150 and 300 ms mixing times) in H 2 O solution. Interproton distances involving non-exchangeable protons for htel21T 18 were measured from NOE build-ups using NOESY experiments recorded at three mixing times (75, 125, and 300 ms) in D 2 O solution. The thymine base proton H6-H7# distance (2.99Å) was used as a reference distance.

Structure calculations
The G-quadruplex structure of the sequence d[(GGGTTA) 2 -GGGTTTGGG] was calculated using the X-PLOR program (NIH VERSION). 39,40 The initial folds guided by NMR restraints listed in Table S2 † were obtained using torsion angle dynamics from an arbitrary extended oligonucleotide conformation. The structures were further rened by Cartesian dynamics. 41 Dihedral angle restraints were used to restrict the glycosidic torsion angle (c) for the experimentally assigned syn-and anti-conformations, 60(AE35) and 240(AE40) , respectively. 42-44 Experimentally obtained distance restraints and G-tetrad hydrogenbonding distance restraints were included during calculations.

Torsion angle dynamics
In the heating stage, the regularized extended DNA chain was subjected to 60 ps of torsion-angle molecular dynamics at 40 000 K using a hybrid energy function composed of geometric and NOE terms. The van der Waals (vdW) component of the geometric term was set at 0.1, while the NOE term included NOEderived distances with a scaling factor of 150. The structures were then slowly cooled from 40 000 K to 1000 K over a period of 60 ps during which the vdW term was linearly increased from 0.1 to 1. In the third stage, the molecules were slowly cooled from 1000 K to 300 K for 6 ps of Cartesian molecular dynamics. 45 The structures with no restraint violations and minimal energies were selected for further renement.

Distance restrained molecular dynamics
Cartesian molecular dynamics was initiated at 300 K and the temperature was gradually increased to 1000 K in 6 ps. The system was equilibrated at 1000 K for 18 ps and was then slowly cooled to 300 K in 14 ps. Subsequently, the system was equilibrated at 300 K for 12 ps. The coordinates saved every 0.5 ps during the last 4.0 ps were averaged. The resulting average structure was subjected to minimization until the gradient of energy was less than 0.1 Kcal mol À1 . A so planarity restraint (weight of 10 kcal mol À1ÅÀ2 ) was imposed on the G-tetrads before the heating process and was removed at the beginning of the equilibration stage. 10 best structures were selected at this stage based on both their minimal energy terms and without NOE violations. The statistics of the structure renement and the quality of the nal structures are summarized in Table S2 † for htel21T 18 . The proton chemical shis of htel21T 18 are shown in Table S3. † All images of G-quadruplex structures in the gures were generated using PyMOL (http://www.pymol.org).

Genome screening for sequence occurrence
We screened the human genome using a single sequence repeat as the input for BLAT search (http://genome.ucsc.edu/).