Indraneel Ghosh*a, Cliff I. Stainsa, Aik T. Ooib and David J. Segal*b
aDepartment of Chemistry, University of Arizona, Tucson, Arizona 85721. E-mail: ghosh@email.arizona.edu; Tel: 520-621-6331
bGenome Center and Department of Pharmacology, University of California, Davis, CA 95616. E-mail: djsegal@ucdavis.edu; Fax: 530-754-9658; Tel: 530-754-9134
First published on 28th September 2006
Methodologies to detect DNA sequences with high sensitivity and specificity have tremendous potential as molecular diagnostic agents. Most current methods exploit the ability of single-stranded DNA (ssDNA) to base pair with high specificity to a complementary molecule. However, recent advances in robust techniques for recognition of DNA in the major and minor groove have made possible the direct detection of double-stranded DNA (dsDNA), without the need for denaturation, renaturation, or hybridization. This review will describe the progress in adapting polyamides, triplex DNA, and engineered zinc finger DNA-binding proteins as dsDNA diagnostic systems. In particular, the sequence-enabled reassembly (SEER) method, involving the use of custom zinc finger proteins, offers the potential for direct detection of dsDNA in cells, with implications for cell-based diagnostics and therapeutics.
![]() Indraneel Ghosh | Indraneel Ghosh was born on 30th December, 1968 in India. He obtained his BS degree in chemistry at Hobart College, Geneva, NY. He received his PhD in Chemistry at Purdue University, IN in 1998 with Professor Jean Chmielewski working on designing peptide inhibitors of protein–protein interactions and self-replicating peptides. Following his doctoral studies, Neel was a joint postdoctoral fellow at Yale University, New Haven, CT with Professor Andrew Hamilton in the Department of Chemistry and Professor Lynne Regan in Molecular Biophysics and Biochemistry. At Yale, he developed the split-Green Fluorescent Protein reporter system for the in vivo detection of protein–protein interactions. He started as an assistant professor at the University of Arizona, Tucson, AZ in 2001 and has been engaged in interdisciplinary research that spans the chemistry and biology interface. His current research interests are focused upon the design and selection of new proteins, peptides and small molecules that can be useful in the construction of therapeutics, biosensors, and biomaterials. He has been a recipient of a Research Innovation Award from the Research Corporation and a Career Award from the National Science Foundation. |
![]() Cliff Stains | Cliff Stains was born in Lewistown, PA in 1979. He attended Millersville University, Millersville, PA where he majored in Chemistry with an emphasis in Biochemistry. Under the direction of Dr Sandra Turchi he developed methods for the separation of methylated nucleotides, leading to an undergraduate honors thesis. Cliff joined the research group of Dr Indraneel Ghosh at the University of Arizona in 2003 where he is currently pursuing a PhD in Chemistry with a focus on Biological Chemistry. His research interests include designing protein–protein and protein–DNA assemblies for use in biological and materials applications. He has been the recipient of an institutional pre-doctoral Ruth L. Kirschstein National Research Service Award and mid-career awards at the University of Arizona. |
![]() Aik Ooi | Aik Ooi was born in Penang, Malaysia in 1980. He obtained his BS degree in Chemistry at Mount Union College, Alliance, OH, with a minor in Biology. For his undergraduate thesis, he worked under the supervision of Dr Laura Beal on the synthesis and application of substituted flavonoids as multi-drug resistant pump inhibitors in S. aureus. In 2002, Aik enrolled in the University of Arizona for his graduate studies in Medicinal and Natural Products Chemistry. He joined the research group of Dr David Segal, where he is working on the design and development of sequence-specific double-stranded DNA detection utilizing custom-designed zinc fingers and split-enzyme complementation. He has been the recipient of Yuma young investigator award and Caldwell research award at the University of Arizona. |
![]() David Segal | David Segal was born in Yonkers, NY in 1966. He obtained a BS with honors in Biology from Cornell University, Ithaca, NY, in 1989. He obtained a PhD in Biochemistry from the University of Utah, Salt Lake City, UT, in 1996 with Professor Dana Carroll working on the stimulation of homologous recombination by targeting double-strand breaks in DNA. Seeking better targeting methods, he joined the laboratory of Professor Carlos Barbas at The Scripps Research Institute, La Jolla, CA as a postdoctoral fellow, where he helped develop the most widely used methods for engineering custom zinc finger DNA-binding proteins. As an assistant professor at the University of Arizona, Tucson, AZ, from 2002–2005, and now as an assistant professor at the University of California, Davis, CA, his research continues to focus on the design of engineered zinc fingers and their application as diagnostic agents and therapeutics |
Many types of DNA diagnostic methodologies have been described. Some are in the very early stages of development while others are commercially available. One approach to categorize the myriad of techniques is to define how they address their common goals. Like all diagnostic technology, DNA diagnostics require both a detection method and a signaltransducer. Most current detection methods for the sequence-specific recognition of DNA make use of the special property of single-stranded DNA (ssDNA) to base pair with high specificity to a complementary molecule (Fig. 1A). The other molecule may be another ssDNA, ssRNA, peptide-nucleic acid (PNA),3 or other base-pairing molecular analog. Such specific annealing or hybridization forms the basis for such common technologies as PCR amplification with specific primer sets, Southern blot, Northern blot, DNA microarray, and fluorescent in situ hybridization (FISH).4 However, there are other ways to read the sequence information besides Watson–Crick base-pairing (Fig. 1B). For example, polyamides are small chemical compounds that can be designed to bind with high sequence specificity in the minor groove of double-stranded DNA (dsDNA).5 Similarly, triplex-forming DNA,6 and zinc finger DNA-binding proteins7 can all be engineered to achieve specific base-pair recognition of dsDNA in the major groove.
![]() | ||
| Fig. 1 An overview of DNA diagnostic methods. Detection methods can read the sequence information by either (A) Watson–Crick base pairing with one strand (orange), which requires denaturation of the duplex and subsequent hybridization with a complementary probe (purple), or (B) direct detection of dsDNA by specific interaction with base edges in the major or minor groove. A signal transducer converts the detection event into a quantitative signal, such as fluorescence intensity (green). | ||
The second component required is a signaltransducer, which converts the sequence-specific recognition event into a signal that can be quantitatively measured (Fig. 1). Typically the signal is optical (colorimetric, fluorescent, luminescent, turbidic, etc) or electrical (voltage, resistance, or current change). Transducers with fluorescent readouts are used most commonly, such as dyes that intercalate into dsDNA (ethidium bromide and SYBR green) or are attached to the base-pairing partner of ssDNA (labeled probes used in DNA microarrays,8 Taqman real-time PCR chemistry,9 or FISH). While the specificity of a DNA diagnostic will depend on the fidelity of the detection method, the sensitivity will largely be a function of the signal transducer. For example, PNA probes can be engineered to bind with extremely high specificity and affinity to their denatured chromosomal targets in a FISH assay.10 However, detection of unique genomic sequences is limited by the difficulty in detecting the weak signal of one fluorescent molecule over background. To improve sensitivity, several ingenious methods have recently been developed to sensitively detect the recognition event (such as hybridization-dependent current fluctuations across an α-hemolysin nanopore11,12), or amplify the transduced signal (such as hybridization-dependent release of barcode DNA from captured nanoparticles,13 or aggregation-enhanced fluorescence14). Some of these methods have proven to be extremely sensitive, able to detect molecules in the zeptomolar range (1–500 molecules per ml sample). Other strategies rely again on the special ability of nucleic acids to form specific base pairs and enzymatically amplify the DNA, either before or as part of the detection method (such as PCR, Strand Displacement Amplification (SDA),15 or Rolling Circle Amplification (RCA)16).
For an overview of recent advances in hybridization-based DNA diagnostics, the reader is directed to several outstanding reviews on this topic.17–19 The scope of this review will be restricted to methods for the direct detection of dsDNA, meaning methods that do not require dsDNA denaturation and subsequent hybridization. Progress in this area has been slower because of the difficulty in engineering highly specific detection methods for the major or minor groove of DNA. However, the emergence of such technologies in the past decade has now enabled their application as dsDNA diagnostics. In some cases, the ability to use dsDNA as a substrate enables capabilities beyond what would be possible for hybridization-based methods.
![]() | ||
| Fig. 2 Polyamide minor groove binders. Left: structural representation of a polyamide (green) bound in the minor groove of dsDNA (black and orange). Middle: a schematic illustration of the binding interactions. The abbreviations are pyrrole (Py), hydroxypyrrole (Hp), and imidazole (Im). Right: chemical structure of thiazole orange, used as a signal transducer. | ||
Laemmli and coworkers have recently utilized fluorescein-labeled polyamides as “chromosome paints” with the goal to visualize AT-rich satellite regions and scaffold-associated regions in the genome of Drosophila melanogaster.23 In a second study, designed polyamides conjugated to Texas-Red were used to target and visualize telomeric repeats in insects (TTAGG) and vertebrates (TTAGGG) with high specificity.24 These results demonstrated that telomere-specific polyamide-dye conjugates might allow for the rapid estimation of the telomere length. These elegant studies utilized fluorescence microscopy of fixed cells or of isolated nuclei, where excess labeled-polyamides can be removed. However, the detection of dsDNA in live cells or whole animals would require a method for removing unbound labeled-polyamides, as the background fluorescence would likely decrease the contrast.
With the goal of lowering background (signal from unbound labeled-polyamides), Dervan and coworkers have recently designed and tested polyamides conjugated to intercalating dyes tetramethyl rhodamine and thiazole orange (Fig. 2, right).25,26 In these studies, several fluorescent conjugates were synthesized and tested against dsDNA targets, 5′-WGGGWW-3′, 5′-WGGCCW-3′, and 5′-WGWWCW-3′ (W = A or T). It was found that the designed conjugates with thiazole orange exhibit >1
000-fold fluorescence enhancement only in the presence of specific target dsDNA, where the dye likely intercalates at an adjacent site. The lowest concentration of oligonucleotide detected in this study was 1 nM, although lower concentrations were not examined. Mismatched targets reduced the signal by >90%. As polyamide binding site sizes and sequence specificities are being further refined, these new dsDNA-sensitive dyes attached to appropriate targeting molecules will likely find use in probing dsDNA in a cellular setting.27
![]() | ||
| Fig. 3 Triplex-DNA major groove binders. Left: structural representation of a TFO (green) bound in the major groove of dsDNA (black and orange). Middle: a schematic illustration of the binding interactions. Right: two types of “padlock” TFO strategies, (top) linear TFO with ends joined using a “splint” oligonucleotide (blue), and (bottom) stem-loop TFO with ligated fluorescently labeled DNA (purple). | ||
An early application of triplex-DNA as a diagnostic agent was to stain an alpha-satellite repeat in chromosomal spreads using an assay analogous to FISH (appropriately termed TISH).31 A 16 nt polypyrimidine TFO was designed to bind the 500–1000 repeats of the target sequence at pH 6 without denaturation. TFO binding was stabilized by crosslinking to the duplex via a tethered psoralen moiety. Signal transduction was accomplished by tagging the TFO with fluorescein isothiocyanate (FITC). The study found TISH comparable to FISH in both sensitivity and specificity. The dsDNA-based TISH was suggested to be more quantitative than FISH, since there was no competition between probe hybridization and duplex reannealing. Additionally, non-denatured DNA allowed TISH, but not FISH, to be compatible with G-banding chromosomal reference techniques.
Another application of triplex DNA is the so called “padlock” TFO.32 The detection method is based on the central part of a linear TFO, which forms a triplex with the target duplex DNA. The ends of the linear molecule can be joined covalently by base-pairing with an additional “splint” oligonuclotide, followed by DNA ligase (Fig. 3, top right). The result is a circular ssDNA molecule that is topologically linked to the target duplex. Signal transduction can be accomplished by RCA,16 or other amplification method. The ends of the TFO can alternatively be stabilized non-covalently by a stem-loop structure (Fig. 3, bottom right).33,34 Signal transduction in this case can be accomplished by designing the stem loop to have a short terminal overhang, to which a fluorescently labeled DNA can be ligated. Because the TFO is physically wrapped around the duplex molecule (hence the name “padlock”), the affinity of this complex is far greater than that of the triplex alone. In one study, a polypurine/polypyrimidine tract in individually spread molecules of lambda phage DNA was visualized with a 59 nt stem-loop oligonucleotide probe (containing a 15 nt central triplex forming region) and a 500 bp stem-loop-binding labeled duplex.33 For increased sensitivity, the 500 bp DNA was labeled with at least 20 molecules of AlexaFluor 546. The purified lambda DNA was stretched on glass slides using molecular combing methods. The sample had to be heated to unwind the stem-loop on the probe oligonucleotide, then slowly cooled to allow rewinding after triplex formation. The precise position of the target site along individual DNAs was easily observable by fluorescence microscopy. In another study, radiolabeled padlock TFO were able to detect subfemtomolar concentrations of target dsDNA using a signal transduction method of gel electrophoresis followed by autoradiography of dried gels.34
A wide variety of technological improvements have been made to expand recognition beyond strictly polypurine tracts, improve affinity, reduce pH dependence, and reduce degradation in cells. For example, artificial base analogs can extend recognition to all possible base pairs.35 Chemical compounds such as BQQ can act as triplex stabilizing agents.36 Triplex formation involving any sequence of bases was suggested to occur at physiological pH in the presence of YOYO-1, an oxazole yellow homodimer the fluorescent intensity of which increases over 1
000-fold in the presence of dsDNA and 100
000-fold in the presence of triplex DNA.37 This approach was reported to distinguish SNPs and single base pair deletions in PCR amplified fragments of cystic fibrosis gene, the DNA repair gene hMSH2, and the tumor suppressor genes BRCA1 and P16. The assay was extremely rapid and simple; however, the postulated triplexes were not unequivocally demonstrated. Such modified triplex paradigms and their general applicability for direct dsDNA recognition deserve further study.
Peptide nucleic acid (PNA) are synthetic nucleic acid homologs containing standard DNA bases but a polyamide backbone composed of N-(2-aminoethyl) glycine units. The nucleobases are attached with methylenecarbonyl linkers.40 The neutral backbone eliminates the charge repulsion in standard nucleic acid hybridization, therefore PNA is able to bind DNA and RNA with extremely high affinity and specificity. In the most common application, a homopyrimidine PNA forms a structure in which one PNA molecule forms Watson–Crick base pair interactions with a polypurine DNA strand, while another PNA forms Hoogsteen interactions. The other DNA strand is displaced, forming a “P loop”. However, a variety of PNA:DNA2, PNA2:DNA, PNA2:DNA2 structures have been observed under various experimental conditions.3
DNA detection applications have been developed using both RecA41 and PNA.42–44 Indeed, applications of PNA are so numerous that the reader is directed to a dedicated review on this subject.45 However, as both these technologies involve some aspect of local duplex disruption and are not strictly major or minor groove detection methods, they will not be discussed further here.
Beyond in vitro assays, the E. colilac repressor was fused to green fluorescent protein (GFP) to visualize inserted repeats of the lac operator in living yeast and mammalian cells.49 Single chromosomal integrations of a vector containing 256 repeats of the lac operator could be observed by fluorescence microscopy in cells expressing the GFP–lac fusion protein, with a signal-to-noise (background nuclear fluorescence) ratio of 12 : 1. In the absence of DNA binding, the expression of GFP–lac produced only diffuse fluorescence. The sensitivity of this live cell imaging method was found to be comparable to immunostaining and FISH. The study went on to demonstrate the utility of this method to examine chromonema fibers in interphase nuclei. The authors noted that previous failures to appreciate large-scale chromatin substructure within chromosome domains were consistent with structural perturbations in chromatin structure resulting from standard in situ hybridization procedures. More recently, the GFP–lac system was used to perform mosaic analysis in living C. elegans.50
The GFP–lac studies demonstrate successful detection in an environment in which the direct detection of dsDNA is a clear advantage, the living cell. However, these studies also illustrate the technological challenges to advancing beyond the proof-of-concept stage. First, the use of natural DNA-binding proteins as the detection method severely restricts the spectrum of DNA sequences that can be recognized. Second, signal transduction was successful because 256 GFP–lac molecules were spatially restricted to one locus. Detection of spatially distributed or unique target sequences, as is more typically desired, would be impossible because the individual bound GFP–lac molecules would have the same signal intensity as unbound molecules. We have attempted to address both of these challenges in the recently described sequence-enabled reassembly (SEER) detection methodology.51–53 As will be described below, the detection method is based on engineered zinc finger proteins, the specificity of which can be programmed by the investigator. The signal transducer is based on the binding-dependent reassembly of a reporter protein, such that no signal should be present unless DNA-binding occurs.
![]() | ||
| Fig. 4 Zinc finger protein major groove binders. Top: structural representation of a three zinc finger protein (blue) bound in the major groove of dsDNA (black and orange). Bottom: recognition modules incorporated into the alpha helix of a zinc finger that will enable it to specifically bind the indicated 5′-ANN-3′, 5′-CNN-3′ or 5′-GNN-3′ DNA sequence. | ||
Using phage display technology, DNA contacting amino acids in the zinc finger domain were randomized for the selection of new variants that recognize desired DNA sequences.55,56 This allowed the selection of modules to construct multi-domain zinc fingers to bind to specific DNA sequences. Currently, domains have been identified that facilitate binding to all 5′-GNN-3′, most 5′-ANN-3′ and 5′-CNN-3′, and some 5′-TNN-3′ type sequences, enabling targeting to an extremely wide spectrum of target sites.57–59 Multi-finger proteins based on these custom DNA binding domains can be assembled by PCR using overlapping oligonucleotides or commercial synthesis.60,61
![]() | ||
| Fig. 5 The SEER method for the direct detection of dsDNA. (A) SEER-GFP, (B) SEER-LAC, (C) mCpG-SEER with GFP. | ||
000-fold improvement in detection time over SEER-GFP, as it could differentiate target from non-target DNA sequences in 5 min. The colorimetric assay format had sufficient sensitivity to easily detect 20 nM of purified target DNA. The specificity of the system was high enough to distinguish a single base-pair mutation in the 18 bp binding site. The intensity of the signal remained the same in the presence of equal mass herring sperm DNA to target oligonucleotide.Towards the goal of a rapid method for the detection of known sites of hypermethylation at CpG islands, we have recently developed a new approach called mCpG-SEER (Fig. 5C).52 The mCpG-SEER system was designed based upon our existing SEER-GFP system, while incorporating a means for targeting methylated CpG sites. We chose the well-characterized MBD2 protein from humans, that has a binding affinity of 2.7 nM for mCpG sites while it has a 50–100 fold reduced binding affinity for unmethylated CpG sites. We hypothesized that this difference in binding affinities would allow us to selectively target mCpG sites versus unmethylated CpG sites. Since numerous sites on a genome are methylated, we needed to introduce sequence selectivity, which can be readily achieved by utilizing natural and designed zinc fingers as discussed. As proof of concept, we employed the Zif268 zinc finger to recognize a site next to the mCpG site. We found that the specificity of mCpG-SEER was >40-fold between a methylated versus a non-methylated CpG target site. We also found that the fluorescent signal was linear to 5 pmol of methylated target DNA in a 100 µL sample volume. Thus, mCpG-SEER represents a new and potentially useful method for the direct detection of CpG methylation, which may find numerous applications in delineating the epigenome and in cancer research.
| Method | Assay | Sensitivity | Sequence restrictions | Likely to be useful in cells | Reference |
|---|---|---|---|---|---|
| a Lowest concentration used in study, but lower concentrations were not examined. | |||||
| Polyamide: | |||||
| Fluorescein conjugate | FISH-like | Highly repeated sequences | None | Yes | 23,24 |
| Thiazole orange | Oligo targets | 1 nMa | None | Yes | 25 |
| Triplex: | |||||
| TISH | FISH-like | Highly repeated sequences | Polypurine tracts | Yes | 31 |
| Padlock-FITC | Spread molecules | Single molecule | Polypurine tracts | No, heat requirement | 33 |
| Padlock-radiolabeled | Southern | <1 fM | Polypurine tracts | No, heat requirement | 34 |
| YOYO-1 | Amplified DNA | 4 nMa | None | No, intercalator | 37 |
| Protein: | |||||
| Hin-oxazole yellow | Oligo targets | 50 nMa | Hin sites | Delivery? | 47 |
| EcoRI–nano | Spread molecules | Single molecule | EcoRI sites | No, cleavage | 48 |
| GFP–lac | Live cell | 256 tandem repeats | lac sites | Yes | 49 |
| SEER-GFP | Oligo targets | 2.5 µMa | Few | Yes | 53 |
| SEER-LAC | Oligo targets | 20 nM | Few | Yes | 51 |
| mCpG SEER | Oligo targets | 50 nM | Few | Yes | 52 |
Footnote |
| † Abbreviations: ZF, zinc finger; GFP, green fluorescent protein; PNA, peptide nucleic acid; dsDNA, double-stranded DNA; TFO, triplex-forming oligonucleotide; SNP, single nucleotide polymorphism; SEER, sequence-enabled reassembly. |
| This journal is © The Royal Society of Chemistry 2006 |