M.
Stuhlmüller
a,
J.
Schwarz-Finsterle
a,
E.
Fey
b,
J.
Lux†
a,
M.
Bach
a,
C.
Cremer
ac,
K.
Hinderhofer
b,
M.
Hausmann
*a and
G.
Hildenbrand
ad
aKirchhoff-Institute for Physics, Heidelberg University, Im Neuenheimer Feld 227, 69120 Heidelberg, Germany. E-mail: hausmann@kip.uni-heidelberg.de; Tel: +49-6221-549824; Fax: +49-6221-549112
bInstitute for Human Genetics, Heidelberg University, Im Neuenheimer Feld 366, 69120 Heidelberg, Germany
cInstitute for Molecular Biology, Ackermannweg 4, 55128 Mainz, Germany
dDept. Radiooncology, University Medical Centre Mannheim, Theodor-Kutzer-Ufer 1-3, 68135 Mannheim, Germany
First published on 2nd October 2015
Trinucleotide repeat expansions (like (CGG)n) of chromatin in the genome of cell nuclei can cause neurological disorders such as for example the Fragile-X syndrome. Until now the mechanisms are not clearly understood as to how these expansions develop during cell proliferation. Therefore in situ investigations of chromatin structures on the nanoscale are required to better understand supra-molecular mechanisms on the single cell level. By super-resolution localization microscopy (Spectral Position Determination Microscopy; SPDM) in combination with nano-probing using COMBO-FISH (COMBinatorial Oligonucleotide FISH), novel insights into the nano-architecture of the genome will become possible. The native spatial structure of trinucleotide repeat expansion genome regions was analysed and optical sequencing of repetitive units was performed within 3D-conserved nuclei using SPDM after COMBO-FISH. We analysed a (CGG)n-expansion region inside the 5′ untranslated region of the FMR1 gene. The number of CGG repeats for a full mutation causing the Fragile-X syndrome was found and also verified by Southern blot. The FMR1 promotor region was similarly condensed like a centromeric region whereas the arrangement of the probes labelling the expansion region seemed to indicate a loop-like nano-structure. These results for the first time demonstrate that in situ chromatin structure measurements on the nanoscale are feasible. Due to further methodological progress it will become possible to estimate the state of trinucleotide repeat mutations in detail and to determine the associated chromatin strand structural changes on the single cell level. In general, the application of the described approach to any genome region will lead to new insights into genome nano-architecture and open new avenues for understanding mechanisms and their relevance in the development of heredity diseases.
Microscopic nano-techniques have offered new insights into cell biology especially of cell membranes and are bridging the gap of the mechanistic understanding of cell function, e.g. immunological response, and medical diagnostics.4 Basic structural features of cell nuclei and their time dependent changes have been elucidated and their profound functional correlation has been demonstrated.5,6 However, so far comparable investigations failed because of probe resolution and specificity referring to very small chromatin regions (nano-architecture of the cell nucleus) even if modern optical super-resolution microscopy or electron microscopy were applied. The use of localization microscopy (e.g. PALM, STORM or as applied here Spectral Position Determination Microscopy [SPDM])7,8 in combination with COMBO-FISH (COMBinatorial Oligonucleotide Fluorescence In Situ Hybridization)9,10 allows the investigation on the one hand of the native spatial structure of chromatin for instance trinucleotide repeat expansions and on the other hand of the optical sequencing of repetitive sequences by counting the number of oligonucleotide probes bound to the target area within the genome of 3D-conserved cell nuclei cells.
Fragile-X syndrome (FXS) or Martin-Bell-Syndrome is one main cause of inherited mental retardation.1 FXS belongs to the group of the so called ‘trinucleotide repeat expansion disorders’ consisting of the expansion of a trinucleotide frequency [(CGG)n-expansion] in the 5′ untranslated region of the Fragile-X Mental Retardation 1 gene (FMR1) on the X-chromosome.11,12 The enlargement of the CGG triplet-repeat results in a hypermethylisation13 within the promoter-region of the FMR1-gene causing a change of the spatial structure of the chromatin.11,14 This leads to a deactivation of the gene and so to a lack of FMR1-protein, which is meant to be required for synaptic plasticity.15 Normally approx. 30 triplets are found. If the number of triplet repeats exceeds 200, (CGG)>200, this is named full mutation. Repeats in the range between (CGG)≥45 and (CGG)≤54 are known as intermediates or grey zone alleles and repeats from (CGG)≥55 to (CGG)≤200 are called pre-mutations.16,17
In current medical diagnostics, the length of the trinuclueotide repeat expansions is usually estimated by Southern blotting and PCR amplification.11 This lengthy method has not only limits in the resolution of the Southern blot; it also averages the result of a certain amount of cells. Therefore, it is hard to differentiate between pre-mutation and full mutation if the expansion rate on average approximates 200 (CGG) repeats.
Lukáš et al. published an article18 about sequestration of muscleblind-like protein 1 (MBNL1) in tissues of patients with myotonic dystrophy type 2, which is known to be caused by a (CCTG)n tetraplet repeat expansion in intron 1 of the zinc finger 9 (ZFN9 gene) on chromosome 3q 21.3.19 They used RNA-FISH and immunofluorescence-FISH to demonstrate a different frequency of nuclei containing the foci of the CCUG transcript, a different expression pattern of MBNL1 protein and a different sequestration of MBNL1 by (CCUG)n repeats in different tissues.18
Here, for the first time we present microscopic structural and sequence measurements of very small chromatin regions (down to the order of one kb) in 3D-conserved cell nuclei on the nano-scale using localization microscopy. With the design and application of specifically labelled oligonucleotides it is shown that COMBO-FISH (COMBinatorial Oligonucleotide FISH) allows counting of the number of CGG-repeats in an expansion region inside the 5′ untranslated region of the FMR1 gene. The number of repeats verified by Southern blotting indicated a full mutation of the Fragile-X syndrome case analysed here. With appropriate sets of oligonucleotides for labelling, we found in this case that the FMR1 promotor region was similarly condensed as the centromeric region whereas the arrangement of the probes labelling the expansion region seemed to indicate a loop-like structure.
COMBO-FISH uses a combination of computer based designed oligonucleotide probes with a typical stretch length of 15-30 nucleotides. By data bank based analysis for accessory binding sites within the whole genome, the design of the probes is optimized so that the specificity of the used probe combination is increased by co-localization within the target region.20 We extracted multiplets of 6 trinucleotide units [(CGG)6 or (CCG)6 probes] referred to as expansion-probes, showing the highest specificity to the (CGG)-repeat expansion region within the 5′ untranslated region of the FMR1 gene with a minimum of accessory binding sites within the whole genome. Three accessory binding sites with a maximum of six probes within a 250 kb-range were identified. Considering a probe length of six trinucleotide units with a linked fluorochrome at one end, simulations of the probes binding to the target region let us expect 45 ± 2 linearly bound probes in an expansion region with 477 (CGG)-repeats (∼1.4 kb) occurring in the cell line used here. In addition, a probe set of different oligonucleotides specifically co-localizing in the FMR1 promotor region, further referred to as promotor-set or promotor-probes, was designed. The design of this promotor-set is based on following conditions: a minimum probe length of fifteen nucleotides, a melting-point difference between promotor-probes and expansion-probes less than 10 K, less than three binding sites within the remaining genome and a distance between the binding sites of the probes of about 490 nucleotides. This search resulted in 50 suitable sequences. Out of this pool the 20 shortest probes were chosen as the promotor-set. Finally, a repetitive probe, unique and thus highly specific for centromere 9 was designed for control experiments.
After hybridization of the expansion- and promotor-sets, fluorescence signals (spots) within more than 80% of the confocal microscopy images of the examined cells were detected. By simultaneous co-localization of two expansion probe sets (same oligonucleotide stretches with different dyes) in a two colour experiment, we proved the specificity of the designed expansion-probes. The observed co-localization efficiency was over 70%. The same results were obtained in co-localization tests of the FMR1-promotor-set and the (CGG)-probes bound to the expansion region (Fig. 1).
For further examinations, such as the number of expansion-probes bound to the target area and their binding behaviour, dyes undergoing reversible photo-bleaching were used21 in order to acquire high resolution images by SPDM.6,7,22 Image time stacks were processed by specially developed MATLAB-based programs to detect molecular blinking events that represent the position of the individual dye molecules linked to the probes. As a result of this blinking detection a matrix is created containing the exact position and the localization precision of each blinking event. This matrix is used to create the localization images and for further cluster analysis.
One subscript of these programs additionally gives the opportunity to crop the localization images and the underlying matrix to a region of interest, which contains the COMBO-FISH signals in a wide field image. Thereby, the signal-to-noise-ratio is improved and the local background within the region of interest is reduced.
A cluster analysis that is applied to the data-matrix detects ‘cluster-points’ within the localization data according to user-defined parameters. These parameters are the minimum number of fluorescence signals within a defined circular region around each blinking event and the radius of this region. The mode of operation is schematically shown in Fig. 2. An event is identified as a ‘cluster-point’ if it has at least as many neighbours within the determined radius as preset by the user (see Fig. 2d and e). An event is not counted to be in a cluster if the preset number of neighbours within this area is not reached (see Fig. 2b and c). After this analysis has been done for all events within the data-matrix the final cluster in turn is set as the incessant union of ‘cluster-points’ (Fig. 2f and g). For visualisation of the clusters as seen in Fig. 4c and g the area of each single cluster gets coloured. By this process not only the localization events get coloured but also the corresponding regions around each event. In addition the regions around events identified not to be inside a cluster are excluded.
Additionally, the script provides different information about these identified clusters such as the number of signals within the clusters, the cluster diameter and the distance distributions of signals inside and outside of clusters. Comparing and validating the position and size of clusters recorded by conventional fluorescence microscopy vs. SPDM, we achieved the parameters representing clusters optimally. This way the number of probes bound to the examined region is given by the number of signals within the specific clusters.
To verify this newly developed analysis procedure and optical sequencing strategy, we used human B-lymphocytes (Coriell GM06897) which were sampled from a male donor with an exhibition of 477 (CGG)-repeats according to the provider. Analysing these cells with the common method based on Southern blotting revealed an expansion length of 490 ± 100 (CGG)-repeats (Fig. 3). This indicates a certain inhomogeneity of the cells and further motivates our single cell approach.
The application of our new method based on COMBO-FISH and SPDM resulted in 65 ± 5 signals within the clusters that were identified as bound probes (Fig. 4). Assuming that a target region with 490 ± 100 (CGG)-repeats should lead to about 45 ± 10 linear bound probes extending our simulations of 477 repeats in a worst case. According to the above mentioned simulation, the maximum amount of probes bound linearly to a target area of 590 (CGG)-repeats, which is about the maximum length of the expansion region based on the Southern blot analysis, should therefore be about 60 occurring with a probability of 4% only.
Measurements comparing the distances of the expansion-probes with those of the promotor-probes and centromere 9 probes (for probe design see ref. 10) indicated a significant difference in the signal and thus probe densities (Fig. 5). While the non-repetitive promotor- and centromere-probes typically showed a mean distance of 23.3 ± 3.7 nm (assuming linear 0.34 nm/nucleotide: 69 ± 11 nucleotides) and 26.1 ± 4.3 nm (77 ± 13 nucleotides), respectively, between neighbouring fluorophores, the repetitive expansion-probes showed a mean distance of only 12.3 ± 3.3 nm (36 ± 10 nucleotides). The only difference between these probes was the character (repetitive or non-repetitive units) of their sequences and the structure of their binding sites. The difference in the probe distance between theoretical (490 nucleotides) and measured values (69 ± 11 nucleotides) in the case of the promotor-probes hints to an inactivated, i.e., dense DNA superstructure. This explanation is proved by densely packed centromere-probes. One can assume that centromere regions and similarly densely packed chromatin regions are the highest compaction form of chromatin. Therefore, the apparently “more densely packed” expansion-probe signals may hint on the contrary to an extended, loop-like and therefore well accessible chromatin conformation that allows increasing the real number of bound probes so that the detected distances of the fluorochromes are apparently shortened. The difference between the measured number of probes bound to the target areas and the simulated average number indicates DNA superstructures since the simulation assumes a linear binding behaviour only. However, a possible non-linear probe binding behaviour could not be strictly excluded for repetitive probe units (Fig. 6). For stoichiometric reasons, these binding formations would also require appropriate free volumes around the target strand so that again a loop-like conformation seems to be preferred.
The combination of COMBO-FISH9,10 and SPDM,7,23,24i.e. super-resolution microscopy, enables analyses of extremely small target areas of the genome and represents an important step towards the optical sequencing of trinucleotide region expansions or any other genome target region, for which appropriate probes are available. Besides aspects of probe binding one should, however, also consider resolution limits in practical SPDM. Physical constraints have an influence on localization accuracy and precision especially for dye molecule distances in the range of 10 nm.25 Thus nowadays the usually applied detection and recording systems of SPDM may run into registration problems under the labelling conditions applied here, so that small probe numbers densely arranged have to be carefully identified. Thus, our experimental requirements might be a technological challenge towards front-line localization microscopy. However, novel developments in optics and detector sensitivity as well as sophisticated informatics will overcome such shortcomings.
By simulating the binding of probes with a different number of (CGG)-repeats on targets of different lengths and a simultaneous search for accessory binding sites of these probes within the human genome we ascertained that the optimal length for the COMBO-FISH probes is (CGG)6. At this length we expected 45 ± 10 probes to bind linearly within an assumed target region of the used cells. Additionally, the data bank search let us expect three accessory binding sites in which three or six probes can bind within a range of 250 kb.28 These results for the (CGG)-probes can be transferred directly to the complimentary (CCG)-probes.
According to these results three oligonucleotide probes with one fluorochrome molecule on the 5′ end were commercially synthesized. Two of these probe types are (CCG)6-probes, one labelled with Alexa 488 and one labelled with Alexa 568. The third kind are (CCG)6-probes labelled with Alexa 568.
The appropriate excitation wavelength was selected by Python-based software. The fluorescence was detected by using a Sensicam QE with 1376 × 1040 pixels and a pixel size of 6.45 μm (PCO). Typically, 2000 frames were acquired with an integration time of 50 ms each.
Additionally, some newly developed programs were used to select specified regions within the widefield images of the data stacks and crop the raw data onto these regions. This leads to a much better noise-to-signal-ratio within the region of interest so that it became much more likely to detect the COMBO-FISH-signals within the data. Furthermore, an appropriate cluster detection algorithm was developed.
These results demonstrate that in situ chromatin nano-structure measurements are feasible in intact cell nuclei. Changes in these nano-structures will provide new information about mechanisms behind genomic aberration inductions. In general the applied methodological approach will lead to new quantifiable insights into mechanisms correlated with the genome nano-architecture and functional consequences especially with aspects relating to the relevance of the development of heredity diseases.
Footnote |
† Present address: Institute of Environmental Physics, Heidelberg University, Im Neuenheimer Feld 229, 69120 Heidelberg, Germany. |
This journal is © The Royal Society of Chemistry 2015 |