Open Access Article
Yangang
Pan
,
Luda S.
Shlyakhtenko
* and
Yuri L.
Lyubchenko
*
Department of Pharmaceutical Sciences, College of Pharmacy, WSH, University of Nebraska Medical Center, Omaha, Nebraska 68198-6025, USA. E-mail: ylyubchenko@unmc.edu; lshlyakhtenko@unmc.edu
First published on 4th September 2019
APOBEC3G (A3G) is a single-stranded DNA (ssDNA) binding protein that restricts the HIV virus by deamination of dC to dU during reverse transcription of the viral genome. A3G has two zinc-binding domains: the N-terminal domain (NTD), which efficiently binds ssDNA, and the C-terminal catalytic domain (CTD), which supports deaminase activity of A3G. Until now, structural information on A3G has been lacking, preventing elucidation of the molecular mechanisms underlying its interaction with ssDNA and deaminase activity. We have recently built a computational model for the full-length A3G monomer and validated its structure using data obtained by time-lapse High-Speed Atomic Force Microscopy (HS AFM). Here time-lapse HS AFM is applied to directly visualize the structure and dynamics of A3G in complexes with ssDNA. Our results demonstrate a highly dynamic structure of A3G, where two domains of the protein fluctuate between compact globular and extended dumbbell structures. Quantitative analysis of our data revealed a substantial increase in the number of A3G dumbbell structures in the presence of the DNA substrate, suggesting that the interaction of A3G with the ssDNA substrate stabilizes this dumbbell structure. Based on these data, we proposed a model explaining the interaction of globular and dumbbell structures of A3G with ssDNA and suggested a possible role of the dumbbell structure in A3G function.
Here the HS-AFM methodology29–32 is utilized to visualize the dynamics of monomeric A3G in complex with ssDNA. To unambiguously identify the A3G–DNA complexes, a hybrid-DNA approach17,28,33,34 was employed, and different types of DNA substrates were used to reveal the intramolecular dynamics of A3G. It was demonstrated that A3G forms complexes with ssDNA either in compact globular and/or dumbbell structures, but the population of the dumbbell structures of A3G considerably increased compared to that of the free protein. A clear dependence was also found for the yield of the dumbbell structures on the length of the ssDNA substrate. Interestingly, the number of dumbbell structures increases coincidently with the length of the ssDNA substrate. The use of different ssDNA substrates allowed us to observe one of the domains being transiently dissociated from ssDNA, demonstrating a very dynamic behavior of A3G in the presence of the ssDNA substrate. Based on these results, we suggested a model to explain the role of the dynamics of A3G in the interaction with ssDNA and form a hypothesis for its role in protein function.
![]() | ||
| Fig. 1 AFM selected frames from Movie 1† illustrating the dynamics of A3G in complex with the 69 nt tail DNA. Frames 18 and 102 show the globular structure of A3G in complex with ssDNA. Frames 44, 56, 57, and 99 represent the dumbbell structure of A3G in complex with ssDNA. The average yield of dumbbell structures is 65%. The scale bar is 25 nm. The scan rate is 398 ms per frame. | ||
The first striking observation for A3G in complex with the 69 nt tail ssDNA substrate was the high yield of the dumbbell structures. The average yield for the dumbbell structure was 65%, analyzed from 10 separate movies with a total of ∼600 frames. Note that this yield is four times greater than the yield of A3G dumbbells for A3G not bound to ssDNA.
For quantitative characterization of the dumbbell and globular structures of A3G in complex with the 69 nt tail ssDNA, several parameters were used, as shown in Fig. 2. For the dumbbell structure of A3G, the cross-sectional feature was selected, as shown in Fig. 2A (marked with a red line on the AFM image). Fig. 2B illustrates three parameters, calculated from the cross-section of the dumbbell structure of A3G. The height of each maximum is marked as h1 for Domain 1 and h2 for Domain 2; the center-to-center distance is marked as d between domains. For the globular structure of A3G, as shown in the AFM image in Fig. 2C, the ratio between two orthogonal diameters d1
:
d2 was used, marked as blue and red lines, respectively. The plot in Fig. 2D illustrates measurements for two cross-sections of the globular structure.
Fig. 3 shows results from data analysis for the dumbbell and globular structures of A3G. Fig. 3A shows the dependence of the distance (d) between the two A3G domains on the frame number, calculated for the dumbbell structure of the A3G–69 nt tail ssDNA complexes. These data show a wide range of fluctuation in the distances between the two domains, between 3 nm and 8 nm. Fig. 3B provides a histogram for the distribution of the distance (d) between the two domains in the dumbbell structure of A3G, and the Gaussian fit gives the average distance of d = 5.1 ± 1.0 nm. Fig. 3C shows the result for globular A3G in the complex as a dependence of the d1
:
d2 ratio on the frame number and as a d1
:
d2 histogram (Fig. 3D). The Gaussian fit to the histogram produces a mean value for the d1
:
d2 ratio of 1.3 ± 0.2, which resembles the data for free A3G.
Another important parameter, which can be obtained from the HS-AFM data, is the lifetime for the specific structure of A3G in the complex. Fig. 4A shows a plot for the dependence of the distance (d) between the two domains in the dumbbell structure (right axes, blue) and d1
:
d2 for the globular structure (left, black) for the A3G–69 nt tail ssDNA complex on the frame number, obtained from one of the movies. Blue dots show changes in the distance (d) between A3G domains in the dumbbell structure, and black triangles represent fluctuations in the d1
:
d2 ratio for the globular structure. Following frame-by-frame transitions between globular and dumbbell structures, the lifetime was calculated for each structure of A3G in the complex. The zoomed portion of the plot in Fig. 4A (marked by a red rectangle) is shown in Fig. 4B, where several consecutive, uninterrupted frames for the dumbbells characterize their lifetime (blue dots), and likewise, several uninterrupted frames for the globular structure (black triangles) characterize the lifetimes of the globular structure. Fig. S2† offers another example of the dynamic behavior of the dumbbell structure of A3G in the DNA complex. The plot in Fig. S2† illustrates an example of the long-lived dumbbell structure of A3G in complex with the 69 nt tail ssDNA substrate, with large fluctuations in the distance (d) between the two domains. Analysis of the lifetimes obtained from all assembled movies for the A3G–69 nt tail ssDNA complexes is shown as histograms in Fig. 4 for the dumbbells (C) and globular (D) A3G structures. The fit of these histograms with first-order exponential decay gives a lifetime of 0.64 ± 0.03 seconds for dumbbells and 0.39 ± 0.06 seconds for globular structures.
:
d2 on the frame number (Fig. 6C); the histogram for d1
:
d2 is shown in Fig. 6D. The calculated lifetimes for dumbbell and globular structures are shown in Fig. 6E and F, which show that the lifetime of the dumbbell structure for A3G in complex with the 25 nt tail ssDNA is less than that of the globular structure: 0.42 ± 0.01 seconds and 1.29 ± 0.09 seconds, respectively.
Fig. 7 presents selected frames from Movie 3,† where the transient dissociation of one of the A3G domains from the ssDNA substrate is unambiguously seen. Frames 21, 25, 47, and 56 show one smaller-sized domain unbound to the ssDNA substrate. Frames 175, 182, 187, and 196 show both domains, similar in size, bound to the ssDNA gap substrate. A3G also formed a globular, compact structure, as seen in frames 42 and 73.
![]() | ||
Fig. 7 Selected frames from Movie 3† illustrating the different positions of A3G domains in complex with the 69 nt gap DNA. Frames 21, 25, 47, and 56 show the dumbbell structure of A3G with one domain unbound to the ssDNA substrate. Frames 42 and 73 represent the globular structure of A3G. Frames 175, 182, 187, and 196 represent the dumbbell structure of A3G with both domains located on the ssDNA substrate. The ratio of the height of Domain 1 to that of Domain 2 (h1 : h2) of A3G is inserted at the top of each frame. The scale bar is 50 nm. The scan rate is 398 ms per frame. | ||
The smaller size of such a domain can be explained by its lack of binding to the ssDNA substrate, which may contribute to the overall size of the domain. To confirm this effect, the ratios of the heights of Domain 1 (h1) to those of Domain 2 (h2) were calculated (Fig. 2B). Data for the h1
:
h2 ratio are incorporated into frames in Fig. 7. When both domains are in the dumbbell structure and bound to the substrate, DNA contributes equally to the sizes of the domains. Therefore, the ratio of heights of the domains h1
:
h2 would be expected to be close to one, which is clearly seen in frames 175, 182, 187, and 196. Meanwhile, when one of the domains is unbound to the ssDNA substrate, the h1
:
h2 ratio should increase due to the lack of binding of this domain with the ssDNA substrate, as seen in frames 21, 25, 47, and 56.
The yield of the dumbbell conformation of A3G also depends on the length of the ssDNA substrate. Table 1 summarizes data obtained from analyses of the dumbbell and globular structures of A3G in complex with 69 nt and 25 nt tail ssDNA substrates and free A3G. As seen in Table 1, in the presence of a long 69 nt ssDNA substrate, the dumbbell structure shows the highest yield of dumbbells (65%), which drops to 35% for a shorter, 25 nt ssDNA substrate, and comprises only 16% of free A3G. Together these data clearly demonstrate the effect of the ssDNA substrate on conformational changes of A3G domains and show the dependence of such changes on the length of the ssDNA substrate. The average distance between A3G domains for the dumbbell structures in A3G–ssDNA complexes tends to change slightly, from 5.1 ± 1.0 nm for a long substrate and decreasing up to 4.7 ± 1.0 nm for a shorter one, the smallest being 4.4 ± 0.9 nm for free A3G. Data for the globular structure do not demonstrate changes for A3G ssDNA complexes and free A3G, indicating that the ssDNA substrate does not affect the globular structure of A3G. Indeed, the d1
:
d2 ratio remains equal to 1.3, indicating the elongated shape for both A3G in complex with ssDNA and free A3G.
| 69 nt tail DNA–A3G | 25 nt tail DNA–A3G | Free A3G | |
|---|---|---|---|
| Globular d1/d2 | 1.3 ± 0.2 | 1.3 ± 0.2 | 1.3 ± 0.3 |
| Dumbbell yield | 65% | 35% | 16% |
| Dumbbell distance | 5.1 ± 1.0 nm | 4.7 ± 1.0 nm | 4.4 ± 0.9 nm |
HS-AFM data also reveal a different affinity for the A3G domains in the dumbbell structure to the DNA substrate. As seen in Fig. 7, one of the A3G domains in complex with ssDNA is capable transiently dissociating from the ssDNA substrate. Quantitatively, for the dumbbell structure of A3G in the complex, this effect is illustrated by measuring of ratios of the heights of Domain 1 to those of Domain 2 (h1
:
h2). The value of the h1
:
h2 ratio is close to one when both domains are bound to the substrate, but when one of the domains is unbound to the ssDNA the h1
:
h2 ratio is 1.3. These measurements were performed for ssDNA substrates with both the 69 nt and 25 nt tail ssDNA substrates. Fig. 8A and B present the results of this analysis. Histograms for A3G complexes with 69 nt and 25 nt tail ssDNA substrates have two distinct peaks. The first peak, with almost equal heights of the domains, corresponds to cases when both domains are bound to the substrate. The second peak corresponds to cases when one of the domains is unbound to the substrate, with the h1
:
h2 ratio close to 1.3, indicating the contribution of ssDNA to the size of the domain. Comparatively, for free A3G (Fig. S3†), the histogram shows only one maximum for the ratio h1
:
h2, which is close to one. Another line of evidence for the contribution of ssDNA to the overall size of the A3G domains comes from directly measuring the heights of each domain for free A3G and A3G in complex with 69 nt tail ssDNA, as shown in Fig. S4.† Here, we assembled histograms for the heights of each domain in the dumbbell structure for free A3G (Fig. S4A and B†) and for A3G in complex with the 69 nt tail ssDNA (Fig. S4C and D†). Data demonstrate that the heights of the domains for free A3G are similar when compared to the heights of domains for A3G in the complex (Fig. S4C and D†). Note that the height of one of the domains for A3G in the complex with the ssDNA substrate is close to the height of both domains for free A3G (Fig. S4D†), which indicates that this domain is unbound to the ssDNA substrate (Fig. S4C†). Overall, the data presented here clearly demonstrate that one of the domains in the dumbbell structure of A3G is capable of transiently dissociating from the ssDNA substrate, supported by the lack of the contribution of ssDNA substrate to the size of the protein.
The diagrams in Fig. 8C and D summarize the analysis of all the results obtained here. The grey area in the diagram presents the yield of globular A3G structures, calculated to be 35% for the 69 nt tail ssDNA substrate (A) and 65% for the 25 nt tail ssDNA substrate (B). The estimated lifetime for the globular A3G structure in complex with the 69 nt tail ssDNA (∼0.39 ± 0.06 s) tends to be less than that with the 25 nt tail ssDNA substrate (∼1.29 ± 0.09 s). The shorter lifetime for the globular structure correlates with the reduced yield of the globular structure compared to the dumbbell structure for the A3G–69 nt ssDNA complexes. The blue and orange areas together show the yield of dumbbell structures for long and short ssDNA substrates to be 65% and 35%, respectively, with a tendency toward increased lifetimes for the dumbbell structures in complex with 69 nt ssDNA (∼0.64 ± 0.03 s) compared with a shorter ssDNA substrate (0.42 ± 0.01 s). These results show the correlation between the yield of dumbbell and globular structures of A3G and their lifetime on the different ssDNA substrates.
As shown in Fig. 8A and B, the two distinct peaks for h1
:
h2 values, shown for the long and short ssDNA substrates, demonstrate different positions of A3G domains on the ssDNA substrate. Indeed, when both domains are bound to the ssDNA substrate, the h1
:
h2 ratio is close to one, compared to the h1
:
h2 ratio equal to 1.3 when one of the domains is unbound to the substrate. In addition to the position of the domains in dumbbell structures of A3G in A3G–ssDNA complexes discussed above, the areas under peak 1 and peak 2 (Fig. 8A and B) indicate the different number of events for bound and unbound domains for long and short ssDNA substrates. Indeed, for a long substrate, the ratio between areas under peak 1 and peak 2 is 1.8, indicating an almost twice greater number of events when both A3G domains are positioned on the ssDNA compared to one of the domains being unbound. The blue and orange areas in Fig. 8C show such a distribution to be 42% for both domains bound to the ssDNA substrate (blue area) vs. 23% for the unbound one (orange area). For a short ssDNA substrate (Fig. 8D), the ratio between areas under peak 1 and peak 2 is 1.1, demonstrating a practically equal number of events for A3G domains positioned on the substrate and for one domain unbound, as shown in blue (18%) and orange (17%) areas in the diagram, respectively.
HS-AFM is not capable of identifying which domain remains in contact with the ssDNA and which is temporarily dissociated. Nevertheless, several lines of evidence allow us to posit that the CTD is the domain capable of transiently dissociating from the ssDNA. Computer analysis performed35 shows that the isoelectric point (pI) of the N-terminal domain (NTD) is 9.6, compared to 6.9 for the CTD. In addition, the number of aromatic amino acids in A3G essential for ssDNA binding is 9 for the NTD versus only 6 for the CTD. Taken together, these findings suggest tighter binding for the NTD than for the CTD. Also, more stable binding of the NTD with ssDNA than of the CTD has been reported.28,35,36 Moreover, it is demonstrated that the NTD is responsible not only for binding with ssDNA,35,36 but also for positioning and stabilizing active sites of the CTD for efficient deamination of ssDNA.37 Mutational studies38 suggest the following two steps for A3G binding with the ssDNA template: (1) initially, high affinity binding is carried out by the NTD with Kd in the nM range, (2) followed by the CTD with Kd in the μM range. In addition, the data obtained in ref. 39 and 40 have demonstrated that during A3G sliding, the CTD tends to dissociate from ssDNA. Therefore, we hypothesize that the CTD has greater conformational mobility compared to the NTD, and is capable of transiently dissociating from the ssDNA template.
Based on our data, we suggest a model where the substrate length is key in determining whether a dumbbell or globular structure will form on each ssDNA substrate. Fig. 9 illustrates such a model for long (A) and short (B) ssDNA substrates. The red ball represents the CTD, which forms a dumbbell structure and is unbound to the ssDNA, and the blue ball represents the NTD bound to ssDNA (state i). In this state (i), only the NTD is bound to the substrate, and A3G may dissociate from a long or short substrate with equal probability. This would explain the similar number of cases when only one domain is bound to the substrate for both long and short ssDNA substrates, 23% vs. 17%, respectively (Fig. 8C and D, orange area). If not dissociated, as in the case of a long substrate (A), the CTD may return to the substrate and preserve the dumbbell structure (grey arrows, state ii) with both domains bound to ssDNA; alternatively, A3G may come close to the NTD domain to form a globular structure (purple arrow, state iii). In the case of a long substrate, A3G has a greater chance of holding the dumbbell structure with both domains bound to the ssDNA, as shown in Fig. 8C (blue area). Therefore, it is reasonable to theorize that for a long substrate, the increased yield of dumbbell structures is primarily due to both domains being bound to the substrate. However, this differs for a short substrate (B). Indeed, the CTD in state i may return to the NTD to form a globular shape (purple arrow, state iii) or form a preserved dumbbell structure with both domains bound to the substrate (grey arrow, state ii) or one domain dissociated from the substrate (orange arrow, state iv). However, for a short substrate, there is less possibility to preserve the dumbbell structure with two domains bound to the substrate, which comprises 18% (Fig. 8D, blue area), compared to 42% for a long substrate (Fig. 8C, blue area).
The conformational changes between domains, facilitated by an interdomain linker,28 are more easily achieved when A3G adopts a dumbbell structure and may facilitate functions of A3G such as sliding8,19,41 and intersegmental transfer16 and eventually the search for the deamination target of the ssDNA substrate. Our data demonstrate that one of the domains is capable of transiently dissociating from the substrate, and such dynamics may facilitate the search for the deamination target. Moreover, we suggest that the CTD is the domain that transiently dissociates from the substrate to facilitate this search. Based on our data, we posit that the dumbbell structure of A3G represents an active structure of the protein. Interestingly, a decrease in the yield for dumbbell structures with a short substrate correlates with the length dependence of deaminase activity of A3G.8,42,43 Indeed, it was shown8 that specific activity of A3G increases between 15 nt and 60 nt ssDNA lengths and remains unchanged thereafter. Despite the fact that both globular and dumbbell forms of A3G provide efficient binding with the ssDNA substrate, a correlation between length-dependence of deaminase activity and the yield of dumbbells supports our hypothesis that dumbbell structures of A3G represent an active form of the protein. Given that A3G is dynamic and in the extended dumbbell conformation occupies a space as long as ∼10 nm, this property of A3G is a factor that defines the interdomain dynamics of the protein. Indeed, 10 nm corresponds to an ssDNA length of ∼30 nt, and we did observe the decrease of dumbbell conformation for the 25 nt ssDNA substrate.
:
1 ratio with a phosphorylated 23 nt oligo (Integrated DNA Technology, IA) to form a 20 bp DNA duplex with sticky ends. Later, the construct was ligated at 16 °C overnight with a previously gel-purified 356 bp DNA fragment with sticky ends. The ligated product was purified from the gel using a QIAquick Gel Extraction Kit (Qiagen) as described33 and re-suspended in TE buffer containing 10 mM Tris, pH 7.5, and 1 mM EDTA. The final product consists of the 69 nt ssDNA attached to a 379 bp dsDNA fragment as a tag.
:
1 ratio and annealed with the bridge oligo. Next, the annealed product was ligated at 16 °C overnight. To remove the bridge oligo, the product was heated to 70 °C for 5 minutes and immediately put into ice. Finally, the 69 nt gap DNA substrate was gel purified using a QIAquick Gel Extraction Kit (Qiagen), as described.33 The final product consists of 69 nt ssDNA flanked with 441 bp and 235 bp dsDNA, respectively.
:
1 protein-to-ssDNA ratio in binding buffer containing 50 mM HEPES, pH 7.5, 100 mM NaCl, 5 mM MgCl2, and 1 mM DTT. The complex was incubated for 15 minutes at 37 °C before deposition on a mica surface. Fig. S1† schematically shows the positions of A3G on different ssDNA substrates.
Footnote |
| † Electronic supplementary information (ESI) available. See DOI: 10.1039/c9na00457b |
| This journal is © The Royal Society of Chemistry 2019 |