Mauricio
Cortes‡
Jr
,
Xindi
Sun‡
,
Anusha
,
Emile Joseph
Batchelder-Schwab
,
Jinyue
Li
,
Naseem
Siraj
,
Rishab
Jampana
,
Yuchen
Zhang
,
Yuntian
Bai
and
Chengde
Mao
*
Department of Chemistry, Purdue University, West Lafayette, IN 47907, USA. E-mail: cortes5@purdue.edu; sun1208@purdue.edu; aanusha@purdue.edu; ebatchel@purdue.edu; li4195@purdue.edu; nkoomull@purdue.edu; rjampana@purdue.edu; zhangyuchen0620@gmail.com; bai121@purdue.edu; mao@purdue.edu
First published on 22nd May 2025
Being able to accurately predict structures is highly desirable for nanoengineering with DNA and other biomolecules. The newly launched AlphaFold 3 (AF3) provides a potential platform for this purpose. In this work, we have used AF3 to model a list of commonly used DNA nanomotifs and compared the AF3 structures with the experimentally observed structures reported in the literature. For asymmetric motifs, AF3 structures are consistent with the experimental observations; but for symmetric motifs, AF3 structures are often substantially different from experimental observations. However, the fails can be rescued if the symmetric motifs are converted into corresponding asymmetric motifs by breaking DNA sequence symmetry while maintaining the backbone symmetry. This study suggests that while AF3 is immensely helpful, we as experimentalists should use it (as it currently stands) with caution. In addition, AF3 needs further development to incorporate the existing experimental data in the training dataset for AF3. At the current stage, a hybrid approach might be beneficial: theoretical modeling softwares calculate the detailed, 3D DNA structures based on secondary DNA structures inspired by experimental observations.
New conceptsProgrammable self-assembly of biomolecules, e.g., structural DNA nanotechnology, provides a superb approach for nanoconstruction. Its potential success critically depends on the structural prediction/design of the biomolecules. Such capabilities are generally missing. The machine-learning-based Alphafold (AF) algorithm has demonstrated excellent capabilities for structural prediction/design for proteins. AF3, the newest version of AF, extends its capability to all major biomolecules (protein, DNA, and RNA) and could be an integrated algorithm for molecular design for all biomolecules. This manuscript provides a systematic evaluation of the application of AF3 to structural DNA nanotechnology. Based on this study, we provide our suggestions for the further development of such a modeling tool for structural DNA nanotechnology. |
Fig. 1 shows the workflow for the evaluation of AF3 structural predictions to DNA nanomotifs, using a DNA 4-arm junction (4aJ) as an example.16–18 For the target DNA motif, a set of DNA strands with specific sequences were designed. With these DNA sequences as inputs, AF3 can predict the tertiary (3D) structures from which their corresponding secondary (2nd) structures can be drawn out. Note that the motifs used in this study have all been experimentally studied in the literature. Thus, the AF3-predicted structure can be directly compared with the structure observed in experiments for the same DNA motif. It is worth pointing out two additional issues in this study. (1) In this workflow, potential sequence-specific effects have not been considered. (2) In experiments, the buffers for assembly of DNA nanostructures generally include divalent cations, particularly Mg2+, to screen the strong electrostatic repulsions among negatively charged DNA backbones.16 Thus, we have included 10 Mg2+ and 10 Na+ in modeling to reflect the experimental conditions, though it is not clear the exact role and treatment of cations in the AF3 algorithm. The complete sets of data are provided in the ESI,† as Fig. S1–S31.
The AF3 results generally agree with the experimental observed structures. For example, DAE is a classic DNA nanomotif and has been extensively used for various DNA nanoconstructions (Fig. 2).21,22 However, there is no direct structural study on this motif by conventional structural biology methods, such as X-ray diffraction, NMR, or cryoEM. AF3 provides a plausible, detailed structural model, which is consistent with all experimental observations so far and our basic knowledge about DNA biophysics and structure. In DAE, all bases are in the correct base pairs and are organized into two duplexes that were roughly anti-parallel and linked by strand crossovers. On the outside, four helical domains (boxed areas in Fig. 2c) significantly deviate from being parallel to minimize electrostatic repulsion of the negatively charged DNA backbone. Such structural features are consistent with numerous AFM observations of DAE-based DNA 2D arrays.22
![]() | ||
Fig. 2 AF3 prediction of an antiparallel double crossover motif (DAE). (a) The motif design. (b)–(d) Three orthogonal views of the AF3-predicted structure. |
The successful predictions include the following:
(i) Single multi-arm branched junctions: a 3-arm junction (3aJ, Fig. S1, ESI†),13–15 a 4-arm junction (4aJ, Fig. S2, ESI†),16–18 a 6-arm junction (6aJ, Fig. S3, ESI†),19 an 8-arm junction (8aJ, Fig. S4, ESI†),20 and a 12-arm junction (12aJ, Fig. S5, ESI†).20
(ii) Parallelly aligned multi-crossover motifs: an antiparallel double crossover with separation of an even number of half-turns (DAE, Fig. S6, ESI†),21,22 antiparallel double crossover with separation of an odd number of half-turns (DAO, Fig. S7, ESI†),21,22 a triplex crossover (TX, Fig. S8, ESI†),23 a double 6-arm Junction (D6aJ, Fig. S9, ESI†),25 a double 8-arm Junction (D8aJ, Fig. S10, ESI†),25 a paranemic crossover (PX, Fig. S11, ESI†),26,27 and a 6-helix bundle (6HB, Fig. S12, ESI†).28
(iii) Double crossover-like (DXL) motifs, a symmetric DXL with 6-base pair (bp) separation (DXL-6, Fig. S13, ESI†),29 a symmetric DXL with 16-bp separation (DX-16, Fig. S14, ESI†),30 and an asymmetric DXL with 14/18-bp separation (DX-14/18, Fig. S15, ESI†).31
(iv) A parallelogram (Fig. S16, ESI†).32
(v) A symmetric 4-point star (s4PS) motif (Fig. S19, ESI†).25,36
(vi) A T-junction (Fig. S22, ESI†).39
(vii) A branched kissing loop (Fig. S23, ESI†).40
(viii) Polyhedra: a tetrahedron (Fig. S24, ESI†), and a cube (Fig. S25, ESI†).41,42
In addition to confirming the structures of well-established DNA nanomotifs, AF3 can make reasonable structural predictions for less characterized DNA nanostructures. Besides the common 3aJ and 4aJ, other multiple-arm branched DNA junctions (6aJ, 8aJ, and 12aJ) have been designed and assembled.19,20 However, their 3D structures have never been experimentally studied. Here, we used AF3 to predict the 3D structures of a 6aJ (Fig. 3 and Fig. S3, ESI†). A 6aJ has six arms that pairwise stack with each other and are organized into three pseudo-continuous duplexes. Any two adjacent pseudo duplexes are rotated from each other by 60°. The minor groove of one duplex always fits into the major groove of the other duplex to minimize electrostatic repulsion. Most importantly, there is no open space at the center of the junction. All base pairs at the center are stacked with other base pairs and are not solvent accessible. Motifs 8aJ (Fig. S4, ESI†) and 12aJ (Fig. S5, ESI†) adopt similar conformations and all resemble the structural feature of 4aJ (Fig. 1). Such conformation for the multi-arm branched junctions has been speculated before; however, this AF3 modeling supports these speculations for the first time.
![]() | ||
Fig. 3 AF3 prediction of a 6-arm junction motif (6aJ). (a) The motif design. (b), (c) Two orthogonal views of the AF3-predicted structure. (d) The corresponding secondary structure of the 6aJ. |
For some DNA motifs, AF3 predictions are substantially different from the structures observed in experiments and against our biophysical knowledge about DNA molecules. One example is the symmetric DNA tensegrity triangle (Fig. 4 and Fig. S17, ESI†). The symmetric DNA tensegrity triangle is composed of three DNA duplexes that are connected through strand crossovers at three points, corresponding to the three vertices of the triangle. It contains one central strand, three identical red strands, and three identical, outer blue strands. With inclusion of complementary sticky ends, the triangles can then further associate with each other in three orthogonal directions to form 3D crystals. Experimentally, tensegrity triangle crystals have been extensively studied by X-ray crystallography and the triangle structures have been solved.33 However, the AF3 structure is vastly different from the experimental results. The red strands crossover from one duplex to another duplex in the AF3 structure; in contrast, the experimental observation shows the red running continuously along one duplex. The structural difference between the AF3 structure and the experimental data is quite dramatic as the secondary structure and DNA topology have drastically changed. The AF3 structure also does not make sense from the point view of DNA biophysics. Each strand crossover costs extra energy compared to continuous DNA duplexes. Such inconsistencies have also been observed for other DNA motifs, including: the symmetric 3PS motif (Fig. S18, ESI†),34 the symmetric 5PS motif (Fig. S21, ESI†),37 and the symmetric 6PS motif (Fig. S22, ESI†).38 We have attributed these failures to inadequate data training. The AF3 is trained with high-resolution experimental data from X-ray crystallography, NMR, and cryoEM. Unfortunately, there is very little such experimental data available for engineered DNA nanostructures. Instead, most available high-resolution data are for DNA/RNA structures with simple topologies such as duplexes or one-stranded folding. Thus, multi-stranded DNA nanomotifs are challenging for AF3.
What structural features of a DNA nanomotif impact the performance of AF3? From all AF3 modeling in this study, there is a general pattern in AF3 performance. All asymmetric DNA motifs are correctly modeled by AF3. All the inconsistent modeling is associated with symmetric DNA motifs (in terms of both backbones and sequences). For example, AF3 gives an incorrect structural model for the symmetric 3PS (s3PS) motif (Fig. 5a–d and Fig. S18, ESI†). An s3PS motif contains a 3-fold rotational axis at the motif center; thus, the three branches are identical to each other. It is assembled from three unique strands: one black L strand with 3-fold repeating sequence, three copies of green strands, and three copies of red strands. Each green strand is supposed to cross from one branch to another at the motif center (Fig. 5a). However, due to the 3-fold rotational symmetry, the green strand could, alternatively and wrongly, make a U-turn at the motif center and stay on the same branch (Fig. 5c). In both situations, all strands are (almost) fully base paired. The current version of AF3 gets confused at this point under such complicated topology and predicts the wrong structure.
The above analysis prompts a hypothesis that AF3 will give correct models if a symmetric motif is converted into an asymmetric motif by breaking the sequence symmetry. We have tested this hypothesis, and the result has proved this hypothesis (Fig. 5e–g and Fig. S19, ESI†). Thus, a strategy is found to help the AF3 algorithm to overcome the symmetry problem in structural modeling. Please note that this strategy follows a general assumption: the exact sequence composition does not significantly change the 3D structures of DNA motifs as long as conventional Watson–Crick base pairs form.
The necessity of including sodium (Na+) and magnesium (Mg2+) in AF3 modeling of DNA nanomotifs is not clear. For most DNA nanomotifs that we have modeled using AF3, the modeling results are the same with or without 10 Na+/10 Mg2+. This observation is likely because AF3 is trained with data from experiments that already include such cations. Some exceptions exist, e.g. 12aJ (Fig. 6). Under both conditions, the 12 arms of the 12aJ pair-wisely stack onto each other to form 6 pseudo-continuous duplexes, akin to the 4aJ. There is no open space at the center. However, the AF3 structure has Na+/Mg2+ packed more densely (Fig. S5, ESI†) than the one without the extra Na+/Mg2+ (Fig. 6), consistent with the notion that cations screen out electrostatic repulsion and allow negatively charged backbones of DNA molecules to come close to each other.
In structural modeling, one important concern is reproducibility: will AF3 give the same structure for the same sequences in multiple, different trials? To address this question, we have used AF3 to model the DAE multiple times as it's the most commonly used DNA nanomotif. The results show that AF3 models are highly reproducible (Fig. 7). Models from multiple rounds of AF3 prediction can be well superimposed for each motif and the calculated root-mean-square deviations (RMSD) are in the range of 0.55–1.91 Å. The two helical domains between the two crossover points are nearly identical for all of the AF3 models. The variation mostly comes from the four helical domains beyond the crossover points.
![]() | ||
Fig. 7 Good superimposition of AF3 models from eight rounds for a DAE motif. Every model is coded with a distinct color. (a)–(c) Three orthogonal views of the models. |
AF3 is a universal modeling platform for all major biomacromolecules, including DNA and RNA. It allows modeling of DNA–RNA hybrid nanomotifs and RNA-only motifs (Fig. 8 and Fig. S27–S30, ESI†). To evaluate this feature, we used AF3 to model several such motifs. Fig. 8 shows the modeling of a DNA–RNA hybrid DAE motif and an RNA-only DAE motif. They all have been experimentally used for nanoconstruction. Both DAE motifs are symmetric and each of them contains five strands: one long, central strand (L), two copies of outside short strands (S), and two copies of medium continuous strands (M). In the DNA–RNA hybrid DAE, the M strands are RNA and both L and S strands are DNA. Thus, each helical domain is composed of one DNA strand and one other strand and is expected to adopt the A-form duplex conformation (Fig. 8a). The AF3 predicted structure is consistent with the experimental results. Equally successfully, AF3 has produced a structure that is consistent with the experimental data for an RNA-only DAE motif (Fig. 8b).
Footnotes |
† Electronic supplementary information (ESI) available: Materials and detailed experimental methods; and a figure for additional AF3 structural predictions. See DOI: https://doi.org/10.1039/d5nh00059a |
‡ Contributed equally. |
This journal is © The Royal Society of Chemistry 2025 |