DNA brick self-assembly with an off-lattice potential

We report Monte Carlo simulations of a simple off-lattice patchy-particle model for DNA `bricks'. We relate the parameters that characterise this model with the binding free energy of pairs of single-stranded DNA molecules. We verify that an off-lattice potential parameterised in this way reproduces much of the behaviour seen with a simpler lattice model we introduced previously, although the relaxation of the geometric constraints leads to a more error-prone self-assembly pathway. We investigate the self-assembly process as a function of the strength of the non-specific interactions. We show that our off-lattice model for DNA bricks results in robust self-assembly into a variety of target structures.


Introduction
Self-assembling materials have been the subject of considerable scrutiny by researchers. [1][2][3] However, most selfassembling structures investigated thus far have been constructed using only a small number of distinct building blocks. The reason is that a system consisting of many different components usually fails to self-assemble due to self-poisoning. It was therefore rather surprising to the community when Peng Yin's group demonstrated that potentially thousands of distinct DNA molecules can reproducibly self-assemble into complex, fully addressable, nearly error-free target structures. 4,5 The prospect of such addressable complex self-assembly has captured the imagination of several groups, and much experimental and theoretical work has been undertaken to try and understand the principles and behaviours of such systems. [6][7][8][9][10][11] In the canonical DNA brick set-up, short single-stranded DNA molecules have sequences chosen in such a way that molecules with which they are designed to hybridise in the target structure are made to be complementary to each other. DNA molecules can hybridise whether or not they are completely complementary; however, the free energy of hybridisation depends strongly on the sequence and Watson-Crick pairing is much more favourable than other combinations of bases. Thus it is generally the case that 'designed' interactions (i.e. those interactions corresponding to hybridisation pairs that are present in the target structure, and which are designed to be complementary) are considerably stronger than all other ('incidental') interactions. This permits the large number of distinct DNA molecules to self-assemble into structures comprising potentially thousands of molecules, 4 although there is in principle an upper limit to the target structure size on entropic grounds. 6 To try to explain why selfassembly succeeded for such DNA-based structures whilst it failed for many other, similar -and often even considerably simpler -systems, we have recently developed simulationbased 12,13 and theoretical approaches [14][15][16][17] to studying the problem. The original experiments on DNA bricks 4 entailed short, 32-nucleotide single-stranded DNA molecules. Each molecule was divided into four domains, with each of the four domains designed to hybridise with a different neighbouring DNA molecule in the target structure. By designing which molecules hybridise with which other molecules, intricate target structures can be designed in a modular manner. 4 However, the choice of the length of the DNA molecules was not arbitrary: as domains form a double-stranded helix with a length of 8 base pairs, this results in a dihedral angle very close to 90 • , 4 since the normal ('B') form of DNA comprises a helix with 10.5 base pairs per turn. This creates a rectangular pattern of DNA helices, but the centres of mass of each of the singlestranded DNA molecules form a (distorted) diamond lattice. 4 We have previously used this fact to design a very simple 'patchy particle' potential, where each particle has four rigid tetrahedrally arranged patches, each with a distinct DNA sequence, to represent the four domains of a DNA molecule.
Our previous simulations with this simple lattice potential have confirmed the experimental hypothesis that nucleation plays a crucial role in the self-assembly process. The underlying idea is simple: at high temperatures, a dilute solution (effectively a 'vapour' phase) is thermodynamically stable, whilst at low temperatures, any incidental, undesigned interaction is favoured and large aggregates form instead of the target structure. At intermediate temperatures, incidental interactions are not yet dominant, but designed interactions are sufficiently favourable that the target structure can form. However, there is a free-energy barrier that prevents the target structure from forming en masse: the process is initiated by nucleation, which is crucial for the self-assembly process. Since the nucleation of a cluster involves crossing a not insignificant free-energy barrier, it is often described as a rare event. Crucially, this means that the clusters that become post-critical are on average very far apart from each other, meaning that they do not interact and do not have the opportunity to form larger incorrect structures. Moreover, the monomers do not all suddenly rush to form clusters, meaning that the monomers are not depleted too rapidly from the surrounding solution. 12,15 Intriguingly, although this process is not entirely unlike crys-tal growth, and crystals are well known to form by nucleation, the nucleation behaviour seen in DNA brick self-assembly is non-classical. Unlike in classical nucleation theory, where the nucleating cluster grows without limit once the maximum in the free-energy barrier has been crossed, in the self-assembly of finite structures, the fully assembled structure does not normally appear to be stable at the point at which the nucleation barrier just becomes surmountable. 15 The reason for this is that whilst there is only a single way to arrange all the particles in the target structure, there are numerous ways of constructing slightly smaller structures, since there are many distinct monomers which can be missing to arrive at a structure of a given incomplete size. At sufficiently high temperatures, this additional entropy wins over the enthalpic favourability of forming the target structure in full. DNA brick structures must therefore be prepared by following a cooling protocol: 15 once nucleation has occurred, the structure must be cooled further still in order to assemble the target structure to completion. We have gained a very considerable degree of insight by performing both lattice simulations and theoretical calculations. However, whilst we have begun to understand the underlying physics which permits DNA brick structures to form, there are several questions that remain unanswered. One particular weakness of the model we have previously proposed is the fact that we have assumed DNA molecules can only move on a lattice and can only adopt one of 24 fixed orientations. A similar constraint was applied in the theoretical approach. Clearly, such constraints have a significant effect on the entropy of the system, and it is therefore important to determine whether the self-assembly that is observed in our lattice model is robust when we go off-lattice. It is not at all clear a priori that just because a lattice model forms a finite ordered structure, an off-lattice analogue will as well: a lattice model cannot distinguish between a dense phase that is liquid-like and one that is crystalline. However, the difference between a truly ordered and simply a 'dense' structure is crucial in the study of selfassembly.
In this work, we propose a simple off-lattice potential that, while still very much a coarse-grained representation of DNA bricks, can capture more of the translational and orientational entropy of the structural building blocks. We present a general statistical mechanical derivation that results in a simple, yet realistic mapping of the model's parameters to experimental data. Finally, we show that such an off-lattice potential behaves in a way that is analogous to the lattice potential we had introduced previously and permits us to construct a variety of target structures in a similar manner to that already investigated, but with a few significant differences which we address below.

Matching an off-lattice potential to experimental data
One of the simplest possible off-lattice potentials that we can use to model the 'patchy' nature of the interaction of the DNA bricks introduced above is a Kern-Frenkel-type 18 potential, where ij is the interparticle vector of length r ij , i and j are the position vectors of particles i and j, respectively, and i and j are their orientations. This is effectively a square well potential, but with an additional angular dependence given by whereˆi α is the normalised position vector of patch α on particle i (see Fig. 1) and p is an optional penalty that can minimise the chance of generating interpenetrating lattices (in our simulations, p = 0 or p/k B = 100 K). 19 The parameter σ is the unit of length, whilst λ , ε and θ c are parameters yet to be determined. It is possible to parameterise a potential by finding a suitable mapping between the potential of interest and either experimental data or another potential for which the parameterisation is known; see e.g. Ref. 20. In this work, we consider the hybridisation of two single-stranded DNA molecules A and B to give a hybridised (double-stranded) molecule AB, for which the equilibrium constant can be written as where

the standard number density and
∆G -• is the standard Gibbs energy for the transformation given in Eqn (3) where 50 % of the monomers have hybridised. This hybridisation free energy can be obtained from the SantaLucia thermodynamic model. 21,22 We can write an equivalent expression to Eqn (4) for the simple Kern-Frenkel model presented above. Assuming that the solution of monomers and dimers is ideal, which is reasonable provided the concentration of each species is small, we can write the canonical partition function of each species x (where x can be A, B or AB) as where V is the volume of the container, N x is the number of particles of species x, Λ x is the de Broglie thermal wavelength of species x and q x is the internal partition function of species x. Note that the thermal wavelength of AB involves integrals over the momenta of both A and B and thus has the dimensions of area rather than length. Each of the chemical potentials can straightforwardly be calculated from the canonical partition function, r jβ At equilibrium, µ A + µ B = µ AB , which we can solve as We assume that the internal state of the monomeric units that bind is not affected by binding. We express this by setting the internal partition functions of the two monomers equal to unity, q A = q B = 1. 23 Moreover, because the AB molecule is described classically as a dimer of the A and B particles, the de Broglie thermal wavelengths cancel out, since the momenta of the two monomeric units in the dimer are uncoupled. The rotational partition function of the dimer is thus subsumed into the translational degrees of freedom of the constituting monomers, given that we integrate over the potential energy over all possible states; we include this contribution in the internal partition function q AB , which will therefore have dimensions of volume. We show in Appendix A that it is given by The equilibrium condition given by Eqn (7) can thus be written as Comparing this equation with Eqn (4) allows us to write Using typical dimensions of a DNA brick, 4 σ 3 ≈ 2.5 nm × 2.5 nm×2.7 nm, gives ρ -• σ 3 ≈ 10.1, leaving only the parameters λ , θ c and ε unaccounted for. Ideally, we might wish to choose ε = -∆G -• . However, at a reasonable bonding distance of λ σ = 2 1/3 σ = 1.26σ , Eqn (10) would lead to a patch width of θ c = 46 • . This is a very wide patch width, which would allow more than one simultaneous patch-patch interaction for any given patch, and thus lead to rather ill-defined structures. Instead, rather than fix ε and λ , we can set λ and θ c to reasonable values, for example θ c = 20 • and λ = 1.5. 24 We then find that ε = -∆G + 2.387 k B T. In other words, the energy of interaction now accounts for the fact that some entropy is being lost by constraining the bond angle.
The approach we have followed allows us to parameterise an off-lattice potential in a way that captures much of the fundamental physics of the system of interest without introducing a significant bias beyond that of the choice of the form of the potential. However, it ought to be borne in mind that the parameters are not uniquely determined by this mapping. In particular, λ , ε and θ c are interdependent. An unreasonably large choice of λ or θ c can mean that the assumptions we have made in the derivation can be inappropriate: for example, if more than one particle can bond to a single patch, the dimer assumption is clearly broken. By contrast, a very small patch width or cutoff radius can lead to exceedingly slow dynamics, and so the equilibrium situation may never be reached in simulations. The parameters must therefore be chosen with some consideration given to the practicalities of the required simulations.

Results
To verify that the model introduced above and parameterised to correspond roughly to experimentally-derived data represents a reasonable approach to simulating DNA brick selfassembly, we perform canonical Metropolis Monte Carlo 25 simulations with 'virtual moves' 26 accounting for the motion of clusters. Following the approach we have used with lattice simulations, 12 we have used umbrella sampling with adaptive weights, 27 with umbrella sampling steps performed every 200 000 Monte Carlo steps, 28 in order to determine the freeenergy barrier as a function of the size of the crystalline cluster in the system. Each particle type in the system has four patches arranged in a tetrahedral manner; each patch is assigned a random DNA sequence, but such that patches that point at each other in the target structure have complementary sequences. In every simulation reported here, a single instance of each particle type was placed in the simulation box, so that at most a single copy of the target structure can assemble.
The behaviour we observe is analogous to that seen in lattice simulations, and this in turn has been shown to correspond remarkably well to experimental results. 12,15 For example, we are able to self-assemble a range of relatively complex target structures in brute-force simulations, as shown in Fig. 2. The underlying behaviour we have proposed for this process in our previous work 12-16 is still predominantly unchanged: self-assembly in such systems is possible over a limited range of temperatures because of a free-energy barrier to nucleation that prevents immediate aggregation and monomer depletion. The structures shown in Fig. 2 correspond to some of the largest structures that spontaneously self-assemble in bruteforce simulations; while the majority of the target structure can be seen to have formed in each case, the structures are incomplete: as discussed above, the full target structure can be assembled by lowering the temperature after the nucleation process has taken place.
It is noteworthy that for a relatively short-ranged potential such as the one studied here, previous work suggests that the open diamond-like structure is only stable at low pressures and temperatures. 29 At the temperatures and densities we considered, the work of Romano et al. 30,31 suggests that for tetrahedral patchy particles with identical interactions, at equilibrium the mixture phase-separates into a gas and a diamond cubic crystal. In brute-force simulations of patchy particles where every particle is identical and all bonds equally strong, we find that the resulting phase is typically a vapour in equilibrium with a dense fluid, perhaps indicating that the nucleation barrier to forming a diamond-like phase is significant, as expected for patch widths as large as the one we are considering. 32,33 It appears that the fact that each particle is distinct and can only bond strongly with very specific other particles in the system plays a crucial role in enabling us to form tetrahedral structures even in conditions where single-component patchy particles cannot successfully self-assemble.
In addition to brute-force simulations, we have calculated free-energy barriers for small target structures (Fig. 3) in a range of conditions (Fig. 4). It is convenient in the first instance to compute the free-energy barrier for a system in which only the designed interactions are switched on, and they all have the same bonding energy. A free-energy profile for such a system is shown in Fig. 4(a). The behaviour observed is very similar to that seen in lattice simulations, and the basic features are essentially identical to those observed in lattice simulations 12 and in theoretical work: 14,15 the free energy initially increases with the cluster size, as the enthalpic gain of a single bond is insufficient to compensate for the entropic loss of binding a monomer from the vapour phase to the growing cluster. However, the completion of every 'cycle', i.e. a closed loop of particles that are bonded to one another, is a process in which two bonds are formed simultaneously, and this process is thermodynamically favoured. This gives the free-energy barrier as a function of cluster size a distinctive jaggedness, as the free energy decreases upon the formation of individual 'cycles' in the largest cluster. Although this behaviour is expected, the picture changes as interactions between patches that are not bonded in the target structure are switched on. We have investigated this behaviour further by studying a range of systems with pre-determined interaction strengths both for the 'designed' and the 'incidental' interactions (i.e. interactions that are present in the target structure and all other possible patch-patch interactions, respectively), whereby all designed patch-patch interactions contribute an energy of ε designed /k B = 4000 K and the incidental patch-patch interactions contribute an energy that ranges from ε incidental /k B = 200 K to 1800 K (where ε is defined in Eqn (2)), but in any one simulation all the designed and all the incidental interactions have the same strength. 34 Free-energy profiles for a selection of these systems are shown in Fig. 4(b). Clearly, the more significant the incidental interactions are, the smoother the free-energy profile becomes. The greater number of possible clusters in off-lattice simulations, including in particular cycles comprising fewer than six monomers, can stabilise the incomplete structures near the top of the free-energy profile in ways that are not possible in on-lattice simulations. Moreover, whilst both the vapour phase and the growing nucleus are stabilised by such incidental interactions, the growing nucleus is stabilised more, reducing the overall height of the nucleation free-energy barrier. 35 The free-energy behaviour of systems that can interact via incidental bonds is interesting because it demonstrates that the finer features of the free-energy profile can be lost when studying more realistic systems than the lattice potential we have previously used as a model for DNA brick self-assembly. Furthermore, because the free-energy barrier to nucleation is smaller for off-lattice systems including incidental interactions than it is for on-lattice analogues, the temperature window in which the nucleation barrier is surmountable but incidental interactions are still sufficiently weak for self-assembly to occur is likely to be even smaller than previously estimated. However, the key features of the non-classical nucleation behaviour we have identified previously remain: because the target struc- The nucleation free energy ∆A of the system relative to the vapour of monomers for a very small target structure of 26 particles [ Fig. 3(a)]. In (a), only designed interactions are included in the energy calculation. T = 318 K, ε designed /k B = 4000 K (or equivalently ε designed /k B T = 12.58). In (b), all designed interactions have a uniform energy of ε designed /k B = 4000 K (or equivalently ε designed /k B T = 12.31), while the incidental interaction strength varies as labelled (in units of k B T, ε inc /k B T is 0.62, 3.69 and 5.54). T = 325 K. Alternating line styles are used for the individual umbrella sampling windows and the initial brute force simulation. In (c), three free-energy profiles corresponding to the full interaction potential, with ε computed from the SantaLucia model depending on the DNA sequence of each patch, are shown: (i) corresponding to a 26-particle target structure [ Fig. 3(a)], and (ii) corresponding to a 56-particle target structure [ Fig. 3(b)]. The label 'D' means that only designed interactions were taken into account, whilst 'F' indicates that all interactions, designed and undesigned, were taken into account. (i) T = 308 K, (ii) T = 316 K. ture is fully addressable, there is only one possible target structure (even if it may now have more practical realisations because the particles are no longer fixed to lattice sites), whereas there are many possible ways in which to assemble partially formed structures. This means that, in conditions where a freeenergy barrier exists to prevent instantaneous nucleation, the target structure is not stable. In order to form the full target structure -which one can envisage is of crucial importance in experiment, where only the fully formed target structure may exhibit the functionality we desire -, it is still crucial that a selfassembly protocol be adopted, with the temperature gradually being reduced as the self-assembly proceeds. 15 Free-energy barriers for target structures simulated using the full potential described above, with interactions between any two patches, whether 'designed' or 'incidental', calculated using the longest complementary set of their associated DNA sequences, are shown in Fig. 4(c). Three free-energy profiles are plotted: the curves labelled (i)F and (i)D correspond to the same choice of DNA sequences, but differ in that the curve labelled D was computed in simulations where only designed interactions were taken into consideration, whilst the curve labelled F corresponds to the full interaction potential, including all incidental interactions. However, the incidental interactions calculated using the DNA sequences associated with each patch are quite weak, and including such weak incidental interactions only slightly stabilises high free-energy structures and thus somewhat reduces the free-energy barrier to nucleation. Finally, the free-energy curve labelled (ii)F in Fig. 4(c) corresponds to a system with a larger target structure. As the target structure size increases, the free-energy barrier to nucleation becomes noticeably smoother, since there are simply many more possible clusters that can form with the same number of building blocks.
The free-energy profiles shown here are not radically different from those we have previously reported for lattice simulations. While the free energy as a function of the largest cluster size is somewhat more difficult to interpret in such offlattice simulations, it remains the case that the self-assembly is controlled by nucleation, and brute-force simulations confirm that it is still possible to find conditions under which the free-energy barrier to nucleation is sufficiently small that nucleation can occur spontaneously, but large enough to be rate-limiting, as appears to be necessary for successful selfassembly.

Conclusion
In this work, we have introduced a very simple approach to obtaining a relatively sound parameterisation of a simple offlattice coarse-grained potential of DNA bricks. In particular, we have shown how a Kern-Frenkel-type potential can be fitted to the hybridisation free energy of two single-stranded DNA molecules that is known from experiment, which allows us to parameterise the potential with comparatively little effort. We have verified that an off-lattice model parameterised in this way gives a reasonable description of the self-assembly of DNA bricks.
The behaviour of DNA bricks that we previously studied using a lattice-based approach both in simulations and using a theoretical approach does not change significantly when simulated using this more realistic off-lattice potential, which helps to support the claim we have previously made that the majority of the underlying physics of self-assembly is captured by the simple patchy model we have previously studied. However, the different dependence on incidental interactions present in the system demonstrates that, not unexpectedly, the offlattice potential self-assembly is somewhat less robust than its on-lattice analogue. Moreover, comparing the lattice and offlattice approaches provides us with significant insight into the types of interaction that truly are fundamental and which can safely be coarse-grained away.
Although the computational model we have introduced is still very simple, its off-lattice nature allows us to relax the severe constraints on the geometry of the structures that were able to be assembled using our previous models. 12,13 The fact that the underlying self-assembly behaviour is not significantly different when studied using an off-lattice potential is very good news, particularly if a different experimental set-up were to be used to construct the kinds of many-component structures we have investigated. For example, DNA Holliday junctions and multi-arm motifs 36 could be used as building blocks instead of short single-stranded DNA; however, such structures might be expected to be much more floppy than canonical DNA bricks. Although our simulations show that relaxing the geometric constraints that the system must satisfy results in slower, more error-prone assembly, the target structures do in fact form reliably over a narrow range of conditions, which gives us some degree of confidence that alternative experimental strategies are certainly worth exploring in more detail.
In future work, it would be prudent to investigate the initial stages of the nucleation behaviour with a much more realistic model of DNA, perhaps of the order of complexity afforded by oxDNA, 37 to verify to what extent the predictions made by our simple coarse-grained potentials are reproduced by DNA. However, this will be a very challenging endeavour indeed, since more realistic potentials are prohibitively expensive to simulate over times sufficiently long to obtain representative behaviour.

A Dimer internal partition function
The internal partition function of the dimer corresponds to the volume available for bonding, which is a known result, 38 but we explicitly derive it here for reference. The internal partition function of the dimer depends on AB , the relative distance of the centres of the monomeric units. For notational simplicity, we define AB = r. We take into account the relative orientations by integrating over both A and B . Since the resulting integral is spherically symmetric in d AB , we can rewrite the volume element in spherical polar co-ordinates as d AB = dx AB dy AB dz AB = 4πr 2 dr, giving q AB = 4π r 2 dr d A d B e -β U .
To evaluate the remaining integrals over the orientational degrees of freedom, we define the Euler angles θ , ϕ and ψ as shown in Fig. 5. The angle θ measures deviations of the patch θ ϕ ψ Fig. 5 Definitions of Euler angles, where the neighbouring particle is implicitly assumed to be placed as in Fig. 1.
position from the interparticle vector; the angle ϕ measures the rotation about the patch position vector; and the angle ψ is the rotation of the patch vector around the interparticle vector. The ranges are thus θ ∈ [0, π], ϕ ∈ [0, 2π] and ψ ∈ [0, 2π].
The dot productˆA 1 ·ˆA B , which is used in the angular dependence of the Kern-Frenkel potential (Eqn (2)), is unaffected by rotations about either ϕ or ψ; this is to say that the projection of the patch vector onto the interparticle vector remains unchanged by either rotation. The potential energy in the Boltzmann exponent of Eqn (11) thus does not depend on either of these two angles, and we can therefore integrate them out; the orientational volume element is thus given by d = 1 2 sin θ dθ .
There are three possible scenarios to consider. Firstly, when r < σ , e -∞ = 0, so there is no contribution to the integral. Secondly, when r > λ σ , the dimer has dissociated, and so this also does not contribute to the internal partition function. The The upper limit for θ is θ c , the patch width; when θ > θ c , the particles no longer form a dimer, so we need not consider that situation. 41 The remaining integrals in Eqn (14) are readily evaluated, giving q AB = π 3 λ 3 -1 σ 3 (cos θ c -1) 2 e β ε , as used in the main text.