Self-assembled clusters of patchy rod-like molecules

Upon aggregation, particles lose part of their freedom of motion, since particles in a cluster have to satisfy the constraints imposed by cluster geometry (see Figure S1). These geometrical constraints can be used to approximate the configurational freedom of clusters. We start with describing the boundaries on rotation around the main axis, χi. From Figure S1, we can immediately see that the angle by which particles can rotate in the cluster can be approximated as


Introduction
The development of artificial building blocks which can be used in the self-assembly of large and complex structures remains challenging. 1,2The ability to control the formation of nanostructures by assembling biomolecules is of particular interest, due to their biocompatability and possible use in medicine.Nature offers a great deal of inspiration in the assembly of proteins into an incredibly large number of various nanomachines, tools, covers, and materials with exceptional properties.4][5][6][7] Therefore, it is important to explore and find the limits of protein assemblies and attempt to expand the available repertoire of protein folds.
There is a vast available space of possible sequences in protein design, which cannot be tested in experiments or simulations for peptides longer than 10 amino acids.An alternative method to explore the available assemblies is a top-down approach, where basic features and understanding are obtained via highly coarsegrained models, which can be further refined to sequence.We employed such a route and investigated how the aggregate structure depends on the shape and distribution of interaction sites on building blocks.To start with, we focused on amphiphilic helical peptides and related finite-size cluster assemblies called coiled-coils.

Simulations
2][13][14] The size of a spherocylinder is determined by its diameter, s = 1.2 nm, and length of the cylindrical part, l = 3 nm.The real space units of nm were chosen to connect to a-helices directly, but the results are valid to other particles with the same aspect ratios.The interacting patch is defined by the angular width a, which is numbered in subscript for spherocylinders with two patches.The angle between the two patches (their centers) is denoted as g (see Fig. 1A).In simulations, we varied the size of the patches and their position.For particles with two patches we kept both patches the same, a 1 = a 2 = 101, and only varied the angle between them.While the size of spherocylinders represents a typical amphiphilic a-helix with about 20 amino acids, the results are valid for any other particles with the same shape, aspect ratio, and the arrangement of interactions.
We used an implicit-solvent model, where interactions between attractive patches effectively include all interactions between the particles, e.g.hydrophobic, hydrogen bonds, saltbridges etc. Distance dependence of repulsive interactions between particles is represented by Weeks-Chandler-Andersen (shift and truncated Leonard-Jones) potentials which goes to a minimum at ffiffi ffi 2 6 p .The attractive part of the potential has a cos 2 distance dependence with a switching range of 0.3 nm, the distance at which the interaction smoothly goes from minima to zero (see Fig. 1B).Therefore, two particles start to have non-zero interaction at distance E1.6 nm between the central line segments.The switching range was chosen to represent a short-range hydrophobic interaction.The depth of attractive potential was set to À1.3333 e per unit length of interaction patch, i.e. two spherocylinders 3 nm in length have an interaction minimum of À4 e. Orientation dependence of interaction potential is a complex function calculated with line segments, which define the spherocylinder (all points with a distance smaller than 0.5s from the line segment are inside the spherocylinder).Attractive interaction between patches is calculated from lengths of line segments situated inside an angular wedge defined by each patch.For the detailed description of interaction potential, we refer the reader to the original paper. 11All the above parameters were the same in all simulations unless explicitly described.Unless stated otherwise, all systems were composed of 125 patchy speherocylindrical particles in a cubic box with dimensions of 25 Â 25 Â 25 nm 3 , which corresponds to a number density of 0.008 nm À3 and equal to a reduced density, 15 r* E 0.04, which is well below the nematic transition at r* = 0.60 for hard spherocylinders with an aspect ratio of 2.5.All simulations were done using periodic boundary conditions.
We studied the aggregation at different concentrations, temperatures, and interaction parameters such as the size of the interaction patch and range.All simulations were carried out using the Metropolis Monte Carlo method implemented in our in-house software (freely available at github.com/robertvacha/SC).
Simulations were done with an off-lattice model using translation and rotation moves.Single particle translations were done by random vectors from a homogeneous sphere of radius 0.212 nm.Rotations were done by rotating particles around the random axis on a unit sphere 16 with an angle in the range from 01 to 7.51.To enhance the sampling and convergence, all simulations were run in the NVT ensemble with parallel tempering.The temperature range captured compositions from those dominated by large aggregates to those with mostly monomers (0.1 to 0.65 kT/e).All simulations were started from random configurations.The systems were sampled for E10 8 sweeps after reaching a steady state.
Clusters were defined as aggregates, where all the particles interact, directly or via other particles, more strongly than a threshold of 1.4 e (see Fig. S6, ESI, † and Fig. 3).We defined different clusters based on the cluster geometry (see Fig. 2).In regular clusters, particles are aligned parallel to each other.8][19][20] Closed clusters are defined by patch saturation, i.e. all patches participate in a maximal number of interactions accessible to given patch in regular clusters.The advantage of the regular closed clusters is their symmetry, which makes them easier to analyze and understand.Irregular clusters form only a small portion of clusters in our simulations and should become negligible with increasing particle aspect ratio (see Fig. S8, ESI †).

Theory
The well-defined morphology of regular clusters enabled us to formulate a following theory for both closed and opened clusters.Let's assume an aggregation equilibrium of j particles, X: jX " X j .At low densities, the clusters can be considered as non-interacting, so we can use the law of mass action to express the equilibrium constant, K, as the ratio of cluster partition functions, , where n is the equilibrium cluster population and the subscript denotes the cluster size, z j is the cluster partition function, and z 1 is the partition function of a monomer.The cluster partition function is given in general form as follows:  where n ˆdenotes the orientation of particles, k ˆdenotes the angular orientation of the interaction patch around the axis of the particle, r denotes the position of a particle, U is the interaction energy, b = 1/k B T is the inverse temperature, where k B is the Boltzmann constant, and d n ˆj,dk ˆj,r j is a function equal to 1 for particles in the cluster and 0 otherwise.Eqn (1) could be formally rewritten such that we express the position of all the particles with respect to cluster center of mass.In that case we can separately integrate the center of mass of the cluster over the volume, and internal degrees of freedom of the cluster.Integral over volume then states (2) where v j is the volume accessible to the cluster of size j, s is the symmetry factor 21 that removes the double counting of cluster configurations which differ only by symmetry operations (particle exchange or rotation of particle by 1801 around patch vector), and z j is the internal configuration integral accounting for small fluctuations within the cluster.This approximation is satisfied for clusters with a well-defined structure, as in closed regular cluster where the particles interact with highly directional short range interactions.Internal configuration integral for spherocylinders could be approximated as where hEi is the mean interaction energy of the cluster, V i is the binding volume in which the particle can fluctuate within the cluster, w is the orientational freedom of the particle, and f is its rotational freedom (with respect to the particle axis).We assumed that all cluster conformations have the mean interaction energy and all the degrees of freedom of particles can be considered independently.Note that z 1 = 1.By substitution of eqn ( 2) and (3) in the law of mass action, one obtains an expression enabling one to calculate the population of all clusters with respect to monomers and to each other: For spherocylinders with one patch, we can approximate the internal degrees of freedom as a function of patch size: where a min is the minimal angle needed to assemble a cluster of a particular geometry (01 for the dimer, 601 for the trimer, 901 for the square tetramer, and 1201 or 601 for the rhomboid tetramer etc.), a i is the patch size, A i is the multiplicative factor including all other contributions independent of the patch size, g(x) = x for x 4 0 and 0 otherwise, and w is the scaling factor.The scaling factor captures deviation from the idealized behavior where all three degrees of freedom of particles in a cluster are independent (w = 1).Thus when certain rotational degrees of freedom are coupled or substantially reduced the value would decrease.For more details on this approximation, see the ESI.† For spherocylinders with two patches, the angle between the patches g defines the optimal cluster geometry (instead of a min for one patch spherocylinder) and min[a 1,2 ] defines the rotational freedom instead of a i À a min for one patch, so where Dg i is the deviation of the angle between the patches from the optimal arrangement.The above assumptions were validated by agreement between the theory and the simulation data at different concentrations and with different patch sizes (see the ESI †).The theory thus could be used to extrapolate the system composition under different conditions or to derive relative populations of cluster morphologies under given conditions (see Fig. S10, ESI †).

Particles with one interaction patch
Firstly, we investigated the behavior of systems composed of particles with one patch.The structure diagram determined as a function of the patch size and temperature is displayed in Fig. 4. The particles formed two types of aggregates: finite-size closed clusters and fibril-like open clusters (see Fig. 2).We focused on the finite-size clusters (see Fig. 3) because fibrillar structures occurring at low temperatures were the subject of our previous work. 11orphologies and amounts of clusters depend on the size of the interaction patch and the interaction range.The depicted points in the diagram correspond to conditions where the equilibrium populations of two cluster types are the same.The theoretical dependence fit the simulated points well.For short-range interactions, we obtained clusters up to tetramers in size with an occasional small number of larger non-regular clusters.The system at low temperatures is dominated by dimers up to a patch size of 601, followed by trimers up to 1201, and tetramers up to 1801.There is a 'triple' point where the population of monomers, dimers, and trimers is the same.It might be surprising that the size of the clusters does not follow a geometrical boundary of 901 for the tetramer.The explanations are in the cluster morphology which due to the short-range attraction is rhombic rather than a regular square (see Fig. 3).In the rhombic tetramer, two particles in the middle interact with the other three particles, so they need to have a patch of 1201, which is in agreement with our structural Fig. 3 Schematic representation of clusters observed in simulations for particles with one patch.The particle body is represented as a blue circle, while the interaction patch is shown as a yellow sector.The number of interactions per particle in a cluster is depicted by black dotted lines.
diagram.Close to the patch size boundaries of 601 and 1201, the particles can change the cluster morphology with temperature or interaction strength (see Fig. S10, ESI †).Note that since equipopulation conditions between dimers and trimers are achieved at higher temperature, then under the equipopulation conditions between monomers and trimers there are no conditions where the system is dominated by dimers for particles with a patch larger than a E 701 (the system is mainly composed of monomers at high temperatures and upon cooling it becomes dominated by trimers).
When the interaction is farther reaching, new cluster structures of square tetramers, cross tetramers, and pentamers appeared (see the structure diagram in Fig. 4B).Square and cross tetramers are formed for patches larger than 901, when the interaction reaches across the diagonal of a square ( ffiffi ffi 2 p diameter).This can be seen by comparison of Fig. 4A and B, where the interaction range was increased from E1.64 nm to 2.3 nm (E1.92 Â particle diameter).Despite the similarity in the arrangement of particles in cross and square tetramers, they behave differently due to the differences in their interaction patterns (see Fig. 3).While each particle interacts with all other particles in the square tetramer, each particle interacts with only two other particles in the cross tetramer.Effectively cross tetramers are composed of two dimers connected by interactions across diagonals of the square.Therefore, cross tetramers are less stable than square tetramers due to enthalpy and less stable than trimers due to entropy.As a result cross tetramers are abundant only in the small region of parameter space connected to dimers (see Fig. 4B).
The square tetramers are the most common tetramers as they are more stable than other tetramers in both enthalpy and entropy (see the number of interactions and the possibility of rotations in Fig. 3).Pentameric clusters were observed for patch angles larger than 1081, where regular pentamers compete with square tetramers.Pentamers are favoured by enthalpy and disfavoured by entropy compared to square tetramers, so pentamers were observed only at low temperatures.However, the situation should change at high densities, where the loss of translation entropy in pentamers decreases and they should become more stable than square tetramers (see Fig. S2, ESI, † for the concentration effect).Finite clusters larger then pentamers appeared occasionally at low temperatures.A further increase of the interaction range can lead to abundant larger finite clusters.However, such a range would be larger than two particle diameters, which would have to be modelled explicitly to avoid the unrealistic possibility of two particles interacting through a third particle.

Particles with two interaction patches
Another way to form larger finite clusters (hexamers, etc.) is to use particles with two separate patches forming a cluster with a hydrophilic core.For simplicity, we simulated particles with two symmetric patches with a size of 101.At this size, each patch accommodates an interaction with one other particle, resulting in closed or open chain-like clusters (see Fig. 2).The calculated structure diagram is depicted in Fig. 5.We found that various regular closed clusters could be formed, depending on the angle between the patches, g.The clusters of each size are the most stable when the angle between the patches corresponds to the inner angle of a regular polygon.We simulated systems with g of up to 1301, where we observed regular clusters with sizes up to tetramers dominated the system.Pentamers and hexamers were obtained in lower yields.Heptamers or octamers were obtained only rarely.The region of g angles, where the clusters can be formed, is determined by the patch size, a (see Fig. S9, ESI †).When the patch size is narrow (e.g.simulated 101) the equipopulation lines might not overlap creating an interesting region, where no closed clusters are formed and the system is composed of open chains (gray area in Fig. 5).Decreasing the overlap for larger clusters could be used to increase their yield.

Discussion
The model parameters used in our simulations were chosen to represent short amphiphilic a-helices aggregating in coiledcoils.We found that the general structural features of the clusters obtained in our simulations resemble those of coiledcoil clusters found in nature. 22Coiled-coil clusters up to pentamers could be represented by single-patch particles with a hydrophobic core.Pentamers are only available when the interaction is farther reaching, which corresponds to the presence of bulky residues, in agreement with known structures. 23Pentamer and hexamer coiled-coils with a hydrophilic or hollow hydrophobic interior are better represented by our two patch model.Our results are in very good agreement with previously estimated phase boundaries for the formation of finite clusters by one patch spherocylinders. 11Our results can be compared to extensively studied models of spherical patchy particles, where finite-size clusters and micelles were also observed at low concentrations and high temperatures. 248][19][20] In agreement with our findings, the same cluster morphologies of trimers, tetramers, and long 1D aggregates were observed at low temperatures for one patch spherical particles with a patch width E1801 and short range interactions (see Fig. 4A). 18This suggests that increasing the interaction range and decreasing the patch width could result in new cluster morphologies as we have found (see Fig. 4B).A comparison of our results to spherical patchy particles in 2D suggests that patchy spherocylinders shall assemble into quasi 2D crystals.However, the third dimension of spherocylinders could result in the closing of the 2D crystal in a large spherical cage, which was reported for amphiphilic helices similar to patchy spherocylinders. 25nfortunately, the scale of such structures cannot be reached by our simulations.
The derived theory fits the simulated data very well and can be used to determine the system composition under various conditions (e.g.concentrations, temperatures, interactions), where assumption of non-interacting clusters holds.It can also be used to calculate the conditions of the highest yield and evaluate the particle interaction parameters necessary for specific clusters.Our results suggest that very specific or directional interactions are needed to obtain larger clusters than pentamers.Such specific interactions include farther reaching interactions that can be realized in the form of hydrophobic chains for smaller particles or farther reaching depletion forces for large particles in specific solvents.For larger clusters, it would thus be advantageous to use more complex building blocks than single a-helices.The addition of end-to-end interactions would result in the formation of fibrils with the cross-section geometry of the clusters described above as previously observed. 11Furthermore, for particles connected by a flexible linker with a patch width of 651 or 1251 vitrimer like polymers might be obtained. 26,27

Conclusion
In summary, the aggregation of rod-like particles with one and two interaction patches was studied using Monte Carlo simulations and a derived theory based on statistical mechanics.The obtained system compositions at various temperatures and various particle parameters are displayed in the form of structure diagrams focused on clusters of different sizes and morphologies.We found that the larger the clusters, the more difficult it is to make them, because the region of parameters where they dominate the system is smaller.Clusters up to tetramers in size can be formed relatively easily by one-or two-patch particles.Pentamers need either farther reaching interactions and/or and two patches.Hexamers were only obtained in larger quantities for two-patch particles and in coexistence with other clusters.This suggests that larger clusters require more complex building blocks or very specific and/or directional interactions.Such interactions could also influence the accessibility and morphology of the cluster, which we have shown on tetramers with the morphology change from rhomboid to square geometry upon the increased interaction range.
The simulated data fit our analytically derived theory, which suggests that the theory could be used to extrapolate, or predict data from experiments.The employed spherocylinder model was parametrized to mimic amphiphilic a-helices, and our results are in very good agreement with naturally found coiled-coils.However, our model is generic, so the findings are generally applicable to various assemblies of prolate-shaped patchy particles including functionalized nanotubes, rod-like viruses, or even macroscopic objects.Note that the one patch particles with patch sizes close to the structure boundaries of 601 and 1201 can be used to switch between the aggregate structures.

Fig. 1 (
Fig.1(A) Schematic representation of a patchy spherocylinder with one and two interaction patches.Angles defining the patch size, a, and angle between the patches, g, are depicted.l is the length of the cylindrical body of the spherocylinder and s is its diameter.Note that interaction patch covers only the cylindrical part, not the end caps.(B) Distance dependence of interaction potential for particles with short (green line) and longer (blue line) range interactions.d 1 and d 2 are distances at which the potential starts to shift to zero for short and longer range interactions, respectively.l corresponds to the distance at which cos 2 function shifts to zero.

Fig. 2
Fig. 2 Types of distinguished clusters: (a) regular clusters (parallel particles), (b) irregular clusters, (c) closed clusters, and (d) open clusters.The spherocylinder body is shown in blue and the direction of the interaction patch is displayed as a yellow spherocylinder.

Fig. 4
Fig. 4 The structure diagram of the clusters formed by one patch spherocylinders with short-range (A) and farther range (B) interactions for various patch sizes.Areas in the graph are colored based on the most abundant cluster morphology found in simulations at given temperatures and at reduced density r* E 0.040.Every point, apart from circles, represents an equipopulation point where the population of two cluster types is the same in the simulation.Numbers in the legend refer to the size of clusters, N represent all clusters larger than all clusters explicitly mentioned, 4D refers to cross tetramers, and 4S corresponds to square tetramers.Empty and filled squares are points obtained for systems with 125 and 500 particles, respectively.The ''triple point'', where the populations of monomers, dimers, and trimers are the same, is at a = 701 and T E 0.28 for short-range interaction and at a = 801 and T E 0.40 for farther reaching interaction.Circular points represent single-point simulations to sample the most abundant clusters at low temperatures.Error bars represent one standard deviation in cluster population fluctuation.Lines are the equipotential curves based on the theory and fit parameters derived from equipopulation lines between monomers and clusters (fitted parameters are in the ESI †).The equipopulation lines between clusters of different sizes were calculated from theory using fitted parameters for equilibrium populations with monomers.

Fig. 5
Fig. 5 Structure diagram of particles with two symmetric patches of 101.Areas in the graph are colored according to the most abundant cluster morphology.Squares and triangles are equipopulation points where the populations of two cluster types are the same in the simulation.Squares are for the same populations of monomers and closed clusters.Triangles stand for the equipopulation points between closed clusters and all other clusters in system.Circles stand for the lowest simulated temperature.The circles are filled for the systems dominated by closed clusters.Lines are theoretical fits (see the ESI † -structure diagram fit).Error bars represent one standard deviation in cluster population fluctuation during the simulation.In the gray region, no regular clusters were obtained.