Molecular dynamics simulations reveal disruptive self-assembly in dynamic peptide libraries. Organic

There is signi ﬁ cant interest in the use of unmodi ﬁ ed self-assembling peptides as building blocks for func-tional, supramolecular biomaterials. Recently, dynamic peptide libraries (DPLs) have been proposed to select self-assembling materials from dynamically exchanging mixtures of dipeptide inputs in the presence of a nonspeci ﬁ c protease enzyme, where peptide sequences are selected and ampli ﬁ ed based on their self-assembling tendencies. It was shown that the results of the DPL of mixed sequences ( e.g. starting from a mixture of dileucine, L 2 , and diphenylalanine, F 2 ) did not give the same outcome as the separate L 2 and F 2 libraries (which give rise to the formation of F 6 and L 6 ), implying that interactions between these sequences could disrupt the self-assembly. In this study, coarse grained molecular dynamics (CG-MD) simulations are used to understand the DPL results for F 2 , L 2 and mixed libraries. CG-MD simulations demonstrate that interactions between precursors can cause the low formation yield of hexapeptides in the mixtures of dipeptides and show that this ability to disrupt is in ﬂ uenced by the concentration of the di ﬀ erent species in the DPL. The disrupting self-assembly e ﬀ ect between the species in the DPL is an important e ﬀ ect to take into account in dynamic combinatorial chemistry as it a ﬀ ects the possible discovery of new materials. This work shows that combined computational and experimental screening can be used complementarily and in combination providing a powerful means to discover new supramolecular peptide nanostructures.


Introduction
Peptide-based nanostructures have been studied over the last few decades as potential new supramolecular materials with promising applications in biomedicine and nanotechnology. 1There is increased interest in the use of unmodified, short peptides, as short as dipeptides, which form nanostructures spontaneously from minimalistic, easy to synthesize, building blocks. 2 Peptide derivatives, such as peptide amphiphiles (PAs) 3 and aromatic peptide amphiphiles (APAs), 4 take advantage of aliphatic or aromatic non-peptide groups, respectively, to enhance the self-assembling tendency of a peptide, allowing tripeptides, 5 dipeptides 6 and some modified single amino acids 7 to form nanostructures spontaneously in solution.
Clearly, peptides are attractive building blocks for nanostructures due to the chemical diversity of amino acids, which can in turn result in diverse structures and functions of the resulting self-assembled systems.
During the last few years, an effort has been made to search for the sequence space for minimalistic unmodified selfassembling peptides.2a,b,8 Frederix et al. employed coarse grained molecular dynamics (CG-MD) simulations using the MARTINI force field to investigate the self-assembling propensity of the 400 dipeptides and 8000 tripeptides.8c,d Firstly, the dipeptide work was used to demonstrate that CG-MD simulations are able to reproduce known supramolecular nanostructures.Once validated, this methodology was applied to map the space of the more numerous tripeptides and this led to the discovery of four new tripeptide hydrogelators, which were verified experimentally.8c In all of the gel-forming systems, the predicted self-assembled structure corresponded to nanofibers.
Complementary to the computational mapping of the sequence space, Pappas et al. implemented an experimental methodology to discover new self-assembling pure peptides using dynamic peptide libraries (DPLs). 9Although the use of DPLs has been previously implemented to study the different self-assembling tendencies of closely related self-assembling peptides, 10 this represented the first example of DPLs that allowed both exchange of the amide bonds and elongation of the peptide chain in both directions, with the aim to discover new, purely peptidic self-assembling molecules.Pappas et al.'s experiments consisted of exposing unprotected dipeptides to a nonspecific protease, which can break and form amide bonds.Although thermodynamically amide hydrolysis is favoured over condensation (in aqueous media), the energetically downhill self-assembling process in itself can overcome the bias for hydrolysis and drive the process to enhance the formation of the most stable self-assembling molecules.In these experiments both the conditions and the starting dipeptide composition were modified to discover a number of new self-assembling peptides, such as F 6 , L 6 , W 4 , F 2 L 2 and (FDFS) 2 . 9owever, some of the results in the experiments from Pappas et al. deserve further attention (Fig. 1). 9In particular, mixing of F 2 (30 mM) with thermolysin (Exp F ) shows a major formation of F 6 (75%).A similar result is observed when L 2 (30 mM) is mixed with thermolysin (Exp L ), where the major product is also the hexapeptide, L 6 , but with a lower yield (60%) compared to F 6 in the previous experiment.However, in a competitive dipeptide experiment, mixing F 2 (15 mM) and L 2 (15 mM) with thermolysin (Exp F-L ) there is no peptide that shows yields of formation higher than 5%, including those that previously yielded over 50% in the non-competitive experiments. 9The reduction in yield can, at least in part, be explained by simple mass action and equilibrium considerations through the use of reduced concentrations, but there may also be additional interactions between library components that could affect the outcome.In particular, the absence of significant formation of F 6 caught our interest.Hence, we decided to use CG-MD simulations to gain an understanding of the variations in self-assembling tendencies for mixtures of components, i.e., assessing the possible coassembly between the species in the mixture.
While the validity of CG-MD to study the self-assembly of pure peptides has been validated by Frederix et al., these systems consisted of di-and tripeptides, 8c,d and hence, this study will also assess the validity of the MARTINI force field for the study of longer self-assembling peptides.Furthermore, while the validity of the MARTINI force field for the co-assembly of multiple peptide species has been previously demonstrated by Guo et al., this study investigated the co-assemblies of peptides containing only phenylalanine, F 2 and F 3 . 11The current study extends the validation of the MARTINI force field for peptide co-assemblies containing homopeptides with different amino acids, in this case L and F.
Therefore, in this study CG-MD simulations with the MARTINI force field will be implemented to: (a) test its ability to reproduce the experimental results of Exp F and Exp L in order to evaluate its use as a predictive tool for peptides longer than three amino acids; (b) assess the ability of this method to model the co-assembly of multicomponent systems; and (c) to improve the understanding of the experimental results presented by Pappas et al. and determine why F 6 is not formed in Exp F-L .
The study starts with the CG-MD simulations of the three species that give significant yields in Exp F and Exp L .This is used to assess the ability of the model to reproduce the selfassembling behaviour of these peptides.The simulations were made at two concentrations to compare and evaluate the differences in the self-assembling tendency of F 6 and L 6 .After this, simulations that combine F 6 and L 2 at different compositions are presented to show how the presence of L 2 modifies the self-assembling tendency of F 6 to explain why this molecule does not appear as a thermodynamically favoured product in Exp F-L .

Systems
Zwitterionic peptides coordinate files were created in Avogadro 12 and converted to CG representation in the MARTINI force field (version 2.2) 13 using martinize.py. 14Zwitterionic peptides were used as it is the expected protonation state in solution at pH 7, which will determine the initial propensity of the peptides to aggregate or not.This approach is consistent with that used in previous studies on di-and tripeptides by Frederix et al. 8c,d While there is evidence that upon self-assembly the local environment of the peptides can result in variable protonation states of the residues, 15 the model is not able to capture these changes and as such the degree to which the final nanostructures obtained are consistent with the experimental structures can be affected by the local environment.However, in this work we focus on the initial ability of the systems to interact and aggregate into stable structures and the protonation state of the residues in the non-aggregated state will be a critical parameter for this.The coiled coil (C) secondary structure was used, which differs from the previous examples on di-and tripeptides where the extended (E) secondary structure was used.The main difference between C and E is the lack of a torsion potential to limit backbone flexibility in the former one.Therefore, this choice is less relevant in the previously studied systems but for those in this study C ensures the maximum flexibility of the tetraand hexapeptides.Previous studies showed that although a PA system (formed by 13 amino acids) experimentally showed the presence of β-sheet conformation (E in the MARTINI model), this was found not to be a deterministic parameter in the simulations. 16Therefore, if this parameter was not key for a PA formed by 13 amino acids, it can be expected to be less influential for the tetra-and hexapeptides of this study.
All systems were built in a cubic box 21.5 × 21.5 × 21.5 nm 3 and solvated using CG water.The first simulations were built at a constant concentration of amino acids, all the systems contain 3600 L or F. Therefore, the F 2 and L 2 systems consist of 1800 molecules, F 4 and L 4 consist of 900, and F 6 and L 6 of 600.This results in a model system concentration ca.ten times greater than the experimental conditions for the dipeptides (300 mM), which is consistent with previous work in this area.8c, 17 The tetrapeptide (150 mM) and hexapeptide (100 mM) systems correlate in the same way to the experimental concentrations while also assuming 100% conversion.The co-assembly systems were composed of F 6 and L 2 as shown in Table 1 (F-L 100-1500 , F-L 200-1200 , F-L 300-900 , F-L 400-600 and F-L 500-300 ), making the number of amino acids constant for all the simulations.Extra control simulations were run to assess the concentration effect and serve as a reference for the co-assembly systemsthe composition of the controls corresponds to the co-assembly systems with no L 2 (Table 1: F 100 6 , F 200 6 , F 300 6 , F 400 6 and F 500 6 ).This amount of product implies, for the total conversion of F 2 , initial concentrations of 50 mM, 100 mM, 150 mM (as used under the experimental conditions), 200 mM and 250 mM.
For some of the systems simulations were run doubling the concentration (600 mM for dipeptides, 300 mM for tetrapeptides and 200 mM for hexapeptides).These were built in half of the volume (17.1 × 17.1 × 17.1 nm 3 ) to keep the number of molecules constant.This was made to ensure that the effects were due to the higher concentration and not due to an increment in the number of molecules.
All visualizations were rendered using VMD. 18

Simulations
Simulations were carried out in GROMACS version 4.5.3. 19imulations used periodic boundary conditions.Lennard-Jones interactions shifted to zero in the range 0.9-1.2nm, and electrostatic interactions in the range of 0.0-1.2nm (using no Particle Mesh Ewald method).For screening the electrostatic interactions, a relative dielectric constant of 15 was used.In all the cases, the box was firstly minimized for 5000 steps or until forces in atoms converged under 200 pN.The box was then equilibrated for 500 000 steps using Berendsen algorithms to keep temperature (τ T = 1 ps) and pressure (τ P = 3 ps) around 303 K and 1 bar (NPT ensemble), respectively. 20A 25 fs per step was used.Simulations were run for 100 000 000 steps which corresponds to 10 µs effective time.13a,21 Analysis Co-assembly and control simulations were analysed using the radial distribution function (RDF) to assess the change of the molecular order on these systems.For the analysis, the third backbone bead of the F 6 molecules was used as is, with the fourth, the closest to the centre of mass.Root mean square deviation (RMSD) analysis was used in the non-competitive self-assembly simulations to assess the differences in mobility of the molecules with time.All analyses were performed in GROMACS 4.5.3. 19

Results and discussion
Non-competitive peptide self-assembly The simulations show the formation of nanostructures for the three F-containing peptides (Fig. 2a-c).Only even-numbered peptide lengths (2, 4, 6) were considered in this study as a clear preference for this behaviour was shown by Pappas et al. in their experimental work, which could be related to the formation of equal numbers of H-bonds on both faces, thus favouring unidirectional assembly for even-numbered peptides only.The snapshots show the well-known tubes for F 2 (Fig. 2a and d) 2a,22 and the F 6 fibres (Fig. 2c), recently discovered. 9In addition, the simulations show that F 4 also forms a fibre-like structure (Fig. 2b), which is consistent with the previous experimental work which reported the ability of F 4 to form nanostructures. 17b This demonstrates the ability of the MARTINI force field to reproduce experimental self-assembly results for tetra-and hexapeptides, and, to our knowledge, this is the first time this has been shown for peptides more than 3 amino acids in length.While both F 6 and F 4 form fibre-like structures, they show key differences.F 6 (Fig. 2c and f ) looks similar to the fibres shown by Frederix et al. 8c and by Scott et al., 23 with no clear patterns on the surface.However, F 4 on a closer inspection (Fig. 2e) shows a clear pattern, which reveals a bilayer-like arrangement situating the aromatic side chains (in white) between the layers of well-aligned backbone beads, in order to decrease the contact of the aromatic sidechains with water.The alignment of the backbones can be seen in the whole F 4 fibre (Fig. 2b) indicating that the fibre is composed of aggregated tapes.These differences are important as, experimentally, F 6 is thermodynamically favoured over F 4 and thus these structural insights highlight the key features of a nanostructure which provide thermodynamic stability.The patterns observed at this concentration for F 4 are found in the concentrated simulations of F 4 and F 6 (Fig. S9 and S10 †).The concentrated F 4 shows no movement through 9 µs (1-10 µs) which suggest that the molecules adopt a solid/crystalline-like structure.The RMSD analysis confirms this lack of movement, more precisely after 2 µs (Fig. S14c †) for F 4 but also for F 2 , while F 6 shows some fluctuations through the whole simulation.These results suggest that F 4 and F 2 have a higher tendency to precipitate than F 6 despite their lower hydrophobicity.
The L-containing peptides do not show formation of onedimensional nanostructures (Fig. 2g-i).The L 2 dipeptides do not aggregate, probably due to their higher solubility, relative to F 2 , (Fig. 2g).Although this is not consistent with the fibres shown by TEM, 9 as no cryo-TEM was reported for this system, it cannot be guaranteed that the L 2 fibres do not appear as a result of drying in the TEM sample preparation.The broad and weak amide I signal in the FT-IR of the initial L 2 solution also does not suggest highly ordered H-bonding, which is typically reported for short peptide fibers.2a,8d At these concentrations, simulations show spherical aggregates for both L 4 and L 6 (Fig. 2h and i).L 4 is not favoured in the experiments, suggesting that the self-assembling driving force is not strong enough or at least weaker than in the case of L 6 .Therefore, a lower tendency to form nanostructures correlates with a lower driving force and hence, the simulated L 4 result (Fig. 2h), showing the formation of an aggregate and not a fibre, is consistent with the low formation of L 4 in the experiments.However, at this concentration, the L 6 aggregate (Fig. 2i) is not consistent with the experimental results, which showed fibres.However, upon concentrating the system, both L 4 (Fig. 2k) and L 6 (Fig. 2l) form fibrous nanostructures, while even at higher concentrations L 2 does also not form aggregates in these simulations.This observation suggests that fibre formation is not simply a consequence of concentration but relates to specific, peptide length-dependent molecular interactions.The L 4 and L 6 fibres do not show patterns as those observed for F 4 in the less concentrated simulations.Nevertheless, there is a difference between the concentrated L 4 and L 6 which is the number of molecules in solution: while all the L 6 molecules are involved in forming the fibre, 10 L 4 molecules (1.1% of the total 900) remain in solution at the end of the simulation.This partition behaviour suggests a higher stability of the L 6 fibres relative to L 4 .In the case of F-peptides, only F 2 shows 10 molecules (0.6% of the total 1800) in solution after the 10 µs of simulation, but the shape of the nanostructure formed by F 2 already shows clear differences with F 4 and F 6 .
The fact that F 6 is able to form one-dimensional nanostructures at a lower concentration than that required by L 6 suggests the higher self-assembling tendency of the former, which is consistent with its higher experimental yield (F 6 = 75% and L 6 = 60%, Fig. 1).
These results confirm that the MARTINI force field can reproduce the overall self-assembly tendencies observed on these di-, tetra-and hexapeptides in non-competitive systems although concentration dependencies cannot be accurately replicated.are not able to form fibres (Fig. 3a-c) and can be considered to be below the simulated critical fibre concentration (SCFCwe distinguish here between the experimental and simulated CFC due to the difference in concentration required in the model and the experimental system), while the fibre shape is evident for the other two systems, F 400 6 and F 500 6 (Fig. 3d and e).Therefore, the concentration clearly affects the F 6 self-assembly tendency and could be sufficient to explain the experimental results.Because Exp F-L uses only 15 mM of F 2 (and 15 mM of L 2 , to give a total of 30 mM), which, with a total conversion of F 2 into F 6 would correspond to the F 300 6 simulated system (150 mM of F 2 and hence, 50 mM of F 6 , at 10 times the experimental concentration), this does not form a fibre-like structure (Fig. 3c).Although L 6 showed nanostructures at concentrations lower than the corresponding computational concentration, this result shows that there is a clear concentration effect which should be taken into account experimentally when carrying out DPL studies.When L 2 is added to the systems to study its co-assembly effect in the F 6 structures, only F-L 500-300 presents the formation of a fibre (Fig. 3j).Although F-L 400-600 has enough F 6 molecules to form a fibre, as the F 400 6 system demonstrated (Fig. 3d), it only forms an elongated aggregate (Fig. 3i).The RDF analysis (Fig. 4) further corroborates the differences between F 400 6 and F-L 400-600 showing differences in the intermolecular order.The trends for the peaks at r = 19 Å (Fig. 4c) and at r = 15 Å (Fig. 4d) show a change when they reach 400 F 6 molecules for the control simulations, F 400 6 , but this change is not observed for the co-assembly systems, F-L 400-600 .This trend change in the control simulations would suggest that the SCFC has been reached.However, this is not observed for the co-assembly system, suggesting that the L 2 molecules modify the SCFC of F 6 .
This difference in fibre formation for systems containing the same number of F 6 molecules evidences a L 2 disrupting effect on the F 6 self-assembly.This effect seems to be insuffi-cient to avoid the formation of the fibre when the F 6 concentration is higher than the L 2 , in F-L 500-300 (Fig. 3j).F-L 300-900 (Fig. 3h) and F 300 6 (Fig. 3c) show similar structures for the same number of F 6 molecules, which might suggest that the L 2 disrupting effect only breaks self-assembled structures and not aggregates, thus, modifying the SCFC, but not the simulated critical aggregation concentration (SCAC).F 300 6 is the system that corresponds to the experimental concentration and indicates that no fibre formation is possible under these conditions.
As shown in Fig. 3, the L 2 molecules interact and affect the F 6 assemblies.In all of the F-L systems the F 6 molecules aggregate (grey, Fig. 3f-j) and the L 2 molecules (green), which in the absence of other molecules remain dispersed (Fig. 2g), accumulate on the surface of the F 6 aggregate.Therefore, this behaviour of L 2 adhering to the fibre surface and increasing the solubility of the F 6 aggregate is what leads to the increase in the SCFC of F 6 and makes F-L 400-600 unable to form fibres while F 400 does.Therefore, we propose that L 2 plays a stabilizing surfactant-like role on the F 6 aggregate surface, which increases the SCFC such that F-L 400-600 cannot form fibres.
These results suggest that there is a disrupting effect of L 2 in the F 6 fibres.This reduces the self-assembling ability of the peptides in the mixture.As a result of this, in competition experiments, the thermodynamic gain from self-assembly is unable to drive the formation of longer peptide structures.The ability of the peptides to interact and affect the relative selfassembly tendencies is of special importance for library experiments as it suggests that these experiments might not always be selecting the same self-assembling molecule observed in non-competitive systems.Furthermore, the results show that the self-assembly tendency of F 6 in the presence of L 2 also depends on the relative concentration of both species.This suggests that, experimentally, some products might be unlocked by modifying the proportions of the different starting products in the mixtures.

Conclusions
In this work, it has been possible to demonstrate several important features of the methodology that is being increasingly applied in the literature to predict self-assembly.Namely: (a) The MARTINI force field can be used to predict the selfassembling tendency of peptides longer than three amino acids.However, this study also suggests that different concentrations might be required to ensure that molecules do not self-assemble at all (L 2 ) rather than excluding molecules with a relatively low self-assembling tendency (L 4 and L 6 ).
(b) As well as reproducing the cooperative effects, as shown by Guo et al., the MARTINI force field can reproduce destructive co-assembly effects in peptides.These disrupting coassembly effects are a potential explanation for the low yields in the DPL experiments with mixtures of components. 9n addition to the methodological insights for computational studies of these systems, this work also reveals several important features of peptide co-assembly that need to be taken into account when performing competition experiments.The results show that the different components of the libraries can interact and modify the tendency of other species to self-assemble and, as self-assembly is the driving force, for these components to be formed in the library.In this case, L 2 is found to disrupt the self-assembly of F 6 , increasing the SCFC.
Although recent studies have reported the combination of F 2 with F 3 and with an F 2 derivative, 11,24 the co-assembly in peptide based nanomaterials has not been extensively studied.However, the current work suggests that it could be critical when employing combinatorial chemistry.The possible coassembly or disruptive assembly of the different components should be taken into account when working with libraries because these effects have the potential to mask the selfassembling tendencies of the studied molecules.Finally, this work also provides a suggested approach for determining the nature of these effects in experimental systems.Changes in the relative concentrations of the different species may help to check the validity of the results.As the interaction between the peptides is concentration dependent, this can be varied in order to assess the effect of the interaction.In the specific case of the libraries for material discovery, the species ratio changes can help to unlock new species which do not give relevant yields due to disrupting self-assembly, or even on adding both dipeptides at the same concentration they were in the single dipeptide system.

Fig. 1
Fig. 1 Summary of the results obtained by Pappas et al. in three different DPLs with F 2 (Exp F ), with L 2 (Exp L ); and with a mixture of both (Exp F-L ).The protein structure above the reaction arrow represents the addition of the protease (thermolysin at 1 mg ml −1 ).

F 6 -
L 2 co-assemblyThe snapshots from the control systems show that F 100 6

Fig. 2
Fig. 2 Simulation snapshots at 10 µs for: (a) F 2 , (b) F 4 , (c) F 6 , (g) L 2 , (h) L 4 , (i) L 6 , ( j) concentrated L 2 , (h) concentrated L 4 , (i) concentrated L 6 ; (d) cross-section of F 2 at 10 µs; and the zoomed structures at 10 µs for (e) F 4 and (f ) F 6 .All images show backbone beads in red and side chain beads in white.(a-d; f-l) Periodic images are semi-transparent.(e) Highlighted bilayer-like arrangement showing the rest of the fibre as semi-transparent; the hydrophilic region of the bilayer-like is highlighted in purple and the hydrophobic region in black.(b) Alignment of backbone beads is highlighted with black arrows.

Fig. 4
Fig. 4 Normalized RDF of the F 6 molecules in the (a) control and (b) co-assembly simulation; and evolution of the RDF intensity, g(r), at (c) r = 19 and (d) r = 15 for the control (F 6 ) and co-assembly (F-L) systems as a function of the number of F 6 molecules.

Table 1
Composition and concentration of F 6 and L 2 in the co-assembly (F-L) and control (F 6 ) simulations