Structural self-sorting of pseudopeptide homo and heterodimeric disulfide cages in water: mechanistic insights and cation sensing

The molecular components of biological systems self-sort via diﬀerent mechanisms to act in a cooperative manner and to avoid interfering with each other. Herein we describe mechanistic insights and a versatile strategy for the synthesis of water-soluble, pseudopeptide molecular cages based on disulfide bonds. The use of trifunctional thiols led to a dynamic combinatorial library (DCL) of readily isolable, multi-component homo and hetero cage-like architectures showing a degree of self-sorting related to the symmetry and size of the trithiol. The work provided detailed kinetic studies and DFT molecular modeling giving original insights into the disulfide cages’ properties. We also applied the selected cage system in the fluorometric detection of La 3+ cations, which led to the generation of a strongly luminescent metal–organic assembly.


Introduction
The synthesis of functional molecular cages continues to pose an intriguing challenge due to the numerous applications of such structures 1 as chemical receptors, [2][3][4] nanocapsules for molecular transport, [5][6][7] or in molecular recognition. 84][15][16] Dynamic combinatorial chemistry (DCC) of disulfides is an effective approach to generate artificial biocompatible systems in water. 17,18Disulfides have been extensively studied with linear, 19,20 macrocyclic [21][22][23][24][25] or interlocked topologies [26][27][28][29][30][31] but far less is known about disulfide cages.1][42] Hetero-component cages seem to be particularly understudied, a situation, which is probably due to the difficulties in isolating pure cages from post-reaction mixtures.Our objective thus was to fill these gaps by developing an efficient synthesis of pure homo-and hetero-dimeric pseudopeptide, water-stable molecular cages, by using thio-amino-acid functionalized organic platforms.This was expected to provide a robust workshop to explore the phenomenon of self-assembly and self-sorting in a dynamic multi-disulfide system.We also attempted to draw some new insights into the structural and geometric factors crucial for disulfide cage formation based on experimental data and DFT calculations.Based on this, we propose a theoretical model that could be applied for the effective design of next mono and bicomponent disulfide cages.In addition, we describe a preliminary investigation of a strongly fluorescent metal-organic material obtained using a disulfide cage as poly-anionic ligand on La(III).

Design of building blocks
The three core platforms used were based on triamides formed from aromatic tricarboxylic acids and L-cysteine.The use of natural L-cysteine ensured the absence of diastereomeric products and the formation of chiral cages with stable absolute configurations.The set of aromatic platforms was selected to gradually increase the size (diameter) of the rigid core.Each was functionalised by amide formation with L-cysteine in order to introduce -SH (for disulfide formation) and -COOH (for water solubility) groups.All building blocks were obtained This journal is © The Royal Society of Chemistry 2021 by a slightly modified three-step synthetic route 36 (see ESI, † Fig. S1).In the first stage, the reaction of each tricarboxylic acid with Nhydroxysuccinimide (NHS) and 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide hydrochloride (EDCÁHCl) provided the corresponding NHS activated esters.Next, formation of the amides by reaction with L-S-trityl-cysteine followed by hydrolysis of the trityl-moieties in acidic conditions led to the desired trifunctional components 1, 2 and 3, which have planar sizes in the order 3 4 2 4 1 (Fig. 1).

Synthesis of homodimeric cages
Solutions of each component (5 mM) were prepared in an aqueous 0.01 M NaOH solution and its pH was adjusted to 8.0 by addition of dilute HCl.The presence of thiolates in solution is obligatory for their oxidation to disulfides as well as for disulfide dynamic exchange.These solutions of the basic components were then placed in loosely capped 2 mL vials and allowed to oxidize and equilibrate for 5 days.To ensure that all DCLs reached thermodynamic equilibrium, each sample was further analysed by HPLC one week and one month after the end of the 5 day reaction time.Control analyses showed no quantitative or qualitative changes in any DCL.After this time, LC-MS analysis was performed.This showed that each postreaction mixture contained only one product (more polar than the substrate), which was identified by mass spectroscopy as the monoprotonated species with a mass of twice that of the reactant less the 6 hydrogen atoms corresponding to those lost in formation of 3 disulfide links, i.e. the tris-disulfide dimeric cage.The compound 1-1 is a known species, identified previously by LC-MS and 1 H NMR. 37 However, we decided to include cage 1-1 in our studies for comparison and to establish its full characteristics.Further details of the structure 1-1 were established from its 1 H NMR spectrum, which showed the D 3 symmetry expected for the homodimeric product (see ESI, † Fig. S19 and S20).The 1 H NMR spectra of the dimeric cages, 2-2 and 3-3 showed the same symmetry.Comparison of the 1 H NMR spectra of the respective components with those of the cages and analysis of optimised structures revealed that the aromatic proton signals of the cages are shifted upfield, presumably as a result of close proximity of the aromatic platforms in the cage structure.In contrast, the signals originating from cysteine CH and CH 2 groups were found to be deshielded due to their position outside the cages (see ESI, † Fig. S19, S24, S29 and S60-S65).Acidification to pH 2 of the post-reaction mixtures yielded the three cages as the waterinsoluble hexa-carboxylic acids, which could be easily isolated by filtration.However, down to the slightly acidic region of pH 5-6 the cages remained soluble, which extends the range of their applications.

Synthesis of heterodimeric cages
The next step was to investigate if the different building blocks would react with each other and form the hetero-component cage structures.Equimolar solutions containing pairs of components were prepared at pH 8.0 and then stirred for five days in air.Three different products in equal proportions were detected in the post-reaction mixture derived from components 1 and 2 (Fig. 2).Based on the LC-MS analysis two homodimeric cages, 1-1 and 2-2, were identified while the third product had the composition expected for the heterodimeric cage 1-2.Similar results were obtained for the mixture of 2 and 3, which after the reaction contained cages 2-2, 3-3 and heterodimeric cage 2-3.For the mixture of 1 and 3, only homodimeric structures were found in the post-reaction mixture, with none  For this reason, we exclude the possibility that polymeric or insoluble products have formed in the DCL that might not be detected by the HPLC DAD detector.To isolate the heterodimeric cages, the experiment with pairs of components 1 + 2 and 2 + 3 was repeated on a larger scale and at a higher concentration (10 mM).Heterodimeric cages were isolated by semi-preparative HPLC on a few milligrams scale and characterized by NMR spectroscopy. 1 H and COSY NMR spectra confirmed their expected structures and C 3 symmetry.Comparison of the 1 H NMR spectra of the hetero cage 1-2 and the components 1 and 2 is also in line with the cage structure (Fig. 3).The upfield shifts of the doublets (cyan-green) from the triphenylamine moieties and the singlet (blue) from the central phenyl ring clearly indicate a strong interaction between aromatic hydrogens and the proximity of organic platforms of 1 and 2. This explanation is supported by analysis of the optimized structure of the 1-2 cage.In this structure, all of the aromatic hydrogens are additionally shielded by the accumulation of electron-rich sulfur (S-S bonds) and oxygen (amides) atoms.
The hydrodynamic radii deduced from DOSY experiments (see ESI † for details) coincide closely with the calculated results (Table 1).Finally, we performed an experiment with an equimolar mixture of all three components at once to check if in this three component DCL any new architectures would emerge.The resulting DCL contained an almost equimolar mixture of all five cages, without any traces of new species (see ESI, † Fig. S44).As in previous experiments, no products with a different architecture were found.This clearly established that our system based on well-defined C 3 symmetric components spontaneously selects the stable cage-like structures from the numerous possibilities of combinatorial connections between the three trifunctional components (see ESI, † Fig. S70).
Although we have obtained pure products, we have been unable to obtain crystals suitable for X-ray structure determinations.4][45] Based on this it can be concluded that the cage molecules prefer an ''open'' conformation, where a small internal space of the molecule is present and the disulfide groups are directed towards the inside of the molecule.The S-S torsion angles are typical for this binding in all optimized cages and averaged 1071 (in the range 851 to 1361).One structure with a ''closed'' conformation and disulfide groups directed outwards turned out to be that of the cage 2-3, which can be explained by the significant differences in the size of aromaticorganic platforms that enforce such arrangement of cysteine arms and disulfide bonds.Also, we found that this structure is Table 1 List of the 3D models and the comparison of experimental and calculated dimensions of cages additionally stabilized by stacking interactions, due to the very close proximity of the aromatic cores (3.0-4.0Å).The cage 3-3 composed of two large aromatic platforms should also be stabilized in this way, since the angles between the arms and platforms are close to 901, favoring extensive close overlap.For this reason, we have defined two geometric factors that can influence the structure.One, termed the trapezoidal angle z (zeta), is the angle between the platform plane and the cysteine arm, which defines the relationship between the size of the platform and the angle imposed on the binding arm.The second, W (theta), is the degree of rotation between the platforms about the principal z-axis.This angle is important for the measure of structural stress caused by the differences in the size of the platforms and for geometric considerations, e.g., steric hindrance.The angles were determined for four points: centroid of the smaller platform -carbon alpha of the arm of the smaller platformalpha carbon of the larger platform -centroid of the larger platform.Centroids have been designated as the central nitrogen atom in platform 2 or the geometric center of the benzene ring in platforms 1 and 3. So, high values for both of these angles seem to be universal factors for determining the geometric barriers to formation of a given cage structure.For each homodimeric cage, the values of those angles are very similar (W near 51 and z near 951) and those compounds are formed smoothly.This twoangles factor is slightly more complicated for two observed heterodimeric cages 1-2 and 2-3, but the values do not exceed 81 theta and 1151 zeta.Cage 1-3 is the only one not formed during the experiments, and the reason for this may be the difficulty involved in attaining the cage geometry.The zeta angle takes on the highest value of 1211 due to the significant difference between the sizes of the aromatic platforms of 1 and 3.The platforms can rotate about the z-axis to compensate for this stretching effect but the optimal theta angle then turns out to be large enough to stop the cage from forming (Fig. 4).

Kinetic studies
Because the calculations indicated that the formation of cage 1-3 is the least preferred for geometric reasons, but structurally possible, we decided to supplement our work with kinetic studies to find any additional reasons for the absence of 1-3.We employed HPLC to monitor cage formation over time.Reactions were performed in 2 mL vials at pH 8 in 5 mM concentration in room temperature.Each reaction mixture was observed for 30 hours and monitored every 30 minutes.The first HPLC injection (t 0 ) for every sample was done immediately after dissolving the component in basic water and setting the pH to 8.0.Thus, we obtained a set of chromatograms illustrating component decay and cage growth over time based on the relative peak area (RPA).To ensure that the method was quantitative, absorption spectra were recorded from the solutions of each component and the corresponding cage at the same concentration.The difference in registered absorbance between the component/cage pairs did not exceed 5%.Each reaction was repeated and monitored three times to ensure that results were the same.For the homodimeric cages, the material distribution curves are shown in Fig. 5a-c.These show that in mono-mixtures the time for conversion to  100% product within experimental error was shortest for cage 3-3 (10 h), followed by 1-1 (20 h) and longest for 2-2 (26 h).We attribute this to the influence of subtle effects such as the differences in solvation of polar cysteine moieties, in surface polarity and in the size of each component.][48] LC-MS measurements taken 60 minutes after the start of reactions provided a deeper understanding of the cage formation process.These measurements (ESI, † Fig. S45-S50) showed that aside from unreacted trithiol, the dominant species present corresponded to the linkage of two trithiol molecules through one, two or three disulfide bonds.No species derived from more than two trithiol units were detectable.This is consistent with an initially intermolecular reaction to give the mono-disulfide being followed by rapid intramolecular reactions giving the macrocyclic bis(disulfide) and then the cage tris(disulfide).Competition by intermolecular processes after the first step appears to be ineffective.The fact that the cage dominates over the macrocycle at a time when the mono-disulfide is still present indicates that restrictions of motion in the macrocycle must lead to faster disulfide formation relative to that giving the macrocycle.Thus, in the first step, two molecules of trithiol form open-chain mono-disulfide adducts (1A, 2A, 3A).These are then further oxidized to the macrocyclic bis-disulfide products (1B, 2B, 3B), which in the third step are finally oxidized to the cages.(Intermediates are shown as gray and black lines in Fig. 5a-c).In the case of cage 2-2, the bis(disulfide) intermediate was more abundant than the mono-(disulfide) intermediate at all points where the reaction mixture was sampled (Fig. 5b).This is consistent with acceleration in rate due to the intramolecular nature of the second reaction step, which is largely dependent on the chemical structure and shape of the component employed.This is supported by the molecular modeling results that showed smaller conformational restrictions in the macromonocyclic bis(disulfide) intermediate 2B than those found for 1B and 3B, see ESI, † Fig. S66-S69).For cages 1-1 and 3-3, the observations are less complicated and the disappearance of the reactant in both cases can be well fitted to a single exponential decay.This implies that the reactant loss is a first-order process (seemingly irreversible), a situation which might arise if the rate-determining step of the reaction were to be the very first step involving the trithiol and dissolved oxygen.Although formally a second order process, if the agitation of the solution and its contact with the normal atmosphere were sufficient to maintain a constant concentration of dissolved oxygen, it would obey pseudo-first-order kinetics.Once the first species is formed and then dimerizes, all subsequent reaction steps, as noted above, would be intramolecular and probably greatly accelerated as a result.Examination of the kinetics of cage formation in the mixed 1 + 3 DCL (Fig. 6a) provided evidence for one of the complications in these reactions.In this system cage 3-3 is exclusively formed first, then 1-1.Formation of 1-1 shows an induction period lasting until the essentially complete conversion of 3 to 3-3, and an intermediate species decaying to 3-3 is very rapidly formed in a substantial amount.
More significantly, the homodimeric cages 1-1 and 3-3 form almost twice as fast as in the single component systems.Complete conversion of 3-3 takes about 4 h, while, once initiated, complete formation of 1-1 takes about 10 h.Since the thiol oxidation is slower process than the disulfide exchange, 49,50 formation of the first S-S linkage between two tri-thiols must be the rate-determining step in the entire cage formation process. 51hat distinguishes both systems (single vs. mixed component) is the formation of intermediate species e.g.1-3A (reaction III, Fig. 6b), and we assume that the presence of the latter is related to observed acceleration of cage formation.Thus, the nucleophilic attack of thiolate on the 1-S-S-3 disulfide (1-3A), results in the formation of linear S-S-S transition state, which decomposes to a more stable disulfide product. 50If 1 is a weaker acid than 3 then under the same pH conditions, there must be a lower 1-S À thiolate concentration, which results in both a slower radical formation during oxidation and a preferred attack of 3-thiolate on the intermediate 1-3A.In this case, the synthesis of 1-1 through either the 1-3A path (reactions III and V) or path IV can be effectively stopped until the system is essentially drained of 3.However, considering the complexity of this system, identifying the exact cause of this phenomenon is very challenging. 52here hetero-cage formation occurs as in reactions involving 2, the kinetics become more complicated to analyze because both reactants are involved in competitive processes.By LC-MS analysis we also observed all the expected hetero-component intermediates, but at much lower concentrations (o5% RPA, LS-MS, ESI, † Fig. S51-S56).A schematic representation of possible reactions between a pair of different components leading to a mixture of homo and heterodimeric cages is shown in Fig. 6a.We assume that for pairs 1 + 2 and 2 + 3, reactions IV, VII and XI occur with a similar probability, resulting in the observed equimolar mixture of homo and hetero products.In the case of the pair 1 + 3, reaction intermediates III (1-3A) and VIII (1-3B) were found at a low concentration (o2% RPA, LC-MS).Normalized concentration-time plots for the reactions 1 + 2 and 2 + 3 are shown in Fig. 5d and e.In the 1 + 2 system, cage 2-2 forms most rapidly, then comes 1-2 and the slowest is 1-1.The total reaction time is much longer than in the 1 + 3 DCL and takes about 30 hours.2-2 reaches 50% of the target conversion after 4 hours compared to 8 hours in the DCL with only 2.
As in the 1 + 3 system, the products involving a geometrically larger component (1-2 and 2-2) are created at higher rates.This effect is even more visible in the 2 + 3 system (Fig. 5e).Here, cages 2-3 and 3-3 are formed at almost the same rate and the initial rate is much higher than in the 1 + 2 DCL.Only when the DCL is dominated 80% by 2-3 and 3-3 (after about 4 hours), does the rate of formation of 2-2 increase.
Observations on the system with all three components present (1 + 2 + 3) were consistent with the results from the two-component systems, once again showing an apparent size effect on the cage formation (Fig. 5d-f).Products containing the largest component 3 formed faster than those containing 2 and the slowest formed were those with 1. Cages built of 1 and 2 form preferentially only when the DCL has already consumed most of the available 3.
Fluorometric detection of La 3+ cations and generation of a strongly luminescent metal-organic material While working with the cages, we noticed that both 2 and cage 2-2, due to the triphenylamine core, show strong fluorescence.We decided to further explore this feature of 2-2 to preliminary investigate some useful properties for potential application of this particular cage.Comparison of the emission spectra of 2 and 2-2 shows almost four times stronger fluorescence of the cage (max.456 nm) than that of the component (max.466 nm) and the a blue-shift of the maximum emission of 2-2 by 10 nm (Fig. 7a).We assume that this is due to an AIE-like effect (aggregation-induced emission) because the closed cage structure prevents the chromophore NPh 3 rings from rotating freely, which results in increased emission intensity.Moreover, each cage is a polyanion with six CO 2 À groups concentrated on a small molecule area and available for coordination through O-donors.
Encouraged by these cage features, we decided to synthesize a metal complex based on 2-2.Mixing an aqueous solution of 2-2 as sodium salt with aqueous La(NO 3 ) 3 results in the quantitative precipitation of highly insoluble, amorphous metal-organic 2-2-La material (see ESI † for details).Taking into account the number and location of the cage chelating groups as well as coordination preferences of La 3+ cation, formation of cross-linked coordination assembly was expected.The composition and purity of the generated material in the solid state was confirmed by the elemental (NHCS) analysis, which indicated 1 : 1 metal : cage ratio (see ESI, † Fig. S72).This was further supported by the ICP-MS analysis, which showed 10.6 wt% of La in 2-2-La for expected 12.6 wt% (see ESI, † page 45).Unfortunately, despite many attempts, the solubility of the obtained material prevented the mass spectrum to be registered.To determine the thermal stability of 2-2-La, TGA in the range of 30-600 1C was conducted and showed high stability of this material, which lost only 40 wt% with fairly linear weight loss.In comparison, organic compounds 2 and 2-2 turned out to be much less stable and already at 250 1C showed a major weight loss of about 30 wt% (over the full range of heating lost of 65 wt% and 50 wt%, respectively was recorded; Fig. 7c).While the volatile products from thermal decomposition were not characterised, it would be expected that the organic parts of the materials would undergo progressive decomposition, probably in multiple steps, to give a residue of La 2 O 3 plus, possibly, Na 2 SO 4 or Na 2 CO 3 from the 2-2-La and the sodium salts alone from 2-2, so that the overall % changes in mass should be different.To get visual insights into the morphology of the 2-2-La complex, SEM imaging was performed and revealed the amorphous state of the powdered material.The close-ups show a spongy porous structure with numerous pores, about 0.1 mm in diameter, presumably due to the three-dimensional architecture of cage 2-2 with internal hydrophobic voids (Fig. 7d).Additionally SEM-EDS analysis was used to map the surface distribution of elements in the 2-2-La.The obtained results confirmed the presence of all elements expected in the investigated material (Fig. 7d and ESI † Fig. S71, S72).Since in aqueous solution, 2-2 exhibits interesting fluorescence behavior, we decided to examine and compare this property with metalorganic 2-2-La measured in the solid state.These measurements show that described system is photo-responsive, which opens several avenues towards their application in various technological fields.As presented in Fig. 7b, after excitation at 345 nm, the emission spectrum of 2-2-La shows a broad band with its maximum at 445 nm, while in the emission spectrum of 2-2 apart from seemingly lower intensity, there is a notable blue-shift (55 nm) in the band maximum = 500 nm (for absorbance spectra see ESI, † Fig. S59).The difference in solid-state fluorescence of both materials under excitation at 325 nm is also clearly visible to the naked eye as a significant difference in color (Fig. 7b insets).

Conclusions
We have described here a facile approach to the synthesis of pure disulfide, water-soluble molecular cages based on trifunctional organic platforms, and which can be formed easily in a one-pot reaction under very mild conditions.We have shown that high symmetry of its components pushes a DCL to a selfsorted selection of cage species.If the difference in component size is small, the DCL self-sorts into a mixture of homo and heterodimeric cages.The loss of geometrical compatibility with large size differences results in self-sorting of a mixture of two trithiols into a pair of two homodimeric cages, with no other products detectable.Based on these observations and DFT calculations, we created a two-factor theoretical model to predict the tendencies of trifunctional thiols to self-assembly into homo-or heterodimeric cages.We have also provided original kinetic studies giving new insight into the disulfide cages' formation mechanism through careful analysis of every intermediate product.As shown by the synthesis of 2-2-La in aqueous conditions, they have potential applications as novel lanthanides complexing agents and functional fluorescent materials.It has been shown that the cage-like structure of these compounds is very stable in aqueous conditions, a highly desirable property for applications in molecular transport and drug delivery.The anionic form of described cages at basic pH could be potentially used as an ionic receptor for molecular recognition.The applications of the cages could be easily extended to non-aqueous solvents by choosing the appropriate organic counter-ions while maintaining the ionic character of the entire system.Our methodological approach described here allows for easy and quick design of multi-functional disulfide cages with the promise of a wide range of applications in numerous branches of chemistry.

Fig. 2
Fig. 2 (a) HPLC chromatograms showing the formation of heterodimeric cages (1-2 and 2-3) in the component pairs mixtures.For the pair of 1 + 3, cage 1-3 was not observed.Due to significant differences in molar extinction coefficients of the components, the chromatograms do not show the real amounts of the products.The dashed line shows the actual content of each (see ESI, † Fig. S57 and S58), (b) comparison of experimental and simulated mass spectra (ESI-MS) of all five cages.

Fig. 3
Fig. 3 Comparison of 1 H NMR spectra (D 2 O, 298 K, 600 MHz) of heterodimeric cage 1-2 and components 1 and 2. Dashed lines show chemical shifts of analogous signals from aromatic hydrogens from both organic platforms.