Open Access Article
Jaka
Snoj
ab,
Fabio
Lapenta‡
a and
Roman
Jerala
*ac
aDepartment of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, SI-1000 Ljubljana, Slovenia. E-mail: roman.jerala@ki.si
bInterdisciplinary Doctoral Program in Biomedicine, University of Ljubljana, Kongresni trg 12, SI-1000 Ljubljana, Slovenia
cEN-FIST Centre of Excellence, Trg OF 13, SI-1000 Ljubljana, Slovenia
First published on 5th February 2024
The rational design of supramolecular assemblies aims to generate complex systems based on the simple information encoded in the chemical structure. Programmable molecules such as nucleic acids and polypeptides are particularly suitable for designing diverse assemblies and shapes not found in nature. Here, we describe a strategy for assembling modular architectures based on structurally and covalently preorganized subunits. Cyclization through spontaneous self-splicing of split intein and coiled-coil dimer-based interactions of polypeptide chains provide structural constraints, facilitating the desired assembly. We demonstrate the implementation of a strategy based on the preorganization of the subunits by designing a two-chain coiled-coil protein origami (CCPO) assembly that adopts a tetrahedral topology only when one or both subunit chains are covalently cyclized. Employing this strategy, we further design a 109 kDa trimeric CCPO assembly comprising 24 CC-forming segments. In this case, intein cyclization was crucial for the assembly of a concave octahedral scaffold, a newly designed protein fold. The study highlights the importance of preorganization of building modules to facilitate the self-assembly of higher-order supramolecular structures.
The use of CC modules enabled the design of large and complex nanostructures such as nanofibers,23 nanotubes,24 nanocages19,25,26 and nanomotors.27 Dimeric CC units, however, present some difficulties when used for the construction of large multimeric structures. Most notably, the flexibility imposed by the linkers between CC segments, while playing an important role in the assembly may lead to different and at times heterogeneous oligomerization states.28 This has been for instance shown on short CC peptides designed to assemble into a large variety of complexes, such as bundles, triangles and squares by the selection of linker length.29,30 Sets of coiled-coil dimer-forming peptides that interact specifically with each other (orthogonal sets) can be used for designing polyhedral protein nanostructures known as coiled-coil protein origami (CCPO).7 CC peptides in such structures are arranged in a defined order in the primary structure and guide the polypeptide to fold into a polyhedral topology, from either a single19,31–33 or multiple chains.34,35 Interestingly, in the wake of the spectacular success of machine learning-based protein structure prediction and design,36–38 the fold of coiled-coil-based protein nanostructures cannot yet be efficiently predicted by these methods, presumably due to the absence of such structures from the learning set underlying machine learning algorithms and their complex fold topology.
We have previously shown how CCPO polyhedral shapes can self-assemble in cells during protein translation19 and undergo structural rearrangement according to which CC modules are selected and how they are placed within the sequence.34,35 However, expanding this design strategy further, and applying it to larger CCPO architectures composed of multiple chains, requires a significant shift in the design rules, especially of the inter-connecting linkers between the building blocks, whose length could lead to the formation of alternative, heterogeneous tertiary and quaternary structures.30 Moreover, the overall topology of CC-based proteins and the positioning of the terminal CC segments have to be taken into consideration when constructing large CC assemblies. In this regard, we have previously noted that CC peptide pairs with low stability tend to destabilize the nanostructures19 and that specific arrangements of the building blocks allow the interface of oligomeric CC-based nanostructures to adopt the desired conformation.35
We reasoned that covalent cyclization could introduce an additional constraint for the preorganization of building modules to facilitate the assembly. Although rare, natural cyclized proteins have increased stability against thermal stress and proteolysis.39,40 Protein cyclization has been already used in bioengineering, mainly to increase stability.41 Covalent bonds within polypeptide chains can be introduced by enzymatic tools such as transglutaminases,42 tyrosinases,43 split cellular anchoring proteins (SpyCatcher/SpyTag)44 and backbone cyclization with inteins. Inteins catalyse a posttranslational reaction, analogous to RNA splicing, where as a result of their activity, they are ultimately excised from the flanking peptides followed by ligation of the remaining parts.45,46 Since their discovery in 1990 in yeast,47 additional inteins have been uncovered48 as well as split inteins, which consist of two inactive parts that regain the splicing capability after their spatial reconstitution.49,50 The important biotechnological potential of this class of proteins was soon recognized and applied to protein cyclization and other applications, such as tag removal, labelling and purification.51 Since the discovery of the first split intein, based on Npu DnaE found in the DNA polymerase III of cyanobacteria Nostoc punctiforme;52,53 more efficient and stable split inteins have been discovered54 or engineered.55,56 Inteins are used to enhance protein stability by decreasing the conformational entropy of the unfolded state by protein cyclization.57 However; despite their versatility, CC-based nanostructures stabilized by intein splicing or intein cyclization have been used only to construct planar triangles.58 On the other hand, cyclization has already been implemented in designed DNA nanostructures and has been shown to increase their stability against exonucleases.59,60 CCPO designs, which like DNA nanostructures, rely on the discrete interactions between building block segments, could equally benefit from cyclization, acquiring higher stability and more conformational homogeneity.
In this work we approached the design of supramolecular assemblies, stabilized by cyclization using the in vivo splicing activity of the naturally occurring split intein gp41.54 Here, we propose that polypeptide chains with several unpaired CC segments can be used as building blocks once the interface and the termini of the CC-based subunit are stabilized in a prearranged conformation. We used this approach to construct CCPO topologies including a tetrahedral fold, composed of two chains as well as an irregular concave octahedral architecture composed of three chains, thus demonstrating the versatility of this strategy for designing multiple-chain assemblies from preorganized subunits.
Synthetic genes were purchased from Twist Bioscience (CA USA) or IDT (IA USA) and DNA oligonucleotides used in PCR reactions were purchased from IDT (IA USA). Genes coding for the proteins of interest were cloned in two different expression vectors pET41a+ (Genscript, NJ USA) and pCIRCgp41-1 a gift from Barbara Di Ventura & Roland Eils (Addgene plasmid # 74227; http://n2t.net/addgene:74227; RRID: Addgene_74227).57 Reading frames were optimized for E. coli codon usage using a software property of IDT (IA USA).
Gibson assembly61 was used to introduce, substitute or delete DNA segments in the genes. Amplification of DNA fragments was done using repliQa HiFi ToughMix® (Quantabio, Beverly, MA, USA) or Phusion® HotStart DNA polymerase (NEB, MA USA) in PCR reactions performed according to manufacturer instructions. Gibson assembly was performed with a mixture of the enzymes Taq Ligase (NEB, MA, USA), Phusion® Polymerase and T5 exonuclease (NEB, MA, USA) in reaction buffer (NEB, MA. USA) following the formulation provided by the manufacturer.61 The mixture was incubated for 1 h at 50 °C before transformation in competent E. coli cells. The plasmid transformation process was carried out utilizing a heat shock technique and following the manufacturer's protocol. Single colonies were then cultured in the presence of the antibiotic Kanamycin (Goldbio, MO, USA), at a final concentration of 50 μg ml−1, in Lysogeny broth (LB) media.
The cellular pellets were resuspended in 10 ml of lysis buffer 50 mM Tris–HCl at pH 8.0, 150 mM NaCl, 10 mM imidazole, 18 U ml−1 Benzonase (Merck, Germany), 1 mM MgCl2, 2 μl ml−1 CPI (Protease Inhibitor Cocktails) (Millex Sigma-Aldrich, MO USA), 1 mM TCEP (Goldbio, MO USA) per liter of culture. Lysis was completed by conducting a thermal protocol, where the lysate was incubated for 10 min in boiling water, cooled in ice, and supplemented with an additional 0.06 μl ml−1 of Benzonase (250 U ml−1) (Merck, Germany) before centrifugation.
The cellular lysates were centrifuged at 16
000×g (4 °C) for 20 min and the soluble fraction was then filtered through 0.45 μm filter units (Sartorius Stedim, Germany) for further purification.
In this study, for purifying proteins, we employed a combination of nickel-nitrilotriacetic acid (Ni-NTA) resin and size exclusion chromatography (SEC). The bacterial lysates were first filtered and then incubated with 5 ml of Ni-NTA resin (Goldbio, MO USA) in buffer A (50 mM Tris–HCl pH 8.0, 150 mM NaCl, 10 mM imidazole) for 5 minutes. The resin was then washed with buffer A and buffer B (50 mM Tris–HCl pH 8.0, 150 mM NaCl, 20 mM imidazole) and the bound proteins were eluted with buffer C (50 mM Tris–HCl pH 8.0, 150 mM NaCl, 300 mM imidazole).
The eluted proteins were then subjected to SEC using 320 ml of HiLoad Superdex™ 200 resin (GE Healthcare, IL USA) packed in a 26/600 XK column (GE Healthcare, IL USA) and the same amount of HiLoad Superdex™ 75 resin (GE Healthcare, IL USA) packed in a 26/600 XK column (GE Healthcare, IL USA) equilibrated with filtered and degassed SEC buffer (20 mM Tris–HCl pH 7.5, 150 mM NaCl, 10% v/v glycerol). The samples after NiNTA were concentrated using centrifugal filters (3k or 10k) (Amicon-ultra, Millex Sigma-Aldrich, MO USA) and filtered through 0.22 μm syringe filters (Millex Sigma-Aldrich, MO USA) before being injected into the column. The chromatography was run using an AKTA™ pure FPLC system (GE Healthcare, IL USA) in SEC buffer at a linear flow rate of 2.6 ml min−1 and the eluted protein fractions were collected separately.
The heterodimeric protein complexes (SB24-nnn, SB24-ncn) were obtained by combining the purified subunits SB9b and SB9c in an equimolar ratio at low concentrations (∼0.2 mg ml−1). Then component SB6/cySB6 was added in 20–30% molar excess. The mixture was then concentrated and purified via an additional SEC passage. The heterodimeric complex was collected after separation and further concentrated for additional characterization.
The heterodimeric protein complexes (DiTET nn, cc, cn) described in the article were obtained by combining the purified subunits in an equimolar ratio at low concentrations (∼0.2 mg ml−1) without additional chromatography passages. The complexes were afterwards concentrated using centrifugal filters (3k) (Amicon-ultra, Millex Sigma-Aldrich, MO USA) to a final concentration depending on the needs of the characterization methods and finally filtered through Durapore 0.1 μm centrifuge filters (Merck Millipore, MA, USA).
| Helical content (%) = MRE222/(MREH222 × (1–2.57/n)) |
500 deg cm2 dmol−1).63
SAXS experiments that were carried out using the advanced PETRA III facility utilized an X-ray wavelength of 1.24 Å and a Pilatus 6M detector that was positioned at a distance of 3 meters from the sample. The scattering vector was recorded in the range of 0.028–7.3 nm−1. The measurement of a single-chain protein SB6 was conducted utilizing a batch mode with the aid of a robotic sample changer operating in flow-through mode. A dilution series was used to evaluate the effects of concentration and included four concentrations ranging from 9.5 mg ml−1 to 1.2 mg ml−1. For each dilution sample (40 μl), data was obtained over 20 exposures with a duration of 0.05 s each. The frames that did not exhibit radiation damage were then averaged and integrated into the SASFLOW pipeline.69 Before and after each sample, buffer scattering data was collected for background subtraction purposes.
Size exclusion chromatography coupled SAXS (SEC-SAXS) was performed using a Superdex™ 200 increase 10/300 column (GE Healthcare, IL USA). The buffer used in the experiment was SEC buffer C (20 mM Tris–HCl (pH 7.5), 150 mM NaCl and 3% (v/v) glycerol) and the samples measured were DiTET-nn, DiTET-cc, DiTET-nc, SB24-nnn, SB24-ncn with their concentrations ranging from 11 to 18 mg ml−1. The mobile phase was run through the column at a 0.5 ml min−1 flow rate. During the experiments, 3000–3600 scattering frames were collected with an exposure time of 0.995 seconds.
The majority of batch measurements were performed at NIC on an Anton Paar instrument SAXSpoint 5.0 using a heated/cooled sample cell holder, low-volume ASX autosampler and the detector positioned 600 mm from the measuring capillary. Samples RH1, cyRH1, RH2, cyRH2 and cySB6 had a concentration of 5.4 mg ml−1, 14 mg ml−1, 22.7 mg ml−1, 8.96 mg ml−1, 8.74 mg ml−1, respectively. A dilution series consisting of four different concentrations was performed for each of the samples to assess concentration effects. Each sample (20 μl) and their matching buffer was loaded into the 1 mm flow-through quartz capillary with the ASX autosampler and the data were collected over 7 exposures each of 30 min. The temperature of the sample during measurements was kept at 10 °C via a Peltier unit, while the waiting samples were kept at 4 °C. Software SAXS Analysis by Anton Paar, (version 4.01) was used to transform and export the raw data: We determined the center of the beam, beamstop was masked, the intensity was normalized by transmittance, q transformation was performed, the data were transformed from 2d to 1d and finally exported in ATSAS format. The subsequent steps of data analysis were performed using the ATSAS suite.70 Frames not displaying any radiation damage were manually averaged and afterwards, buffer scattering was used for background subtraction. To assess concentration effects, a dilution series consisting of four concentrations was measured for the single-chain proteins. Analysis of scattering curves and ab initio modeling was performed using the ATSAS suite.70 Theoretical SAXS profiles were calculated from molecular models and compared to experimental data using Pepsi-SAXS.71 The agreement between theoretical and experimental curves was evaluated using the χ metric, with low values demonstrating a good fit. The models that agreed with the experimental curve best, were further refined using an online tool of ATSAS suite70 Sreflex,72 which performed flexible refinement of models to achieve a better fit.
Experimental scattering profiles and ideal theoretical scattering of the complexes SB24 were compared using the volatility ratio (VR) using a web application Sibyls (https://sibyls.als.lbl.gov/saxs-similarity/).73 VR was calculated by taking the ratio of two scattering profiles in the scattering vector range of 0.15–1.5 nm−1.
:
1 dissociation model with the software SedPhat.75 The data was visualized using Gussi software (Dr Chad Brautigam).
First, we paid attention to the arrangement of CC-forming peptides in individual subunits to minimize the number of undesired conformations of the final assembly. To prevent the undesired homodimerization of larger subunits, complementary CC pairs within a single subunit were strategically positioned at a minimal topological distance in the primary structure. In some subunits, only one intramolecular CC pair was introduced, and all other CC segments were left unpaired. These unpaired CC segments were designed to be complementary to the corresponding segments in the partner chain subunits.
Second, we designed subunits to have rigid geometrical faces, such as triangles, to avoid conformational variability such as in oblique shapes of four-sided faces. Third, we fixed the subunits in the preorganized shape to support the desired assembly. This was achieved by increasing the helical propensity of CC-forming modules by introducing salt bridges at positions b, c and f of the heptad repeats76 of a previously published orthogonal set of dimeric CC units5,15 (Fig. 1B).
![]() | ||
| Fig. 1 Design strategy for multi-chain CCPO structures and generation of preorganized building modules that include split intein cyclization and intrachain CC dimer. (A) Polyhedral shapes with edges formed by pairs of peptides were decomposed into multiple chains. (B) CC modules from the orthogonal library of dimeric CCs5,15 were selected to have a high helical propensity to increase their rigidity. This was achieved by the introduction of salt bridges between residues at the b, c and f positions of the CC dimer. (C) The splicing activity of split intein Gp41 was used to generate cyclic peptide chains in bacterial cells to preorganize the building modules to increase the predictability of the assembly. (D) Intramolecular CC dimer supported the preorganization of the cyclic chain to form merged bitrigonal modules. Molecular models were generated to predict the shape of the complexes. | ||
For our initial design, we selected a 12 CC-segment tetrahedral structure called TET12SN19 and decomposed it into a heterodimeric complex (Fig. 1A) with each unit composed of 6-segments (Fig. 1D left). Initially, we designed the subunits as two linear chains, each preorganized only by the presence of an intramolecular CC dimer, that self-assembles into a trigon with the N-terminus of the first and C-terminus of the sixth segment unconstrained. Additionally, we designed the alternative preorganized subunit variants, where the termini were constrained by an intein-mediated cyclization (Fig. 1C) shaping each of the subunits into a merged bitrigon. This was achieved by the genetic fusion of both parts of a highly efficient naturally occurring split intein Gp41 to the C- and N-termini of each subunit. Protein splicing occurred spontaneously during biosynthesis in bacterial cells, and those building subunits were isolated from the soluble fraction. By intein cyclization we were able to covalently link the N- and C-terminal segment, essentially creating a preorganized subunit comprising two trigons connected across one edge with an internal homodimeric CC with the remaining four unpaired CC dimer-forming segments (Fig. 1C right). In this manner, we aimed to self-assemble the tetrahedron from two bitrigon-forming subunits (Fig. 1D).
The linear (RH1 and RH2) and cyclized (cyRH1 and cyRH2) subunits for the two-chain tetrahedral CCPO assembly were produced separately in E. coli. SEC-MALS analysis indicated that the subunits were monodisperse with masses corresponding to the predicted values (RH1 = 27 kDa, RH2 = 30 kDa, cyRH1 = 26 kDa, cyRH2 = 26 kDa) (Fig. 2, B, G and L, and S1A to D, Table S1†). CD measurements confirmed the predicted high helicity of the individual subunits (Fig. S2B, S3B and Table S1†). The CD signal at 222 nm (Fig. S4A–D†) showed a gradual loss of helicity with the increased temperature, as expected for the nonglobular helical structure. After cooling, the polypeptide regained high helicity, similar to the measurements before the denaturation (Fig. S5A–D†), indicating a reversible unfolding-refolding. Each linear subunit and its cyclic counterpart, comprising identical amino acid sequences, exhibited different biophysical characteristics, indicating that the intein splicing reaction indeed successfully cyclized the proteins. Specifically, linear subunits had a lower elution time on SEC-MALS (RH1 = 28.7 min, RH2 = 27.6 min vs. cyRH1 = 29.3 min, cyRH2 = 29.6 min) (Fig. S2A and S3A†) and a larger Dmax measured by Small-Angle X-ray scattering (SAXS) in comparison to the cyclic polypeptide chains (RH1 = 32.5 ± 0.1 nm, RH2 = 17.1 ± 0.3 nm vs. cyRH1 = 16.6 ± 0.3 nm, cyRH2 = 10.5 ± 0.1 nm) (Fig. S2E and S3E†). By transforming SAXS curves into Kratky plots it is possible to better assess the degree of unfolding in the samples; unfolded proteins show a plateau in the Kratky plot at high q, while compact, globular proteins are distinguished by a bell-shaped peak. According to the Kratky plot of the SAXS curves, the linear subunits assumed a less ordered conformational state in the solution (Fig. S2D and S3D†) than cyclic subunits. Moreover, proteolytic treatment of the proteins with elastase further confirmed the success of intein cyclization (Fig. S6†). Comparing the cyclic subunits and noncyclic counterparts (Fig. S2C and S3C†) we observed no significant difference in the thermal denaturation of secondary structure elements indicating that the stability of the CC units is not affected by the cyclization.
Linear (RH1 and RH2) and cyclized (cyRH1 and cyRH2) subunits were mixed in all four different combinations to form dimeric complexes (DiTET-nn, DiTET-cc, DiTET-cn, DiTET-nc). Nomenclature of the complexes includes initials of the number of subunits constituting the complex (Di), initials of the intended polyhedron (TET), and letters at the end indicating the cyclic (c) or non-cyclic (i.e. linear) (n) nature of the first and second subunit. SEC-MALS of all complexes (Fig. 2B, G, L and S1K–N†) showed predominantly monodisperse dimers with masses around 50 kDa, corresponding to their theoretical size. Complex DiTET-nn had the lowest elution time (25.5 min), indicating a larger hydrodynamic diameter, while DiTET-cc had the highest elution time (27.1 min), pointing to a more compact shape. Both the monomeric TET12SN (26.6 min) and DiTET-nc (26.3 min) complexes had elution times between the DiTET-nn and DiTET-cc complexes (Fig. 3A). SEC-MALS chromatograms of DITET-cc and DiTET-nc showed a shoulder to the right of the main peak, corresponding to the monomeric components, which indicates these complexes' subunits may have slightly lower binding affinity compared to the rest. In the case of DiTET-cc the shoulder was reduced at high concentrations (S9). The formation of the complexes was monitored by ITC (Fig. 2C, H and M) and the data fitted to a 1
:
1 model, with Kd determined in the nanomolar range (Table 1). As indicated by SEC-MALS, the ITC confirmed that the binding affinity for the linear variants was higher compared to the cyclized versions.
:
1.5. SAXS data was analyzed to determine Dmax from a pair-distance distribution function and to determine the radius of gyration (Rg)
After proving the correct stoichiometry of the assembly, we then attempted to crystallize the complex and to obtain the structure using cryo-electron microscopy; however, the intrinsic flexibility of the loops on the vertices of the polyhedral structure prevented us from obtaining any high-resolution reconstruction of the complex. In this regard, we have already observed that CCPO structures seem inherently difficult to crystalize or vitrify; only recently an X-ray structure of the CCPO triangle has been determined based on the fortuitous packing of CC modules into a crystal lattice.77 We therefore resorted to using small-angle X-ray scattering (SAXS) (Fig. 2D, I and N, and Table 1) a method that allows determination of the molecular shape directly in solution. The experimental SAXS plot of the DiTET-nn complex matched the theoretical scattering of a molecular model with a flat and elongated conformation, suggesting misfolding. In contrast, assemblies DiTET-cc and DiTET-nc closely matched the theoretical scattering curve of the designed models with a tetrahedral shape (Fig. 2E, J and O). Both Dmax and radius of gyration (Rg) (Table 1) were in accordance with the SAXS fits, confirming that complexes containing cyclized subunits are more compact than the complexes without cyclized subunits.
SAXS curves of the three measured complexes were compared to the curve of the single-chain tetrahedral protein TET12SN (previously published data)19 (Fig. 3C and D). The scattering profile of DiTET-nn was the most divergent from the rest, lacking a scattering pattern characteristic for TET12SN tetrahedral topology, indicating a misfolded state. Dmax determined by SAXS (Fig. 3B and Table 1) was the highest for DiTET-nn (14.7 ± 0.1 nm), followed by DiTET-nc (12.5 ± 0.2 nm). The Dmax of DITET-cc (8.8 ± 0.2 nm) was similar to the diameter of the monomeric TET12SN (9.8 ± 0.1 nm), whose larger size, in comparison to DiTET-cc, might be an indication of some flexibility at the termini.
In addition, the ab initio SAXS reconstruction based on the pair distance distribution function (Fig. 3E) confirmed these results and, for complexes DiTET-cc, DiTET-nc, featured a concave depression indicating an internal cavity, which is characteristic of this type of de novo CCPO protein designs. Those results, therefore, demonstrate that the modular polypeptide assemblies can be achieved using preorganized cyclic subunits.
The assembly was divided into three pre-organized CC-based subunits to self-assemble into a concave irregular octahedral CCPO (Fig. 4). This heterotrimeric design consisted of two 9-segment pre-organized subunits (SB9b, SB9c) and a 6-segment polypeptide. We established the following nomenclature for the designed proteins: “SB” stood for “split boat” and the number denoted the number of CC-forming segments that were included in the protein/complex. The prefix “cy” described whether the protein was cyclized and letters at the end of the complex's name indicated cyclic (c) or non-cyclic (i.e. linear) (n) nature of the first (SB9b), second (SB6/cySB6) and third (SB9c) subunit. The two large subunits, composed of a 9 CC-forming segment contained three identical CC pairs (P3:4, P1:2, BCR:BCR) in a mirrored configuration. The three unpaired CC segments of each subunit constituted the binding interfaces between the SB9 subunits and a connecting smaller 6-CC-segment subunit. We were particularly interested to see how the short 6-CC peptide in either the linearized version or cyclized version (SB6 and cySB6, respectively) would affect the overall success of the assembly (SB24-nnn or cySB24-ncn, respectively) and therefore performed cyclization of the SB6 subunit similarly as for the tetrahedron as described above using split intein splicing (Fig. 1C).
Polypeptide subunits SB9b, SB9c, SB6 and cySB6 were expressed separately from E. coli and purified. SEC-MALS analysis (Fig. S1E–H†) showed that each component of the complex was monodisperse; while CD measurements confirmed that all the subunits assumed a highly helical secondary structure (SB6 α = 73.7%, cySB6 α = 77.8%, SB9b α = 62.2%, SB9c α = 67.1%) (Table S1†) and were able to refold after thermal denaturation (Fig. S4E–H and S5E–H†). The amino acid sequence of subunits cySB6 and SB6 was similar except for the intein splicing scar and some point mutations in the CC modules (Table S2†), however the biophysical characterization showed a difference in secondary and tertiary structure between the two (Fig. 5) suggesting that, cyclization with split inteins caused a more compact and less disordered fold. While SEC-MALS showed the proteins had a similar mass as theoretically predicted (SB6 = 30 kDa, cySB6 = 27 kDa) (Fig. S1E and F†), the larger elution time on SEC indicated a smaller hydrodynamic radius in the case of the cyclized variant (30.2 min) in comparison to the linear variant (27.4 min) (Fig. 5C). CD analysis showed a difference in thermal stability when comparing cyclic and linear subunits (Fig. S10†) which could be due to the point mutation differences in the sequence. SAXS analysis of the two proteins (Fig. 5D, S7A and B†) indicated SB6 had a hydrodynamic diameter larger than cySB6 (Dmax = 25 ± 0.1 nm and 9.4 ± 0.05 nm, respectively) (Fig. 5E, Table S1†). Kratky plot of SB6 showed a wide bell-shaped peak indicating a more unfolded conformation in solution than cySB6, shown by the narrower peak of the latter (Fig. 5F).
SEC-MALS experiments showed that a mixture of SB9b and SB9c had little to no tendency to form dimers (Fig. S1O†), even though they were complementary to each other across a single unpaired segment. On the other hand, mixing SB9b and SB9c with either SB6 or cySB6 in an equimolar ratio resulted in a predominantly monodisperse trimer (Fig. 6A and E) with an experimentally determined mass of 102 ± 0.5 kDa and 104 ± 0.6 kDa, respectively, corresponding to the theoretical size of the correctly assembled heterotrimers (SB24-nnn = 108.9 kDa, SB24-ncn = 108.6 kDa) (Fig. 6B and F, and S1I, J and Table S1†). The thermodynamics of the interaction between the subunits was analysed by ITC. Either SB6 or cySB6 was titrated into an equimolar mixture of SB9b and SB9c. The determined Kd pointed to a substantially higher affinity for the complex formed with the linear SB6 (Kd = 6.9 nM) compared to its cyclized counterpart cySB6 (Kd = 195.1 nM) (Fig. S8†). The linear subunit is more flexible in solution and has a higher conformational entropy than the cyclic subunit. This could contribute towards the higher affinity observed with the linear subunit, which however form a misfolded assembly upon binding to the other two proteins.
The size and shape of the complexes in the solution were assessed by SEC-SAXS. Experimental SAXS analysis of the complex SB24-nnn, comprising only linear chains, matched the theoretical scattering curve of a model with a collapsed subunit (χ2 = 1.97) (Fig. 6C and D). On the other hand, the complex SB24-ncn was shown to fit best to the scattering profile of models with a concave octahedral conformation (χ2 = 1.64) (Fig. 6G and H), with a Dmax (Dmax = 14.7 ± 0.05 nm), lower than SB24-nnn (Dmax = 16.1 ± 0.3 nm) (Table S1†). The difference between the experimental scattering profiles of the complexes and the theoretical scattering profile of an “idealized” model of an irregular octahedron were analysed with the volatility of ratio (Fig. 7), which is a metric that can identify local changes in otherwise structurally similar particles from their scattering profile.73 This revealed that the cyclized version of the complex, SB24ncn, assumed a conformation closer to the ideal model of a designed octahedron, suggesting that cyclization of the smaller subunit is sufficient to stabilize the complex and allow it to assume a conformation close to the original design.
While particularly significant in the field of synthetic biology, where CC and CC-based structures have been widely employed, understanding the rules governing the folding and assembly of molecular architectures is relevant for the use of synthetic proteins in biomedicine and biotechnology. In this context, this study introduces a strategy that enhances the stability and predictability of CC-based assemblies.
The two-chain assembly of the tetrahedral topology demonstrated here clearly illustrates the advantages of structural preorganization via cyclization. While non-cyclized subunits assembled stoichiometrically, their interaction yielded a misfolded assembly as the interaction energy of the collapsed state of the complex was higher compared to when we employed cyclic subunits, which on the other hand, favored the correct folding of the heterodimeric complex. In this case, the assembly assumed the intended shape when at least one cyclized subunit was incorporated into the design, which was supported by an extensive biophysical characterization and SAXS analysis of the assembly in solution. Expanding the reach of the rational design we demonstrated the design and assembly of SB24, an irregular concave octahedral topology, so far not implemented at the molecular scale, crafted from 24 coiled-coil modules, which marks the first successful design from three pre-organized chains as well as the largest de novo 3D CC-based assembly to date.
In both cases (DiTET-cc and SB24-ncn) the molecular shape in agreement with the design was confirmed with a high degree of certainty by SAXS analysis. For the two-chain tetrahedron, the validity of its shape was further confirmed through ab initio reconstitution based on scattering profiles. Attempts to determine the high-resolution structure of both assemblies by either X-ray crystallography or cryo-electron microscopy have not been successful, due to the intrinsic flexibility of the assembly lacking a compact core. Notably, SAXS was able to identify the misfolded assemblies despite their correct stoichiometry. It helped us to understand how the flexibility within the subunit of the complex SB24-nnn caused a partial collapse, and how replacing the flexible polypeptide chain with a cyclic chain alleviated the flexibility of the CC segments at the termini. This approach enabled the construction of a CCPO architecture that assumes a defined asymmetric molecular envelope that would be otherwise unattainable.
Interestingly, ITC results revealed a slightly lower affinity between cyclized subunits compared to linear ones. This discrepancy may be due to strain in the interacting cyclized chains. A likely explanation is that the collapse of linear chains, characterized by a higher conformational entropy, added a non-specific contribution to the interaction energy, which however came at the expense of misfolding.
Intein-mediated cyclization enhanced the structural stability and definition of protein–protein interaction surfaces. This method of subunit stabilization, utilizing highly efficient split intein splicing during bacterial production, requires no additional steps in polypeptide subunit isolation.
Recently, computational techniques based on machine learning benefiting from the extensive 3D structural information of natural proteins collected to date36,38,85 have significantly improved the design and prediction of globular natural-like proteins. In contrast, the reliable design of modular assemblies, such as CCPO, remains challenging due to their reliance on long-range topological contacts and CC dimer orthogonality.5,10,86 Interestingly, algorithms, such as AlphaFold2, have not been able to predict the structure of CCPOs.77
The rigid nature of the used triangular building blocks, combined with the rational arrangement of interacting CC segments allows in principle for the design of larger multimeric CC-based architectures. Multimeric CCPO shapes could represent scaffolds for encapsulation of small cargo molecules, as the cavity of these structures is designable in shape and size, and most importantly, unlike cavities in virus-like particles, is not bound to the specific symmetry of the assembly and could harbor molecules with an asymmetric shape. Moreover, the cavity of SB24 is notably larger than previous CC-based designs e.g. monomeric tetrahedron,19 dimeric bipyramid.35
The preorganization strategy for macromolecular self-assembly can be used to assemble asymmetric, yet precisely defined nanostructures where each vertex or edge of the polyhedral structure could be individually addressed to provide the geometric arrangement of the selected functionalities, e.g. presentation of antigens for the stimulation of immune response, activation of cellular receptors, positioning of catalytic centers or binding sites. Additionally, such CC complexes could be engineered to undergo a reversible conformational change in response to chemical signals such as metal ions,34,88 pH89 or biological molecules,90 all cues already used to regulate CC assemblies.
Here, we demonstrated an innovative approach to protein assembly by integrating cyclic subunits with intramolecular CC complementarity to overcome flexibility challenges. This strategy enabled the design of larger CC-based structures, marking a substantial advancement in the field. Utilizing split intein cyclization, not only enhances the predictability in designing interaction interfaces but also stands out for its simplicity of implementation.
Footnotes |
| † Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3sc06658d |
| ‡ University of Nova Gorica, Vipavska 13, SI5000 Nova Gorica, Slovenia |
| This journal is © The Royal Society of Chemistry 2024 |