DOI: 
10.1039/D4PY00907J
(Paper)
Polym. Chem., 2024, 
15, 4864-4874
Sequence-defined structural transitions by calcium-responsive proteins†
Received 
      20th August 2024
    , Accepted 6th November 2024
First published on 8th November 2024
Abstract
Biopolymer sequences dictate their functions, and protein-based polymers are a promising platform to establish sequence–function relationships for novel biopolymers. To efficiently explore vast sequence spaces of natural proteins, sequence repetition is a common strategy to tune and amplify specific functions. This strategy is applied to repeats-in-toxin (RTX) proteins with calcium-responsive folding behavior, which stems from tandem repeats of the nonapeptide GGXGXDXUX in which X can be any amino acid and U is a hydrophobic amino acid. To determine the functional range of this nonapeptide, we modified a naturally occurring RTX protein that forms β-roll structures in the presence of calcium. Sequence modifications focused on calcium-binding turns within the repetitive region, including either global substitution of nonconserved residues or complete replacement with tandem repeats of a consensus nonapeptide GGAGXDTLY. Some sequence modifications disrupted the typical transition from intrinsically disordered random coils to folded β rolls, despite conservation of the underlying nonapeptide sequence. Proteins enriched with smaller, hydrophobic amino acids adopted secondary structures in the absence of calcium and underwent structural rearrangements in calcium-rich environments. In contrast, proteins with bulkier, hydrophilic amino acids maintained intrinsic disorder in the absence of calcium. These results indicate a significant role of nonconserved amino acids in calcium-responsive folding, thereby revealing a strategy to leverage sequences in the design of tunable, calcium-responsive biopolymers.
    
      Introduction
      Defining the sequence of a polymer is a powerful approach for tuning intramolecular conformations, intermolecular interactions, and material properties.1–3 Sequence-defined polymers have enhanced control over self-assembled structures,4–6 molecular recognition,7–9 and stimuli-responsive functions.10–12 As synthetic strategies for sequence-defined polymers continue to improve, tradeoffs emerge between exhaustive or efficient exploration of expansive design spaces.13–16 Such tradeoffs are mitigated by evolutionary processes in biological systems, in which genetic drifts and selective pressures can produce diverse traits. Natural macromolecules that have evolved to carry out specific functions are promising platforms to evaluate the level of sequence definition required to design functional polymers.
      A function of recent interest is the calcium-responsive folding of bacterial proteins, which critically enable cells to secrete pathogens,17–19 assemble pore-forming toxins,20,21 and crystallize cell-protective surface layers.22,23 Calcium responsiveness emerges from conserved, repetitive protein sequences. Specifically, the proteins contain tandem repeats of the “consensus” nonapeptide (GGXGXDXUX)n, where X can be any amino acid, U is an aliphatic amino acid, and n is the number of tandem repeats. The consensus nonapeptide is identified by aligning related protein amino acid sequences. When the same amino acid occurs at a given position with a high frequency, the residue is considered conserved (G, glycine and D, aspartic acid in the nonapeptide). When many different amino acids occur at a given position, the residue is considered nonconserved (X in the nonapeptide). The consensus nonapeptide is historically named the Repeats-in-Toxin (RTX) motif, but not all RTX-containing proteins are cytotoxic. In the absence of calcium ions, RTX regions adopt intrinsically disordered conformations.24 In the presence of calcium ions, RTX regions form β-roll structures that consist of parallel β-sheets connected by calcium-binding turns (Fig. 1A and B).25,26
      |  | 
|  | Fig. 1  RTX sequence variants were designed to screen the importance of sequence conservation, residue size, hydrophobicity, and electrostatics in calcium-responsive folding. (A) The RTX protein comprises an N-terminal domain (top, red) that is highly repetitive and a C-terminal capping domain (bottom, black) that initiates folding in response to calcium ions (yellow). (B) Top-down view of calcium-binding turns connected by beta sheets. The N-terminal domain is characterized by the repeat sequence GGXGXDXUX, where X indicates a variable amino acid and U is an aliphatic amino acid (PDB: 5CVW18), structures generated in Pymol.49 (C) Primary sequence of CyaA Block V (wild type) and the C-terminal capping domain prior to mutation (left). Blue residues indicate positions selected for substitution. In global substitution variants (middle), all blue residues were replaced with the same amino acid. In consensus repeat variants (right), the entire N-terminal domain was replaced with 9 tandem repeats of the consensus sequence GGAGXDTLY.37 The C-terminal domain (black) was preserved in all sequence variants. Expressed proteins carried additional residues from the directional cloning strategy (gray), as well as a 6× His tag for purification (purple). |  | 
The calcium-responsive folding of the RTX motif inspired recent technological advances beyond the context of bacteria.27,28 These advances leveraged reversible changes in protein size and surface chemistry upon the introduction of calcium. RTX domains enabled switchable mesh sizes in protein networks,29,30 calcium-induced crosslinking of protein-based hydrogels,31–34 regulation of biomolecular recognition,35,36 column-free purification of recombinant proteins,37,38 and selective binding of lanthanide ions.39,40 While RTX-based technologies are promising, a potential limitation is the need for relatively high calcium concentrations (1–100 mM) to initiate folding.41 This need reflects the origin of the RTX motif, which folds in response to calcium concentrations that are relevant for bacteria. RTX proteins must remain disordered in intracellular environments, where calcium concentrations are less than 100 nM. Folding is only initiated upon translocation and secretion into extracellular environments, where calcium concentrations range from 10 μM to >10 mM.42,43
      The calcium binding affinities of RTX domains are sensitive to sequences, despite the conserved pattern underlying the nonapeptide repeats. This sensitivity is apparent in the well-studied adenylate cyclase toxin (CyaA) of Bordetella pertussis, which contains 40 RTX nonapeptide repeats that form five distinct blocks.26 Each CyaA block is denoted with Roman numerals I to V corresponding to N- and C-terminal domains, respectively. The fifth block of CyaA—hereafter denoted as “Block V”—binds most strongly to calcium ions. Calcium-responsive folding proceeds successively from the C-terminus to the N-terminus due to weaker affinities of Blocks IV, III, II, and I.19,44,45 Block V consists of nine tandem repeats of the RTX consensus nonapeptide and is flanked by a C-terminal capping domain (Fig. 1A). The capping domain initiates folding upon secretion through the type I secretion system in Gram-negative bacterial cells. Truncation or removal of the capping domain disrupts calcium-responsive folding, which can be recovered by entropic stabilization of the C-terminus.46,47 The importance of sequence patterning in Block V was demonstrated by rearranging the order of nonapeptide repeats, which reduced calcium binding affinities.48 This finding suggests that the consensus nonapeptide does not fully describe the requirements for calcium-responsive folding of RTX proteins.
      In this work, we modified the sequence of Block V to compare the roles of amino acid size, electrostatic interactions, hydrophobicity, and sequence repetition on the calcium-responsive folding of RTX proteins. We leveraged recombinant protein engineering to generate twelve sequence variants and systematically evaluate sequence-dependent secondary structures in the absence and presence of calcium. Many sequence variants formed secondary structures in the absence of calcium, in contrast to the intrinsically disordered Block V. Generally, sequence variants exhibited weaker calcium-responsive folding than Block V. Sequence variants that maintained disordered conformations in the absence of calcium underwent weaker folding transitions, revealing the importance of residue size and hydrophobicity in frustrating secondary structure formation. Sequence variants that adopted secondary structures in the absence of calcium underwent calcium-responsive structural rearrangements, revealing unexpected transitions between helical and sheet-like structures. The consistent calcium-bound structures of highly repetitive sequence variants suggest the importance of nonconserved residues in the final folded state of RTX proteins.
    
    
      Experimental
      
        Design of RTX sequence variants
        To determine the role of sequences in calcium-responsive folding, we produced twelve RTX sequence variants with modifications of the repeat domain Block V (Fig. 1C). All sequence variants preserved the native C-terminal capping domain to stabilize calcium-bound structures.46,47 One subset of sequence variants included global substitutions of the nonconserved residue X in the fifth position of the nonapeptide GGXGXDXUX, which was selected for its proximity to the highly conserved aspartic acid residue in the calcium-binding turn. Global substitution variants replaced nine residues throughout the Block V sequence with a single amino acid. The fifth position of each nonapeptide was globally replaced with alanine, histidine, serine, asparagine, aspartic acid, or glutamic acid—these options include the five amino acids that occur naturally in these positions throughout Block V, as well as glutamic acid for its chemical similarity to aspartic acid and potential to interact with calcium ions. Another subset of sequence variants replaced Block V with minimal consensus sequences GGAGXDTLY, which were derived from the most common amino acids in a set of RTX-containing proteins.37 The minimal consensus sequences differ from the consensus nonapeptide GGXGXDXUX by fully specifying all nine residues in the sequence. Consensus repeat variants included nine tandem repeats of each minimal consensus sequence to match the size of Block V. For each of the six consensus repeat variants, the fifth position X included one of the same six amino acids as the global substitution variants: alanine, histidine, serine, asparagine, aspartic acid, or glutamic acid. Complete amino acid sequences and DNA sequences are included in the ESI.†
      
      
        Cloning
        Genes encoding Block V and its 12 sequence variants were produced using directional cloning. Genes for each RTX sequence variant were flanked with restriction sites for directional cloning, codon optimized for Escherichia coli with scrambling to suppress recombination of repetitive regions,50 and purchased as gene fragments (Twist Bioscience). Genes were subcloned into pQE-9 using BamHI and HindIII restriction sites. All cloning was performed in NEB 5-alpha E. coli (New England Biolabs), which were prepared as chemically competent cells using Mix & Go transformation kits (Zymo Research). Plasmid DNA was purified by miniprep (ZymoPURE) to screen successful cloning through analytical digests at XbaI and SacI restriction sites prior to Sanger sequencing of the inserted region (GENEWIZ). All plasmids are available for use from Addgene.
      
      
        Protein expression
        RTX sequence variants were produced using recombinant protein expression in E. coli.51 All expression strains were purchased from New England Biolabs and prepared as chemically competent cells using Mix & Go transformation kits (Zymo Research). Most sequence variants were expressed in T7 Express lysY/Iq (NEB), with the exceptions of Block V in T7 Express, alanine global substitution in BL21(DE3), alanine, aspartic acid, and glutamic acid consensus repeats in BL21, and histidine and serine consensus repeats in NEBExpress Iq. Proteins were expressed by inoculating 10 mL of freshly grown overnight culture into 1 L LB media supplemented with 100 μg mL−1 ampicillin. Cultures were incubated at 37 °C until reaching an optical density at 600 nm between 0.8 and 1.0. Expression was induced with 1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG), and expression proceeded for 6 hours at 37 °C. The cells were harvested by centrifugation at 4000 rpm for 10 minutes. Pelleted cells were resuspended in 25 mL of denaturing lysis buffer (100 mM sodium phosphate, 10 mM Tris, 8 M urea, pH 8.0) and stored at −80 °C. To improve yield, lysis buffers for some expressions were supplemented with 1.0 M NaCl.52
      
      
        Protein recovery, purification, and validation
        Expressed proteins were recovered from cell pellets prior to isolation using immobilized metal affinity chromatography to capture 6× His-tagged proteins of interest, dialysis to remove excess ions, and lyophilization to remove water. To aid defrosting of cell pellets, an additional 25 mL lysis buffer supplemented with 20 mM imidazole was added prior to lysis by sonication. Crude lysates were clarified by centrifugation (8000 rpm for 1 hour) and filtration (0.45 μm). 6× His-tagged proteins were isolated using immobilized affinity chromatography, in which clarified lysates were incubated with HisPur™ Ni-NTA resin (ThermoScientific) for 2 hours at ambient temperature. Protein-bound resins were washed with lysis buffer supplemented with 10 mM to 25 mM imidazole prior to elution in lysis buffer supplemented with 250 mM imidazole. Eluted fractions were dialyzed against a chelating buffer (10 mM Tris, 1 mM EGTA, 50 mM NaCl, pH 8.0, 3 exchanges) and ultrapure water (18.2 MΩ cm, MilliQ, 7 exchanges). Water was removed by lyophilization,36 and purified proteins were stored at −20 °C. Typical protein expression yields ranged from 14 to 130 mg per 1 L culture. Protein purity was assessed by sodium dodecyl sulfate polyacrylamide gel electrophoresis (Fig. 2A and Fig. S1†). Protein identity was confirmed by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS, Bruker Microflex LRF) by comparing the measured molar mass to the expected molar mass based on an amino acid sequence (Table S1, Fig. S2–S14†).
        |  | 
|  | Fig. 2  Synthesized RTX sequence variants adopted diverse secondary structures in the absence of calcium. (A) Recombinant protein expression in E. coli was tolerant to all designed mutations, as demonstrated by sodium dodecyl sulfate polyacrylamide gel electrophoresis of Ni-NTA purified RTX sequence variants (12% polyacrylamide, 200 V, 45 minutes). (B) Secondary structures emerged in the CD spectra of global substitution variants with alanine, serine, and aspartic acid. Histidine, asparagine, and glutamic acid variants were disordered and resembled Block V, as indicated by a negative peak in molar residue ellipticity (MRE) at 200 nm. (C) Secondary structures emerged in the CD spectra of consensus repeat variants with alanine, histidine, aspartic acid, and glutamic acid. Serine and asparagine variants were disordered and resembled Block V. Replicate spectra for all sequence variants are included in Fig. S15–S27.† |  | 
Circular dichroism (CD) spectroscopy
        Sequence-dependent and calcium-responsive structural changes were measured using CD spectroscopy. Lyophilized proteins were resuspended in 50 mM Tris (pH 7.5) supplemented with up to 100 mM CaCl2 at final protein concentrations between 5 μM and 10 μM. Concentrations were measured after filtration (0.2 μm polyethersulfone membrane) using UV-vis spectroscopy (see the ESI† for details). Triplicate CD experiments were conducted using a Jasco J-815 spectropolarimeter (Fig. S15–S27†). Samples were measured 5 to 15 minutes after mixing with CaCl2 (Fig. S28 and S29†). Samples were loaded into a 1 mm pathlength cuvette (Hellma) and held at 20 °C. Scans were performed from 250 nm to 190 nm with a step of 0.2 nm and an integration time of 2 s. Spectra were averaged between 10 scans, and triplicate solutions were measured for each sequence variant. All spectra were corrected by background subtraction of 50 mM Tris (pH 7.5) with the corresponding concentration of CaCl2.
      
    
    
      Results and discussion
      
        Emergence of structures in RTX sequence variants without calcium
        In the absence of calcium ions, RTX proteins are typically disordered; however, several RTX sequence variants formed secondary structures that were characterized using CD spectroscopy (Fig. 2B). In Block V, the disorder was indicated by a prominent negative peak at 200 nm, consistent with random coil conformations.24 This peak persisted in global substitution variants with histidine, asparagine, and glutamic acid, which have bulky side chains that promote disorder. In the Block V sequence, these three amino acids each appear multiple times in the nonconserved position of interest. Interestingly, the asparagine variant was nearly indistinguishable from Block V. The reduced intensity of the peak in the histidine variant was attributed to UV absorption by aromatic side chains.53
        In contrast, global substitution variants with alanine, serine, and aspartic acid formed more ordered secondary structures without calcium, suggesting influences of amino acid size and electrostatic interactions. For these variants, the negative peak at 200 nm was replaced by a lower intensity negative peak between 205 nm and 208 nm, and a broad feature from 215 nm to 230 nm appeared. These spectral features suggested the formation of helical structures, which are commonly associated with negative peaks at 208 nm and 222 nm.54 The relative helical content between these variants contradicted the typical helix propensities of alanine (highest), serine (moderate), and aspartic acid (low), indicating the possible role of electrostatic stabilization in secondary structure formation.55–57
        Among the consensus repeat variants, polar variants mimicked the random coil conformations of Block V without calcium, whereas hydrophobic and charged variants formed more ordered secondary structures (Fig. 2C). Consensus repeat variants with serine and asparagine formed disordered structures that most resembled Block V, similar to the global substitution asparagine variant. These disordered structures indicate that polar uncharged residues promote random coil conformations in the absence of calcium. Meanwhile, hydrophobic consensus repeat variants with alanine and histidine adopted similar structures to the global substitution variant with alanine, namely a low intensity negative peak between 205 nm and 208 nm and a broad feature from 215 nm to 230 nm. Consensus repeat variants with aspartic acid and glutamic acid produced a broad negative feature from 210 nm to 230 nm, suggesting both helical and β-sheet characteristics. The diverse secondary structures formed by RTX sequence variants suggest an interplay between steric, hydrophilic, and electrostatic contributions to promote random coil conformations in the absence of calcium.
      
      
        Global substitution variants alter and weaken calcium-responsive folding
        CD spectra of Block V revealed a calcium-dependent structural transition, consistent with prior reports of RTX proteins (Fig. 3A).24,26,41,46 Below 0.5 mM CaCl2, Block V adopted a random coil conformation indicated by the negative peak at 200 nm. Above 0.5 mM CaCl2, Block V formed β-sheet structures indicated by the appearance of a negative peak at 218 nm and the disappearance of the negative peak at 200 nm. Deconvolution of CD spectra taken at 0 mM and 100 mM CaCl2 revealed an increase in the sheet content from 19.1% to 27.9% upon the addition of calcium ions (Fig. 3B), which is denoted as a 46% relative increase in sheet content. Block V also produced a 59% relative increase in turn content, consistent with the formation of a β-roll structure, characteristic of RTX proteins. A modest 17% relative increase in helical content was attributed to the folding of the capping domain. Spectral deconvolution was performed from 200 nm to 250 nm with CDPro software using the reference set SPD48, which is the largest available reference set that includes denatured proteins.58 The results from CDSSTR, CONTIN/LL, and SELCON3 methods were normalized and averaged to facilitate quantitative comparisons (Table S2†).
        |  | 
|  | Fig. 3  Block V formed β-roll structures in the presence of calcium ions. (A) CD spectroscopy of Block V revealed a transition from disordered random coils to β-roll structures between 0.5 mM and 1.0 mM CaCl2. Replicate spectra are included in Fig. S15.† (B) Spectral deconvolution quantified a structural transition at 1.0 mM CaCl2. This transition produced an increase in sheet, helix, and turn contents and a corresponding decrease in the unstructured content. |  | 
Many of the global substitution variants underwent calcium-responsive structural changes that were not characteristic of β-roll formation (Fig. 4A, top row). These unexpected structural rearrangements required higher calcium concentrations than those of Block V. For alanine, serine, and aspartic acid variants, the addition of 10 mM CaCl2 corresponded to the disappearance of the negative peak between 205 nm and 208 nm. The three variants produced distinct changes in the broad feature from 215 nm to 230 nm, such that the feature was enhanced for the alanine variant, relatively constant for the serine variant, and reduced for the aspartic acid variant. For all three variants, spectral deconvolution indicated a ≤10% relative increase in the sheet content from 0 mM to 100 mM CaCl2 (Fig. 4B).
        |  | 
|  | Fig. 4  Calcium-responsive folding of global substitution variants was weaker than that of Block V. (A) Asparagine and glutamic acid variants formed the most similar calcium-responsive structures to Block V, whereas alanine, serine, histidine, and aspartic acid variants underwent qualitatively different structural changes. Replicate spectra of all global substitution variants are included in Fig. S16–S21.† (B) CD spectral deconvolution at 0 mM and 100 mM CaCl2 revealed that the greatest secondary structure changes in response to calcium occurred in variants that were most disordered without calcium. |  | 
Disordered global substitution variants with histidine, asparagine, and glutamic acid formed β-roll structures upon addition of sufficient calcium chloride (Fig. 4A, bottom row). For the histidine variant, addition of 5 mM CaCl2 corresponded to the disappearance of the negative peak at 200 nm and the appearance of a low-intensity negative peak at 225 nm. Spectral deconvolution indicated a 35% and 48% relative increase in sheet and turn contents between 0 mM and 100 mM CaCl2 (Fig. 4B), suggesting that UV absorbance by histidine obscured the typical signatures of β-roll formation. For the asparagine variant, a sharp transition near 1.0 mM CaCl2 resembled the Block V transition at 0.5 mM CaCl2. For the glutamic acid variant, folding occurred gradually from 5 mM to 100 mM CaCl2.
        The global substitution variants with asparagine and glutamic acid demonstrated weaker calcium affinity and reduced cooperativity compared to Block V (Table 1). To compare the calcium responsiveness of RTX sequence variants to Block V, the Hill–Langmuir equation was used to fit the fraction of proteins bound by calcium ions θ with respect to the total calcium concentration [Ca2+]:59,60
where the Hill coefficient 
n describes the cooperativity of ligand binding, and the half-saturation dissociation constant 
KD indicates the calcium concentration at which half of protein binding sites are occupied. 
θ was calculated by normalizing the molar residue ellipticity at 218 nm, with the maximum absolute intensity corresponding to complete binding and β-roll formation. The asparagine variant resembled Block V, with similar 
KD values and positively cooperative binding (
n > 1). However, cooperative binding was weaker for the asparagine variant than for Block V. The glutamic acid variant exhibited an order-of-magnitude weaker response to calcium, which may result from its noncooperative binding. This analysis was limited to the asparagine and glutamic acid variants, which maintained disordered structures in the absence of calcium and produced the characteristic β-roll signature at 218 nm. The remaining global substitution variants exhibited higher fractions of β sheets in calcium-free structures, which resulted in weaker spectral changes at 218 nm. Weaker signals prevented reliable quantification of the bound fraction 
θ.
        
Table 1 Global substitutions of Block V reduce binding to calcium
		
            
              
              
              
              
                
                  | Variant | K
                    D (mM) | n | 
              
              
                
                  | Block V | 0.67 ± 0.08 | 4.1 ± 1.3 | 
                
                  | Asparagine | 1.0 ± 0.8 | 1.8 ± 0.7 | 
                
                  | Glutamic acid | 11 ± 3 | 1.0 ± 0.3 | 
              
            
        The weaker calcium-responsive folding of global substitution variants than Block V emphasizes the importance of sequence evolution in natural RTX proteins. The consensus RTX sequence GGXGXDXUX highlights some necessary features for RTX proteins to function, such as glycine for flexibility in the calcium-binding turn, aspartic acid to stabilize electrostatic interactions of divalent cations, and aliphatic residues to form the characteristic β-roll structure.25 However, these features alone were not sufficient to facilitate calcium-responsive folding of RTX sequence variants. Sequence variants that formed secondary structures in the absence of calcium suggest that the fifth residue of GGXGXDXUX plays a role in frustrating protein folding. Frustrated proteins that adopt random coil conformations may sample a broader folding energy landscape that promotes ion-driven folding, whereas proteins with less frustration may fold prematurely into conformations with less favorable ionic interactions.61,62 This contrast is best highlighted by comparing the variants with serine and asparagine, which are both polar residues. The smaller serine residue stabilized secondary structures in the absence of calcium, while larger asparagine promoted random coil conformations. This subtle difference in the residue structure led to drastically different calcium-responsive structural changes between these variants. A similar contrast emerges when comparing the variants with aspartic acid and glutamic acid, which have identical net charges but different side chain lengths. The smaller aspartic acid residue promoted electrostatic stabilization of secondary structures, whereas the bulkier glutamic acid residue promoted frustration and random coil conformations. These differences are consistent with reports that intrinsically disordered proteins are enriched in glutamic acid but not in aspartic acid.63–65 In other disordered proteins, aspartates support extended structures and form helical caps. In the variant with aspartic acid, helical conformations are attributed to the shorter side chain of aspartic acid, which can form hydrogen bonds with the peptide backbone. In Block V, strong cooperative binding likely results from a mix of both stabilizing and frustrating residues throughout the repeat domain.
      
      
        Consensus repeat variants undergo structural rearrangements to form consistent calcium-bound structures
        Despite a range of secondary structures in the absence of calcium, all consensus repeat variants adopted similar secondary structures in the presence of 100 mM CaCl2 (Fig. 5A). CD spectra showed a monotonic decrease in ellipticity from 200 nm to 220 nm to produce strong negative peaks near 225 nm (red curves). Spectral deconvolution revealed similar structural components, with calcium-bound structures demonstrating less variation than calcium-free structures (Fig. 5B). Calcium-bound consensus repeat variants also formed 10–23% relatively higher sheet content compared to calcium-bound Block V.
        |  | 
|  | Fig. 5  Consensus repeat variants formed consistent calcium-bound structures. (A) All consensus repeat variants produced similar circular dichroism spectra at 100 mM CaCl2 (red curves), which were characterized by a monotonic decrease from 200 nm to 220 nm and a broad negative peak near 225 nm. Replicate spectra of all consensus repeat variants are included in Fig. S22–S27.† (B) CD spectral deconvolution at 0 mM and 100 mM CaCl2 revealed a structural variation among consensus repeat variants in the absence of calcium, in contrast to quantitatively similar structures in the presence of 100 mM CaCl2. |  | 
Some consensus repeat variants underwent conformational changes from random coils to β-roll structures, whereas others underwent calcium-responsive structural rearrangements. In the absence of calcium, consensus repeat variants with serine and asparagine maintained the most disorder. The serine and asparagine variants underwent characteristic RTX folding transitions, respectively showing 26% and 30% relative decreases in the unstructured content between 0 mM and 100 mM CaCl2. Meanwhile, alanine, histidine, aspartic acid, and glutamic acid variants revealed unexpected structural transitions in response to calcium. For these variants, an increase in the sheet content was associated with a decrease in the helical content, resulting in the unstructured content remaining similar in the absence and presence of calcium for the alanine, histidine, and aspartic acid variants. The glutamic acid variant produced a 20% relative increase in unstructured content between 0 mM and 100 mM CaCl2. Interestingly, these calcium-responsive changes in secondary structure revealed transitions between helical and sheet-like structures that are unlike the conformational changes from random coils to β-rolls by Block V.
        Like global substitution variants, consensus repeat variants demonstrated weaker sensitivity to calcium compared to Block V. The consensus repeat variant with asparagine retained the greatest sensitivity, with conformational changes occurring between 1.0 mM and 3.0 mM CaCl2. For serine and histidine variants, structural transitions occurred gradually between 3.0 mM and 100 mM CaCl2. The alanine, aspartic acid, and glutamic acid variants exhibited the weakest calcium sensitivities, with structural transitions occurring between 10 mM and 100 mM CaCl2. The reduced calcium sensitivity of all sequence variants in this work suggests that nonconserved residues and sequence patterns are necessary to maintain the calcium sensitivity of Block V.
      
      
        General conclusions
        There remains much to learn from nature's design rules for calcium-responsive protein folding. To probe sequence effects, we modified the repetitive region of Block V—a naturally occurring RTX protein domain that binds to calcium by folding into a parallel β-roll. Global substitution variants altered the size, charge, and hydrophobicity of nonconserved residues in the calcium-binding turns of Block V, and consensus repeat variants replaced the repetitive region of Block V with tandem repeats of the nonapeptide GGAGXDTLY. All sequence mutations were tolerated during recombinant protein expression, which accelerates the rapid and accurate production of sequence-defined biopolymers.
        Despite changes to nonconserved residues, RTX variants adopted diverse, sequence-dependent secondary structures ranging from random coil conformations resembling Block V to more helical structures. In the global substitution variants, random coil conformations were achieved by the largest residues: histidine, asparagine, and glutamic acid. Unanticipated helical structures were observed for the global substitution variants with the smallest residues, alanine and serine. Residue size effects were further emphasized by unexpected helical structures formed by variants with aspartic acid, which contrasted the random coil conformations of variants with glutamic acid. For the consensus repeat variants, the hydrophilic residues serine and asparagine most resembled Block V in the absence of calcium. In the nonconserved position of interest, bulkier and hydrophilic residues tended to frustrate protein folding, enabling the protein to maintain a disordered structure in the absence of calcium.
        RTX sequence variants that preserved intrinsic disorder in the absence of calcium underwent calcium-responsive folding transitions associated with β-roll formation. β-roll structures emerged for global substitution variants with histidine, asparagine, and glutamic acid, although each with a weaker calcium affinity and cooperativity than Block V. Consensus repeat variants with polar residues—serine and asparagine—also underwent calcium-responsive folding. In contrast, sequence variants that adopted secondary structures in the absence of calcium revealed calcium-responsive structural rearrangements, in which an increase in the sheet content was offset by a decrease in the helical content. These transitions appear unlike the characteristic folding of random coils into β rolls by natural RTX proteins. Moreover, consensus repeat variants adopted different final structures from Block V, specifically with a higher sheet content in the presence of 100 mM CaCl2.
        Overall, our results highlight the versatility of recombinant protein engineering to map the sequence–function relationships of biopolymers. We establish the importance of the size and hydrophobicity of nonconserved residues in the RTX nonapeptide GGXGXDXUX. Asparagine strikes a particular balance between size and hydrophilic character, demonstrating the most calcium sensitivity within the sets of global substitution and consensus repeat variants. We anticipate that these insights will advance the use of RTX proteins as tunable, ion-responsive components of protein-based biomaterials and biotechnologies.
      
    
    
      Author contributions
      M.P.C. and D.J.M. conceptualized the study and designed experiments. M.P.C., W.H., G.M.S., and K.M.H. conducted molecular cloning, expressed and purified recombinant proteins, and validated the expressed proteins. M.P.C. conducted circular dichroism measurements. M.P.C. and D.J.M. analyzed data. M.P.C. and D.J.M. wrote the initial draft of the manuscript. All authors contributed to the revision and editing process. D.J.M. supervised the research.
    
    
      Data availability
      The data supporting this article have been included as part of the ESI.†
      All plasmids used to express RTX protein variants are available for use from Addgene.
    
    
      Conflicts of interest
      There are no conflicts to declare.
    
  
    Acknowledgements
      This material is based upon work supported by the Air Force Office of Scientific Research under award number FA9550-22-1-0241. This work was supported in part by the Linac Coherent Light Source (LCLS), SLAC National Accelerator Laboratory, under Contract No. DE-AC02-76SF00515 with the U.S. Department of Energy, Office of Basic Energy Sciences. We recognize support from the Department of Chemical Engineering, the Bio-X Summer Undergraduate Research Program, and the Office of the Vice Provost for Undergraduate Education at Stanford University. MALDI-TOF-MS measurements were supported by the Vincent Coates Foundation Mass Spectrometry Laboratory, Stanford University Mass Spectrometry (RRID: SCR_017801) utilizing a Bruker Microflex MALDI TOF mass spectrometer (RRID: SCR_018696). We thank Prof. Bradley Olsen for the generous gift of a pQE-9-mCherry plasmid, Prof. Possu Huang and Carla Perez for CD spectroscopy access, and Alana Gudinas and all members of the Mai Lab for helpful conversations.
    
    References
      - J.-F. Lutz, The future of sequence-defined polymers, Eur. Polym. J., 2023, 199, 112465 CrossRef.
- A. J. DeStefano, R. A. Segalman and E. C. Davidson, Where Biology and Traditional Polymers Meet: The Potential of Associating Sequence-Defined Polymers for Materials Science, JACS Au, 2021, 1(10), 1556–1571 CrossRef PubMed.
- S. L. Perry and C. E. Sing, 100th Anniversary of Macromolecular Science Viewpoint: Opportunities in the Physics of Sequence-Defined Polymers, ACS Macro Lett., 2020, 9(2), 216–225 CrossRef PubMed.
- H. Yu, F. C. Kalutantirige, L. Yao, C. M. Schroeder, Q. Chen and J. S. Moore, Self-Assembly of Repetitive Segment and Random Segment Polymer Architectures, ACS Macro Lett., 2022, 11(12), 1366–1372 CrossRef PubMed.
- J. Park, A. Staiger, S. Mecking and K. I. Winey, Ordered Nanostructures in Thin Films of Precise Ion-Containing Multiblock Copolymers, ACS Cent. Sci., 2022, 8(3), 388–393 CrossRef PubMed.
- J. Babi, L. Zhu, A. Lin, A. Uva, H. El-Haddad, A. Peloewetse and H. Tran, Self-assembled free-floating nanomaterials from sequence-defined polymers, J. Polym. Sci., 2021, 59(21), 2378–2404 CrossRef.
- M. Szatko, W. Forysiak, S. Kozub, T. Andruniów and R. Szweda, Revealing the Effect of Stereocontrol on Intermolecular Interactions between Abiotic, Sequence-Defined Polyurethanes and a Ligand, ACS Biomater. Sci. Eng., 2024, 10(6), 3727–3738 CrossRef PubMed.
- B. M. Seifried, W. Qi, Y. J. Yang, D. J. Mai, W. B. Puryear, J. A. Runstadler, G. Chen and B. D. Olsen, Glycoprotein Mimics with Tunable Functionalization through Global Amino Acid Substitution and Copper Click Chemistry, Bioconjugate Chem., 2020, 31(3), 554–566 CrossRef PubMed.
- S. Celasun, D. Remmler, T. Schwaar, M. G. Weller, F. Du Prez and H. G. Börner, Digging into the Sequential Space of Thiolactone Precision Polymers: A Combinatorial Strategy to Identify Functional Domains, Angew. Chem., Int. Ed., 2019, 58(7), 1960–1964 CrossRef CAS PubMed.
- B. M. Wirtz, A. G. Yun, C. Wick, X. J. Gao and D. J. Mai, Protease-Driven Phase Separation of Elastin-Like Polypeptides, Biomacromolecules, 2024, 25(8), 4898–4904 CrossRef CAS PubMed.
- X. Yuan, H. W. Hatch, J. C. Conrad, A. B. Marciel and J. C. Palmer, pH response of sequence-controlled polyampholyte brushes, Soft Matter, 2023, 19(23), 4333–4344 RSC.
- C. Pan, S. K. Tabatabaei, S. M. H. Tabatabaei Yazdi, A. G. Hernandez, C. M. Schroeder and O. Milenkovic, Rewritable two-dimensional DNA-based data storage with machine learning reconstruction, Nat. Commun., 2022, 13(1), 2984 CrossRef CAS PubMed.
- M. A. Webb, N. E. Jackson, P. S. Gil and J. J. de Pablo, Targeted sequence design within the coarse-grained polymer genome, Sci. Adv., 2020, 6(43), eabc6216 CrossRef PubMed.
- R. A. Patel, C. H. Borca and M. A. Webb, Featurization strategies for polymer sequence or composition design by machine learning, Mol. Syst. Des. Eng., 2022, 7(6), 661–676 RSC.
- P. S. Ramesh and T. K. Patra, Polymer sequence design via molecular simulation-based active learning, Soft Matter, 2023, 19(2), 282–294 RSC.
- E. C. Day, S. S. Chittari, M. P. Bogen and A. S. Knight, Navigating the Expansive Landscapes of Soft Materials: A User Guide for High-Throughput Workflows, ACS Polym. Au, 2023, 3(6), 406–427 CrossRef PubMed.
- I. Linhartová, L. Bumba, J. Mašín, M. Basler, R. Osička, J. Kamanová, K. Procházkovzá, I. Adkins, J. Hejnová-Holubová, L. Sadíková, J. Morová and P. Šebo, RTX proteins: a highly diverse family secreted by a common mechanism, FEMS Microbiol. Rev., 2010, 34(6), 1076–1112 CrossRef PubMed.
- L. Bumba, J. Masin, P. Macek, T. Wald, L. Motlova, I. Bibova, N. Klimova, L. Bednarova, V. Veverka, M. Kachala, D. I. Svergun, C. Barinka and P. Sebo, Calcium-Driven Folding of RTX Domain β-Rolls Ratchets Translocation of RTX Proteins through Type I Secretion Ducts, Mol. Cell, 2016, 62(1), 47–62 CrossRef PubMed.
- L. Motlova, N. Klimova, R. Fiser, P. Sebo and L. Bumba, Continuous Assembly of β-Roll Structures Is Implicated in the Type I-Dependent Secretion of Large Repeat-in-Toxins (RTX) Proteins, J. Mol. Biol., 2020, 432(20), 5696–5710 CrossRef PubMed.
- A. J. Wallace, T. J. Stillman, A. Atkins, S. J. Jamieson, P. A. Bullough, J. Green and P. J. Artymiuk, 
            E. coli Hemolysin E (HlyE, ClyA, SheA): X-Ray Crystal Structure of the Toxin and Observation of Membrane Pores by Electron Microscopy, Cell, 2000, 100(2), 265–276 CrossRef PubMed.
- M. D. Peraro and F. G. van der Goot, Pore-forming toxins: ancient, but never really out of fashion, Nat. Rev. Microbiol., 2016, 14(2), 77–92 CrossRef PubMed.
- J. Herrmann, F. Jabbarpour, P. G. Bargar, J. F. Nomellini, P.-N. Li, T. J. Lane, T. M. Weiss, J. Smit, L. Shapiro and S. Wakatsuki, Environmental Calcium Controls Alternate Physical States of the Caulobacter Surface Layer, Biophys. J., 2017, 112(9), 1841–1851 CrossRef PubMed.
- J. Herrmann, P.-N. Li, F. Jabbarpour, A. C. K. Chan, I. Rajkovic, T. Matsui, L. Shapiro, J. Smit, T. M. Weiss, M. E. P. Murphy and S. Wakatsuki, A bacterial surface layer protein exploits multistep crystallization for rapid self-assembly, Proc. Natl. Acad. Sci. U. S. A., 2020, 117(1), 388–394 CrossRef PubMed.
- A. Chenal, J. I. Guijarro, B. Raynal, M. Delepierre and D. Ladant, RTX Calcium Binding Motifs Are Intrinsically Disordered in the Absence of Calcium: Implication for Protein Secretion, J. Biol. Chem., 2009, 284(3), 1781–1789 CrossRef PubMed.
- U. Baumann, S. Wu, K. M. Flaherty and D. B. McKay, Three-dimensional structure of the alkaline protease of Pseudomonas aeruginosa: a two-domain protein with a calcium binding parallel beta roll motif, EMBO J., 1993, 12(9), 3357–3364 CrossRef PubMed.
- C. Bauche, A. Chenal, O. Knapp, C. Bodenreider, R. Benz, A. Chaffotte and D. Ladant, Structural and Functional Characterization of an Essential RTX Subdomain of Bordetella pertussis Adenylate Cyclase Toxin, J. Biol. Chem., 2006, 281(25), 16914–16926 CrossRef PubMed.
- B. Bulutoglu and S. Banta, Block V RTX Domain of Adenylate Cyclase from Bordetella pertussis: A Conformationally Dynamic Scaffold for Protein Engineering Applications, Toxins, 2017, 9(9), 289 CrossRef PubMed.
- M. P. Chang, W. Huang and D. J. Mai, Monomer-scale design of functional protein polymers using consensus repeat sequences, J. Polym. Sci., 2021, 59(22), 2644–2664 CrossRef.
- P. Ringler and G. E. Schulz, Self-Assembly of Proteins into Designed Networks, Science, 2003, 302(5642), 106–109 CrossRef PubMed.
- L. Liu, H. Wang, Y. Han, S. Lv and J. Chen, Using single molecule force spectroscopy to facilitate a rational design of Ca2+-responsive β-roll peptide-based hydrogels, J. Mater. Chem. B, 2018, 6(32), 5303–5312 RSC.
- K. Dooley, Y. H. Kim, H. D. Lu, R. Tu and S. Banta, Engineering of an Environmentally Responsive Beta Roll Peptide for Use As a Calcium-Dependent Cross-Linking Domain for Peptide Hydrogel Formation, Biomacromolecules, 2012, 13(6), 1758–1764 CrossRef PubMed.
- X.-R. Zhou, R. Ge and S.-Z. Luo, Self-assembly of pH and calcium dual-responsive peptide-amphiphilic hydrogel, J. Pept. Sci., 2013, 19(12), 737–744 CrossRef PubMed.
- K. Dooley, B. Bulutoglu and S. Banta, Doubling the Cross-Linking Interface of a Rationally Designed Beta Roll Peptide for Calcium-Dependent Proteinaceous Hydrogel Formation, Biomacromolecules, 2014, 15(10), 3617–3624 CrossRef PubMed.
- B. Bulutoglu, S. J. Yang and S. Banta, Conditional Network Assembly and Targeted Protein Retention via Environmentally Responsive, Engineered β-Roll Peptides, Biomacromolecules, 2017, 18(7), 2139–2145 CrossRef PubMed.
- B. Bulutoglu, K. Dooley, G. Szilvay, M. Blenner and S. Banta, Catch and Release: Engineered Allosterically Regulated β-Roll Peptides Enable On/Off Biomolecular Recognition, ACS Synth. Biol., 2017, 6(9), 1732–1741 CrossRef PubMed.
- W. Abdallah, K. Solanki and S. Banta, Insertion of a Calcium-Responsive β-Roll Domain into a Thermostable Alcohol Dehydrogenase Enables Tunable Control over Cofactor Selectivity, ACS Catal., 2018, 8(2), 1602–1613 CrossRef.
- O. Shur, K. Dooley, M. Blenner, M. Baltimore and S. Banta, A designed, phase changing RTX-based peptide for efficient bioseparations, BioTechniques, 2013, 54(4), 197–206 CrossRef PubMed.
- J. Hendrix, T. Read, J.-F. Lalonde, P. K. Jensen, W. Heymann, E. Lovelace, S. A. Zimmermann, M. Brasino, J. Rokicki and R. D. Dowell, Engineered Calcium-Precipitable Restriction Enzyme, ACS Synth. Biol., 2014, 3(12), 969–971 CrossRef PubMed.
- H. Jung, Z. Su, Y. Inaba, A. C. West and S. Banta, Genetic Modification of Acidithiobacillus ferrooxidans for Rare-Earth Element Recovery under Acidic Conditions, Environ. Sci. Technol., 2023, 57(48), 19902–19911 CrossRef PubMed.
- F. Khoury, Z. Su and S. Banta, Rare Earth Element Binding and Recovery by a Beta Roll-Forming RTX Domain, Inorg. Chem., 2024, 63(29), 13223–13230 CrossRef PubMed.
- T. Rose, P. Sebo, J. Bellalou and D. Ladant, Interaction of Calcium with Bordetella pertussis Adenylate Cyclase Toxin: Characterization of Multiple Calcium-Binding Sites and Calcium-Induced Conformational Changes, J. Biol. Chem., 1995, 270(44), 26370–26376 CrossRef PubMed.
- P. Gangola and B. P. Rosen, Maintenance of intracellular calcium in Escherichia coli, J. Biol. Chem., 1987, 262(26), 12570–12574 CrossRef.
- 
          M. M. King, B. B. Kayastha, M. J. Franklin and M. A. Patrauchan, Calcium Regulation of Bacterial Virulence, in Calcium Signaling, ed. M. S. Islam, Springer International Publishing, Cham,  2020, pp. 827–855 Search PubMed.
- H. Wang, X. Gao and H. Li, Single Molecule Force Spectroscopy Reveals the Mechanical Design Governing the Efficient Translocation of the Bacterial Toxin Protein RTX, J. Am. Chem. Soc., 2019, 141(51), 20498–20506 CrossRef PubMed.
- H. Wang, G. Chen and H. Li, Templated folding of the RTX domain of the bacterial toxin adenylate cyclase revealed by single molecule force spectroscopy, Nat. Commun., 2022, 13(1), 2784 CrossRef PubMed.
- A.-C. S. Pérez, J. C. Karst, M. Davi, J. I. Guijarro, D. Ladant and A. Chenal, Characterization of the Regions Involved in the Calcium-Induced Folding of the Intrinsically Disordered RTX Motifs from the Bordetella pertussis Adenylate Cyclase Toxin, J. Mol. Biol., 2010, 397(2), 534–549 CrossRef PubMed.
- M. A. Blenner, O. Shur, G. R. Szilvay, D. M. Cropek and S. Banta, Calcium-Induced Folding of a Beta Roll Motif Requires C-Terminal Entropic Stabilization, J. Mol. Biol., 2010, 400(2), 244–256 CrossRef.
- O. Shur and S. Banta, Rearranging and concatenating a native RTX domain to understand sequence modularity, Protein Eng., Des. Sel., 2012, 26(3), 171–180 CrossRef.
- 
          The PyMOL Molecular Graphics System, Version 3.0
          Schrödinger, LLC Search PubMed.
- N. C. Tang and A. Chilkoti, Combinatorial codon scrambling enables scalable gene synthesis and amplification of repetitive proteins, Nat. Mater., 2016, 15(4), 419–424 CrossRef PubMed.
- M. A. Morris, R. A. Bataglioli, D. J. Mai, Y. J. Yang, J. M. Paloni, C. E. Mills, J. D. Schmitz, E. A. Ding, A. C. Huske and B. D. Olsen, Democratizing the rapid screening of protein expression for materials development, Mol. Syst. Des. Eng., 2023, 8(2), 227–239 RSC.
- V. Yeong, E. G. Werth, L. M. Brown and A. C. Obermeyer, Formation of Biomolecular Condensates in Bacteria by Tuning Protein Electrostatics, ACS Cent. Sci., 2020, 6(12), 2301–2310 CrossRef PubMed.
- E. Peggion, A. Cosani, M. Terbojevich and E. Scoffone, Solution Properties of Synthetic Polypeptides. Circular Dichroism Studies on Poly-L-histidine and on Random Copolymers of L-Histidine and L-Lysine in Aqueous Solution, Macromolecules, 1971, 4(6), 725–731 CrossRef.
- N. J. Greenfield, Using circular dichroism spectra to estimate protein secondary structure, Nat. Protoc., 2006, 1(6), 2876–2890 CrossRef PubMed.
- K. T. O'Neil and W. F. DeGrado, A Thermodynamic Scale for the Helix-Forming Tendencies of the Commonly Occurring Amino Acids, Science, 1990, 250(4981), 646–651 CrossRef PubMed.
- M. Blaber, X.-j. Zhang and B. W. Matthews, Structural Basis of Amino Acid α Helix Propensity, Science, 1993, 260(5114), 1637–1640 CrossRef PubMed.
- C. N. Pace and J. M. Scholtz, A Helix Propensity Scale Based on Experimental Studies of Peptides and Proteins, Biophys. J., 1998, 75(1), 422–427 CrossRef PubMed.
- N. Sreerama and R. W. Woody, Estimation of Protein Secondary Structure from Circular Dichroism Spectra: Comparison of CONTIN, SELCON, and CDSSTR Methods with an Expanded Reference Set, Anal. Biochem., 2000, 287(2), 252–260 CrossRef PubMed.
- R. Gesztelyi, J. Zsuga, A. Kemeny-Beke, B. Varga, B. Juhasz and A. Tosaki, The Hill equation and the origin of quantitative pharmacology, Arch. Hist. Exact Sci., 2012, 66(4), 427–438 CrossRef.
- 
          D. L. Nelson and M. M. Cox, Lehninger Principles of Biochemistry, W. H. Freeman,  2017 Search PubMed.
- A. Kluber, T. A. Burt and C. Clementi, Size and topology modulate the effects of frustration in protein folding, Proc. Natl. Acad. Sci. U. S. A., 2018, 115(37), 9234–9239 CrossRef PubMed.
- S. Gianni, M. I. Freiberger, P. Jemth, D. U. Ferreiro, P. G. Wolynes and M. Fuxreiter, Fuzziness and Frustration in the Energy Landscape of Protein Folding, Function, and Assembly, Acc. Chem. Res., 2021, 54(5), 1251–1259 CrossRef PubMed.
- R. M. Williams, Z. Obradovic, V. Mathura, W. Braun, E. C. Garner, J. Young, S. Takayama, C. J. Brown and A. K. Dunker, The Protein Non-folding Problem: Amino Acid Determinants of Intrinsic Order and Disorder, Biocomputing 2001, 2000, 89–100 Search PubMed.
- V. N. Uversky, The alphabet of intrinsic disorder, Intrinsically Disordered Proteins, 2013, 1(1), e24684 CrossRef PubMed.
- M. A. Roesgaard, J. E. Lundsgaard, E. A. Newcombe, N. L. Jacobsen, F. Pesce, E. E. Tranchant, S. Lindemose, A. Prestel, R. Hartmann-Petersen, K. Lindorff-Larsen and B. B. Kragelund, Deciphering the Alphabet of Disorder—Glu and Asp Act Differently on Local but Not Global Properties, Biomolecules, 2022, 12(10), 1426 CrossRef PubMed.
| 
 | 
| This journal is © The Royal Society of Chemistry 2024 | 
Click here to see how this site uses Cookies. View our privacy policy here.