An efficient pyrrolysyl-tRNA synthetase for economical production of MeHis-containing enzymes

Genetic code expansion has emerged as a powerful tool in enzyme design and engineering, providing new insights into sophisticated catalytic mechanisms and enabling the development of enzymes with new catalytic functions. In this regard, the non-canonical histidine analogue Nδ-methylhistidine (MeHis) has proven especially versatile due to its ability to serve as a metal coordinating ligand or a catalytic nucleophile with a similar mode of reactivity to small molecule catalysts such as 4-dimethylaminopyridine (DMAP). Here we report the development of a highly efficient aminoacyl tRNA synthetase (G1PylRSMIFAF) for encoding MeHis into proteins, by transplanting five known active site mutations from Methanomethylophilus alvus (MaPylRS) into the single domain PylRS from Methanogenic archaeon ISO4-G1. In contrast to the high concentrations of MeHis (5–10 mM) needed with the Ma system, G1PylRSMIFAF can operate efficiently using MeHis concentrations of ∼0.1 mM, allowing more economical production of a range of MeHis-containing enzymes in high titres. Interestingly G1PylRSMIFAF is also a ‘polyspecific’ aminoacyl tRNA synthetase (aaRS), enabling incorporation of five different non-canonical amino acids (ncAAs) including 3-pyridylalanine and 2-fluorophenylalanine. This study provides an important step towards scalable production of engineered enzymes that contain non-canonical amino acids such as MeHis as key catalytic elements.


Introduction
In nature proteins perform a vast array of functions, including accelerating biochemical reactions, transporting molecules across membranes, providing structural support, and controlling signalling processes.Recent advances in high-throughput computation and experimentation have given us unprecedented control over protein sequence, structure and function, resulting in the development of a diverse array of engineered protein therapeutics, biocatalysts, and advanced biomaterials [1][2][3][4][5][6] .At present, our approaches to protein design, engineering and production typically only make use of nature's standard alphabet of twenty canonical amino acid building blocks.These standard amino acids are limited in their chemical diversity, which ultimately restricts our ability to develop proteins with new functions and desirable properties.To address this fundamental limitation, genetic code expansion (GCE) has emerged as a powerful and versatile technology to site selectively install new functional elements into proteins as non-canonical amino acids (ncAAs) 7,8 .GCE employs orthogonal aminoacyl tRNA synthetase (aaRS)/tRNA pairs to direct the incorporation of ncAAs in response to a reassigned codon (most commonly the amber codon UAG) introduced into a gene of interest.To date, a variety of aaRS/tRNA pairs have been developed which display the required orthogonality across a range of host organisms.Pyrrolysyl aaRS/ Pyl tRNA CUA pairs from methanogenic archaea have proven especially versatile, having been re-engineered to encode several hundred ncAAs in bacteria, yeast and mammalian cell lines 9 .These systems have allowed the development of new protein therapeutics, precision bioconjugates, responsive materials, protein-based vaccines and new biocontainment strategies.
The availability of an expanded genetic code also opens exciting new opportunities in enzyme design and engineering.For example, GCE has been used to improve enzyme activity and stability 10,11 , to probe complex biological mechanisms [12][13][14] , and to develop enzymes with functions and modes of catalysis beyond those found in nature [15][16][17][18][19][20] .The non-canonical histidine analogue, N δ -methylhistidine (MeHis), has proven to be an especially versatile tool in enzyme design and engineering research, leading to catalytically modified enzymes with augmented properties and entirely new functions, as well as new biocontainment strategies (Figure 1) 15,20,21 .For example, MeHis has been used as a metal chelating ligand to probe the mechanisms of heme enzymes including ascorbate peroxidase and cytochrome c peroxidase 10,12 .MeHis ligands have also been used to augment metalloenzyme function, leading to improvements in both peroxidase and carbene transferase activities in engineered myoglobin variants 11,22,23 .MeHis has also been shown to act as a potent catalytic nucleophile, leading to the development of artificial hydrolases and proficient enzymes for valuable non-biological transformations such as the Morita-Baylis-Hillman reaction 15,20 .
Figure 1: Crystal structures of MeHis-containing enzymes.A) The crystal structure of the myoglobin variant Mb*(MeHis) (PDB: 6G5B), in which substitution of His93 with MeHis affords an oxygen tolerant carbene transferase 23 .MeHis is shown as atom-coloured sticks, carbon in red, showing the Fe(III) carbenoid complex (atom-coloured stick, carbons in blue and yellow).B) The designed esterase OE1.3 uses MeHis23 as a catalytic nucleophile 15 .The crystal structure (PDB: 6Q7R) shows MeHis23 (atom-coloured sticks, carbons in grey) alkylated with the mechanistic inhibitor bromoacetophenone (atom-coloured sticks, carbons in yellow).C) Snapshot from an MD simulation of BH MeHis 1.8_Int2 complex.BH MeHis 1.8 uses MeHis23 as a catalytic nucleophile (PDB: 8BP0, atom-coloured sticks, carbons in blue).Int2 is shown in atom-coloured sticks with carbons in grey 20 .
To capitalize on these recent advances, it is important that we are able to produce MeHis-containing enzymes in an efficient and economical manner.At present, the engineered aaRS/tRNA pairs used to encode MeHis are relatively inefficient, typically requiring 5-10 mM concentrations of the expensive ncAA to be supplemented to the culture medium.For context, for a 30 kDa protein produced at 100mg per litre of culture, this equates to only 0.03-0.06% of MeHis being incorporated into the target protein.Furthermore, even at these high MeHis concentrations, a substantial proportion of undesired truncated protein is typically observed.It is therefore evident that for any future large-scale applications of MeHis-containing enzymes, the efficiency of modified protein production will have to be significantly improved.Here we report an efficient system for producing MeHis-containing proteins in high titres using only low concentrations of ncAA, providing an important step towards scalable production of these modified biocatalysts.

Results and Discussion
MeHis is commonly incorporated into target proteins using engineered PylRS homologs from Methanosarcina mazei (Mm), Methanosarcina barkeri (Mb) or Methanomethylophilus alvus (Ma) 24,25 .Two distinct sets of mutations have been reported to confer activity towards MeHis incorporation, either L121M, L125I, Y126F, M129A and V168F or L125I, Y126F, M129G, V168F and Y206F (based on Ma numbering), affording MaPylRS MIFAF or MaPylRS IFGFF respectively (Figure 2) [26][27][28] .A recent study showed that MaPylRS IFGFF gave modest improvements in protein yield compared with the analogous MmPylRS IFGFF variant, although high concentrations of MeHis are required in both cases.With the aim of identifying more efficient systems, we elected to explore a wider range of PylRS homologs, namely Methanogenic archaeon ISO4-G1 (G1) PylRS and Methanomassiliicoccales archaeon RumEn M1 (RumEn) PylRS 25 .Similar to MaPylRS, these homologs lack the N-terminal tRNA binding domain that is essential for activity in MmPylRS and MbPylRS.nd wild-type G1PylRS (blue, PDB: 8IFJ) 29 .Two distinct sets of mutations have been reported to confer activity towards MeHis incorporation in MaPylRS.The residues mutated in these two variants are shown as atomcoloured sticks with black carbons.The specific mutations present in each variant are detailed in the main body of text.
The resulting G1PylRS MIFAF/IFGFF and RumEnPylRS MIFAF/IFGFF variants were evaluated using an established GFP production assay and their activity compared to the analogous Ma systems.Of the two existing Ma variants, MaPylRS IFGFF was shown to be slightly more effective in suppressing the UAG codon to produce full length GFP containing MeHis at position 150 (Figure 3A).The UAG suppression efficiency of RumEnPylRS MIFAF/IFGFF was substantially reduced compared to the Ma variants.In contrast, both active-site transplanted G1PylRS variants showed substantially improved UAG suppression efficiency.G1PylRS MIFAF displays both high activity and specificity for MeHis (purple bars) over incorporation of canonical amino acids (grey bars), whereas the G1PylRS IFGFF variant suffers from a high background of phenylalanine incorporation in cultures grown in the absence of MeHis.This newly engineered G1PylRS MIFAF variant produces approximately 4-fold more full length GFP than MaPylRS IFGFF when 0.5 mM MeHis is supplied to the culture medium.To further compare the G1PylRS MIFAF and MaPylRS IFGFF systems, we next evaluated GFP production across a range of MeHis concentrations (Figure 3B and C and Supplementary Figure 1).Remarkably, G1PylRS MIFAF can operate efficiently using a MeHis concentration of 0.1 mM, with detectable levels of GFP production even observed at 0.01 mM.Increasing the MeHis concentration to 0.5mM or 1 mM led to only modest improvements in MeHis incorporation, suggesting that G1PylRS MIFAF is saturated at a concentration between 0.1 and 0.5 mM.For comparison, G1PylRS MIFAF was a more effective aaRS at 0.1 mM MeHis than MaPylRS IFGFF at 1 mM.Even using 50 times more MeHis (5 mM), MaPylRS IFGFF is only marginally more active.To illustrate the efficacy of G1PylRS MIFAF , we further increased the stringency of GFP production assays by introducing an additional UAG codon at position 40 (Figure 4).The G1PylRS MIFAF system is able to efficiently read through two UAG codons to produce GFP containing MeHis at positions 40 and 150, with only minor reductions in protein yield (Figure 4B).In contrast, with the less efficient MaPylRS IFGFF system yields of doubly modified GFP are extremely low.It is notable that many enzymes contain multiple catalytically important histidine residues [30][31][32][33] .The ability to efficiently produce proteins containing multiple MeHis residues will open up new opportunities to study and/or tune the functions of these enzymes.Having established an efficient aaRS/tRNA pair, our attention now turned to the production of engineered enzymes that use MeHis as an important catalytic element.To this end we selected the engineered peroxidase APX2_MeHis, where MeHis serves as an axial ligand and leads to dramatically improved turnover numbers, and designed enzymes OE1.4 (stereoselective hydrolase) and BH MeHis 1.8 (Morita-Baylis-Hillmanase) that both employ MeHis as a catalytic nucleophile 10,15,20 .Using only 0.1 mM MeHis, these engineered enzymes are all produced in >100 mg/L in standard laboratory Escherichia coli strains and culture conditions, corresponding to an impressive 4-6% of the total MeHis supplemented being incorporated into protein (Table 1).In all cases, protein yields achieved with G1PylRS MIFAF are substantially higher than those produced with MaPylRS IFGFF using 10 times higher MeHis concentrations (1 mM).Given the high efficiency of G1PylRS MIFAF , we wondered whether this aaRS is highly specific for MeHis or whether it could also be used to encode other ncAAs.We therefore tested G1PylRS MIFAF activity towards a small panel of ncAAs (10 mM) using the aforementioned GFP production assay.In addition to MeHis, G1PylRS MIFAF is able to encode five of the other ncAAs tested (Figure 5, structures 4, 6, 7, 8, and 9).These substrates include the hydrophobic ncAAs 3-(2-napthyl)alanine and 2fluorophenylalanine, as well as 3-pyridylalanine, a potentially valuable histidine analogue that is a poor substrate for MaPylRS IFGFF and MaPylRS MIFAF (Supplementary Figure 2).The ability of G1PylRS MIFAF to efficiently discriminate between phenylalanine and 2-fluorophenylalanine is particularly notable.

Conclusion
In summary, we have developed a highly efficient aaRS for encoding MeHis by introducing five known active site mutations into a single domain PylRS from Methanogenic archaeon ISO4-G1.The successful development of G1PylRS MIFAF serves to highlight the importance of exploring a wider range of PylRS homologs when developing orthogonal translation components.This G1PylRS MIFAF has allowed the efficient and economical production of a range of MeHis-containing enzymes using only 0.1 mM MeHis supplemented to the culture medium.Moving forward, there are several avenues for investigation to further enhance the production of valuable proteins containing MeHis to underpin any future commercial applications.Firstly, it is likely that even more efficient G1PylRS MIFAF descendants can be developed through directed evolution using established high-throughput assays.Switching to highdensity fermentation technologies will also likely boost protein yields.Secondly, we can take advantage of engineered or synthetic E. coli strains that have been specifically tailored for more efficient ncAA incorporation 8,[34][35][36] .Alternatively, we can explore engineered yeast strains or mammalian cell lines that can be advantageous for selected protein applications [37][38][39] .Finally, considering MeHis is a naturally occurring amino acid, we can envision the development of engineered production hosts that contain the necessary biosynthetic machinery to produce MeHis and direct its selected incorporation into target proteins.For these reasons, we are optimistic that the work presented in this paper will provide an important step towards commercially viable production of MeHis-containing enzymes.pEVOL_MaPylRS IFGFF /Ma Pyl tRNA CUA was available from a previous study 26 .PylRS genes (G1PylRS MIFAF and RumEnPylRS MIFAF ), optimized for E. coli expression, and the tRNAs (G1 pyl tRNA CUA and RumEn pyl tRNA CUA ) were synthesized by Twist Bioscience.Two copies of each PylRS gene and their corresponding tRNA were cloned into their respective pEVOL vectors using NdeI/PstI and BglII/SalI restriction sites for PylRS genes and ApaLI/XhoI for the tRNA.To make MaPylRS MIFAF , G1PylRS IFGFF and RumEnPylRS IFGFF primers to introduce the required mutations were used to make gene fragments which were combined using overlap extension PCR.Two copies of each gene were cloned into their respective pEVOL vectors using NdeI/PstI and BglII/SalI restriction sites.

GFP expression assays
Chemically competent E. coli BL21(DE3) cells containing the appropriate pEVOL vector were transformed with either pET28_GFP_Asn150TAG or pET28_GFP_Asn40TAG_Asn150TAG plasmid.Single colonies of freshly transformed cells were cultured in 5 mL of LB media containing 50 µg/mL kanamycin and 25 µg/mL chloramphenicol for 18 h at 30°C.Expression cultures were grown in 96deepwell blocks sealed with a breathable membrane.20 µL of the starter culture was used to inoculate 480 µL of defined auto-induction medium containing 50 µg/mL kanamycin and 25 µg/mL chloramphenicol (for the cultures with Asn40TAG_Asn150TAG plasmid, IPTG was removed from the auto-induction medium and added when the cultures reached at OD 600 = 0.6).Expression cultures were grown in the presence of the appropriate ncAA (0-10 mM) and incubated at 30°C with shaking at 850 rpm for 20 h.OD 600 and GFP fluorescence (λ excitation : 395 nm, λ emission : 509 nm) measurements were recorded using a BMG LabTech CLARIOstar spectrophotometer.

Protein production and purification of MeHis-containing proteins
For the expression of APX2_MeHis, chemically competent E. coli BL21(DE3) containing either pEVOL_MaPylRS IFGFF /Ma Pyl tRNA CUA or pEVOL_G1PylRS MIFAF /G1 Pyl tRNA CUA were transformed with pET29b_APX_MeHis.A single colony of freshly transformed cells were cultured in 5 mL of LB media containing 50 µg/mL kanamycin and 25 µg/mL chloramphenicol for 18 h at 30°C.300 µL of the starter cultures was used to inoculate 30 mL 2xYT medium supplemented with 50 µg/mL kanamycin, 25 µg/mL chloramphenicol, 5-aminolevulinic acid (1 mM final) and MeHis (1-0.1 mM final) and cultures were grown at 37°C, 200 rpm to an OD 600 of 0.6.Protein expression was induced with the addition of IPTG (0.1 mM final) and arabinose (5 mM final) and the cultures grown for a further 20 h at 20°C.
For the expression of BH MeHis 1.8 and OE1.4,chemically competent E. coli DH10β containing either pEVOL_MaPylRS IFGFF /Ma Pyl tRNA CUA or pEVOL_G1PylRS MIFAF /G1 Pyl tRNA CUA were transformed with either pBbE8K_ BH MeHis 1.8 or pBbE8K_OE1.4.A single colony of freshly transformed cells were cultured in 5 mL of LB media containing 50 µg/mL kanamycin and 25 µg/mL chloramphenicol for 18 h at 30°C.300 µL of the starter cultures was used to inoculate 30 mL 2xYT medium supplemented with 50 µg/mL kanamycin, 25 µg/mL chloramphenicol, and MeHis (1-0.1 mM final) and cultures were grown at 37°C, 200 rpm to an OD 600 of 0.6.Protein expression was induced with the arabinose (10 mM final) and the cultures grown for a further 20 h at 20°C.
The cells were harvested and purified as stated above for PylRS purification.Purified proteins were desalted using 10DG desalting columns (Bio-Rad) with PBS pH 7.4 and analysed by SDS-PAGE and protein MS.Protein concentrations were determined by measuring the absorbance at 280 nm using calculated extinction coefficients (ExPASy ProtParam).

Figure 2 :
Figure 2: Crystal structure overlay of MaPylRS and G1PylRS.An overlay of wild-type MaPylRS (Red, PDB: 6JP2)26 and wild-type G1PylRS (blue, PDB: 8IFJ)29 .Two distinct sets of mutations have been reported to confer activity towards MeHis incorporation in MaPylRS.The residues mutated in these two variants are shown as atomcoloured sticks with black carbons.The specific mutations present in each variant are detailed in the main body of text.

Figure 3 :
Figure 3: Production of MeHis-containing GFP using engineered PylRS homologs.A) Bar chart showing production of GFP containing MeHis at position 150, using either MaPylRS MIFAF/IFGFF / Pyl tRNA CUA , RumEnPylRS MIFAF/IFGFF / Pyl tRNA CUA or G1PylRS MIFAF/IFGFF / Pyl tRNA CUA pairs.Cultures grown in the presence of 0.5 mM MeHis (purple bars) or with no MeHis supplemented (grey bars).Error bars represent the standard deviation of measurements made in triplicate.B) Bar chart showing GFP production in cultures containing varying MeHis concentrations (0-5 mM) using MaPylRS IFGFF / Pyl tRNA CUA .Error bars represent the standard deviation of measurements made in triplicate.C) Bar chart showing GFP production in cultures containing varying MeHis concentrations (0-1 mM) using G1PylRS MIFAF / Pyl tRNA CUA .Error bars represent the standard deviation of measurements made in triplicate.

Figure 4 :
Figure 4: Production of GFP containing two MeHis residues.A) Schematic representation of modified GFP expression, introducing two MeHis residues in response to UAG codons at positions 40 and 150.B) Bar chart showing GFP production containing MeHis at either position 150 (grey bars) or at positions 40 and 150 (purple