Palladium-unleashed proteins: gentle aldehyde decaging for site-selective protein modification

Protein bioconjugation frequently makes use of aldehydes as reactive handles, with methods for their installation being highly valued. Here a new, powerful strategy to unmask a reactive protein aldehyde is presented. A genetically encoded caged glyoxyl aldehyde, situated in solvent-accessible locations, can be rapidly decaged through treatment with just one equivalent of allylpalladium( II ) chloride dimer at physiological pH. The protein aldehyde can undergo subsequent oxime ligation for site-selective protein modification. Quick yet mild conditions, orthogonality and powerful exposed reactivity make this strategy of great potential in protein modification.

Protein bioconjugation frequently makes use of aldehydes as reactive handles, with methods for their installation being highly valued.
Here a new, powerful strategy to unmask a reactive protein aldehyde is presented. A genetically encoded caged glyoxyl aldehyde, situated in solvent-accessible locations, can be rapidly decaged through treatment with just one equivalent of allylpalladium(II) chloride dimer at physiological pH. The protein aldehyde can undergo subsequent oxime ligation for site-selective protein modification. Quick yet mild conditions, orthogonality and powerful exposed reactivity make this strategy of great potential in protein modification.
Aldehydes are a powerful yet underutilised tool for bioorthogonal chemistry, where the high electrophilicity coupled with good stability and low abundance in nature make them an attractive handle for protein modification. 1 Bioorthogonal reactions involving aldehydes have been developed to take advantage of the unique reactivity of this functional group 2-8 and an impressive array of bioconjugates are synthetically accessible, including antibody-drug conjugates, 9 protein-protein conjugates 10 and labelled live cells. 11 Access to such aldehydes, however, can be an impediment to their usage, with incorporation methods either requiring enzyme recognition sequences, location at a protein terminus, or both. Use of formylglycine-generating enzyme (FGE), for example, will only form an aldehyde on the side chain of a cysteine in a CXPXR sequence (Fig. 1a). 12 Some strategies are less flexible for aldehyde positioning: periodate-mediated oxidative cleavage of serine or threonine residues (Fig. 1b) 13 or transamination of glycine residues 14 occurs only at such N-terminal residues, whilst tubulin tyrosine ligase (''Tub'') requires a Tub tag on a protein C-terminus to append tyrosine derivatives such as m-formyl-L-tyrosine 1 (Fig. S1, ESI †). 15 The technique of unnatural amino acid (UAA) mutagenesis has become a standard tool in chemical biology. Use of the pyrrolysine (Pyl) tRNA CUA /pyrrolysyl-tRNA synthetase (RS) pair from several species of archaeal methanogens for amber stop codon (TAG) suppression has allowed access to proteins containing a wide range of non-canonical functionality, including alkenes, [16][17][18] alkynes, [19][20][21] azides, 22 and aryl halides, 23 with generally excellent levels of site specificity; indeed, UAA mutagenesis has become a widely utilised tool in chemical biology. 24 Notably, the aldehyde-containing UAA m-formyl-L-phenylalanine 2 (Fig. S1, ESI †) has been genetically encoded using an engineered pylRS variant from Methanosarcina mazei. 25 The wild-type Methanosarcina barkeri pyrrolysine tRNA-RS pair genetically encodes 2-thiazolidine derivative ThzK, with the methyl ester 3 used as a suitable precursor for incorporation into proteins (Fig. 1c). 26 The thiazolidine group has seen use in peptide synthesis as an aldehydebased protecting group for 1,2-aminothiols such as cysteine with various deprotection strategies, [27][28][29] and a 4-thiazolidine derivative has been used as a caged pseudo-N-terminal cysteine for protein CBT condensation ligations, 30 showing the potential for a 2thiazolidine to be used to smuggle an aldehyde group into a protein. Notably, a thiazolidine will decage to yield a glyoxyl group, a highly reactive aldehyde to the extent that more reliable, reproducible and standardised modification methodologies have been established for this protein aldehyde than any other. 1 As reactive carbonyls have been documented as forming undesired adducts with 1,2-aminothiols in biological media, 31,32 the use of a protection/deprotection strategy avoids such side reactions which may suppress genetic incorporation or conjugation yields. Indeed, UAA mutagenesis has been shown to exhibit excellent synergy with decaging strategies, where photodecaging 33 and metalmediated decaging [34][35][36] reactions have expanded the paradigm of functionality which can be genetically encoded. In this work a rapid yet mild palladium glyoxyl-decaging strategy is presented which reveals protein aldehydes from surface-exposed ThzK residues in a protein without the need for enzyme recognition sequences, using just a single equivalent of palladium (Fig. 1c).
The new aldehydes are subsequently shown to be amenable to site-selective modification.
Green fluorescent protein (GFP) and superfolder green fluorescent protein (sfGFP) are highly useful test systems for protein modification due to ease of visualisation and highly optimised expression systems for use with UAA mutagenesis. Two GFP mutants containing amber stop codons at surface-exposed sites were selected: sfGFP(N150TAG) 37 and GFP(Y39TAG). 38 An advantage of using GFP is the facile confirmation and assaying of successful stop codon suppression through green fluorescence of harvested cell pellets. 39 Genetic encoding of ThzK has previously made use of the M. barkeri pyrrolysine tRNA-RS pair, 26 although the promiscuity of the corresponding M. mazei pair is generally sufficiently similar to the extent that either pair would be likely to genetically encode ThzK. 30 Protected ThzK 3 was synthesised in four straightforward steps via 4-9 with a cumulative yield of ca. 45% (Scheme S1, ESI †) following literature precedent. 26 Separate saponification is unnecessary as this amino acid can be delivered into protein expression systems in a mildly basic stock solution, exposing the C-terminus as needed for translation. Using an adapted general method for amber stop codon suppression in GFP expression, 38 both mutants were individually expressed with the supplementation of 3 at 1.6 mM in growth media and the green fluorescence of the harvested cell pellets confirmed the presence of full-length protein, demonstrating that ThzK can be encoded by the M. mazei pyrrolysine tRNA-RS pair. Following nickel affinity purification of the cell lysate, both GFP(Y39ThzK) 10 and sfGFP(N150ThzK) 11 could be isolated and their purity confirmed by SDS-PAGE and ESI-FTICR-MS ( Fig. S2 and S3, ESI †).
Following successful ThzK incorporation, 10 and 11 were used as test beds for novel decaging strategies to yield GFP(Y39GlyoxylK) 12 and sfGFP(N150GlyoxylK) 13 (Fig. 2a). Inspired by examples of palladium-mediated deprotection in aqueous conditions, 35,36 including thiazolidine cleavage, 40 we opted to screen commercially available palladium sources as potential biologically compatible reagents for unmasking of glyoxyl aldehydes. Palladium complexes 14-17 were trialled at 37 1C, pH 7.4 in order to maintain biocompatibility over a range of concentrations and time intervals. These complexes are a mixture of Pd(0) and Pd(II) 41 and at 100 equivalents have been used previously to decage thiazolidine peptides. 40 Addition of 100 equiv. Pd(OAc) 2 14 and allylpalladium(II) chloride dimer 16 led to protein precipitation, although some decaging was observed with 16. PdCl 2 (amphos) 2 15 was found to be unreactive, neither decaging nor inducing precipitation. Success was first achieved using tris(dibenzylideneacetone)dipalladium(0) 17, with complete decaging observed after 48 hours. Further optimisation of conditions led to complete decaging within 24 hours. Curiously, under these conditions decaging of ThzK-containing 10 largely afforded glyoxyl-containing 12 as the aldehyde form, whilst glyoxyl aldehydes in aqueous conditions generally exist in the hydrated form. On closer inspection, the observation that allylpalladium(II) chloride dimer 16 induced precipitation of GFP seemed unusual given its successful use for other types of decaging with GFP. 36 Subsequently, further screening using this palladium source showed that at concentrations greater than 0.5 mM, 16 will cause GFP to precipitate, but not at lower concentrations. Hence further screening was carried out using fewer equivalents with an identical concentration of GFP(Y39ThzK) 10 and pleasingly full decaging of the ThzK group to a glyoxyl could be observed within one hour using just one equivalent of allylpalladium(II)chloride dimer 16 (Fig. 2b) at room temperature. Notably, use of tris(dibenzylideneacetone)dipalladium(0) 17 under the same conditions afforded no detectable decaging. One equivalent of 16 strikes the balance between minimal protein precipitation and maximum extent of decaging, with substoichiometric quantities of palladium resulting in poorer conversion. These optimised conditions were then applied to sfGFP(N150ThzK) 11 (Fig. 2c) and complete decaging was also achieved with no protocol alterations necessary, despite the altered position of the ThzK moiety within the protein scaffold. In this example, the decaged glyoxyl exists almost exclusively as the hydrate, indicating that differences in the microenvironment surrounding the thiazolidine may affect the electrophilicity and reactivity of the resulting glyoxyl species. The final optimised procedure using allylpalladium(II) chloride dimer 16 is simple and rapid, requiring only limited exposure to a very low loading of palladium reagent.
As further confirmation of the exposed aldehyde reactivity, aniline-catalysed oxime ligation was performed upon protein glyoxyl species 12 and 13 using an aminooxy biotin probe 18 to afford biotinylated proteins 19 and 20 respectively (Fig. 3a). Pleasingly, complete ligation was observed with both proteins within 24 h, with a Western blot confirming the incorporation of the biotin probe in 19 (Fig. 3b). Further oxime ligation was performed using the fluorescent aminooxy dansyl probe 21 with 12 and 13 to afford dansylated proteins 22 and 23 respectively. Again full conversion was observed and through denatured protein in-gel fluorescence the presence of a dansyl group in 23 could be unequivocally visualised (Fig. 3c). Irrespective of differences in protein and aldehyde/hydrate distribution, the aldehydes uncaged by the method reported here can be modified to completion following established protocols. In summary, a new way to uncage a genetically encoded glyoxyl aldehyde precursor at physiological pH has been demonstrated using stoichiometric Pd(II), facilitating access to internally-modified proteins through aldehyde ligations without the need for an enzyme recognition sequence and hence minimising structural perturbations. This method requires only short reaction times under gentle conditions and the resulting aldehyde can be modified in completion. It is hoped that this latest addition to the chemical biologist's toolbox will open up opportunities for creating exciting new bioconjugates, achieving a greater understanding of complex biological systems.
We thank Dr Ed Bergstrom and the CoEMS for support with protein MS. This work was supported by the University of York (R. L. B.) and an EPSRC DTG studentship (EP/M506680/1, R. J. S.).