Oleksandr
Koniev
ab and
Alain
Wagner
*a
aLaboratory of Functional Chemo-Systems (UMR 7199), Labex Medalis, University of Strasbourg, 74 Route du Rhin, 67401 Illkirch-Graffenstaden, France. E-mail: alwag@unistra.fr
bSyndivia SAS, 4 rue Boussingault, 67000 Strasbourg, France
First published on 22nd May 2015
Bioconjugation methodologies have proven to play a central enabling role in the recent development of biotherapeutics and chemical biology approaches. Recent endeavours in these fields shed light on unprecedented chemical challenges to attain bioselectivity, biocompatibility, and biostability required by modern applications. In this review the current developments in various techniques of selective bond forming reactions of proteins and peptides were highlighted. The utility of each endogenous amino acid-selective conjugation methodology in the fields of biology and protein science has been surveyed with emphasis on the most relevant among reported transformations; selectivity and practical use have been discussed.
A large number of reactions exist to modify proteins.28 However, site-specific conjugation continues to attract considerable research efforts to develop new methodologies that match continuously increasing requirements of modern applications in terms of selectivity, stability, mildness, and preserving biomolecule integrity. For the purpose of this overview, the focus will remain on recent developments in bond-forming approaches in bioconjugation of native amino acid residues. Among about 20 different amino acids involved in protein composition, only a smaller number comprises appropriate targets for practical bioconjugation methods. In fact, only one-third of all amino acid residues represent chemical targets for the vast majority of bond-forming approaches.
The bioconjugation methodology of choice is selected according to the intrinsic reactivity of the targeted amino acid residue (acidity/basicity, electrophilicity/nucleophilicity, oxido-reductive characteristics) and its specific special environment (in-chain, N-terminal, C-terminal, location in a specific sequence, accessibility, etc.). In this review we will thus present the known bioconjugation strategies in regard to these parameters ranked in a descending order of frequency they are reported in literature.
Deprotonated primary amines are the most nucleophilic among the available functional groups present in a typical protein. However, protonation drastically decreases their reactivity. As a consequence, despite the generally higher intrinsic nucleophilicity of Lys ε-amino groups, they require higher pH values to be uncovered by deprotonation, which allows distinguishing α- and ε-amino groups by adjusting the pH. That is to say, at the higher pH level, when both types of primary amines are deprotonated, Lys side chain amino groups are generally more reactive towards electrophiles, while at the lower pH it is the opposite because of their prior protonation (Fig. 1). At the acidic pH all amines are protonated and possess no significant nucleophilicity compared to other side chains present in proteins. In particular, free (non-disulfide-bonded) Cys residues are much stronger nucleophiles and, if accessible, will readily be modified by most amine-reactive reagents.
It is also to be mentioned that like any other parameters, nucleophilicity and basicity, as well as solvent exposure and accessibility of a particular amino group, are influenced by the microenvironment and can vary substantially, regarding the substrate. For instance, Westheimer and Schmidt have found the actual pKa of the amino group situated in the active site of acetoacetate decarboxylase to be 5.9, which is 4 pKa units less than that of an “ordinary” ε-amino group of lysine.29
Depending on reaction conditions, selective modification of either N-termini (see Section 2) or Lys ε-amino groups can be achieved by using various chemical reagents. They generally belong to one of the following classes (in the order of relevancy): activated esters (fluorophenyl esters, NHS (N-hydroxysuccinimides),30 sulfo-NHS, acyl azides), isothiocyanates, isocyanates,31 aldehydes, anhydrides, sulfonyl chlorides, carbonates, fluorobenzenes, epoxides and imidoesters. Among this vast variety of reactive functions, NHS esters (and their more soluble sulfo-NHS analogues) and imidoesters represent the most popular amine-specific functional groups that are incorporated into commercially available reagents for protein conjugation and labelling.28
Despite their name, amine-reactive reagents are not always entirely selective for amines. Firstly, as already mentioned before, they will react with any other stronger nucleophile, if the latter is present and accessible on a protein surface. Particularly, it concerns cysteine, tyrosine, serine and threonine side chains. Secondly, depletion of these highly activated reagents by hydrolysis is inevitable in aqueous solution. The rate of both side-reactions depends on the particular substrate, the conjugation partner, pH, temperature, and buffer composition. Evidently, buffers that contain free amines, such as TRIS (tris(hydroxymethyl)aminomethane), must be avoided when using any amine-reactive probes, since the rate of the reaction with buffer would greatly exceed that with protein amino groups.
![]() | ||
Fig. 2 Synthesis of the p-maleimidophenyl isocyanate crosslinker via Curtius rearrangement proposed by Palumbo and associates.33 |
Several early studies were devoted to the elaboration of isocyanate conjugation methodology,34,35 but proven to be especially laborious and complicated mainly due to the high reactivity and low stability of isocyanates. Therefore they are of deferred interest today, being completely displaced by isothiocyanate-mediated approaches. Both isothiocyanates and isocyanates can be obtained from the corresponding aromatic amines upon reaction with thiophosgene and phosgene respectively (Fig. 3).36
![]() | ||
Fig. 3 Synthesis of isothiocyanates and isocyanates from the corresponding aromatic amines.36 |
Isothiocyanate-based selective amino group modification was first reported in 1937 by Todrick and Walker,37 who found that the reaction of allyl isothiocyanate with cysteine in alkaline medium results selectively in thiourea – the product of amine addition to isothiocyanate. In 1950, exploiting the selectivity of amino-terminal labelling of the peptide with phenylisothiocyanate, Edman has developed a method for peptide sequencing that has changed cardinally the protein science and is known today as Edman degradation.38 Only 30 years later, Podhradský et al. have examined the reaction of isothiocyanates on complex substrates and demonstrated that the addition of the thiol and phenolate functions of cysteine and tyrosine residues is always prevalent, and that only at pH > 5 amino groups start to manifest themselves in the reaction.39 While thiol and alcohol additions result in reversible reactions to give dithiocarbamates and O-thiocarbamates respectively, amines add themselves irreversibly, thus shifting the reaction equilibrium towards thioureas (Fig. 4). One should however keep in mind that, despite the reversibility of the addition of thiols and alcohols to isocyanates, they can enhance the kinetics of their hydrolysis to unreactive amines or ureas and therefore significantly decrease the yield of the conjugation. Moderately reactive but quite stable in water and most solvents, isothiocyanates represent thus an appropriate alternative to the unstable isocyanates. As a consequence, they are much more popular in bioconjugation.
Ever since the introduction of fluorescent isothiocyanate dyes as more stable analogues of corresponding isocyanates for fluorescent labelling of antibodies by Riggs et al.40 in 1958, they have found widespread use in research laboratories and proved to be an effective means for tagging proteins at specific sites.41
Fluorescein isothiocyanate (FITC) is arguably one of the most commonly used fluorescent derivatisation reagents for proteins. For instance, it was reported by Tuls et al.42 that cytochrome P-450 can be selectively labelled by FITC with 75% yield of a single-labelled LYS-338 conjugate in TRIS (particularly inappropriate buffer for amine-reactive reagents though) at pH 8.0 and 0 °C. Burtnick43 has described selective labelling of one out of 34 lysine residues of actin in borate buffer with 35-fold excess of the reagent at pH 8.5. Such a high level of selectivity towards the LYS-61 residue over other 33 lysine residues present in proteins (Fig. 5, shown in red) remains unclear, but it is hypothesised to be due to an anomalously low pKa value thereof. Following reports of Miki and collaborators44,45 further confirmed the selectivity of this labelling, yet without any explanation of such specificity. Bellelli et al.46 were able to covalently label ricin (pH 8.1, 6 °C for 4 h). In fact, the targets of the isothiocyanate-mediated labelling of proteins elaborated over the last 60 years are even difficult to enumerate. It was proven to be effective in diverse applications such as tagging of antibodies (usually in carbonate–bicarbonate buffer, pH 9),47–53 bleaching-based measurement of membrane protein diffusion of FITC-labelled cells (pH 9.5, 24 °C),54 surface topography of the Escherichia coli ribosomal subunit,55 α-actinin distribution in living and fixed fibroblasts,56 characterisation of a proton pump on lysosomes,57 and hematopoietic stem cells.58
![]() | ||
Fig. 5 Selective fluorescent labelling of the Lys-61 residue (shown in magenta) of rabbit skeletal muscle G-actin (pdb: 2VYP) reported by Burtnick.43 |
The most stunning examples include 125I labelling by means of isothiocyanates, elaborated by Shapiro and colleagues59 for regional differentiation of the sperm surface (TRIS, pH 7.7, 12 °C for 30 min), and the application of a similar methodology by Schirrmacher et al.60 for 18F radioactive labelling of RSA, apotransferrin and bovine IgG (pH 9.0, room temperature for 10–20 min). Conjugation of antibodies with chelating agents for further radiometal labelling of antibodies has been described by several groups61–63 and is based on the use of phenylisothiocyanate-containing probes. Brechbiel et al.64 went even further by combining the chelating functionality with the biotin fragment in a scaffold of trifunctional conjugation reagents. The preparation of silica nanoparticles coated with isothiocyanate groups and their use in apoptosis detection has recently been elaborated.65
The classical protocol of isothiocyanate labelling involves the use of 5–10 equivalents at a slightly basic pH in the range of 9.0–9.5.66,67 Resulting thioureas are reasonably stable in aqueous medium and provide a suitable degree of conjugation.68 For example, Sandmaier and colleagues69 have recently demonstrated that radiolabelling of the anticanine CD45 antibody using isocyanate and isothiocyanate provides a more specific delivery to the targeted CD45-expressing cells than a method exploiting thiol–maleimide conjugation (see Section 1.3.2). However, it has been shown by Banks and Paquette70 that, compared to NHS ester based methodology, antibody conjugates prepared with isothiocyanates are less hydrolytically stable and deteriorate over time. Moreover, the reaction of NHS esters for amine labelling was found to be faster, to give more stable conjugates for both model amino acids and proteins, and to proceed readily at lower pH, compared to isothiocyanates. Consequently, NHS esters are preferable to isothiocyanates in many respects for synthesizing bioconjugates.
N-Hydroxysuccinimide (NHS) activated esters were introduced in 1963 by Anderson et al. as a better alternative to phenyl esters in forming the peptide bond.30,76,77 Possessing high selectivity towards aliphatic amines, NHS esters are today considered among the most powerful protein-modification reagents. Although several studies drew attention to a certain reactivity of NHS-activated esters with tyrosine,78–83 histidine,84 serine and threonine (especially when situated in certain locations, see Section 1.2),85–90 these side reactions possess largely decreased rates compared to the reaction with free amines and do not generally hinder the amine-selective derivatisation. High concentrations of nucleophilic thiols should however be avoided because, similarly to isothiocyanates, they may increase the rate of probe degradation by forming more easily hydrolysable intermediates (Fig. 6).
The optimum pH for NHS-mediated labelling in aqueous systems was found to be lower than for other amine-selective reagents and ranges from 7 to 8 units (compared with 9–9.5 for isothiocyanates), which enlarges the prospect of its suitability for modifying alkaline-sensitive proteins. Several elaborated studies of the kinetics,72 the stoichiometry,91 and the selectivity (in-chain versus N-terminal modification)92 of NHS-mediated protein tagging have been recently reported.
Depending on the pH of the reaction solution and temperature, NHS esters are hydrolysed by water (possessing a half-life of 4–5 hours at pH 7, 1 hour at pH 8 and 10 minutes at pH 8.6),84,93 but are stable to storage if kept well desiccated. Virtually any molecule containing an acid functionality, or a moiety which can give an acid, can be transformed into its N-hydroxysuccinimide ester. While the activation with NHS generally decreases the water-solubility of the carboxylate molecule, the utilisation of sulfo-NHS94 preserves or even increases the water-solubility of the modified molecule by virtue of the charged sulfonate group. The development of new reagents based on NHS chemistry can be sometimes challenging,95 but the derivatives are frequently of very important use.96–100 Many NHS derivatives for the preparation of affinity reagents, fluorescent probes and cross-coupling reagents are now commercially available, enabling wide access to investigations.
The formed conjugates are linked by means of a very stable aliphatic amide bond with half-lives in the range of 7 years in water.101 This excellent stability and biocompatibility of the obtained bonds have provided an exceptional importance of NHS esters in the field of bioconjugation.
NHS ester-mediated covalent conjugation for protein modification has been first accomplished by Becker et al., who studied biotin transport first in yeast74 and then applied this technique to the covalent attachment of biotin to bacteriophage T4.102 Since then, the field of NHS-mediated conjugation of proteins has been unceasingly expanding its employability in countless applications.
Cross-linking of proteins often implies using NHS-containing homobifunctional or heterobifunctional cross-linking reagents. These were used for elucidation of protein–protein103–107 and protein–drug interactions,108 protein structural and subunit analysis,26,109 create protein complex models,110 and preparation of protein conjugates with enzymes, drugs or other macromolecules.111–113
Homobifunctional NHS cross-linkers are generally used in reaction procedures to randomly “fix” or polymerize peptides or proteins through their amino groups. Adding such crosslinkers to a cell lysate will result in the random conjugation of interacting proteins, protein subunits, and any other polypeptides whose Lys side chains happen to be in close proximity to each other. This represents a methodology for capturing a “snapshot” of all protein interactions at a certain instant of time. Using this approach, for instance, Sinz and collaborators were able to elucidate binding of calmodulin to mettilin, a polypeptide and principal component of honeybee venom, without chromatographic separation techniques.114 Cross-linking of the proteins with a number of different length NHS-homobifunctional cross-linkers, and the following digestion of obtained products with trypsin and analysis by HPLC enabled the possibility of three-dimensional structure modelling of the calmodulin–melittin complex (Fig. 7).
![]() | ||
Fig. 7 Mode of binding of melittin in the calmodulin–melittin complex (pdb: 2MLT and 1CDL) calculated from ambiguous distance restraints derived from the cross-linking data by Sinz and associates.114 |
Several applications however require the precision of crosslinking which cannot be provided by homobifunctional crosslinkers. For example, the preparation of an antibody–drug conjugate (ADC) implies selective linking of a cytotoxic payload to each molecule of the antibody without causing any antibody-to-antibody linkages to form. For such application the combination of different selective approaches in one linker is needed.
Therefore, heterobifunctional crosslinkers are designed to possess different reactive groups at either end. These reagents allow for sequential conjugations that diminish undesirable self-conjugation and polymerisation. Sequential procedures involve two-step processes, where heterobifunctional reagents (often in excess to ensure high conversion levels) are reacted with one protein using the most labile group of the crosslinker first. After eliminating the excess of the nonreacted crosslinker, the second protein is added to a solution containing modified first protein and another reaction occurs with the second reactive group of the crosslinker. According to the Pierce website (Rockford, IL, USA), the most popular heterobifunctional crosslinkers are those having amine-reactive NHS esters at one end and thiol-reactive maleimides (see Section 1.3.2) at the other end. Because of its less stability in aqueous solution compared to maleimide, the NHS-ester group should usually be reacted first. Takeda and co-workers115 used a bifunctional reagent that contained a NHS function and a benzylthioester function to prepare a DNA–protein hybrid. One of the fastest growing fields requiring heterobifunctional crosslinkers today is targeted drug delivery therapies – ADCs.22,116–120 They are constituted of three main components: one monoclonal antibody (mAb), targeting specific signs or markers of cancer cells, one cytotoxic agent, and one linker molecule that allows covalent drug binding to the mAb. The composition of trastuzumab emtansine (Kadcyla®, Genentech), an in clinic ADC for treatment of HER2-positive metastatic breast cancer, is depicted in Fig. 8.
![]() | ||
Fig. 8 Structure of an antibody drug conjugate (ADC) Kadcyla®. SMCC linker (shown in blue) serves for conjugation of the antibody and cytotoxic payload (Mertansine, shown in red). |
The first example of a “cleavable” NHS cross-linking reagent, DSP, was reported by Lomant and Fairbanks93 and allowed effecting the reversal of the previously conjugated fragments under mild conditions of disulfide bond reduction. Further advances in the field have resulted in various types of linkers, cleavable under mild nucleophilic conditions (EGS),121 at basic pH (BSOCOES),122 in the presence of periodate (DST),123 or enzymatically.124 These found their applicability for studies in basic and applied research. The reader is directed to a recent review by Leriche et al.125 that provides an overview of chemical functions that can be used as cleavable agents and to a publication by Jin Lee126 for an overview of commercially available cross-linking reagents.
Other combinations of functionalities have been studied over the last 20 years and resulted in elaboration of heterotrifunctional127 linkers usually combining two bioselective reactive groups and a functionality for anchoring the obtained conjugate (e.g. the biotin moiety).
Many chemical probes widely used in bioconjugation contain the NHS-fragment in their structure and are designed to react with free amino groups of proteins. For example, biotinylation128–130 as well as PEGylation2 of proteins are most commonly achieved using NHS-activated probes today. It was recently reported by Anderson and collaborators that biotinylation of antibodies with NHS-biotin and their following adsorption on the surface of nanocrystal quantum-dots (QD) results in obtaining highly efficient QD-antibody conjugates for the detection of protein toxins.131 Other types of protein immobilisation on matrices have also been reported.84,132,133 The Bolton–Hunter reagent (SHPP),134 allowing the conjugation of tyrosine-like residues for increasing the yield of subsequent (radio)iodination, is also based on N-hydroxysuccinimide chemistry.135–140 Elaborated in 1982 by Ji et al.141 structurally similar SHPP photoactivable heterobifunctional probes for cross-linking experiments have been used in more than 100 studies ever since. The NHS ester-based strategy for isobaric, stable isotope labelling of peptides142–144 has recently found more widespread application in proteomic studies with simultaneous developments in enhancing peptide detection by electrospray ionisation mass spectrometry.78,145 This list can be continued and arguably utilisation of NHS-mediated techniques can be found in all major fields of protein conjugation and represents a gold standard in bioconjugation.
Historically, the conjugation of oligosaccharides to proteins has become the first target for this approach. In 1974, relying on the exceptional ability of the cyanoborohydride anion described three years earlier by Borch146 to reduce selectively Schiff bases generated in situ from an amine and an aldehyde, Gray has illustrated the possibility of mild synthesis of carbohydrate coated bovine serum albumin (BSA) and P150 protein (Fig. 9).147 However, because of low kinetics of conjugation, only 4 out of 59 BSA lysine residues (presumably those possessing the lowest pKa values) were derivatised after 300 hours of reaction.
![]() | ||
Fig. 9 First example of reductive amination of lactose by bovine serum albumin (BSA, pdb: 3V03; lysine residues are shown in magenta) described by Gray.147 |
Reductive amination of proteins proceeds most readily at pH 6.5–8.5 where the reduction of aldehydes and ketones is negligible, and, if feasible, in an alcoholic solution under dehydrating conditions where the rate-limiting formation of the imine is favoured. According to Allred and colleagues,148 the addition of sodium sulfate (500 mM) may largely improve the coupling efficiency in aqueous media.
To date, reductive amination has played a central role in the synthesis of carbohydrate–protein conjugates,20,149–151 which have been used for years to study the molecular recognition of carbohydrates.152 Among these conjugates, polysaccharide–protein conjugate vaccines such as Menactra, HIBTiter, and Prevnar are FDA approved and used routinely for the prevention of invasive bacterial infections (Fig. 10),20,153 and potential anti-infective and anti-cancer agents are currently in clinical trials.23,151,154,155 The reader is directed to a recent comprehensive review of Adamo et al.24 covering the current status and future perspectives of carbohydrate–protein conjugates.
![]() | ||
Fig. 10 Structure of Prevnar 13 vaccine. The bacterial capsule sugars, a characteristic of the pathogens, are linked to CRM197, a nontoxic recombinant variant of diphtheria toxin (pdb: 4AE0), by reductive amination at lysine residues and N-terminus (shown in magenta). |
Another reported application of reductive amination includes the preparation of an organic trialdehyde to be used as a template for the synthesis of three-helix bundle proteins,156 protein PEGylation157,158 and immobilisation.159
Reductive amination however possesses several drawbacks preventing it from being generally applicable to protein conjugation.160 The most important is the necessity to use water-sensitive sodium cyanoborohydride, which has the potential for reducing disulfide bonds within proteins. As an alternative, McFarland and Francis161 have reported a water-stable iridium catalyst (Fig. 11). However, the efficiency of the method is lower than that of the classical reduction with cyanoborohydride.
Sulfonyl halides are highly reactive but also very unstable, especially in aqueous media at the pH required for reaction with aliphatic amines. For example, Haugland and collaborators165 have demonstrated that the rate of hydrolysis of Texas Red (one of the most widely used long-wavelength fluorescent probes)166 and Lissamine rhodamine B sulfonyl chloride was much higher (complete hydrolysis within 5 minutes in pH 8.3 aqueous solution) than that of corresponding NHS esters (both retained most of their reactivity for more than an hour under the same conditions). Yet, the formed sulfonamide bonds are extremely stable and even survive amino acid hydrolysis,164,167 which makes sulfonamide conjugates useful for the applications where the stability of the conjugation bond is a crucial feature.
Optimal conditions of protein modification by sulfonyl chlorides are those under which free amino groups most effectively compete with water for a limited amount of the reagent. It is thus best done at low temperature at pH 8.5–9.5.168 At lower pH values, the unreactive protonated form of amines slows the labelling reaction compared to the hydrolysis by water, above this range the reagent is hydrolysed too rapidly.169,170 In practical experiments, a several-fold excess of the reagent is usually added, providing the unused probe is hydrolysed to the corresponding unreactive sulfonic acid after labelling. It must be borne in mind that unlike other amine-selective reagents, sulfonyl chlorides are unstable in dimethylsulfoxide, classically used for the preparation of stock solutions, and should never be used in this solvent (Fig. 12).171
![]() | ||
Fig. 12 Reaction of sulfonyl chlorides with amino groups present in proteins. Hydrolysis of the starting material by water or dimethylsulfoxide171 (chlorodimethylsulfide, CDMS, is a leaving group) results in obtaining unreactive sulfonic acid. |
Apart from being reported for fluorescent labelling of proteins,172 sulfonyl chlorides were used to incorporate a chelate moiety into proteins,173 to study hydrodynamic properties or introduce long-lived fluorescence labels into macromolecules using tagging with pyrene derivatives174,175 or as cross-linking reagents.176
Because of their very high reactivity towards nucleophiles, sulfonyl halides also form conjugates with tyrosine, cysteine, serine, threonine, and imidazole residues of proteins;177 therefore, they are less selective than either NHS esters or isothiocyanates. These conjugates are however unstable and can be completely hydrolysed under basic conditions.
Covalent immobilisation of proteins on hydroxyl group containing carrying supports (such as agarose, cellulose, diol-silica, or polylactic acid films) is often accomplished by transforming the latter into corresponding sulfonates: tosylate, mesylate, or tresylate,178,179 serving as good leaving groups (Fig. 13).180–182 Albumin, cytokines and other therapeutic proteins and peptides were reported to undergo mild PEGylation by means of PEG tresylates.183–185 Although rather specific to amino groups, the chemistry of tresylate-mediated conjugation is not unique and well defined. For instance, Gais et al. have shown that PEG–tresylate conjugation can produce a product that contains a degradable sulfamate linkage resulting in heterodispersity of the reaction.186
![]() | ||
Fig. 13 Schematic representation of the reaction protocol for immobilisation of protein on PLLA film surfaces described by Ma et al.181 |
Compared to other aryl halides, fluoro-substituted nitrobenzenes were found to be the most reactive in bimolecular nucleophilic substitution reactions.188 They are usually regarded as amino-selective reagents, despite their known reactivity towards thiolates, phenolates and imidazoles, as the products obtained in these reactions are either unstable at alkaline pH required for the reaction (Tyr and His) or can be thiolysed by excess β-mercaptoethanol (Cys).189
4-Fluoro-7-nitro-2,1,3-benzoxadiazole (NBD-F), which has been introduced as a fluorogenic reagent for more than 30 years ago by Imai and Watanabe,190 still remains important for several applications, mainly pre-column derivatisation and enrichment of peptides. The reader is referred to a recent review by Elbashir et al.191 providing an excellent overview of the NBD-F applicability to the analysis of peptides and to a complete overview of NBD-mediated methodologies for the fluorescent labelling of amino acid residues by Imai and associates.192,193
An elegant approach for improving protein crystallizability, still remaining a major challenge in protein structure research,194 was elaborated by Sutton and collaborators195 and consists in the introduction of a charged ammonium residue. It exploits the amine-selective derivatisation of protein by 1-fluoro-2-nitro-4-trimethylammoniobenzene iodide (Fig. 14) and results in increasing the hydrophilicity thereof. Using their approach, the authors were able to study the binding site196 and to obtain crystalline derivatives of modified bovine insulin,197 which is especially hard to crystallize without inducing structural changes.198 A similar protocol was used by Ladd et al.199 for chromophorical PEGylation of proteins with polyethylene glycol fluoronitrobenzene derivatives.
![]() | ||
Fig. 14 Derivatisation of bovine insulin with 1-fluoro-2-nitro-4-trimethylammoniobenzene iodide described by Sutton et al.195 Only amine-containing residues of A1 and B1 chains are shown. Two of the four tyrosine residues present in chains A1 and B1 (not shown) also react with the probe under described conditions. |
![]() | ||
Fig. 15 Procedure of the cleavable crosslinking of the intact 30s ribosomes (pdb: 1J5E) described by Traut et al.201 Lysine residues (Lys-72 and Lys-156) were chosen randomly for simplicity purposes (TEASH stands for triethanolamine buffer adjusted to 3% 3-mercaptoethanol). |
Imidoesters react with primary amines to form amidine bonds. A high specificity towards amines can be achieved when alkaline conditions (pH 10) and amine-free media, such as borate buffer, are used.202 This places imidoesters among the most specific agents for amine labeling. Because the resulting amidine bonds are protonated at physiological pH, positive charges near modified sites are preserved during the conjugation with lysines and N-termini. Consequently, as was first demonstrated by Wofsy et al.203 such modifications produce little or no significant changes in the conformational properties and biological activities of proteins.
Thiolates obtained after the ring opening of Traut's reagent by free amines enable a plentiful thiol-selective chemistry on modified α- or ε-amino groups of proteins (see Section 1.3). Although, many imidoesters other than Traut's reagent are today commercially available (for example, see DMA, DMP, or DTBP), the amount of described labelling imidoester probes is rather scarce.
Schramm et al.204 have described the synthesis of fluorescent imidoester dyes from corresponding nitriles; the approach was later used by Bozler et al.205 for the preparation of dansyl containing imidoester and selective modification of lysine residues in the active site of glucose dehydrogenase. New readily available reagents for the attachment of sugars to proteins via imidoester linkage,206 hydrophilic spin probes for determining membrane protein interaction using EPR,207 immunoreactive probes,208 tyrosine-like probes for radioactive labelling with 125I,209 protein PEGylation reagents,210 and the immobilisation of trypsin, yeast alcohol dehydrogenase, and E. coli asparaginase onto several types of organic polymer beads211 were achieved via imidoester conjugation and proven to have several advantages compared to other existing methodologies, namely, deprivation of solubility issues and retention of positive charge at the reaction site.
Azetidinone chemistry has recently been demonstrated by Barbas and collaborators212,213 to have potential for selective lysine labelling of a particular IgG framework, containing a very reactive lysine residue with an unusually low pKa of about 6. Some detailed procedures are described for a smooth opening of a β-lactam moiety resulting in a β-alanine peptide bond.213
Discovered by Tietze et al. as two-step sequential procedures for coupling of amines,214,215 squaric acid diester amine–amine conjugation is now actively developed by Wurm et al., who have recently reported their successful use for the one-pot preparation of poly(glycerol)–protein216 and glycol–protein conjugates217 in aqueous media (Fig. 16).
![]() | ||
Fig. 16 One-pot, two-step squaric acid diester mediated glycosylation of BSA (pdb: 3V03) described by Wurm et al.217 Up to 22 lysine residues of the 59 present in BSA (30–35 are available for post-modification) could be glycosylated with 25-fold excess of squaric diester and glucosamine in 12 hours. |
Dichlorotriazine derivatives were described for amine-selective conjugation mainly as fluorescent dyes218,219 and PEGylation probes.220–222 They were shown to possess high reactivity towards protein amines. However, as was demonstrated by Abuchowski et al.,220 because the hydrolysis of dichlorotriazine occurs readily under slightly basic conditions (pH 9.2) needed for reaction to take place with sufficient selectivity towards amines, a considerable excess of the probe must be used in the coupling reaction. Banks and Paquette70 have conducted a comparative study of three fluorescent probes, differing only in the moiety responsible for the reactivity with amines: CFSE (NHS ester), DTAF (dichlorotriazine) and FITC (isothiocyanate). It was found that the rate of conjugation is significantly faster for the NHS ester compared to the diclorotriazine probe, which, in turn, reacts faster than the isothiocyanate derivative. Each conjugate provided a satisfactory level of stability in solution over a period of 1 week at room temperature, although the hydrolysis of the remaining, relatively inert, chloro group of DTAF was observed (Fig. 17).
![]() | ||
Fig. 17 Modification of BSA (pdb: 3V03) by the PEG-[14C]dichlorotriazine probe reported by Abuchowski and colleagues (14C atoms are marked with red stars).220 |
Arpicco et al.223 have prepared thioimidoester activated PEG-containing derivatives and shown their superiority over the NHS-activated analogue for gelonin modification (the reaction was conducted in PBS at pH 7.4). In this particular case, PEGylation with a less active, compared to NHS ester, thioimidoester derivative resulted in the gelonin conjugate with higher inhibiting activity. Ikeda and associates224 have recently described a protocol for the preparation of the glutalaldehyde-functionalised PEG reagent, allowing for protein PEGylation under mild reaction conditions. Similarly, the modified protein exhibited higher biological activity than when reacted with a corresponding NHS-activated PEGylation reagent.
α-Halocarbonyls, such as iodoacetamides, can modify lysine residues at pH > 7.0,225 but the reaction rate is much slower than the reaction with cysteine residues. Another class of reagents usually used in cysteine-selective conjugation – vinyl-sulfones226 (see Section 1.3.3) – was recently reported to be applicable for lysine labelling at slightly basic pH.227,228
Modification of Lys residues with acid anhydrides, including succinic, citraconic, maleic, trimellitic, cis-aconitic, and various phthalic anhydride derivatives belongs to a pool of classically used protein modification methodologies229 and allows for transforming nucleophilic amines into acids and, as a result, enables carboxylate-selective chemistry thereof.
For more details of the practical aspects of using the above-described methodologies, the reader is referred to a recent review by Brun and Gauzy-Lazo230 on the preparation of antibody–drug conjugates by lysine conjugation.
However, highly amine selective N-hydroxysuccinimide (NHS) esters have been documented to give occasional side reactions with hydroxyl side chains.83,90,92,231 In a series of experiments, Miller et al. have demonstrated that the presence of histidine in sequences of the type His-AA-Ser/Thr or His-AA-AA′-Ser/Thr (where AA and AA′ stand for any amino acid) can significantly increase the reactivity of hydroxyl groups toward classical amine labelling agents (Fig. 18).85,86,88,232
Similarly, Mädler and Zenobi have reported that the guanidinium group of arginine can contribute to the reactivity of hydroxyl groups toward NHS esters and catalyse the nucleophilic substitution.233 In both cases, it is hypothesised that the imidazolyl and guanidine moieties of histidine and arginine, respectively, catalyse the reaction by stabilizing the transition state by means of hydrogen bonds and electrostatic interactions. This promoting effect is thought to be responsible for side reactions on several substrates while using cross-linking reagents.92,233
Despite the fact that methodologies of selective in-chain serine and threonine labelling are rather scarce, these residues are of special interest for bioconjugation when located on the N-terminus (see Section 2.2).
In proteins, thiols can also be generated by selectively reducing cystine disulfides with reagents such as dithiothreitol (DTT, D1532),238 2-mercaptoethanol (β-mercaptoethanol), or tris[2-carboxyethyl]phosphine (TCEP).239,240 Generally, all these reagents must be removed before conducting thiol-selective conjugation, as they will compete with target thiols in proteins otherwise.241 Unfortunately, removal of reducing agents is sometimes accompanied by air oxidation of thiols back to disulfides. Although, in contrast to the majority of thiol-reducing agents, TCEP does not contain the thiolate group, there have been several reports that it can react with α-halocarbonyls or maleimides and that labelling is inhibited when TCEP is present in the reaction medium.242,243
Direct labelling of the thiolate group is usually achieved by either a nucleophilic addition or displacement reaction with the thiolate anion as the nucleophile. The substantially less dissociation energy of sulfhydryl groups compared to the corresponding alcohols provides much higher acidity of the former and, as a consequence, a wider availability of its slightly nucleophilic anionic form at physiological pH.
Typically, the reaction of sulfhydryl groups with haloacetamides is conducted under physiological and alkaline conditions (pH 7.2–9.0). When iodoacetamides are used, the reaction is preferably carried out under subdued light in order to limit free iodine generation, which has the potential to react with Tyr, His and Trp residues. The reaction is most specific for sulfhydryl groups at pH 8.3. The iodoacetyl group is known to react with other amino acid side chains, especially when there is no cysteine present or if a gross excess of iodoacetyl is used. For instance, free amino groups, the thioester of methionine, and both imidazolyl side chain nitrogens will react with iodoacetyl groups above pH 7 and 5, although with much slower kinetics.246 This, however, can be resolved by the use of less reactive chloroacetamides247 or cautious control of pH and incubation time.
It is to be noted that the local environment has a profound effect on the reactivity of cysteine residues in proteins. If moderately reactive reagents such as iodoacetamide are used for bioconjugation, this difference in reactivities makes it possible to discern different types of Cys moieties present in the protein. Almost half a century ago, Gerwin248 reported dramatic differences in the reactivities of chloroacetic acid and chloroacetamide in the modification of the active-site cysteine of streptococcal proteinase, which was found to be due to the influence of the neighbouring histidine residue. As a general trend, cysteine residues possessing lower pKa values are more reactive when reaction is conducted under neutral or slightly acidic conditions, owing to their greater degree of dissociation and, as a consequence, higher concentration of the corresponding thiolate anions in the medium. For instance, Kim et al.249 have described a method for selective biotinylation of low-pKa cysteine residues in proteins simply by conducting the reaction at slightly acidic pH (Fig. 19).
![]() | ||
Fig. 19 Biotinylation of the low-pKa cysteine residue of rabbit muscle creatine kinase (CK, pdb: 2CRK) by BIAM.249 The charge interaction between the negatively charged thiolate and the positively charged amino acid residues nearby results in a significantly lower pKa value of the CYS-283 residue (6.5). Consequently, selective alkylation thereof becomes possible in the presence of three other cysteine residues with higher pKa values (8.0–9.0). |
Davis and Flitsch250 described a procedure for the selective glycosylation of proteins at one or several sites by reacting the carbohydrate-tethered iodoacetamides with cysteine side chains, which allowed for preparing homogeneously glycosylated human erythropoietin251 and dihydrofolate reductase.252
In 1948, Mackworth253 published his study on the reactivity of the biochemical mechanism of the lachrymatory effect of certain war gases and first reported the reactivity of structurally relevant α-bromoacetophenones for the inhibition of several classes of thiol enzymes.
Despite advances made in the investigation of α-haloacetophenones and related ketoximes for the modification of the active sites of enzymes,256–259 their utility for the conjugation is very limited because of various side reactions.
An interesting approach that allows photochemical conversion of cysteine into corresponding thioaldehyde and then to aldehyde thought to be formed by Norrish type II cleavage was reported by Clark and Lowe.254,255 Photolysis of the enzyme, alkylated by a bromoacetophenone derivative, results in spontaneous loss of hydrogen sulfide from the generated thioaldehyde to give the corresponding aldehyde (Fig. 20), which can either be utilised as a locus for aldehyde-selective conjugation or be transformed into the corresponding serine or glycine residue by reduction or transamination respectively.
The reason for such remarkable reactivity of maleimide towards thiolates is worth being discussed. In general, the electrophilicity of alkenes is defined by their ability to serve as acceptors of nucleophile's electron density, and thus interrelated to the energy of electrophile's π* orbital (its lowest unoccupied molecular orbital, LUMO). Generally speaking, the rule is simple: the lower the energy of the alkene's π* orbital – the faster its reaction with nucleophiles. There exist two main approaches for decreasing alkene's LUMO energy: the direct attachment of an electron-withdrawing group (EWG) and the straining of the double bond. Although proceeding via two different mechanisms: by decreasing the energy of both orbitals or by diminishing the energetic gap between them, either approach results in lowering the LUMO energy of the alkene and, as a result, in the increase of its reactivity (Fig. 21). The unique reactivity of the maleimide moiety owes to the fact that it exploits these two mechanisms together.262,263
![]() | ||
Fig. 21 Influence of an electron-withdrawing group (EWG) and cycle strain on the frontier orbitals of alkenes, σ-orbitals omitted (the form of the orbital is presented approximately, based on the publication of Merchán et al.268). Decreasing the energy of the LUMO (lowest unoccupied molecular orbital) results in higher reactivity of the electrophile towards nucleophiles. Although via different mechanisms, both the EWG and strain of the cycle activate alkenes for the attack by nucleophiles. |
To date, a large variety of maleimide-based modifying reagents are available from a number of leading biochemical companies with even more being synthesised in laboratories around the world for specific applications. The applications of these reagents strongly overlap those of iodoacetamides, although maleimides apparently do not react with methionine, histidine or tyrosine.264,265
The optimum reaction conditions for maleimide-mediated conjugation, namely conducting the reaction under near neutral conditions (pH 6.5–7.5, Fig. 22), prevent the reaction of maleimide with amines, because the latter requires a higher pH to occur. At pH above 8 the hydrolysis of maleimide itself results in obtaining a mixture of isomeric non-strained maleamic acids unreactive toward sulfhydryls and can thus compete with thiol modification.28,266 Similarly, maleimide–thiol adducts hydrolyse, which either results in complete deconjugation or causes a significant change in the properties of the conjugate.266 Furthermore, especially at pH above 9, ring-opening by nucleophilic reaction with an adjacent amine may yield crosslinked products.267
![]() | ||
Fig. 22 Conjugation of maleimide-functional poly(PEGMA) to the only free CYS-34 residue of BSA (pdb: 3V03).311 |
Schuber and co-workers269 have found that important kinetic discrimination can be achieved between the maleimide and bromoacetyl functions when the reaction with thiols is conducted at pH 6.5 and 9.0, respectively.
Maleimide–NHS heterobifunctional reagents are especially important for the formation of conjugates. Hydrolysis of both the maleimide moiety and the generated thioester linkage is considerably dependent on the type of chemical group adjacent to the maleimide. Interestingly, the cyclohexane ring was found to provide increased maleimide stability to hydrolysis due to its steric effects and its lack of aromatic character. For this reason, succinimidyl-4-(N-maleimidomethyl)cyclohexane-1-carboxylate (SMCC) and its water-soluble analogue (sulfo-SMCC) are today among the most popular crosslinkers in bioconjugation. They are often used in the synthesis of the protein–protein or protein–probe assemblies such as antibody–enzyme or antibody–drug conjugates respectively. These include enzyme immunoassays,270–274 carrier-protein conjugates,275–277 albumin-binding prodrugs,278,279 and even approved therapies.117,119,280,281
Short homobifunctional maleimides are commonly used to explore and characterize protein structure (i.e., oligomerisation) or protein interactions.282–288 Maleimide-mediated immobilisation of biomolecules is often achieved either by direct conjugation13,289,290 or by prior biotinylation of the molecule of interest.291 The latter approach has been used for protein enrichment,292 capture,293,294 and immobilisation on modified supports.295–298
Most of the optical thiol-selective fluorescent probes often used as sensors for monitoring biological processes are represented by maleimide-containing reagents.299–301 Another testament to maleimide utility is its use in glycosylation,302 radiolabelling,303,304 studying protein interactions,305–308 and quantitation of cysteine residues.309,310
Despite its successful application as a reagent for the chemical modification of proteins, the irreversibility of maleimide's addition makes it impossible to regenerate the unmodified protein by controlled disassembly of the conjugate. Such necessity is however often desirable for in vitro or in vivo applications. Several studies were devoted to a mild and specific hydrolysis of the imido group in maleimide conjugates.312,313 These approaches turned the originally irreversible maleimide-mediated thiol conjugation into the cleavable methodology. However, the harsh reaction conditions of this cleavage (strong basic conditions or the presence of a high amount of imidazole) make them incompatible with many fragile protein substrates.
Monobromomaleimide derivatives, introduced by Baker et al.314 in 2009, have expanded the class of reagents for the selective and reversible modification of cysteine (Fig. 23). In contrast to methanethiosulfonates (see Section 1.3.6),315 monobromomaleimides allow much more stable conjugation of thiolates, which are easily cleavable upon reaction with TCEP by the addition–elimination sequence (Fig. 23).
![]() | ||
Fig. 23 Conjugate addition of the L111C mutant of the SH2 domain of Grb2 (wild-type pdb: 1JYU) to bromomaleimide followed by second addition of glutathione resulting in the generation of the vicinal bis-cysteine adduct.316 |
Moreover, the initial modification of a protein resulted in obtaining a thiol–maleimide moiety, which was shown to be prone to a second thiol addition and resolved another recognised drawback of maleimide-based methodologies, namely the presence of only two points of attachment.316 Similar to a non-substituted maleimide, the hydrolysis of thiol–maleimide linkage results in a dramatic decrease in its reactivity towards thiolates, which can be used for “switching off” the linker after the first thiol addition (Fig. 24).317,318
Initially, VS-mediated approaches have been used almost exclusively for PEGylation of proteins with end-functionalised PEG derivatives.322,324 Several studies on the immobilisation of macromolecules on solid supports using vinyl sulfones were reported, owing to the elaboration of new methods for the preparation of VS-modified surfaces.319,320,325–327 Versatile VS-containing probes, namely carbohydrates,227,328 chelating agents,329 fluorescent tags,330,331 and biotinylation reagents331 were recently developed and applied in the bioconjugation of proteins (Fig. 25). Ovaa and co-workers332 used vinyl sulfone handle to conjugate enzymes to a ubiquitin-like protein. The applications of VS-tags in proteomics have recently gained popularity and have been reviewed by Lopez-Jaramillo et al.331
![]() | ||
Fig. 25 Bifunctional labelling of horseradish peroxidase (HRP, pdb: 2ATJ) with vinyl sulfone-functionalised tags described by Morales-Sanfrutos et al.330 |
Vinyl sulfones react with thiols to form a stable thioether linkage to the protein under slightly basic conditions (pH 7–8).228 The reaction may proceed faster if the pH is increased, but this usually also increases the amount of side-products (namely the modification of the Lys ε-amino groups and the His imidazole rings).320 The main advantage of VS-tags is their elevated stability in aqueous solutions, compared to more reactive thiols and maleimides, which can be subjected to ring opening or addition of water across the double bond.185
The TEC conjugations (usually conducted in PBS-DMSO buffer at pH 7.0–7.5) are compatible with oxygen and aqueous media and are usually carried out upon irradiation at λmax 365 nm in the presence of Vazo44 (2,2′-azobis[2-(2-imidazolin-2-yl)propane]dihydrochloride) as an initiator. The resulting thioether linkage is biologically stable and robust.
The first approach to protein conjugation namely glycosylation via TEC was reported by Davis et al. in 2009.336 However, it consisted of photoinduced coupling of various glycosyl thiols with site-specifically introduced unnatural L-homoallylglycine. A complementary approach to peptide and protein glycoconjugation by photoinduced coupling on cysteines was first introduced almost at the same time by Dondoni and co-workers.337 The 66 kDa globular bovine serum albumin (BSA) possessing one free CYS-34 residue was selected as a model protein. Surprisingly, it revealed that not only the one CYS-34 SH group, as expected, but also two more SH groups arising from the 75 ↔ 91 disulfide bond were modified. It was suggested that such hyperglycosylation was due to well-documented disulfide bond degradation by UV-irradiation,338 namely to an electron transfer process from photoexcited tryptophan residues. Furthermore, prolonged irradiation of the reaction mixture up to 2 hours induced the introduction of seven glycoside residues into BSA. Despite the necessity for UV-irradiation, ensuing side-reactions, and often moderate yields,335 the fact that, in contrast to the majority of thiol-selective methodologies, TEC does not exploit elevated nucleophilicity of the thiolate but its readiness for the generation of radicals makes it especially tolerant to a wide range of functional groups. For instance, Garber and Carlson339 have used this feature of TEC for selective capping of thiols in the presence of thiophosphorylated groups, free alcohols and amines.
Several approaches involving the combination of cysteine-selective methodologies have been recently reported. Stolz and Northrop340 studied the reactivity of N-allyl maleimides and found this scaffold to be appropriate for consecutive two-step conjugation of thiols: via (1) base-initiated Michael-addition to maleimide moiety and (2) radical-mediated TEC of allyl-fragment. Scanlan and associates341 developed the sequential NCL-TEC approach (for more details on NCL, see Section 2.3.1) for the functionalisation of the cysteine thiolate generated at the ligation site during native chemical ligation.
TYC occurs under the same reaction conditions as TEC and smoothly proceeds at room temperature in aqueous solutions. First trials on the applicability of TYC on peptides were reported by Dondoni and collaborators in 2010.344 The authors have demonstrated the possibility of dual glycosylation of a series of peptides (up to 8 residues). Later on, Davis and Dondoni have expanded the dual conjugation strategy for achieving sequential glycosylation and fluorescent labelling of BSA (Fig. 28).345 Just as with TEC, the reaction also occurs at cysteine residues of the 75 ↔ 91 disulfide bridge.
![]() | ||
Fig. 28 Glycosylation and fluorescent labelling of bovine serum albumin (BSA, pdb: 3V03) by TYC (first step) followed by TEC (second step) with 2,2-dimethoxy-2-phenylacetophenone (DMPA) as a photoinitiator.345 |
The necessity of using a photo- or a chemical initiator in both TEC and TYC conjugations represents the main drawback of these methodologies, as the presence of free radicals results in a series of side reactions, namely oxidation and crosslinking of proteins.
Diverse disulfides have been extensively used in the past decade for the modification of cysteine by disulfide exchange. This reversible reaction involves attack of cysteine thiolate at the disulfide, breaking the S–S bond, and subsequent formation of a new mixed disulfide. A well-known example of such reaction is colorimetric quantitation of free sulfhydryls with Ellman's Reagent.348 Several symmetric disulfide-containing fluorescent probes such as BODIPY L-cystine and fluorescein L-cystine are commercially available. However, because there is no thermodynamic preference for this disulfide exchange to pass one way or another, labelling with non-activated disulfides generally requires use of a large excess of the probe to achieve sufficient levels of tagging.349 In contrast, related activated thiols, namely thiosulfates (R–S–SO3−), thiosulfonates (R–S–SO2–R′, MTS), sulfenyl halides (R–S–X),350 pyridyl disulfides,351,352 and TNB-thiols (derivatives of 5-thio-2-nitrobenzoic acid) contain good leaving groups, which tautomerise to give unreactive forms thus shifting the reaction equilibrium (Fig. 29).
PEGylation, fluorescent and biotinylation probes containing thiosulfate (commercialised as TS-link reagents) and pyridyl disulfide motifs are today widely commercially available. Thiosulfonates were first introduced for bioconjugation by Davis and co-workers353,354 in their work on the controlled glycosylation355 and further elaborated by Zhao et al.356 for site-selective PEGylation of proteins. Recent advances resulted in the further development of thiol activated methodologies towards selenenylsulfides357,358 and even methanedithiosulfonates, allowing for synthesizing trisulfide conjugates.359 Disulfide-based conjugation was recently reported for the preparation of antibody–drug conjugates and studying the influence of the spacer length on their stability.360 Diselene analogues of disulfide PEGylation reagents were proposed by Jevševar et al.361 as selective and fast alternatives for coupling. Although high conversion yield required the use of a large molar excess of the probe, this elegant approach represents an interesting technology which deserves further investigation.
The main factor that has gained popularity to methodologies yielding disulfide and selenol-sulfide linkage is the reversibility they afford. However, the resulting conjugates are generally less stable than those obtained using bromomaleimides315 (Section 1.3.2) and can be readily cleaved with classical reducing agents such as DTT, β-mercaptoethanol or TCEP. Yet, in the case of disulfides, the modified protein can be made more stable and resistant to reduction by the corresponding thioether-linked conjugate by means of HMPT-mediated desulfurisation elaborated by Davis and associates (Fig. 30).362 The use of hindered disulfides represents another way to increase the resistance of generated conjugates to cleavage.363
In an attempt to resolve this problem, Brocchini et al.371 developed a clever methodology for the PEGylation of protein disulfide bonds with α,β-unsaturated bis-thiol alkylating reagents. Covalent rebridging of the two thiols derived from the disulfide after its mild reduction allowed obtaining the modified proteins with retained tertiary structure and biological activity. Interferon α-2b (IFN) was used in the initial studies, because it is representative of four-helical-bundle proteins with accessible disulfide bonds. Following reduction of the disulfide in IFN's, the two free cysteines were re-joined using a three-carbon linked functional PEG.368,371 The methodology was further expanded to PEGylation of therapeutic proteins,368,372,373 antigen-binding fragments of immunoglobulin G,374 and poly phosphocholine labelling of IFN.375 Simultaneously with the introduction of previously mentioned monobromomaleimides, Baker and co-workers have introduced a relevant class of reagents containing a highly reactive dibromomaleimide or dibromopyridazonedione scaffold, allowing rapid and efficient disulfide rebridging by installing a rigid two-carbon linker.316,376 This approach was first applied for equimolar PEGylation of 32-amino acid salmon calcitonin (sCT, Fig. 31)377 and very recently for the preparation of homogeneous antibody conjugates.378,379
![]() | ||
Fig. 31 One-pot reduction-PEGylation of the disulfide bridge of sCT (pdb: 2GLH) followed by rebridging using dibromomaleimide.377 |
Although being very rapid (full conversion is achieved in less than 5 minutes), dibromomaleimide-based conjugation resulted in obtaining a small amount of the multimers while modifying complex polypeptides.378 Developed by Baker et al.380 and Haddleton et al.381 more stable and less reactive dithiophenolmaleimides allowed avoiding this apparent drawback of dibromomaleimide probes. In combination with benzeneselenols known for their efficiency in the catalysis of disulfide cleavage, the dithiophenolmaleimide approach allowed selective antibody fragment conjugation with no detectable formation of multimers and conserving a high reaction rate (Fig. 32).378 Haddleton and colleagues382 have recently reported an in situ one-pot preparation of oxytocin–polymer conjugates using dithiophenolmaleimide-containing probes.
![]() | ||
Fig. 32 Dithiophenolmaleimide approach for in situ disulfide bridging of antibody fragments (pdb: 1QOK).378 |
Structurally similar to dibromomaleimides, but containing four attachment points instead of three, dibromopyridazinediones (PD) were recently described by Caddick et al.383,384 and provided a platform for IgG antibody dual labelling.384 Namely, the authors elaborated the preparation of a PD-linker containing two orthogonal reactive handles in its structure: (1) a strained alkyne, which readily reacts with azides in Cu-free strain-promoted azide–alkyne cycloaddition (SPAAC), and (2) a terminal alkyne, which reacts with azides in Cu-catalysed azide–alkyne cycloaddition (CuAAC). The construct obtained after rebridging a reduced antibody with the PD-linker was then used to selectively introduce two distinct functionalities (Fig. 33).
![]() | ||
Fig. 33 Dual labelling of IgG antibody using dibromopyridazinediones (PD, shown in orange).384 The construct obtained after rebridging of a reduced IgG with a PD-probe containing a strained alkyne (reactive in strain-promoted azide–alkyne cycloaddition, SPAAC) and a terminal alkyne (reactive in Cu-catalysed azide–alkyne cycloaddition, CuAAC) was subsequently labelled with two azide-containing probes (N3–R1 and N3–R2). |
An interesting disulfide stapling–unstapling strategy using dichloro-s-tetrazine was developed by Smith and collaborators (Fig. 34).385,386 In addition to their ability to be photochemically cleaved (i.e., unstaple thus regenerating reduced disulphide bonds; Fig. 34a), S,S-tetrazine macrocycles provide a possibility for labeling by exploiting the reactivity of the tetrazine in the inverse electron demand Diels–Alder reaction (Fig. 34b).
![]() | ||
Fig. 34 Protein stapling–unstapling using dichloro-s-tetrazine described by Smith et al. (a) Rebridging of a disulfide bond of a 14-mer peptide using dichloro-s-tetrazine and recovery of the starting product by photocleavage of the resulting S,S-tetrazine macrocycle (irradiation at 312 nm) followed by quenching of the thiocyanate intermediate with an excess of cysteine. (b) Rebridging of thioredoxin (Trx, pdb: 2F51) after the disulfide bond reduction followed by tagging of the resulting S,S-tetrazine with a strained alkyne containing probe (BCN-fluorescein). |
Organic arsenicals (similar to tetracysteine-selective biarsenical dyes initially developed by Tsien and co-workers, see Section 4) were recently exploited for efficient protein–polymer conjugation (Fig. 35).387 It is noteworthy that, in contrast to highly thiol-reactive dibromomaleimides, these reagents demonstrated enhanced selectivity for disulfide rebridging in the presence of free Cys residues. Namely, while dibromomaleimides reacted near quantitatively within 30 minutes with the free CYS-34 of native BSA, organic arsenicals exhibited limited reactivity and demonstrated only about 20% labeling over the same period of time.387 The authors hypothesised entropy-driven affinity of arsenicals for closely spaced dithiols to be the main reason of such specificity.
![]() | ||
Fig. 36 Utilisation of Cys-Dha transformation for bioconjugation. (a) General mechanism of the transformation. (b) Representative methods for the generation of dehydroalanine.395 |
Site-selective incorporation of Dha into proteins may be achieved by a considerable number of chemical transformations. Historically, the first example of such reaction was reported by Koshland and collaborators 50 years ago and consisted of transformation of nucleophilic serine of chymotrypsin to dehydroalanine via selective sulfonylation followed by base-mediated elimination.396–398 The method, however, exploited the particularly high nucleophilicity of the SER-195 residue located in the active site of chymotrypsin. Logically, more general methods based on the exceptional nucleophilic properties of cysteine received increased attention in the years to follow. These are represented in Fig. 36 and include: reduction–elimination, representing an often observed undesired side-reaction during reduction of disulfides; base-mediated elimination of activated thiolate, typically requiring temperatures incompatible with protein substrates and thus not being of synthetic interest; oxidative elimination, and bis-alkylation-elimination.395 Two last approaches seems to us the most promising among available Cys → Dha transformations and it is to these methodologies that we now turn.
![]() | ||
Fig. 37 Conjugation of mutant S156C SBL (wild-type pdb: 1GCI) containing a single, surface-exposed cysteine residue CYS-156 by oxidative elimination followed by conjugation with thiol probes.392 |
Basic conditions are generally required for the reaction to achieve high conversion yields. Other amino acid residues may also react with MSH and bromomaleimide, but the reaction rates are largely inferior to those of thiolates and resulting products are generally unstable in basic conditions and decompose back to their starting unmodified forms.
![]() | ||
Fig. 38 Dehydroalanine-mediated conjugation on histone 3 mutant H3.393 |
SNAr substitution chemistry approaches for cysteine modification in proteins were reported by several research groups.395,414,415 Davis et al.395 utilised Mukaiyama's reagent (2-chloro-1-methylpyridinium iodide) to generate an arylated cysteine as an intermediate for conversion thereof to dehydroalanine. Pentelute and co-workers have expanded the approach towards perfluoroaryls for protein stapling and conjugation.414,415 Finally, Barbas and associates416 have developed a class of Julia–Kocienski-like methylsulfonyl-functionalised reagents, that reacts rapidly and specifically with thiols under biologically relevant pH (5.8–8.0). Notably, the resulting conjugates possess superior hydrolytic stability compared to cysteine–maleimide, which makes this methodology appropriate for the preparation of stable protein conjugates and PEGylated proteins (Fig. 39).
![]() | ||
Fig. 39 Labelling of the free cysteine residue of BSA (pdb: 3V03) with Julia–Kocienski-like reagents.417 |
A strategy exploiting selective cyclization of peptides containing three cysteines to generate combinatorial libraries of cysteine-rich bicyclic peptides was recently developed. This approach is based on utilisation of homotrifunctional linkers: TBMB (tribromomethyl-), TATA (triacryloyl-), or TBAB (tribromoacetamide-containing reagents).418,419
An efficient gold-catalysed allene-mediated coupling reaction has been recently developed by Che and colleagues.420 The method allowed direct thiol-selective functionalisation of model peptides and reduced RNase A (Fig. 40).
Reactions of thiols with electron deficient acetylenes have been known for decades, being, however, mostly conducted in organic solvents.421–425 Several examples of reactions in aqueous media have been recently reported.417,420,426,427 Che and co-workers417 have elaborated a versatile method for the selective cysteine labelling of unprotected peptides and proteins in aqueous media with arylalkynone reagents. Notably, modified peptides could be converted back into the unmodified peptides by treatment with thiols under mild reaction conditions (Fig. 41).
![]() | ||
Fig. 41 Cleavable labelling of BSA (pdb: 3V03) with arylalkynone reagents elaborated by Che and associates.417 |
Interestingly, in contrast to arylalkynones, structurally similar electron deficient acetylenes – 3-arylpropiolonitriles (APN) – were recently reported as a prominent class of reagents for irreversible tagging of cysteine.428,429 A superior stability of resulting conjugates in aqueous and biological media opens an interesting prospect in many fields where stability of obtained conjugates is crucial, e.g. for preparation of antibody conjugates possessing increased plasma stability (Fig. 42).429
![]() | ||
Fig. 42 Preparation of IgG (pdb: 1IGY, red sticks represent four interchain disulfide bonds) conjugates with increased plasma stability using 3-arylpropiolonitriles (APN) described by Wagner and collaborators.429 |
Oxanorbornadienedicarboxylates (OND reagents), strained adducts of furans and electron-deficient alkynes, were found to provide better water stability while retaining selective, rapid, and fluorogenic reactivity towards cysteine compared to corresponding alkynes.430 α,β-Unsaturated ketones and amides (typically acrylamides) can undergo Michael-addition.431,432 However, the rate of addition is not generally high enough to provide it with competitive advantages compared with other approaches. Internal Cys residues were reported to accelerate native chemical ligation (see Section 2.3.1), an especially selective approach for N-terminal cysteine conjugation, via cyclic transition states.433–438
![]() | ||
Fig. 43 Selective labelling of Trp side-chain in a 8-mer peptide PTHIKWGD with malondialdehyde described by Foettinger et al.446 The peptide structure was simulated using RaptorX web server.447 |
To overpass selectivity issues, namely a known side reaction with Arg side chains,448 the conditions for the reactions with MDA, the hydrazone formation and the cleavage of the MDA derivative, had to be optimised concerning pH, buffer, temperature, and reagent. No side reactions of MDAs were observed only under strongly acidic conditions, such as aqueous TFA (80%). The following hydrazone formation requires approximately 50–100 fold molar reagent excess at a pH of 5–7 and sometimes increasing the temperature to 50 °C. Although unstable at acidic conditions and when the reagent excess is removed, the hydrazone bond remains firm in alkaline medium (pH > 9). The optimal conditions for the cleavage were found using hydrazine (applied as the dihydrochloride salt) in ammonium acetate solution at a pH ∼ 3. Demonstrably, these rather rough reaction conditions prevent this methodology from finding widespread use for sensitive protein targets, yet allowing its application in proteomics on peptide digests.449
![]() | ||
Fig. 44 Modification of horse heart myoglobin (pdb: 1YMB) with rhodium carbenoids described by Francis et al.450 A 100 μM solution of myoglobin was exposed to stabilised vinyl diazo precursor (10 mM) and Rh2(OAc)4 (100 μM) for 7 h. N- and C-derivatisation of indole rings of both Trp residues – TRP-14 and TRP-7 – were identified by the mass reconstruction. An excess of hydroxylamine hydrochloride (75 mM) is crucial for the efficiency of the conjugation, although its mode of action was not elucidated. |
The initially reported reaction conditions tolerated several aqueous solvent systems and proceeded at room temperature. Yet, acidic conditions (pH 1.5–3.5) were still necessary for efficient protein labelling and stood out as the main drawback preventing this approach from being generally applicable. For instance, in the same work, authors have stated that myoglobin was denatured and the heme dissociated from the protein due to the high acidity of the medium.
To address these limitations, following efforts of the same group were to improve the pH range of the tryptophan modification methodology.452 For hydroxylamine was found to be ineffective at generating rhodium carbenoids at pH ≥ 6, a wide screening of commonly used buffers, as well as additives structurally similar to hydroxylamine H2NOH, was conducted in order to identify appropriate conditions. From these studies, tBuNHOH was found to be highly effective at promoting carbenoid addition. Despite the precise mode of action for tBuNHOH remains unclear, the authors attributed the substantial increase in catalytic activity to a specific interaction between this additive and Rh2(OAc)4. They speculated that, in contrast to hydroxylamine, tBuNHOH binds to Rh2(OAc)4 through the oxygen, rather than the nitrogen, the latter being disfavoured by the bulky tert-butyl substituent (Fig. 45a), and increases both the stability and the reactivity of the complex at neutral and slightly basic pH.
![]() | ||
Fig. 45 Optimised metallocarbenoids-based approach, described by Francis and collaborators.452 (a) Proposed binding of tBuNHOH with Rh2(OAc)4 at pH 6.0. O-Coordination is favoured due to a lower sterical hindrance of tert-butyl groups with acetate ligands. (b) Crystal structure of wild type FKBP (pdb: 1A7X), containing a single, buried tryptophan residue TRP-59 (shown in magenta), which is unavailable for modification under nondenaturating conditions. (c) A peptide tag (IQKQGQGQWG) incorporated into fusion FKBP protein expressed in E. coli as C-terminal intein fusion. The total level of modification was estimated to be in excess of 40%. |
Interestingly, in the same work, the authors have demonstrated the key role of solvent accessibility of residues in determining the outcome of conjugation on tyrosine using rhodium metallocarbenes. Human FK506 binding protein (FKBP) was identified as a suitable substrate for the study. The only Trp residue (TRP-59) of a wild type FDBP (containing an additional C-terminal threonine residue) is located at the base of the binding pocket, and therefore is unavailable for modification under nondenaturing conditions (Fig. 45b).
To overcome these difficulties, a labelling strategy based on tryptophan mutagenesis followed by chemoselective modification with rhodium carbenoids was utilised. Tryptophan-containing FKBP proteins were expressed in E. coli with C-terminal intein fusions containing a chitin binding domain for affinity purification and short tryptophan-containing peptides (Fig. 45c). Indeed, these newly obtained mutants with solvent accessible Trp residues showed significant level of conjugation (more than 40%) under optimised non-denaturating conditions at room temperature.
In 2010, further developing the rhodium-carbenoids methodology for selective tryptophan labelling described 6 years before by Francis et al.,453 Popp and Ball have reported structure-selective modification of aromatic side chains (expanding its scope to include Tyr and Phe residues) using proximity-driven approach (see Section 5 for details).454 Structurally similar to Rh2(OAc)4, metallopeptide complexes with a dirhodium center bounded with two glutamate residues were envisioned to provide delivering of the catalyst to a close proximity of the reactive side chains by exploiting the coiled coil matched peptides,455 for molecular peptide–peptide recognition (Fig. 46).
![]() | ||
Fig. 46 Selective covalent labelling of the TRP-9 residue of the peptide QEISALEKWISALEQEISALEK with its complementary dirhodium metallopeptide KISALQKQKESALEQKISALKQ described by Popp and Ball.454 The rhodium cluster, chelated with two glutamate residues, is brought closer to the reactive Trp residue by peptides coil self-assembling, resulting in selective peptide modification on TRP-9. Peptide structures were simulated using the RaptorX web server.447 |
By combining residue-selective chemistry with secondary-structure recognition, the authors have provided a strategy for selective covalent modification of biomolecules. However, only simple diazo reagents without functional handles were used in controlled environments on model peptide substrates.
In the following year, the same group has extended their initial studies to examine the reactivity of whole proteins in a complex, cell-like environment.456 For this, the proximity-driven catalysis approach was applied to a recombinant maltose binding protein (MBP), fused with the 21-amino-acid tryptophan-containing coil (almost identical to one used in the initial publication, Fig. 46). Directly after the expression, the lysate was subjected to metallopeptide-catalysed biotinylation. A single band in Western blot analysis indicated highly selective modification of the modified MBP protein with no nonselective modification to be observed.
![]() | ||
Fig. 47 Tautomerisation equilibrium of the neutral imidazole side chain (base forms A and C) occurring through the acid form B.457 Form A is somewhat favoured over C at neutral and acidic pH, while at basic pH the form C is preferred.458 |
![]() | ||
Fig. 48 Fluorescent probes, described by Li et al. for selective ligation of histidine.459,460 |
The authors suggest that these probes can be used for specific labelling of His residues in proteins if a mild reaction condition (lower reaction temperature but longer reaction time) was used, but no example of such application was given. Moreover, considering known reactivity of epoxides with primary amines, thiolates, and hydroxyl groups, such selectivity towards histidine at physiological pH seems improbable.461
An affinity-based labelling approach (see Section 4) based on the epoxide opening was developed by Hamachi and collaborators for selective histidine labelling of bovine carbonic anhydrase II.462,463 Labelling reagents investigated by the authors must consist of at least three major fragments: (1) a benzenesulfonamide ligand directing specifically to bCA, (2) a reactive electrophilic epoxide for protein labelling, and (3) an exchangeable hydrazone bond between the ligand and the epoxide group for removing the ligand by hydrazone/oxime-exchange and restoring the enzymatic activity (Fig. 49a). Further developing their approach,463 the authors added an iodophenyl or acetylene handle on the epoxide-containing fragment to enable the possibility of further derivatisation of the obtained conjugate by Suzuki coupling464 or Huisgen cycloaddition465 either after or before removing the ligand from the active site of the enzyme (Fig. 49b).
![]() | ||
Fig. 49 Affinity labelling, hydrazone–oxime-exchange reaction, and Suzuki coupling reaction on the surface of bCA (pdb: 1V9E) described by Hamachi and collaborators.463 (a) Principal structural fragments of the probe: a benzenesulfonamide ligand responsible for targeting bCA; an electrophilic epoxide responsible for reactivity towards vicinal His residues HIS-2 and HIS-3 (shown in magenta), cleavable hydrazone bond, responsible for the recovering of the enzymatic activity; an iodoaryl moiety, utilised for the further bioorthogonal transformation via Suzuki coupling. (b) Dual labelling of bCA in situ: fluorescent labelling of the HIS-2 or HIS-3 alkylated intermediate by Suzuki coupling with a coumarine derivative, followed by biotinylation by hydrazone/oxime-exchange. |
![]() | ||
Fig. 50 Luminescent histidine selective peptide tagging.467 (a) Labelling of a N-terminal histidine-containing HTat peptide (HRKKRRQRRR). (b) Labelling of a dihistidine motif placed in the middle of a P450dHTat peptide (MLAKGLPPKSVLVKGGHHGRKKRRQRRR). |
Although, reckoning obtaining of the coordination complexes to conjugation techniques would be stretching a point, they are however included in this survey because of histidine liability to complexation and increased stability of obtained complexes to decomposition.
Described by Pauly in the beginning of the 20th century,480,481 the reaction with diazonium salts has been only used for colorimetric determining of His residues482 and was not further elaborated for bioconjugation. Other classical examples of chemical modification of histidine are mainly reactions with pyrocarbonate, sulfonyl chlorides, sulfonic esters, phenacyl- and acylbromides and activated esters.229 These electrophiles react readily with other nucleophilic groups presented in proteins (thiols, amines, alcohols, or guanidino groups) and require a careful tuning of the reaction conditions to achieve sufficient selectivity. For instance, at low pH (generally < 6.0) these reactions are quite selective for histidine, as the main side reaction with the ε-amino group of lysine proceeds very slowly.483
Reader should, nonetheless, be aware, that these examples do not represent a general rule, but an exception from it. To avoid the reactions of more nucleophilic functions, the His residue must have been located in a unique microenvironment476,477 or have an enhanced nucleophilic character,478 but even in this case, prior modifications of highly reactive Cys residues are often inevitable.479
The reactivity of tyrosyl moiety is easily influenced by its deprotonation, which is a function of the microenvironment inside the protein. All described methodologies take advantage of either the peculiar chemical properties of the electron-rich aromatic ring, or the easiness of the tyrosine hydroxyl group to be transformed into highly reactive phenolate.
An elegant approach – the affinity labelling – allows surpassing the selectivity issues of tyrosine acylation by ligand-tethered directing of the reaction. In the case of tyrosine-selective modification, an acyl transfer catalyst is connected to a ligand with a high affinity to the target protein.462,486 The acyl group activated by the anchored catalyst is brought to the binging pocket of the protein and transfers an acyl moiety on the nucleophilic Tyr residue in close proximity. Utilizing this methodology, Hamachi et al.486 demonstrated selective tagging of Y51 residue of Congerin II (Fig. 52) using a suitable saccharide as a ligand to the target lectin (carbohydrate-binding protein) and DMAP (4-dimethylaminopyridine) as an acyl-transfer catalyst. In a similar way, Broo and collaborators487 have demonstrated the possibility of a site-specific acylation of a tyrosine residue situated in an active site of human glutathione transferase (hGST).
![]() | ||
Fig. 52 Sugar-DMAP (4-dimethylaminopyridine) assisted Y51-specific acylation of Congerin II described by Hamachi et al.486 Schematic representation is made using lactose-ligated Congerin II crystal structure published by Muramoto et al.488 (pdb: 1IS4). |
Miller and collaborators have shown that biotinylation with NHS esters (see Section 1.1.2) may result in preferential O-acylation of hydroxyl-containing residues – serine, threonine and tyrosine (though to a greater extent of the first two) – when they are located two positions next to histidine (i.e. in sequences His-AA-Tyr, where AA refers to any amino acid).85,88
Several approaches for labelling involve the initial modification of tyrosine and successive conjugation of an obtained intermediate. For instance, an ortho-nitration of tyrosine with tetranitromethane (TNM)489 or peroxynitrite490 results in obtaining of o-nitrotyrosine that can be then reduced by sodium dithionite (Na2S2O4) to form an o-aminotyrosine.
Although much less reactive than aliphatic amines at neutral pH, the aromatic amine of o-aminotyrosine can selectively react with amine-reactive reagents at lower pH.491,492 Namely, Nikov et al.492 have demonstrated that selective labelling of aminotyrosines is achievable in the presence N-terminal and ε-amino groups of lysines by using NHS-activated ester at particular reaction conditions (acetate buffer, pH 5.0, 2 hours). Exploitation of the pKa difference between aminotyrosyl residues and other reactive groups in proteins (4.75 for aminotyrosine, whilst much higher values for N-terminal and side-chain amino groups, see Section 1.1) allows selective labelling thereof. The method was validated on model peptides and then applied to a human serum albumin modification (Fig. 53).
![]() | ||
Fig. 53 Two-step biotinylation of HSA (pdb: 1AO6) described by Nikov et al.492 Preferential nitration of TYR-138 (shown in magenta) and TYR-411 residues of HSA with peroxynitrite was achieved using protocol described by Jiao and colleagues.490 Aminotyrosine of the peptide 138YIYEIARK144, obtained after the step of reduction with sodium dithionite and following digestion of nitrated HSA with trypsin, were selectively modified with a cleavable biotin-containing reactant at pH 5.0. |
Despite the reaction of TNM and peroxynitrite with proteins being reasonably specific for tyrosine, side reactions with histidine, methionine and tryptophan have been reported, as has oxidation of sulfhydryl groups. The latter would seem to be the most common side reaction, as it can result in disulfide bond formation and the formation of oxidation products such as sulfone and sulfenic acid derivatives. As a general rule, it is normally assumed that the reaction of nitration reagents with Cys residues proceeds equally well at pH 6 and pH 8, while the reaction with tyrosine occurs at pH 8 and not at pH 6.
In a like manner, the phenol group in Tyr residues can be initially ortho-formylated with chloroform in an alkaline medium to a salicylaldehyde derivative, and then undergo a reaction with ortho-phenylenediamine derivatives to form fluorescent benzimidazoles as conjugation products (Fig. 54).493,494
![]() | ||
Fig. 54 Two-step tyrosine modification by selective formylation of the tyrosyl residue at the ortho position of its phenolic moiety (reaction with chloroform in potassium hydroxide solution) and further derivatisation of the resulting aldehyde described by Kai et al. and Ishida et al.493,494 |
Further exploiting the methodology developed by Trost and Toste for selective O- and C-alkylation of phenols with π-allylpalladium complexes,495,496 Francis et al., have demonstrated the possibility of selective allylic alkylation of surface-exposed tyrosines of several full-size proteins.497
The photoactivable metal-catalysed version of tyrosine oxidation chemistry is significantly faster than the one achieved by Ni(II)–peptide complexes. It has been largely exploited by Kodadek and co-workers for cross-linking of closely associated proteins.503–505
These coupling reactions are hypothesised to occur through the addition of tyrosyl radicals to adjacent Tyr residues (Fig. 55). It is worth mentioning, that in some cases nearby tryptophan and other nucleophilic side chains can also participate in oxidation.503 For more details, the reader is directed to a review by Bonnafous and a publication of Francis that provide an excellent overview of oxidative cross-linking techniques.507,508
![]() | ||
Fig. 55 Transition metal catalysed oxidative cross-linking. (a) Schematic representation of the protein–protein cross-linking, described by Burlingame et al.506 The tyrosyl radical generated from ecotin (pdb: 1ECZ) D137H mutant after the abstraction of an electron with Ni(II)–GGH complex reacts with additional tyrosine residue on nearby protein that are in close proximity due to a significant protein–protein interaction to give a dimer. (b) Chemically activated by MMPP or oxone as stoichiometric oxidants, Ni(II)–peptide complexes can be used as efficient catalysts of cross-linking. (c) Photochemically activated by visible light in the presence of (NH4)2S2O8 or Co(III)(NH3)5Cl2+ as electron acceptors, transition metal complexes are used as catalysts of cross-linking. |
The use of cerium(IV) ammonium nitrate (CAN) – a classical one-electron oxidant – for chemoselective ligation on tyrosine was demonstrated by Francis et al. (Fig. 56).509 After the optimisation of reaction conditions, the authors could achieve modification of tyrosine-containing proteins with high yields at neutral pH and low substrate concentration and applied this strategy to modify both native and introduced residues on proteins with polyethylene glycol (PEG) and small peptides, although dealing with the concurrent reaction of Trp residues.509
![]() | ||
Fig. 56 PEGylation of four solvent-accessible Tyr residues of chymotrypsinogen (pdb: 1EX3) reported by Francis et al.509 (only Y171 residue is shown for simplicity reasons). Intermediate tyrosyl radical, generated in the presence of cerium(IV) ammonium nitrate (CAN), gives two addition products with electron-rich aniline derivative (O- is preferred over C-arylation). |
Notwithstanding the issue of specificity, photo-oxidation and oxidation of techniques of tyrosine ligation continue to be of considerable interest for the study of protein–protein interactions,510 mapping multi-protein complexes,511 or assembling of macromolecules.512
The optimised conditions have nonetheless allowed its application for selective modification of tyrosines on the surface of bacteriophage MS2,515,516 the modification of the tobacco mosaic virus,517 and the direct conjugation on proteins.518 Francis and co-workers have demonstrated that highly reactive diazonium salts (i.e. containing electron withdrawing groups in their structure) should be utilised in order to achieve efficient Tyr targeting and avoid concurrent reaction with Lys and His residues (see Section 1.5.4).517 Recently described by Barbas et al., formylbenzene diazonium hexafluorophosphate reagent519 represents an elegant example of a stable ready-to-use reagent for tyrosine labelling and introduction of an aldehyde bioorthogonal tag, capable for future bioorthogonal modifications (Fig. 57).
The three-component Mannich-type methodology – involving the in situ reaction between a Tyr residue, an amine and formaldehyde – was reincarnated more than 50 years later by Francis et al.521 The authors demonstrated the possibility of selective modification of tyrosine residues of α-chymotrypsinogen A under mild conditions (pH 6.5, 25–37 °C) and at low concentration of the protein (20–200 μM). However, 18 hours of incubation were needed to reach a reasonable level of tagging (66% in the case of a fluorescent labelling, Fig. 58). The same group then used this to incorporate synthetic peptides into full-sized proteins.522
![]() | ||
Fig. 58 Three-component Mannich-type selective modification of the Y154 Tyr residue of α-chymotrypsinogen A (pdb: 1EX3) by a rhodamine dye, described by Francis et al.521 |
Despite recognised selectivity issues of a three-component Mannich-type approach for tyrosine labelling,523 its main advantage is the possibility to easily vary the participating partners: an aldehyde (Fig. 58, shown in blue) and an aniline residues (Fig. 58, shown in violet). In the following publication on the subject, Francis et al. have demonstrated the viability of NMR-based characterisation of the conjugate isotopically enriched by incorporation of 13C-formaldehyde into the coupling reaction.523 Interestingly, while a reaction by-product arising from tryptophan indole ring was revealed, Cys moiety was found to not participate in the reaction, except in the case of a reduced disulfide, which formed a dithioacetal.
Using similar precursors – electron-rich aniline derivatives –Tanaka et al.524 could demonstrate the potential of in situ obtained imines as fluorogenic probes for tyrosine labelling. While the educts, as well as the imine derivatives, exhibited weak or no fluorescence, the addition products had a significantly higher (more than 100-fold) level of fluorescence.
In the successive study, the same group has expanded this approach to presynthesised cyclic imines completely excluding the need for using an excess of highly reactive formaldehyde.525 Although, the authors have clearly demonstrated the applicability of their methodology in water at room temperature over a wide pH range (pH 2–10) on a set of model phenols, no example of peptide or protein conjugation has been given.
![]() | ||
Fig. 59 Reaction of electron-rich arenes with azodicarboxyl compounds (shown in green). (a) First example described by Schroeter in 1969. (b) Conjugation of an HIV entry inhibitor aplaviroc (APL) containing a cyclic diazodicarboxamide derivative – 4-phenyl-3H-1,2,4-triazoline-3,5(4H)-dione (PTAD) – with the IgG antibody demonstrated by Barbas.537 |
However, these highly reactive reagents decompose rapidly in aqueous media, which makes them not suitable for bioconjugation.535 On the other hand, corresponding diazodicarboxyamide reagents are too stabilised and do not react with phenols in aqueous media.535 Cyclic diazodicarboxyamides like 4-phenyl-3H-1,2,4-triazoline-3,5(4H)-dione (PTAD) were recently reported by Barbas and collaborators and represent a good compromise between reactivity and stability of diazodicarboxyl-containing reagents.536,537 Diazodicarboxylate-mediated tyrosine conjugation is applicable over a wide pH range, however the highest labeling efficiency was observed at pH 7–10.536 A versatile class of stable PTAD precursors, possessing different functional groups, was developed and applied for a selective tyrosine conjugation (Fig. 59b). Their utilisation implies prior to use oxidation with 1,3-dibromo-5,5-dimethylhydantoin and the addition of a small amount of TRIS (2-amino-2-hydroxymethyl-propane-1,3-diol) during the step of conjugation. The latter is of crucial importance for the coupling selectivity, for it is hypothesised to serve as a scavenger of a putative isocyanate by-product of the PTAD decomposition, which is promiscuous in labelling.
The non-selective labelling of other aromatic side chains of proteins is the Achilles' heel of the vast majority of approaches described for tyrosine labelling. Careful tuning of reaction conditions is important for achieving appropriate levels of selectivity. In some cases where purely chemical distinction of reactivity of amino acid moieties is not feasible, catalysis on the basis of molecular shape rather than local environment can be used to induce selectivity. This concept is routinely exploited by enzymes and allows enabling reactivity that would otherwise be kinetically impossible. In 2010, Popp and Ball used dirhodium metallopeptide catalysts for selective conjugation on tyrosine and tryptophan using the concept of the proximity-driven mechanism (see Section 5).454 In the following year, Silverman et al. have demonstrated a DNA-catalysed approach for selective labelling of tyrosine, although only on small peptide substrates.538
However, the pKa value of arginine was found to vary significantly in the microenvironments within certain proteins,542,543 allowing, in terms of Leitner and Lindner,544 the grouping of arginines in “exposed” or “partially buried” residues, basing on the difference of their reactivities.
Most of the described approaches for arginine labelling and modification exploit the chemistry of α-dicarbonyl compounds. For instance, introduced by Takahashi545 as an arginyl reagent, phenylglyoxal has since been applied for the study of complex systems in the past decade.546–550 The reaction occurs under mild conditions and consists of two steps: first addition of phenylglyoxal resulting in the formation of hydrolytically instable imidazolidine diol, and the second step results in a relatively stable addition product (Fig. 61).
![]() | ||
Fig. 61 Reaction of phenylglyoxal with the arginine moiety (reaction condition: 0.2 M N-ethylmorpholine acetate buffer, pH 7–8). |
Substituted phenylglioxal analogs, such as p-hydroxyphenylglyoxal, p-nitrophenylglyoxal and 4-hydroxy-3-nitrophenylglyoxal, p-azidophenylglyoxal (APG) were reported for spectrophotometric and cross-linking study of the modification of arginine in proteins.551–555 None of these linkers have however been used in bond-forming conjugation. Because phenylglyoxal, like glyoxal, reacts with ε-amino groups at a significant rate,545 many efforts were made to increase its selectivity towards guanidinium residue. Cheung and Fonda have studied the effect of buffers and pH on the reaction rate556 and found that the reaction of arginine is greatly accelerated in bicarbonate–carbonate buffer systems, possibly due to the stabilisation of the obtained diol.
Geminal diones – namely 2,3-butanedione (introduced by Yankeelov)557,558 and 1,2-cyclohexanedione (introduced by Itano)559 – are another well-characterised reagents for the modification of Arg residues. The reaction progresses through the pathway that is similar to the phenylglyoxal addition. However, it was not until the observation that borate had a significant effect on the selectivity of the reaction that the use of this reagents became practical.560,561 The presence of borate in the solution allows shifting of the equilibrium during the addition to a guanidine moiety through the stabilisation of reversibly obtained diol (Fig. 62).
![]() | ||
Fig. 62 Reaction of 2,3-butanedione with arginine residues of Carboxypeptidaze A (pdb: 1HDU) in the presence of borate described by Riordan.560 |
In 2005, using this approach, Lindner and colleagues have described a method for the selection of arginine-containing peptides from a tryptic digest of the model proteins (BSA, lysozyme, ovalbumin) by a solid phase capture and release.562 First, arginine containing peptides presented in the digest were covalently modified on the guanidine moiety with 2,3-butanedione and phenylboronic acid under alkaline conditions. Polymeric materials allowing the immobilisation of phenylboronic acid were then used to capture the arginine-peptides on a solid support while washing away all not covalently bonded arginine-free peptides. Finally, the arginine-peptides were cleaved again from the boronic acid beads due to the reversibility of the reaction. Photoactivable bifunctional reagents for cross-linking of arginine moieties have been elaborated by Ngo et al. and Politz et al. to study enzymes with an arginine at their active sites.555,563
Arginine-specific PEGylation of lysozyme using polyethylene glycols containing an α-oxo-aldehyde motif in borate buffer was recently reported by Gauthier and Klok564 and represents mild and selective method for protein modification (Fig. 63). Other methods described to date565,566 possess selectivities, which are not sufficient (especially in the presence of Lys moieties) to consider them as suitable for bioconjugation.
![]() | ||
Fig. 63 Selective PEGylation of lysozyme's arginine side chains (pdb: 2LYZ) described by Gauthier and Klok.564 |
For more than a half century, carbodiimide-mediated activation is the most extensively used methodology for the modification of free-acids in protein.567,568 The reaction of carbodiimides with protonated carboxyl groups yields activated acylisoureas, which then react smoothly with a variety of nucleophiles, namely amines (Fig. 64).569 It is important to utilise weakly basic amines, that remain deprotonated and thus reactive at pH below 8.0, to avoid protein cross-linking occurring at higher pH values. For this reason, weakly basic hydrazides are often reagents of choice in coupling reactions with activated carboxylic acids.570 Although water-insoluble carbodiimides (DCC, DIC) still continue to be useful for acid-selective protein conjugation,571,572 most current reports exploit water-soluble carbodiimides such as 1-ethyl-3-(3-dimethyl-aminopropyl)carbodiimide (EDC). Developed by Sheehan and Hlavka573,574 these carbodiimides first proved their especial usefulness as zero-length cross-linking reagents to study proteins.574 Subsequent studies were devoted for the application of carbodiimides for quantitation of accessible carboxyl groups in proteins,567,568,575 preparation of antigenic conjugates,576 and protein immobilisation.577
![]() | ||
Fig. 64 (a) Carbodiimide-mediated activation of the carboxylic acid side chain of glutamate. (b) Relevant examples of activating reagents. |
As mentioned previously, the upper limit for the optimal pH of carboxylate conjugation is defined by the reactivity of free amino groups present in protein. The lower limit is mainly determined by aqueous stability of the activating reagent. Borders and co-workers578 studied the stability of EDC in aqueous solution. It was found that EDC has a T12 of 37 hours (pH 7.0), 20 hours (pH 6.0), and 3.9 hours (pH 5.0) in 50 mM MES buffer at 25 °C; in the presence of 100 mM glycine, the T12 values were 15.8 hours, 6.7 hours, and 0.73 hours respectively. This supports the optimal pH for acid-selective conjugation to be in a range from 6.0 to 7.0. NHS (or its water-soluble analogue sulfo-NHS) is often included in coupling protocols to improve efficiency or to create a more stable intermediate. Possible side-reactions involving activating reagents were recently reviewed by Valeur and Bradley.579
Woodward's reagent K (N-ethyl-5-phenylisoxazolium-3′-sulfonate)580 and analogous substrates were used as activating reagents of carboxyl groups for synthetic purposes. Bodlaender et al.581 used N-ethyl-5-phenylisoxazolium-3-sulfonate, the N-alkyl derivatives of 5-phenylisoxazolium fluoroborate, to activate carboxyl groups on trypsin for subsequent modification with methylamine or ethylamine.
Lastly, several studies revealed unexpected examples of carboxyl group modification with reagents usually reacting far more effectively with other nucleophiles. For instance, p-bromophenacyls and iodoacetamides have been found to selectively alkylate carboxylic acid moieties of pepsin and ribonuclease T1 respectively.582–584 However, the applicability of these reagents is not general and is appropriate on specific substrates only.
All approaches described in literature exploit alkylation of Met residues in acidic media. Although many nucleophilic functional groups present in proteins can react with alkylating reagents, at low pH all of them except methionine exist in protonated forms, which greatly decreases their reactivity.587 Consequently, alkylations of other nucleophilic functional groups, such as thiols, are commonly conducted at high pH,400 while methionine is the only functional group in proteins able to react with alkylating reagents at low pH.
Basing their research on pioneering studies being done by Toennies in the 1940s,589,590 Kramer and Deming have recently reported a reversible chemoselective labelling of methionine in peptides and polypeptides.588 Treatment of the model peptide (PHCRKM) with alkylating reagents of different structures in 0.2 M aqueous formic acid (pH 2.4) gave a single product, where only the Met residue was alkylated. This resulting sulfonium salts can readily be dealkylated by addition of pyridine-2(1H)-thione (PyS) to give the starting peptide as the sole product along with the alkylated PyS byproduct. The removal reaction was found to be selective and allows selective dealkylating using concentrations of PyS that do not react with the disulfide bond in cystine under identical conditions (Fig. 65).
![]() | ||
Fig. 65 Selective reversible modification of a 6-amino-acid peptide (PHCRKM) via methionine alkylation reported by Kramer and Deming.588 No reactions with other amino acids were detected. |
In the classical work on acetylation of the growth hormone, Reid592 has for the first time demonstrated the possibility of selective modification of N-termini, if acetylation is performed with a relatively small amount of acetic anhydride. Further development of this approach has in several protocols for selective labelling of α-amino groups of proteins,3,593 peptides,594,595 and proteomes.596–598 Under optimal reaction conditions, the use of a 5-fold excess of amine-reactive reagent in PBS (pH 6.5) at 4 °C, high levels of selectivity can be achieve after 2–24 hours of reaction. However, the preference for terminal amino groups achieved by control of pH is rather limited, mainly owing to the fickleness of the pKa of the amino group depending on the microenvironment and reaction conditions. Consequently, more proficient methods that rely upon increased chelating ability of N-termini,599 direct participation of the adjacent side chains or peptide bond,600,601 were developed and represent to date preferential approaches for bioconjugation.
![]() | ||
Fig. 66 Selective labelling of lysozyme (pdb: 2LYZ) described by Che.603 Moderate-to-high level of selectivity towards N-terminal residue (K1) was achieved in the presence of 5 other in-chain lysines (LYS-13, LYS-33, LYS-96, LYS-97 and LYS-116). The ketene was synthesised from the corresponding acid by a two-step protocol: transformation into a mixed anhydride (oxalyl chloride, DCM) followed by the transformation thereof into ketene upon the reaction with a base (TEA, THF). |
Inspired by the pioneering works of Metzler and Snell,607 and Cennamo and collaborators608,609 on the transamination of simple amino acids and peptides under harsh conditions (heating at 100 °C and pH 5.0), Dixon and Moret610,611 developed a method for mild transamination in the presence of copper(II) salts, which allowed the reaction to pass at room temperature (Fig. 67). The isomerisation of the imine generated in situ by catalysed 1,3-proton shift transfer is the key step of the transformation which defines both its direction and the reaction rate. It is clearly the activation of α-protons of N-terminus by the adjacent peptide bond and the metal ions that makes this reaction specific to α-amino groups. Interestingly, Dixon reports that no traces of the reaction of lysine side chains were ever observed.600
![]() | ||
Fig. 67 General scheme of the transamination reaction activated by metal ions (typically Cu2+ and Ni2+).610 Principal steps of the reaction mechanism: (a) generation of the imine; (b) isomerisation of the obtained imine by proton removal; (c) hydrolysis of isomeric imine to generate transaminated reaction partners. |
Quite a wide range of reaction conditions has been tested ever since. The discovery that pyridine612 and acetate613 greatly accelerated the transamination of amino acids led to a slightly milder reaction conditions, which however were still too harsh to maintain the folded structure of most proteins, and were therefore more appropriate for sequence-analysis applications.614
Only recently Francis and co-workers601 have re-examined Cennamo's approach608 of amino acid transamination in the presence of pyridoxal-5-phosphate (PLP, vitamin B6) at 100 °C and found much milder the reaction conditions when modifying the N-terminal residues of peptides (65 °C for reaction to be over in 2 hours; 25 °C to achieve a complete conversion in 24 hours). Screening experiments on different N-terminal amino acids of the peptides indicated that the aldehyde structure strongly influenced the reaction efficiency. Amusingly, known for more than 50 years PLP emerged as the most effective aldehyde among dozens being screened, affording the highest yields at milder reaction conditions.
The mechanism of PLP-catalysed transamination is depicted in Fig. 68. The reaction of PLP and N-terminal amine results in forming of a Schiff base aldimine (a). Then the α-proton the amino acid transferred to the 4′ position of the pyridoxal unit (b and c). Finally, hydrolysis of the obtained ketimine leads to the desired α-ketoacid and pyridoxamine phosphate (d). Quinoid is an important intermediate for the transformation of aldimine to ketimine and can be found in all transamination reagents described to date.
![]() | ||
Fig. 68 Structure of PLP and mechanism of PLP-mediated transamination. Reaction pathway consists of (a) condensation reaction between pyridoxal and the amine; (b and c) tautomerisation of the obtained aldimine being favourable because of a much lower intrinsic pKa values of α-proton (shown in blue); (d) hydrolysis of the resulting ketimine, accompanied by decarboxylation in the case of aspartic acid (R = –CH2COOH).601 |
Under optimal reaction conditions: 10–50 μM protein and 10–50 mM PLP at 37 °C in PB (pH 6–7), a complete conversion is generally achieved after 2–24 hours. The resulting keto-proteins are generally rather stable and can be concentrated, stored, or lyophylised without any specific precautions.615 An example of transamination-conjugation methodology was demonstrated by Francis in the initial publication on selective labelling of an N-terminal glycine residue of horse heart myoglobin and enhanced green fluorescent protein (EGFP, Fig. 69).601
![]() | ||
Fig. 69 Site-specific N-terminal labelling of EGFP (pdb: 2Y0G). Proteins possessing N-terminal carbonyl groups obtained by in the first step PLP-mediated transamination were labelled with hydroxylamine probes in the second step.601 |
Although the side chain of the N-terminus does not participate directly in the transamination mechanism, the reaction rates were found to vary significantly depending on the amino acid in the N-terminal position.616 Generally, the majority N-terminal amino acids provide high yields of the desired transaminated products; however, some residues (His, Trp, Lys, and Pro) generate adducts with PLP itself, while other are incompatible with the technique because of known side reactions (Ser, Thr, Cys and Trp) or complete inertness (Pro).
In the attempts for the investigation of transamination reaction scope and limitations, Francis and collaborators have prepared an 8000-member one-bead-one-sequence combinatorial peptide library in which the three N-terminal residues were varied.617 Interestingly, the Ala-Lys (AK) motif was found to favour especially the transamination yields (the Lys residue is hypothesised to accelerate the isomerisation step of 1,3-proton shift acting as a general base). To demonstrate this, labelling of the Type III “Antifreeze” Protein and its mutant presenting the AKT sequence on the N-terminus were side-by-side benchmarked. At every time point analysed, the AKT terminus outperformed the wild-type one (GNQ) at different concentrations of PLP.
Although mild reaction conditions of PLP-mediated transamination render it amenable for the modification of intact proteins,618 the yields are generally not high and elevated temperatures are usually required, which largely limits the practical applicability of the approach. Given this situation, Francis and co-workers619 have utilised above-described combinatorial approach to identify another transaminative agents. As a result, N-methylpyridinium-4-carboxaldehyde benzenesulfonate salt (Rapoport's salt, RS, Fig. 70) was identified as a highly effective alternative to PLP. Furthermore, this was found to be particularly efficient for glutamate-rich sequences.619,620 The fact that several antibody isotypes dispose at least one glutamate-terminal chain makes RS particularly amenable for their selective conjugation. Remarkably, the difference of transamination reaction rates on Glu- and non-Glu polypeptides was significant enough for selective labelling of only the heavy chain of immunoglobulin G1 (containing the N-terminal Glu residue), while leaving unmodified the light chain. This was assigned to be due to the higher steric hindrance of the already less-reactive substrate (N-terminal Asp). To ensure these results an IgG1 mutant possessing Asp–Asp–Ser sequence on both chains was prepared. Indeed, this underwent the modification of both sites when exposed to RS. Facing another recognised drawback of Francis methodology, namely low efficiency for bulky amino acid termini (Leu, Ile, Val), Zhang et al.621 have elaborated an efficient PLP analogue, FHMDP (Fig. 70), that demonstrated much higher efficiency in transamination thereof.
The above-mentioned transformations provide just a few examples of the rapidly growing field of transaminative modification of proteins. Recent advances have also resulted in elaboration of general approaches for protein immobilisation (Fig. 71),15,622 dual fluorescent modification of periplasmic solute binding proteins,623 protein PEGylation and PEG-like conjugation (e.g. OEGMAtion),624 preparation of phage conjugates,625,626 N-terminus proteomics,627,628 enabling Wittig629 and Pictet–Spengler ligation on transaminated proteins.630,631
Rather than demonstrating any ability to transaminate, 2PCA exclusively showed conversion to a pair of cyclic imidazolidinone diastereomers upon reaction with model peptides. The key step of the reaction mechanism is a nucleophilic attack of the adjacent amide nitrogen on the electrophilic carbon of the initially formed N-terminal imine (Fig. 72a). The presence of nitrogen heterocycles (namely pyridines) was found to be crucial for efficiency of this stereoelectronically disfavored condensation. It is noteworthy that Lys residues are unreactive in such a pathway, because of the lack of a neighbouring amide group suitable for cyclisation and higher pKa values compared to α-amino groups. Furthermore, this methodology was also found to be compatible with the presence of free in-chain cysteines. This approach can therefore be considered as an orthogonal to other Lys-selective (see Section 1.1) and Cys-selective methodologies (see Section 1.3).
![]() | ||
Fig. 72 N-terminal protein modification using 2-pyridinecarboxyaldehydes (2PCT) reported by Francis et al.632 (a) Mechanism of the reaction of between N-terminal amino group and 2PCA. (b) Modification of native RNase A (pdb: 7RSA). |
2PCA-mediated conjugation was found to be generally applicable for protein labelling (the authors demonstrated its application on a broad set of 12 different proteins including RNase A; Fig. 72b) except for N-acylated proteins (no imine formation) and peptides containing proline in position 2 (no cyclisation of the formed N-terminal imine).632 The resulting imidazolidinone-containing conjugates are generally moderately stable and decompose to starting protein by about 20–30% after 12 hours of incubation at 37 °C (in case of RNase A). This may limit the suitability of this methodology for several applications where stability of generated conjugates is crucial; however, the substrate variation could resolve the issue and such efforts are underway.
![]() | ||
Fig. 73 General scheme of the ligation strategy proposed by Liu and Tam.635 A model 50-residue peptide was obtained in good yield in ligation reaction between a 32-mer peptide VVSHFNDCPDSHTQFEFHGTCRFLVQEDKPAR containing C-terminal aldehyde function and a 17-mer peptide CHSGYVGARC(Ac-m)EHADLLA containing N-terminal cysteine; in the case of Thr- and Ser-peptides, the reaction was moderately efficient. The peptides structures were simulated using the RaptorX web server.447 |
Although this methodology has demonstrated its high potential for the ligation of unprotected peptides,635,638,639 the generation of a “non-native” heterocyclic fragment at the site of ligating the two peptides made it extruded almost completely by other “native ligation” approaches; notably, by a similar mechanistically native chemical ligation (NCL) (see Section 2.3.1).
Based on the periodic acid mediated oxidation,642 the reaction occurs only when there exists the target site for the periodate to form a cyclic intermediate, that is to say, when the N-terminal residue is represented by a serine or a threonine, or when hydroxylysine is present (rarely occurring in proteins). Possible side reactions include oxidation of the side chains of Met, Trp, and His. However, the potential for side reactions can be diminished by using very low periodate-to-protein molar ratios, as demonstrated by Geoghegan and Stroh in their experiments on two model peptides, SIGSLAK and SYSMEHFRWG, and with recombinant murine interleukin-1α (an 18 kDa cytokine with N-terminal Ser, Fig. 74),641 or by conducting the oxidation at neutral pH.643
![]() | ||
Fig. 74 N-terminal serine labelling of recombinant murine interleukin-1α (pdb: 2KKI) with Lucifer yellow dye described by Geoghegan and Stroh.641 The method is also applicable if N-terminal threonine is present. |
As in the initial publication of Geoghegan and Stroh,641 obtained glyoxyloyl can serve as the locus for further chemical modification involving aldehyde-selective reactions (e.g. through the formation of stable oxime, hydrazone or previously described oxazolidine moieties).605,606 Robin and colleagues have demonstrated the possibility of using this two-step methodology for assembling two unprotected protein fragments: oxidised to glyoxyloyl-containing and hydrazide peptide derivatives.644 Rose and co-workers645 exploited the reactivity of generated glyoxyloyls towards O-alkyl hydroxylamine derivatives to synthesize a pentameric form of the cholera toxin subunit B.
Further investigation of periodate oxidation allowed its promoting for site-selective tagging, PEGylation, preparation of protein conjugates, protein capture and synthesis of large protein dendrimers.643,646–651 It is worth mentioning that periodate oxidation is incompatible with a number of protein classes. For instance, glycoproteins will undergo periodate-based cleavage of polysaccharide chains as a side reaction pathway.652
Lastly, glyoxyloyles can easily be transformed into corresponding amines via transamination reaction in the presence of copper(II) or nickel(II) salts.604,610,653 The reaction mechanism as well as need for both essential components of the system: the acceptor of the glyoxyloyl (usually aspartic acid or glycine) and the cation of a heavy metal are explained in Fig. 75. Despite being of moderate interest for bioconjugation by itself, this approach has initiated the development of a more general methodology for selective N-terminus modification – transaminative conjugation (Section 2.1.3). The reader is directed to a recent review by El Mahdi and Melnyk654 for a complete overview of the glyoxyloyl transformations in bioconjugation.
![]() | ||
Fig. 75 Preparation of S1G mutant of corticotropin-releasing hormone (GRH) by regioselective transformation of N-terminal serine to glycine.604 Reaction steps: (a) selective serine oxidation by periodate; (b) formation of imine with amino group of glutamic acid precomplexated with Cu2+ (Cu-Glu); (c) base-assisted isomerisation of imine; (d) hydrolysis of imine, completed when aspartate and copper ions are removed. |
The inherent reactivity of an N-terminal phosphorylated Ser or Thr residues was demonstrated to significantly facilitate the amide bond formation with a range of C-terminal peptide thioesters. Although it is not yet clear what exact intermediate is formed during ligation, the authors have hypothesised that rapid acyl migration to the N-terminal amine of a peptide occurs through the formation of unstable acyl phosphate (Fig. 76).
![]() | ||
Fig. 76 Hypothesised mechanism of the phosphate-assisted ligation reported by Thomas et al.655 |
In 1994, confronted with this criticism, Kent and co-workers664 introduced a versatile approach to the linkage of peptide fragments using a native peptide bond – native chemical ligation (NCL). Based on the original principles of the chemical ligation methodology663 and the ability of thioesters to undergo S → N acyl shift discovered by Wieland et al.,665 NCL allowed to achieve chemoselective formation of the amide bond in the presence of unprotected nucleophilic amino acid side chains as alcohols (Ser/Thr), phenolates (Tyr), free amines (Lys), carboxylates (Glu/Asp), or other thiolates (Cys) presented in the macromolecule.
By analogy with previously developed O → N acyl shift, the reversibility of the thioester-thiol exchange in the presence of an exogenous thiol additive coupled with the capture of the acyl segment by S → N acyl shift, being possible only in the case when the latter is brought in the close proximity to an amine in a N-terminal cysteine thioester intermediate, result in an exquisite regioselectivity of this methodology. The product resulting from this S → N acyl shift represents a peptide, consisting of two fragments linked by a native peptide bond through a cysteine residue (Fig. 78).
![]() | ||
Fig. 78 Synthesis of the human interleukin 8 by native chemical ligation elaborated by Kent et al.664 The ligation reaction occurs between an unprotected peptide thioester fragment – IL-8(1–33 His-αCOSR), and a second unprotected peptide possessing a N-terminal cysteine – IL-8(34–72). First step of thioester exchange results in different thioester-linked intermediates, among which only the peptide obtained from the corresponding N-terminal cysteine can undergo following irreversible step of an S–N acyl shift resulting in obtaining of a native amide bond at the linkage site. |
Typically, the reaction performed in PS or PBS buffer (pH 7.0–8.5) at 37 °C is complete in less than an hour and with high yields.666,667 Solubilizing agents such as guanidine hydrochloride or urea do not interfere with the ligation and are usually used to enhance the concentration of peptide segments, and thus increase reaction rate. It is important to prevent the thiolate of N-terminal cysteine from the oxidation resulting in a disulfide-linked dimer, which is unreactive in the ligation. A reductant (e.g. TCEP) or an excess of thiol corresponding to the thioester leaving group (4–5%, vol/vol) is generally added to keep the Cys residues in reduced form. Moreover, the latter largely increases the overall rate of NCL by reversing the first step of transesterification for in-chain intermediate adducts deprived from the possibility to undergo S → N acyl shift and to generate a stable amide bond.
The first step in synthesizing a protein by NCL generally consists in defining the fragments to be used in the ligation reactions. Preferentially, naturally occurring AA-Cys motifs in the native sequence should be chosen as the ligation sites (AA stands for any amino acid). Val, Ile, Asp, Asn, Glu, Gln and Pro represent less favourable choices, because of lower ligation rates and possible side reactions,666 which, however, can be accelerated either by transformation of the corresponding thioesters into selenoesters,668 or by tuning the reaction pH.669
Higher reaction rates were reported to be achievable while using good thiol-containing living groups, i.e. mildly acidic thiols such as thiophenol, 4-(carboxymethyl)thiophenol (MPAA), or 5-thio-2-nitrobenzoic acid (TNB, the reduced form of Elman's reagent). These are generally generated in situ by thiol–thioester exchange from the relatively unreactive peptide- (αCOSCH2CH2CO)-Leu alkylthioester by adding an excess thereof (1–5%, vol/vol).666
Sequential native chemical ligation allows further extending of this limit by means of N-terminal cysteine protected peptides. Three polypeptide fragments: a peptide1-COSRAr (N-terminal fragment), a protected PG-Cys-peptide2-COSRAr (middle fragment), and a Cys-peptide3 (C-terminal fragment), are thus assembled in a one-pot three step synthesis. Firstly, the middle fragment and the C-terminal fragment are ligated under the classical reaction conditions of NCL. Then protecting group is removed, uncovering N-terminal cysteine of the obtained polypeptide (central plus C-terminal fragment), which undergoes the second NCL with the N-terminal fragment to give the target protein (Fig. 79).
![]() | ||
Fig. 79 Synthesis of insulin-like growth factor 1 (IGF-1) via sequential native chemical ligation described by Sohma et al.672 The reversible protection of the α-amino group of the central peptide fragment IGF(18–47) prevents its self-reaction with the α-thioester moiety present in the same molecule. Thiazolidine protecting group can be easily removed by brief treatment with NH2OMe·HCl at pH 4. NCL reaction condition used: PB (pH 6.7), 6M GndCl, 10 mM MPPA, 20 mM TCEP. |
Since its introduction, sequential native chemical ligation has demonstrated its general applicability to the preparation of various complex assemblies. Consequently, several methodologies compatible with NCL for the protection of N-terminal Cys residues were elaborated. The most relevant among them are depicted in the Table 1.
Protecting group | Deprotection | Examples | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
a Require a preparative-HPLC step before removal, gives lower overall yield compared with Thz.683 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 | Thz |
![]() |
NH2OMe pH 4 | IGF,672 HIV-1 Tat,674 crambin,675,676 EPO,677 PYP678 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2 | Msc679 |
![]() |
pH 12 | SOD,680 Abl-SH3681 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3a | Acm682 |
![]() |
AgOAc, DTT | Crambin,683 DAGK684 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
4 | Mapoc685 |
![]() |
hν > 300 nm | hBNP-32685 |
The reader is referred to a recent review by Melnyk and collaborators673 for a more complete overview of the sequential ligation strategies on proteins.
The fact that the kinetics of NCL with alkylthioesters is significantly inferior of those with arylthioester makes it possible to control the intrinsic dual reactivity of a bifunctional Cys-peptide2-COSRAlk so that it would selectively react with a peptide1-COSRAr and then undergo a classical NCL (i.e. in the presence of exogenous aryl thiol) with a third Cys-peptide3 to yield an assembled peptide1-Cys-peptide2-Cys-peptide3 with no need to use protecting groups. Bang and collaborators have applied this methodology to assemble a 46-residue protein crambin from six peptide fragments (Fig. 80).
![]() | ||
Fig. 80 Two final steps of the synthesis of V15A crambin (pdb: 3NIR) described by Bang and associated.675 The mutation was introduced for simplifying the prior KCL step of assembling the second peptide. Kinetically controlled ligation of SAr thioester spontaneously occurs in aqueous media in the absence of exogenous thiophenol, while native chemical ligation of SAlk thioester must be accelerated by the addition of 1% PhSH. |
Further advancing pioneering works done by Botti et al.686 on in situ acyl migration, KCL methodology has been recently extended from alkylthioesters to a full class of O-esters undergoing a spontaneous transformation to produce a thioester when exposed to a reducing agent through disulfide bond reduction followed by O → S acyl shift (Fig. 81).687
![]() | ||
Fig. 81 Schematic representation of the O → S acyl shift of O-esters containing a disulfide bond described by Zheng et al.687 |
Through a thorough investigation Zheng et al.687 have defined that structures of the O-esters have an important effect on their reactivity. The authors have side-by-side benchmarked their methodology with previously described KCL by synthesizing the same V15A crambin (Fig. 80) by a one-pot one-step condensation of peptide segments and found its applicability to this system. Readily available by Fmoc solid-phase synthesis, these O-ester scaffolds can expand the applicability of NCL to substrates with hardly accessible thioester peptide fragments.
Despite its recognised drawbacks due to hazardous acid treatment often leading to undesired side-reactions, the protocol of in situ neutralisation for Boc-based solid-phase peptide synthesis represents the most effective approach for the preparation of peptidyl thioesters.666,688–690 Alternatively, the Fmoc synthesis approach was investigated691–693 and found to be favoured when synthesizing phospho- and glycopeptides.
![]() | ||
Fig. 82 Principle of “crypted thioesters” demonstrated by Danishefsky and co-workers.697 Intramolecular O → S migration of acyl residue resulting in generation of highly active thioester (its “uncrypted form”) occurs only upon reductive cleavage of disulfide bond of reasonably stable “crypted form”. |
Synthesis of thioesters in situ from stable amides via N → S acyl-transfer was demonstrated by Ohta et al.699 who studied acylated oxazolidinones derived from S-protected cysteine. These possess a distorted amide planarity provoking so called ground-state destabilisation700 and, as a consequence, favour the acyl migration to a deprotected thiolate. Nakahara and collaborators have studied and elaborated two classes of secondary amides amenable to the N → S acyl shift at low pH values: 5-mercaptomethyl prolines701 and N-alkyl cysteamides (Fig. 83a).702 Oxazolidinones699 and N-sulfanylethylamilides (SEAlides)703 described by Otaka and collaborators were found to possess similar aptitude towards N → S acyl-transfer shift at low pH (for a complete overview of N → S acyl-transfer systems described before 2010 see review by Kang and Macmillan).704 Erlich et al.705 have recently applied N-alkyl cysteamide-based approach for the synthesis of 76-residue ubiquitin thioester, while Otaka and collaborators have demonstrated high potential of SEAlides by conducting the chemical synthesis of 162-redue active glycosylated GM2-activator protein.706
![]() | ||
Fig. 83 N → S migration of acyl residue. The thioester-amide equilibrium is shifted towards the thioester form at low pH due to the protonation of the secondary amides. In the presence of 3-mercaptopropionic acid (MPA) all intermediates produce MPA-thioester. (a) From left to right: reduced form of Danishefsky's ester,697 5-mercaptomethyl prolines,701N-alkyl cysteamides,702 oxazolidinones,699 and N-sulfanylethylamilides.703 (b) SAMon/off approach by Melnyk and collaborators.711 |
Almost simultaneously have two research groups reported a general approach, based on the application of bis(2-sulfanylethyl)amides (SAM) as precursors for NCL.707–710 An interesting extension of the methodology enabling the possibility of triggering the reactivity of SAM – so called SEAon/off system – has been further elaborated by Melnyk and collaborators.711 The transition between reactive (SEAon) and unreactive (SEAoff) states is simply triggered by mild oxidation/reduction procedures (Fig. 83b). SEAoff can be easily switched on via TCEP reduction, while the reverse switching off is achieved by mild oxidation with iodine. After few seconds, the excess of iodine is decomposed by the addition of dithiothreitol (DTT). Other amino acid residues susceptible to oxidation such as methionine or tryptophan are not affected because of this very short exposure to oxidant. However, cysteine residues must be protected with tert-butylsulfenyl groups to remain unaffected. At low pH values these are not reducible by DTT, allowing thus reliable protection of cysteines during the cycles of oxidative–reductive SAM triggering.
To demonstrate the potential of this newly established and optimised SAMon/off system, Melnyk and collaborators have synthesised 85-residue domain of the hepotocyte growth factor (HGF 125–209).711
![]() | ||
Fig. 84 Synthesis of 21-residue cyclic antibiotic peptide Microcin J25 via NCL-desulfurisation strategy resulting in alanyl-linked polypeptide chain developed by Yan and Dawson.712 Reaction conditions: cyclisation – Tris-HCl, 6 M GndCl; desulfurisation – NaOAc, pH 4.5, 6 M GndCl, H2 (Pd/Al2O3). |
In order to extend native chemical ligation-desulfurisation approach to these amino acids, a SH-group should be attached to the carbon atom situated in the β-position to amino group (in some cases in the γ-position). The new building block containing a β-mercapto-α-amino or γ-mercapto-α-amino fragment is then introduced at the N-terminal position during solid-phase peptide synthesis or by means of DNA-recombinant technologies.714 Following ligation is conducted under classic NCL conditions: at pH 7.5–8.0 in the presence of TCEP as reducing agent and 1% of exogenous thiol additive. Finally, the desulfurisation of cysteine gives a nascent residue of interest (Table 2).
Besides aforementioned reduction on RANEY® nickel, various milder conditions such as nickel boride,712 Pd/Al2O3,729 or metal-free conditions730,731 were developed to achieve efficient desulfurisation. More recently, in situ ligation-desulfurisation approach was also reported.732,733
Desulfurisation-based methodology of the NCL expanding towards other amino acid junctions have contributed in many ways to prepare proteins and posttranslationally modified analogues for biochemical and structural analyses ever since its introduction. However, despite this broad utility, carrying out desulfurisation of the linkage site in the presence of other Cys residues in the protein sequence usually requires using of protecting groups.729,734 Not only does this necessity represent an undesirable step to protein synthesis, but it also implies some limitations on the applicability of the approach, mainly due to the solubility issues.
Developed by Dawson and collaborators735 the protocol for selective reduction of selenocysteines (Sec, U) in the presence of cysteine allowed overpassing the need for protection of thiolates and expanded the already established field of Sec-mediated native chemical ligations (see Section 2.6).736–739
The sensitivity of Sec peptides to reduction was noted in several works on selenocysteine ligations,737,738,740 During their preceding work on the synthesis of seleno-glutaredoxin 3 analogues (Se-Grx3), Dawson and associates741 have observed this incompatibility of Sec-containing proteins and peptides with TCEP-assisted native chemical ligation due to the generation of significant levels of a deselenised side product. In the publication to follow, the authors have successively applied the ligation-deselenisation strategy on a model peptide system. Accordingly, N-terminal Sec-peptide1 (UGLEFRSI-amide) prepared in the form of a diselenide dimer was ligated to the thioester peptide2 (Ac-LYRAG-SR) (Fig. 85) to produce the deselenised alanyl-peptide (Ac-LYRAGAGLEFRSI-amide) after the treatment with 50-fold excess TCEP at pH 5.5. Importantly, an excess of 200 mM 4-mercaptophenylacetic acid (MPAA) was needed for the ligation step. MPAA is both served as a catalyst to activate the alkyl thioester and as mild reducing agent to generate a small pool of free selenol to facilitate the ligation reaction.
![]() | ||
Fig. 85 Traceless ligation of peptides using selective deselenisation described by Dawson and collaborators.735 |
This selectivity of reduction with TCEP is hypothesised to be due to the weaker Se–C bond compared with the S–C bond coupled with higher propensity of selenols to form radicals.742 It should be mention, however, that albeit upon heating, cysteine can be desulfurised in the absence of a radical initiator when treated with excess phosphine.743
Selenocysteine-mediated NCL deselenisation procedure have been recently exploited for selective ligation of selenolphenylalanine,744 and γ-selenolproline,720 easily transformable into peptides phenylalanine and proline respectively at the ligation site by treating Se-containing intermediates with TCEP or DTT. Analogously to N → S acyl transfer (Section 2.3.1.3), N → Se acyl shift was recently observed by Adams and Macmillan and allowed NCL to take place at lower temperatures and on shorter time scales.745 Corresponding selenoesters can be readily accessible by direct solid phase synthesis.746
NCL principles of S → N acyl transfer found their application in ligation assisted by proximity effect (see Section 5),747 allowing for conjugation of N-terminal residues other that cysteine by auxiliary-mediated acyl transfer. Furthermore, the Cys side chain thiolate introduced during NCL can also provide a synthetic handle for further functionalisation using cysteine-selective methodologies (see Section 1.2). It was recently demonstrated by Fang and co-workers that more accessible than thioesters simple phenyl esters could undergo native chemical ligation smoothly under the promotion of imidazole.748 Lastly, recently reviewed by Monbaliu and Katritzky749 Kemp's template-mediated thiol ligation,750–758 Tam's ligation by thiol/disulfide exchange,759–761 and other auxiliary-driven extensions of native chemical ligation,762,763 represent significant importance in the field of protein synthesis and can be considered as appropriate for bioconjugation.
The thiazolidine-mediated ligation was first applied by Tam and associates to the preparation of peptide dendrimers765,767 by attaching unprotected peptide dendrones containing Cys residues at their N-termini to a branched core matrix with aldehyde functions. Botti et al.766 have transformed this approach to a general method for the preparation of cyclic peptides. Villain et al. have demonstrated that obtained thiazolidines (sometimes referred to as pseudo-prolines)768 can be selectively cleaved by adding hydroxylamine derivatives, which react with aldehyde functions protected under the form of thiazolidine to form oximes. The authors applied this methodology for the covalent capture of proteins possessing N-terminal Cys or Thr residues (Fig. 86).769 Interestingly, under the same conditions N-terminal Ser residues reacted only poorly.
![]() | ||
Fig. 86 General scheme of covalent capture purification of N-terminal cysteine containing proteins developed by Villain et al.769 |
Because of the recent advances in the semisynthesis of proteins and the encoding of 1,2-aminothiols into recombinant proteins,662,770,771 thiazolidine-mediated conjugation is now experiencing a reappraisal of its potential for bioconjugation.772,773 For instance, Casi et al.772 have exploited thiazolidine formation for the preparation of antibody–drug conjugates by site-specific incorporation of a potent drug, containing an aldehyde moiety, to engineered recombinant antibodies displaying a Cys residue at their N-terminus, or a 1,2-aminothiol at their C-terminus.772 Lastly, thiazolidines represent one of the most often used N-terminal cysteine protecting groups for sequential NCL (see Table 1, Section 2.3.1.1).
The regeneration pathway of luciferin in firefly was found to be consisted of the condensation of CBT with D-cysteine (Fig. 87).774 The reaction mechanism underlying this addition include first attack of cyano group of CBT by cysteine thiolate. This results in the formation of the electrophilic imidothiolate, subjected to the second attack by cysteine amino group to form thiazole structure after the yielding of ammonia gas.
![]() | ||
Fig. 87 Synthesis of luciferin by the reaction between cyanobenzothiazole (CBT) and D-cysteine.774 |
Inspired by these early works, Rao and co-workers777 have further investigated the reaction of cyano-substituted aromatic compounds with amino-thiol substrates. They have demonstrated that benzotriazole motif plays an important role for the activation of the cysteine addition to a nitrile group. For instance, under optimal reaction conditions (PBS, pH 7.0–7.5) its replacement by other aromatic fragment such as picolinonitrile or benzonitrile largely decreases the reaction yield. All naturally occurring amino acids are unreactive towards CBT, except for cysteine owing the highest second-order rate constant among six other tested amino-thiol substrates (9.19 M−1 s−1, which is significantly higher than these of the majority of biocompatible click reactions).778 Finally, the efficiency and specificity of CBT-based labelling of terminal cysteine residues was demonstrated on proteins in vitro (Fig. 88) as well as on cell surfaces.
![]() | ||
Fig. 88 Labelling of Renilla luciferase (pdb: 2PSD) by CBT-based probes.777 N-terminal cysteine was generated by protease processing of the fusion protein. |
In the publications to follow, Rao and colleagues have extended the applicability of CBT towards biocompatible condensations to create polymer assemblies in vitro and in living cells under the control of either pH, disulfide reduction or enzymatic cleavage.779,780 Yuan et al. have taken advantage of this approach and developed a method for the determination of glutathione (GSH) concentration in vitro and in HepG2 human liver cancer cells.781 Jeon et al.782 have elaborated a CBT-based 18F-probe radiolabelling of N-terminal cysteine-bearing peptides and proteins. Two labelled substrates: a dimeric RGD-peptide – [18F]CBTRGD2, and Renilla lucifierase bearing a cysteine at N-terminus, have been synthesised with excellent radiochemical yields and shown good in vivo molecular PET imaging efficiency. Proceeding efficiently at physiological conditions, CBT-mediated N-terminal Cys conjugation represents a useful alternative to existing approaches for protein labelling.783
![]() | ||
Fig. 89 Ligation–desulfurisation at tryptophan, reported by Payne and collaborators.786 (a) Electrophilic sulfenylation of Trp in acidic conditions with DNPS-Cl and the obtention of the corresponding 2SH-Trp derivative. (b) Auxiliary-assisted native ligation of N-terminal 2SH-Trp peptide (2SH-WSPGYS-NH2) with a thioester of the peptide Ac-LYRANG-SPh resulting in 12-mer peptide Ac-LYRANGWSPGYS after the desulfurisation step. |
The proposed mechanism for this approach is mechanistically similar to NCL. It was hypothesised that the reaction would proceed via an initial step of the peptide thioester transthioesterification with an indole 2-thiol functionality followed by an S- to N-acyl shift through a 7-membered ring transition state to generate a native amide bond. The last step of 2-thiol Trp desulfurisation results in obtention of a ligated product with only naturally occurring amino acid residues (Fig. 89b). Although this methodology represents a clever chemoselective approach for the ligation of completely unprotected peptide fragments through Trp moiety, the harshness of the reaction conditions of sulfenylation and desulfurisation limit it only to peptide substrated and don't allow its application on complex biomolecules.
Ellman's regent348 was used to activate the C-terminal peptide thiocarboxylic acid by forming acyldisulfide derivative, which is then nucleophilically attacked by N-terminal histidine. Captured by the imidazole of the N-terminal histidine, the obtained Nim-acyl intermediate is hypothesised to undergo Nim → Nα shift to form histidine at the ligation site (Fig. 90). However, the Nim-acyl intermediate has not been isolated and it is quite possible that regioselectivity is obtained simply because of anchimeric assistance of the proximal imidazole moiety at the ligation site.
![]() | ||
Fig. 90 N-terminal histidine labelling described by Zhang and Tam.791 (a) Mechanism of thiocarboxylic acid activation by Ellman's reagent. (b) Labelling of bovine parathyroid hormone fragment (14–34, pdb: 1ZWC) with activated tetrapeptide thioacid. |
Interestingly, no sign of coupling reaction has occurred when a corresponding non-activated C-terminal thiocarboxylic acid is participating in the reaction instead of the acyldisulfide. The reaction pH plays an important role on the effectiveness of the reaction. Only when maintained at slightly acidic values (pH 5–6) and in the absence of the thiol nucleophiles, would the imidazolyl moiety of histidine be the sole nucleophile present in the polypeptide. This methodology has been applied to generate histidine-containing peptides with yields up to 75%.
Sec ligation can be chemoselective when conducted at slightly acidic pH. Low intrinsic pKa values of selenocysteine (5.2)795 and consequently its higher dissociation level at low pH, endows this amino acid with unique biochemical properties, allowing regiospecific covalent conjugation with electrophilic compounds in the presence of the side chains of all other natural amino acids including the thiol group of Cys (pKa 8.3). For instance, the reaction rate with selenocysteine was found to be 1000 fold faster than with cysteine at pH 5.0.736 Moreover, the lower pH generally suppresses β-elimination of the selenol group from selenocysteine resulting in the obtaining of unreactive dehydroalanine.392
Initially, considerable efforts were made to show the applicability of selenocysteine NCL for the preparation of selenium-containing derivatives of enzymes and benchmarking activities thereof. Hilvert and associates synthesised a C38U analogue of bovine pancreatic trypsin inhibitor (BPTI). Amusingly, the wild-type BPTI and its artificial analogue folded into alike conformations and demonstrated similar inhibiting affinity of trypsin and chymotrypsin.738 Raines et al. selected 124-residue ribonuclease A (RNase A) as a model protein for the investigations.736 DNA recombinant technology was utilised to prepare a C-terminal thioester fragment corresponding to residues 1–109, while standard solid phase peptide synthesis methodology was used to obtain a N-terminal Cys and Sec peptides corresponding to residues 110–124 (Fig. 91). Just as in the case of BPTI, the semisynthetic wild-type RNase A and C110U RNase A presented equivalent ribonucleolytic activities. Further advances in the field of Sec-NCL have resulted in synthesis and investigations of other different proteins such as seleno-glutaredoxin,796 azurin,797,798 and thioredoxine reductase.
![]() | ||
Fig. 91 Selenocysteine native chemical ligation applied for the preparation of C110U mutant of RNase A (wild-type – pdb: 7RSA).736 |
Ease of the post-ligation transformation of selenocysteine to alanine (by deselenisation), dehydroalanine (by β-elimination) or non-natural amino acids (by addition reaction to dehydroalanine, see Section 1.3.8) became a spur to a further propagation of Sec-mediated methodologies as very effective tools in rational design of peptides and proteins. Quaderer and Hilvert799 exploited such transmutations of selenocysteine to access a series 16-residue cyclic peptides (Fig. 92).
In this initial report, the deselenisation step was conducted rather harshly (RANEY® Ni, H2), implying that all Cys residues (if there were any) would have been reduced as well. Recently, however, Dawson and collaborators735 have demonstrated that selenocysteine can be chemoselectively deselenated with TCEP in the presence of cysteines. This allowed overpassing the main limitation of the NCL-desulfurisation strategy (see Section 2.3.1.4), namely the inability to control regioselectivity of desulfurisations if several cysteines are present in the peptide or protein, and yielded in a pool of NCL-deselenisation strategies for mild incorporation of alanine, phenylalanine and proline into the ligation site by classic NCL approaches (see Table 2). Finally, selenocysteine peptides were found to efficiently undergo reverse NCL at acidic pH and thus to be of particular interest for the generation of thioesters by sequential N → Se acyl-transfer and substitution of the obtained selenoester by exogenous thiol (see Section 2.3.1.3).745
Because the incorporations of selenocysteine by the cell translational machinery are generally very laborious,800,801 selenopeptides are mainly obtained by SPPS.736,737,794
![]() | ||
Fig. 93 Selective modification of 1P-GFP (Pro residue is introduced on N-terminus) with o-aminophenol-PEG reagent (wild-type – pdb: 1GFL).802 (a) In situ oxidation of o-aminophenol derivatite. (b) Reaction of the active coupling species with N-terminus of 1P-GFP. |
Because of the impetuous development of NCL (see Section 2.3.1), requiring C-terminal thiolates as reacting partners for N-terminal Cys proteins, those became widely accessible, namely by means of SPPS. A perspective approach, exploiting these advances was proposed by Goody803 and collaborators, who developed a protocol for selective transformation of C-terminal thioesters to corresponding hydroxylamines, enabling thus the application of aldehyde- and ketone-selective methodologies on the C-terminus.
On the other hand, the unique position of protein C-termini has stimulated numerous efforts to target this location, which resulted in numerous enzymatic and intein-based approaches for C-terminal-selective protein modification.804–809 These methodologies are, however, not covered by this review devoted to chemical methods of bioconjugation.
Metal-chelation methodologies are perhaps the most elaborated among sequence-selective approaches (Fig. 94).810 The oligo-histidine sequence (usually H6), called His-tag, is known to interact robustly with transition-metal complexes, including a nitrilotriacetic acid (NTA) complex of Ni2+, therefore the sequence is widely used for protein purification by affinity chromatography. Similarly, oligo-aspartates (most often (D4)n, n = 1–3) were developed for selective labelling with multinuclear Zn2+ complexes.811,812 The tetracysteine motif (CCPGCC) was reported to especially effectively chelate biarsenical dyes such as 4′,5′-bis(1,3,2-dithioarsolan-2-yl)fluorescein (FlAsH),5,813 while a similar structural tetraserine sequence (SSPGSS) was found to be selective for diborate-scaffolds.
![]() | ||
Fig. 94 Strategies for the selective conjugation of proteins based on metal-chelation: tetracysteine/biarsenical system, oligohistidine/nickel-complex system, tetraserine–borate system, oligo-aspartate/zinc-complex system.27 |
The inherent reactivity is rare to be overcome, which largely limits of the scope of known methods for bioconjugation of the amino acids possessing low nucleophilicity. However, bringing the reaction partners into close proximity can accelerate a reaction thereof, which would not otherwise be possible because of the presence of other more reactive species. Routinely exploited by enzymes, this approach enables selectivity on the basis of the molecular shape rather than reactivity or the local environment (Fig. 95).
Developed by Hamachi and collaborators in 2006,462 the post-affinity-labelling approach can be used for the selective tethering a functional molecule at the proximity of the active site of enzymes. This method was then used on numerous substrates and enlarged the scope of selective conjugation especially on histidine and tyrosine (see Section 1.5.1).462,463,486,814 Despite these successes, a prevalent limitation of the applicability of the method due to the possible modification of the residues situated only in the vicinity of ligand-binding pocket of the target protein prevents this methodology from becoming a general approach for protein labelling.
In 2010, exploiting similar idea, Popp and Ball454 envisioned the combination of two previously described techniques: the coiled-coil based molecular recognition of complementary peptides455,815–817 and high catalytic activity of dirhodium complexes on carbene C–H insertion, previously reported by Francis and collaborators452 for the selective modification of tryptophan (see Section 1.4.2). Two complementary peptides: one, containing a dirhodium catalytic centre (precomplexated through two glutamate side chains), and another, containing a side chain to be modified are thus involved in this methodology (Fig. 96a). Because of the compelled proximity of the side chain of interest and the active catalytic centre in the obtained supramolecular assembly, the reaction of rhodium-catalysed C–H insertion is largely accelerated (more than 103 times).818
![]() | ||
Fig. 96 (a) Modification of c-Fos (shown in red) catalysed by Max(Rh2) (shown in blue) metallopeptide described by Popp and Ball.818 Two possible coiled-coil alignments result in modification of GLU-14 and GLU-21 residues, located in close vicinity to the dirhodium catalytic center. (b) Proposed product bond connectivity. |
As a result, the conjugation of amino acid residues with lower reactivities becomes possible. This allowed to expand the scope of originally tryptophan-selective dirhodium carbene methodology first to the other aromatic residues, phenylalanine and tyrosine,454 and then to over half of the naturally occurring amino acid residues (Fig. 96b).818 To date, dirhodium metallopeptide represent the only reported method for selective modification of Gln, Asn, and Phe side chains. The authors have also demonstrated the possibility to apply their methodology on chimeric proteins, containing fused coils,456 as well as on full-sised natural proteins possessing coiled-coils in their structures.818
Despite its important potential, the metallopeptides methodology is however not devoid of drawbacks. Because binding to dirhodium is nonselective and thus cannot be performed in the presence of other carboxylate-containing peptides, rhodium-peptide complexes must be synthesised beforehand, which is often challenging mainly due to their poor solubility.819 Moreover, the method is restricted to proteins containing coiled coil fragments in their structures, which in case of the vast majority of targets would mean the need for resource- and time-consuming expression of fused proteins.
Another approach, developed by Silverman and colleagues, exploits self-assembling of complementary DNA to bring into proximity two reacting fragments and allowed, although only on simple substrates, selective phosphorylation of tyrosine and serine, otherwise not feasible.538
Beyond coiled coils and DNA-based preorganisation, the principles of proximity-driven selectivity should be extended to other helix-binding protein domains and to biological molecular recognition generally. A significant broadening of the applicability of this elegant approach for protein modification, biochemistry and biomaterials engineering is anticipated in the nearest future.
Many, if not most of these methods, however, often possess drawbacks limiting their general applicability. This fact has become of special consideration with a rise of novel exigent applications for bioconjugation, namely the preparation of new therapeutic conjugates, vaccines, and biomaterials. Also tremendous progress in the sensitivity of analytical methodologies as well as the need to work with smaller and smaller amounts of sample, often-unstandardized patient samples, highlighted the need for more efficient, selective and reliable bioconjugation methods.
Moreover, some parameters of the mode of conjugation, previously completely neglected, were recently revealed to be of paramount importance. For instance, the stability of the generated linkage and the distribution of products generated upon conjugation can be determining for the overall efficiency of the conjugate.
Overall, we believe that intensive ongoing research in the field of bioconjugation will result in more efficient and selective methodologies allowing specific conjugation of native proteins in complex biological media, and ultimately in living organisms.
This journal is © The Royal Society of Chemistry 2015 |