Developments and recent advancements in the field of endogenous amino acid selective bond forming reactions for bioconjugation.

Bioconjugation methodologies have proven to play a central enabling role in the recent development of biotherapeutics and chemical biology approaches. Recent endeavours in these fields shed light on unprecedented chemical challenges to attain bioselectivity, biocompatibility, and biostability required by modern applications. In this review the current developments in various techniques of selective bond forming reactions of proteins and peptides were highlighted. The utility of each endogenous amino acid-selective conjugation methodology in the fields of biology and protein science has been surveyed with emphasis on the most relevant among reported transformations; selectivity and practical use have been discussed.


Introduction
Bioconjugation is a set of techniques allowing site-specific creation of a covalent link between a biomolecule and an exogenous moiety that endow it with desirable properties. This novel hybrid having the combined properties of its individual components can serve, for instance, as a more stable and efficient therapeutic, [1][2][3][4] an assembly for studying proteins in the biological context, [5][6][7][8][9][10][11] a new protein-based material, 12-17 a microarray, 18,19 a biological material, 20-24 a tool for immobilisation, 25 and for elucidation of the structure 26,27 of proteins.
A large number of reactions exist to modify proteins. 28 However, site-specific conjugation continues to attract considerable research efforts to develop new methodologies that match continuously increasing requirements of modern applications in terms of selectivity, stability, mildness, and preserving biomolecule integrity. For the purpose of this overview, the focus will remain on recent developments in bond-forming approaches in bioconjugation of native amino acid residues. Among about 20 different amino acids involved in protein composition, only a smaller number comprises appropriate targets for practical bioconjugation methods. In fact, only one-third of all amino acid residues represent chemical targets for the vast majority of bond-forming approaches.
The bioconjugation methodology of choice is selected according to the intrinsic reactivity of the targeted amino acid residue (acidity/basicity, electrophilicity/nucleophilicity, oxido-reductive characteristics) and its specific special environment (in-chain, N-terminal, C-terminal, location in a specific sequence, accessibility, etc.). In this review we will thus present the known bioconjugation strategies in regard to these parameters ranked in a descending order of frequency they are reported in literature.

Lysine
The use of chemical groups that react with primary amines is one of the oldest and most versatile techniques for protein conjugation. Virtually all proteins contain primary amino groups in their structure. They can be divided into two groups: the a-amino group situated in the N-terminus of most polypeptide chains and e-amino groups of lysine residue (Lys, K). Because these amino groups possess pK a values of about 8 and 10 (for aand e-amines respectively), in a vast majority of cases they are protonated at physiological pH and, therefore, occur predominantly on the solvent-exposed outside surfaces of protein tertiary structures. As a result, they become easily accessible to conjugation reagents introduced into the aqueous media.
Deprotonated primary amines are the most nucleophilic among the available functional groups present in a typical protein. However, protonation drastically decreases their reactivity. As a consequence, despite the generally higher intrinsic nucleophilicity of Lys e-amino groups, they require higher pH values to be uncovered by deprotonation, which allows distinguishing aand e-amino groups by adjusting the pH. That is to say, at the higher pH level, when both types of primary amines are deprotonated, Lys side chain amino groups are generally more reactive towards electrophiles, while at the lower pH it is the opposite because of their prior protonation ( Fig. 1). At the acidic pH all amines are protonated and possess no significant nucleophilicity compared to other side chains present in proteins. In particular, free (non-disulfide-bonded) Cys residues are much stronger nucleophiles and, if accessible, will readily be modified by most amine-reactive reagents.
It is also to be mentioned that like any other parameters, nucleophilicity and basicity, as well as solvent exposure and accessibility of a particular amino group, are influenced by the microenvironment and can vary substantially, regarding the substrate. For instance, Westheimer and Schmidt have found the actual pK a of the amino group situated in the active site of acetoacetate decarboxylase to be 5.9, which is 4 pK a units less than that of an ''ordinary'' e-amino group of lysine. 29 Depending on reaction conditions, selective modification of either N-termini (see Section 2) or Lys e-amino groups can be achieved by using various chemical reagents. They generally belong to one of the following classes (in the order of relevancy): activated esters (fluorophenyl esters, NHS (N-hydroxysuccinimides), 30 sulfo-NHS, acyl azides), isothiocyanates, isocyanates, 31 aldehydes, anhydrides, sulfonyl chlorides, carbonates, fluorobenzenes, epoxides and imidoesters. Among this vast variety of reactive functions, NHS esters (and their more soluble sulfo-NHS analogues) and imidoesters represent the most popular amine-specific functional groups that are incorporated into commercially available reagents for protein conjugation and labelling. 28 Despite their name, amine-reactive reagents are not always entirely selective for amines. Firstly, as already mentioned before, they will react with any other stronger nucleophile, if the latter is present and accessible on a protein surface. Particularly, it concerns cysteine, tyrosine, serine and threonine side chains. Secondly, depletion of these highly activated reagents by hydrolysis is inevitable in aqueous solution. The rate of both side-reactions depends on the particular substrate, the conjugation partner, pH, temperature, and buffer composition. Evidently, buffers that contain free amines, such as TRIS (tris(hydroxymethyl)aminomethane), must be avoided when using any amine-reactive probes, since the rate of the reaction with buffer would greatly exceed that with protein amino groups.
1.1.1 Isocyanates and isothiocyanates. Amines undergo a reaction with isocyanates to readily form stable ureas. However, because of their susceptibility to deterioration during storage, 32 isocyanates are much more difficult to manipulate with and thus are not as well commercially accessible as corresponding isothiocyanates. They can though be easily prepared prior to use from more stable acyl azides by Curtius rearrangement.
Using this approach, for instance, Palumbo and colleagues have elaborated the synthesis of a heterobifunctional amine-thiol crosslinker containing an isocyanate group on one end and a thiol reactive maleimide group on the other end (Fig. 2). 33 Several early studies were devoted to the elaboration of isocyanate conjugation methodology, 34,35 but proven to be especially laborious and complicated mainly due to the high reactivity and low stability of isocyanates. Therefore they are of deferred interest today, being completely displaced by isothiocyanate-mediated approaches. Both isothiocyanates and isocyanates can be obtained from the corresponding aromatic amines upon reaction with thiophosgene and phosgene respectively (Fig. 3). 36 Isothiocyanate-based selective amino group modification was first reported in 1937 by Todrick and Walker, 37 who found that the reaction of allyl isothiocyanate with cysteine in alkaline medium results selectively in thiourea -the product of amine addition to isothiocyanate. In 1950, exploiting the selectivity of amino-terminal labelling of the peptide with phenylisothiocyanate, Edman has developed a method for peptide sequencing that has changed cardinally the protein science and is known today as Edman degradation. 38 Only 30 years later, Podhradský et al. have examined the reaction of isothiocyanates on complex substrates and demonstrated that the addition of the thiol and phenolate functions of cysteine and tyrosine residues is always prevalent, and that only at pH 4 5 amino groups start to manifest themselves in the reaction. 39 While thiol and alcohol additions result in reversible reactions to give dithiocarbamates and O-thiocarbamates respectively, amines add themselves irreversibly, thus shifting the reaction equilibrium towards thioureas (Fig. 4). One should however keep in mind that, despite the reversibility of the addition of thiols and alcohols to isocyanates, they can enhance the kinetics of their hydrolysis to unreactive amines or ureas and therefore significantly decrease the yield of the conjugation. Moderately reactive but quite stable in water and most solvents, isothiocyanates represent thus an appropriate alternative to the unstable isocyanates. As a consequence, they are much more popular in bioconjugation.
Ever since the introduction of fluorescent isothiocyanate dyes as more stable analogues of corresponding isocyanates for fluorescent labelling of antibodies by Riggs et al. 40 in 1958, Fig. 2 Synthesis of the p-maleimidophenyl isocyanate crosslinker via Curtius rearrangement proposed by Palumbo and associates. 33 Fig. 3 Synthesis of isothiocyanates and isocyanates from the corresponding aromatic amines. 36 Fig. 1 Deprotonation of different types of amino groups present in protein structure (more nucleophilic amine is encircled in red). Lysine e-amino groups are more nucleophilic, but also more difficult to deprotonate. Generally, a pH of 8.5-9.5 is optimal for modifying lysine residues, while near neutral pH favours selective modification of N-termini.
they have found widespread use in research laboratories and proved to be an effective means for tagging proteins at specific sites. 41 Fluorescein isothiocyanate (FITC) is arguably one of the most commonly used fluorescent derivatisation reagents for proteins. For instance, it was reported by Tuls et al. 42 that cytochrome P-450 can be selectively labelled by FITC with 75% yield of a single-labelled LYS-338 conjugate in TRIS (particularly inappropriate buffer for amine-reactive reagents though) at pH 8.0 and 0 1C. Burtnick 43 has described selective labelling of one out of 34 lysine residues of actin in borate buffer with 35-fold excess of the reagent at pH 8.5. Such a high level of selectivity towards the LYS-61 residue over other 33 lysine residues present in proteins (Fig. 5, shown in red) remains unclear, but it is hypothesised to be due to an anomalously low pK a value thereof. Following reports of Miki and collaborators 44,45 further confirmed the selectivity of this labelling, yet without any explanation of such specificity. Bellelli et al. 46 were able to covalently label ricin (pH 8.1, 6 1C for 4 h). In fact, the targets of the isothiocyanate-mediated labelling of proteins elaborated over the last 60 years are even difficult to enumerate. It was proven to be effective in diverse applications such as tagging of antibodies (usually in carbonate-bicarbonate buffer, pH 9), [47][48][49][50][51][52][53] bleaching-based measurement of membrane protein diffusion of FITC-labelled cells (pH 9.5, 24 1C), 54 surface topography of the Escherichia coli ribosomal subunit, 55 a-actinin distribution in living and fixed fibroblasts, 56 characterisation of a proton pump on lysosomes, 57 and hematopoietic stem cells. 58 The most stunning examples include 125 I labelling by means of isothiocyanates, elaborated by Shapiro and colleagues 59 for regional differentiation of the sperm surface (TRIS, pH 7.7, 12 1C for 30 min), and the application of a similar methodology by Schirrmacher et al. 60 for 18 F radioactive labelling of RSA, apotransferrin and bovine IgG (pH 9.0, room temperature for 10-20 min). Conjugation of antibodies with chelating agents for further radiometal labelling of antibodies has been described by several groups [61][62][63] and is based on the use of phenylisothiocyanate-containing probes. Brechbiel et al. 64 went even further by combining the chelating functionality with the biotin fragment in a scaffold of trifunctional conjugation reagents. The preparation of silica nanoparticles coated with isothiocyanate groups and their use in apoptosis detection has recently been elaborated. 65 The classical protocol of isothiocyanate labelling involves the use of 5-10 equivalents at a slightly basic pH in the range of 9.0-9.5. 66,67 Resulting thioureas are reasonably stable in aqueous medium and provide a suitable degree of conjugation. 68 For example, Sandmaier and colleagues 69 have recently demonstrated that radiolabelling of the anticanine CD45 antibody using isocyanate and isothiocyanate provides a more specific delivery to the targeted CD45-expressing cells than a method exploiting thiol-maleimide conjugation (see Section 1.3.2). However, it has been shown by Banks and Paquette 70 that, compared to NHS ester based methodology, antibody conjugates prepared with isothiocyanates are less hydrolytically stable and deteriorate over time. Moreover, the reaction of NHS esters for amine labelling was found to be faster, to give more stable conjugates for both model amino acids and proteins, and to proceed readily at lower pH, compared to isothiocyanates. Consequently, NHS esters are preferable to isothiocyanates in many respects for synthesizing bioconjugates.
1.1.2 Activated esters. Because of the poor leaving ability of the alkoxy groups, alkyl esters of carboxylic acids are inert to amines in aqueous media. 71 However, their substitution by good leaving groups activates the carbonyl and renders it susceptible to nucleophilic attack. It is worth remarking that not only does such activation increase the reactivity of these reagents towards free amino groups, but also often augment their tendency to degrade in the presence of water. 72 Although many activating moieties have appeared over years, only a limited number of them are of significant importance in bioconjugation today. For instance, formerly of significant importance, especially in the field of peptide synthesis, activated phenyl esters are almost of no use today in bioconjugation because of their lower kinetics and lower stability compared to succimidyl esters. [73][74][75] However, they continue to reappear in certain studies. 23 N-Hydroxysuccinimide (NHS) activated esters were introduced in 1963 by Anderson et al. as a better alternative to phenyl esters in forming the peptide bond. 30,76,77 Possessing high selectivity towards aliphatic amines, NHS esters are today Fig. 4 Reaction of isothiocyanates with nucleophilic amino acid residues present in proteins. Only the reaction of lysine and N-terminal residues (considerable at pH 4 5) results irreversibly in obtaining the thiourea. Although the reactions of thiol and alcohol groups with isothiocyanates are reversible, they can largely accelerate the rate of isothiocyanate hydrolysis.  Burtnick. 43 considered among the most powerful protein-modification reagents. Although several studies drew attention to a certain reactivity of NHS-activated esters with tyrosine, [78][79][80][81][82][83] histidine, 84 serine and threonine (especially when situated in certain locations, see Section 1.2), 85-90 these side reactions possess largely decreased rates compared to the reaction with free amines and do not generally hinder the amine-selective derivatisation. High concentrations of nucleophilic thiols should however be avoided because, similarly to isothiocyanates, they may increase the rate of probe degradation by forming more easily hydrolysable intermediates (Fig. 6).
The optimum pH for NHS-mediated labelling in aqueous systems was found to be lower than for other amine-selective reagents and ranges from 7 to 8 units (compared with 9-9.5 for isothiocyanates), which enlarges the prospect of its suitability for modifying alkaline-sensitive proteins. Several elaborated studies of the kinetics, 72 the stoichiometry, 91 and the selectivity (in-chain versus N-terminal modification) 92 of NHS-mediated protein tagging have been recently reported.
Depending on the pH of the reaction solution and temperature, NHS esters are hydrolysed by water (possessing a half-life of 4-5 hours at pH 7, 1 hour at pH 8 and 10 minutes at pH 8.6), 84,93 but are stable to storage if kept well desiccated. Virtually any molecule containing an acid functionality, or a moiety which can give an acid, can be transformed into its N-hydroxysuccinimide ester. While the activation with NHS generally decreases the watersolubility of the carboxylate molecule, the utilisation of sulfo-NHS 94 preserves or even increases the water-solubility of the modified molecule by virtue of the charged sulfonate group. The development of new reagents based on NHS chemistry can be sometimes challenging, 95 but the derivatives are frequently of very important use. [96][97][98][99][100] Many NHS derivatives for the preparation of affinity reagents, fluorescent probes and cross-coupling reagents are now commercially available, enabling wide access to investigations.
The formed conjugates are linked by means of a very stable aliphatic amide bond with half-lives in the range of 7 years in water. 101 This excellent stability and biocompatibility of the obtained bonds have provided an exceptional importance of NHS esters in the field of bioconjugation.
NHS ester-mediated covalent conjugation for protein modification has been first accomplished by Becker et al., who studied biotin transport first in yeast 74 and then applied this technique to the covalent attachment of biotin to bacteriophage T4. 102 Since then, the field of NHS-mediated conjugation of proteins has been unceasingly expanding its employability in countless applications.
Cross-linking of proteins often implies using NHS-containing homobifunctional or heterobifunctional cross-linking reagents. These were used for elucidation of protein-protein [103][104][105][106][107] and protein-drug interactions, 108 protein structural and subunit analysis, 26,109 create protein complex models, 110 and preparation of protein conjugates with enzymes, drugs or other macromolecules. [111][112][113] Homobifunctional NHS cross-linkers are generally used in reaction procedures to randomly ''fix'' or polymerize peptides or proteins through their amino groups. Adding such crosslinkers to a cell lysate will result in the random conjugation of interacting proteins, protein subunits, and any other polypeptides whose Lys side chains happen to be in close proximity to each other. This represents a methodology for capturing a ''snapshot'' of all protein interactions at a certain instant of time. Using this approach, for instance, Sinz and collaborators were able to elucidate binding of calmodulin to mettilin, a polypeptide and principal component of honeybee venom, without chromatographic separation techniques. 114 Cross-linking of the proteins with a number of different length NHS-homobifunctional cross-linkers, and the following digestion of obtained products with trypsin and analysis by HPLC enabled the possibility of three-dimensional structure modelling of the calmodulin-melittin complex (Fig. 7).
Several applications however require the precision of crosslinking which cannot be provided by homobifunctional crosslinkers. For example, the preparation of an antibody-drug conjugate (ADC) implies selective linking of a cytotoxic payload to each molecule of the antibody without causing any antibodyto-antibody linkages to form. For such application the combination of different selective approaches in one linker is needed.
Therefore, heterobifunctional crosslinkers are designed to possess different reactive groups at either end. These reagents allow for sequential conjugations that diminish undesirable self-conjugation and polymerisation. Sequential procedures involve two-step processes, where heterobifunctional reagents Fig. 7 Mode of binding of melittin in the calmodulin-melittin complex (pdb: 2MLT and 1CDL) calculated from ambiguous distance restraints derived from the cross-linking data by Sinz and associates. 114 Fig. 6 Preparation of activated NHS-esters and their reaction with nucleophilic amino acid residues present in proteins. Similarly to isothiocyanates, the reversible reactions of thiol and alcohol groups with NHS-esters largely increase the rate of hydrolysis thereof.
(often in excess to ensure high conversion levels) are reacted with one protein using the most labile group of the crosslinker first. After eliminating the excess of the nonreacted crosslinker, the second protein is added to a solution containing modified first protein and another reaction occurs with the second reactive group of the crosslinker. According to the Pierce website (Rockford, IL, USA), the most popular heterobifunctional crosslinkers are those having amine-reactive NHS esters at one end and thiol-reactive maleimides (see Section 1.3.2) at the other end. Because of its less stability in aqueous solution compared to maleimide, the NHS-ester group should usually be reacted first. Takeda and co-workers 115 used a bifunctional reagent that contained a NHS function and a benzylthioester function to prepare a DNA-protein hybrid. One of the fastest growing fields requiring heterobifunctional crosslinkers today is targeted drug delivery therapies -ADCs. 22,[116][117][118][119][120] They are constituted of three main components: one monoclonal antibody (mAb), targeting specific signs or markers of cancer cells, one cytotoxic agent, and one linker molecule that allows covalent drug binding to the mAb. The composition of trastuzumab emtansine (Kadcyla s , Genentech), an in clinic ADC for treatment of HER2-positive metastatic breast cancer, is depicted in Fig. 8.
The first example of a ''cleavable'' NHS cross-linking reagent, DSP, was reported by Lomant and Fairbanks 93 and allowed effecting the reversal of the previously conjugated fragments under mild conditions of disulfide bond reduction. Further advances in the field have resulted in various types of linkers, cleavable under mild nucleophilic conditions (EGS), 121 at basic pH (BSOCOES), 122 in the presence of periodate (DST), 123 or enzymatically. 124 These found their applicability for studies in basic and applied research. The reader is directed to a recent review by Leriche et al. 125 that provides an overview of chemical functions that can be used as cleavable agents and to a publication by Jin Lee 126 for an overview of commercially available cross-linking reagents.
Other combinations of functionalities have been studied over the last 20 years and resulted in elaboration of heterotrifunctional 127 linkers usually combining two bioselective reactive groups and a functionality for anchoring the obtained conjugate (e.g. the biotin moiety).
Many chemical probes widely used in bioconjugation contain the NHS-fragment in their structure and are designed to react with free amino groups of proteins. For example, biotinylation [128][129][130] as well as PEGylation 2 of proteins are most commonly achieved using NHS-activated probes today. It was recently reported by Anderson and collaborators that biotinylation of antibodies with NHS-biotin and their following adsorption on the surface of nanocrystal quantum-dots (QD) results in obtaining highly efficient QD-antibody conjugates for the detection of protein toxins. 131 Other types of protein immobilisation on matrices have also been reported. 84,132,133 The Bolton-Hunter reagent (SHPP), 134 allowing the conjugation of tyrosine-like residues for increasing the yield of subsequent (radio)iodination, is also based on N-hydroxysuccinimide chemistry. [135][136][137][138][139][140] Elaborated in 1982 by Ji et al. 141 structurally similar SHPP photoactivable heterobifunctional probes for cross-linking experiments have been used in more than 100 studies ever since. The NHS ester-based strategy for isobaric, stable isotope labelling of peptides [142][143][144] has recently found more widespread application in proteomic studies with simultaneous developments in enhancing peptide detection by electrospray ionisation mass spectrometry. 78,145 This list can be continued and arguably utilisation of NHSmediated techniques can be found in all major fields of protein conjugation and represents a gold standard in bioconjugation.
1.1.3 Reductive amination of aldehydes. Aliphatic and aromatic amines react under mild aqueous conditions with aldehyde groups to form an imine (known as a Schiff base). This intermediate can then be selectively reduced by a mild reducing agent, such as sodium cyanoborohydride, 146 to give a stable alkylamine bond. Although this approach for amine modification is not used in protein conjugations as frequently as the activated ester or isothiocyanate method, it is to be considered as preferable when the molecule to be attached has an aldehyde group (or can be easily converted to an aldehyde) because of its simplicity and mild reaction conditions.
Historically, the conjugation of oligosaccharides to proteins has become the first target for this approach. In 1974, relying on the exceptional ability of the cyanoborohydride anion described three years earlier by Borch 146 to reduce selectively Schiff bases generated in situ from an amine and an aldehyde, Gray has illustrated the possibility of mild synthesis of carbohydrate coated bovine serum albumin (BSA) and P150 protein (Fig. 9). 147 However, because of low kinetics of conjugation, only 4 out of 59 BSA lysine residues (presumably those possessing the lowest pK a values) were derivatised after 300 hours of reaction.
Reductive amination of proteins proceeds most readily at pH 6.5-8.5 where the reduction of aldehydes and ketones is negligible, and, if feasible, in an alcoholic solution under dehydrating conditions where the rate-limiting formation of the imine is favoured. According to Allred and colleagues, 148 the addition of sodium sulfate (500 mM) may largely improve the coupling efficiency in aqueous media.
To date, reductive amination has played a central role in the synthesis of carbohydrate-protein conjugates, 20,149-151 which have been used for years to study the molecular recognition of carbohydrates. 152 Among these conjugates, polysaccharide-protein conjugate vaccines such as Menactra, HIBTiter, and Prevnar are FDA approved and used routinely for the prevention of invasive bacterial infections (Fig. 10), 20,153 and potential anti-infective and anti-cancer agents are currently in clinical trials. 23,151,154,155 The reader is directed to a recent comprehensive review of Adamo et al. 24 covering the current status and future perspectives of carbohydrate-protein conjugates.
Another reported application of reductive amination includes the preparation of an organic trialdehyde to be used as a template for the synthesis of three-helix bundle proteins, 156 protein PEGylation 157,158 and immobilisation. 159 Reductive amination however possesses several drawbacks preventing it from being generally applicable to protein conjugation. 160 The most important is the necessity to use watersensitive sodium cyanoborohydride, which has the potential for reducing disulfide bonds within proteins. As an alternative, McFarland and Francis 161 have reported a water-stable iridium catalyst (Fig. 11). However, the efficiency of the method is lower than that of the classical reduction with cyanoborohydride.
1.1.4 Sulfonyl halides and sulfonates. Introduced in 1952 by Weber 162 for fluorescent labelling of macromolecules, dansyl chloride (DNSC) was the first widely used sulfonyl chloride for the modification of proteins. It has gained incontestable popularity for the study of proteins after Hartley and Massey have successfully used it for the determination of the active centre of chymotrypsin. 163 DNSC-Edman degradation was proposed by Gray 164 to improve the ease and reproducibility of a classical isothiocyanate-based Edman degradation. 38 Sulfonyl halides are highly reactive but also very unstable, especially in aqueous media at the pH required for reaction with aliphatic amines. For example, Haugland and collaborators 165 have demonstrated that the rate of hydrolysis of Texas Red (one of the most widely used long-wavelength fluorescent probes) 166 and Lissamine rhodamine B sulfonyl chloride was much higher (complete hydrolysis within 5 minutes in pH 8.3 aqueous solution) than that of corresponding NHS esters (both retained most of their reactivity for more than an hour under the same conditions). Yet, the formed sulfonamide bonds are extremely stable and even survive amino acid hydrolysis, 164,167 which makes sulfonamide conjugates useful for the applications where the stability of the conjugation bond is a crucial feature.
Optimal conditions of protein modification by sulfonyl chlorides are those under which free amino groups most effectively compete with water for a limited amount of the reagent. It is thus best done at low temperature at pH 8.5-9.5. 168 At lower pH values, the unreactive protonated form of amines slows the labelling reaction compared to the hydrolysis by water, above this range the reagent is hydrolysed too rapidly. 169,170 In practical experiments, a several-fold excess of the reagent is usually added, providing the unused probe is hydrolysed to the corresponding unreactive sulfonic acid after labelling. It must be borne in mind that unlike other amine-selective reagents, sulfonyl chlorides are unstable in dimethylsulfoxide, classically used for the preparation of stock solutions, and should never be used in this solvent (Fig. 12). 171 Apart from being reported for fluorescent labelling of proteins, 172 sulfonyl chlorides were used to incorporate a chelate moiety into proteins, 173 to study hydrodynamic properties or introduce long-lived fluorescence labels into macromolecules   using tagging with pyrene derivatives 174,175 or as cross-linking reagents. 176 Because of their very high reactivity towards nucleophiles, sulfonyl halides also form conjugates with tyrosine, cysteine, serine, threonine, and imidazole residues of proteins; 177 therefore, they are less selective than either NHS esters or isothiocyanates. These conjugates are however unstable and can be completely hydrolysed under basic conditions. Covalent immobilisation of proteins on hydroxyl group containing carrying supports (such as agarose, cellulose, diol-silica, or polylactic acid films) is often accomplished by transforming the latter into corresponding sulfonates: tosylate, mesylate, or tresylate, 178,179 serving as good leaving groups (Fig. 13). [180][181][182] Albumin, cytokines and other therapeutic proteins and peptides were reported to undergo mild PEGylation by means of PEG tresylates. [183][184][185] Although rather specific to amino groups, the chemistry of tresylate-mediated conjugation is not unique and well defined. For instance, Gais et al. have shown that PEGtresylate conjugation can produce a product that contains a degradable sulfamate linkage resulting in heterodispersity of the reaction. 186 1.1.5 Fluorobenzenes. Despite their utmost importance for protein modification and amino group quantification since Sanger and Tuppy's work on the structure of insulin, 187 derivatives of fluoronitrobenzene are of limited usefulness for bioconjugation.
Compared to other aryl halides, fluoro-substituted nitrobenzenes were found to be the most reactive in bimolecular nucleophilic substitution reactions. 188 They are usually regarded as aminoselective reagents, despite their known reactivity towards thiolates, phenolates and imidazoles, as the products obtained in these reactions are either unstable at alkaline pH required for the reaction (Tyr and His) or can be thiolysed by excess b-mercaptoethanol (Cys). 189 4-Fluoro-7-nitro-2,1,3-benzoxadiazole (NBD-F), which has been introduced as a fluorogenic reagent for more than 30 years ago by Imai and Watanabe,190 still remains important for several applications, mainly pre-column derivatisation and enrichment of peptides. The reader is referred to a recent review by Elbashir et al. 191 providing an excellent overview of the NBD-F applicability to the analysis of peptides and to a complete overview of NBD-mediated methodologies for the fluorescent labelling of amino acid residues by Imai and associates. 192,193 An elegant approach for improving protein crystallizability, still remaining a major challenge in protein structure research, 194 was elaborated by Sutton and collaborators 195 and consists in the introduction of a charged ammonium residue. It exploits the amine-selective derivatisation of protein by 1-fluoro-2-nitro-4-trimethylammoniobenzene iodide (Fig. 14) and results in increasing the hydrophilicity thereof. Using their approach, the authors were able to study the binding site 196 and to obtain crystalline derivatives of modified bovine insulin, 197 which is especially hard to crystallize without inducing structural changes. 198 A similar protocol was used by Ladd et al. 199 for chromophorical PEGylation of proteins with polyethylene glycol fluoronitrobenzene derivatives.
1.1.6 Imidoesters. First investigated by Hunter and Ludwig in 1962, 200 the reaction of imidoesters with peptides reached its climax for protein modification ten years after, when Traut et al. 201 introduced the 2-iminothiolane reagent (today carrying Traut's name) for cross-linking. It allowed for producing disulfide-linked dimers of neighbouring proteins on the intact 30s ribosome from E. coli using a two-step procedure: the reaction of ribosomal amino groups with the imidoester function followed by the mild oxidation of the obtained thiolate-charged ribosomes (Fig. 15). An excellent review of the cross-linking studies for the determination of ribosomal structure was published by Nomura. 202 Imidoesters react with primary amines to form amidine bonds. A high specificity towards amines can be achieved when alkaline conditions (pH 10) and amine-free media, such as borate buffer, are used. 202 This places imidoesters among the most specific agents for amine labeling. Because the resulting amidine bonds are protonated at physiological pH, positive charges near modified sites are preserved during the conjugation Fig. 14 Derivatisation of bovine insulin with 1-fluoro-2-nitro-4-trimethylammoniobenzene iodide described by Sutton et al. 195 Only amine-containing residues of A1 and B1 chains are shown. Two of the four tyrosine residues present in chains A1 and B1 (not shown) also react with the probe under described conditions.  with lysines and N-termini. Consequently, as was first demonstrated by Wofsy et al. 203 such modifications produce little or no significant changes in the conformational properties and biological activities of proteins.
Thiolates obtained after the ring opening of Traut's reagent by free amines enable a plentiful thiol-selective chemistry on modified aor e-amino groups of proteins (see Section 1.3). Although, many imidoesters other than Traut's reagent are today commercially available (for example, see DMA, DMP, or DTBP), the amount of described labelling imidoester probes is rather scarce.
Schramm et al. 204 have described the synthesis of fluorescent imidoester dyes from corresponding nitriles; the approach was later used by Bozler et al. 205 for the preparation of dansyl containing imidoester and selective modification of lysine residues in the active site of glucose dehydrogenase. New readily available reagents for the attachment of sugars to proteins via imidoester linkage, 206 hydrophilic spin probes for determining membrane protein interaction using EPR, 207 immunoreactive probes, 208 tyrosinelike probes for radioactive labelling with 125 I, 209 protein PEGylation reagents, 210 and the immobilisation of trypsin, yeast alcohol dehydrogenase, and E. coli asparaginase onto several types of organic polymer beads 211 were achieved via imidoester conjugation and proven to have several advantages compared to other existing methodologies, namely, deprivation of solubility issues and retention of positive charge at the reaction site.
1.1.7 Miscellaneous amine-selective reagents. Several methods of amine-selective modification of proteins were not included in the main chapter, either because of scanty information available or their applicability reduced to a specific substrate type and is not general.
Azetidinone chemistry has recently been demonstrated by Barbas and collaborators 212,213 to have potential for selective lysine labelling of a particular IgG framework, containing a very reactive lysine residue with an unusually low pK a of about 6. Some detailed procedures are described for a smooth opening of a b-lactam moiety resulting in a b-alanine peptide bond. 213 Discovered by Tietze et al. as two-step sequential procedures for coupling of amines, 214,215 squaric acid diester amine-amine conjugation is now actively developed by Wurm et al., who have recently reported their successful use for the one-pot preparation of poly(glycerol)-protein 216 and glycol-protein conjugates 217 in aqueous media (Fig. 16).
Dichlorotriazine derivatives were described for amine-selective conjugation mainly as fluorescent dyes 218,219 and PEGylation probes. [220][221][222] They were shown to possess high reactivity towards protein amines. However, as was demonstrated by Abuchowski et al., 220 because the hydrolysis of dichlorotriazine occurs readily under slightly basic conditions (pH 9.2) needed for reaction to take place with sufficient selectivity towards amines, a considerable excess of the probe must be used in the coupling reaction. Banks and Paquette 70 have conducted a comparative study of three fluorescent probes, differing only in the moiety responsible for the reactivity with amines: CFSE (NHS ester), DTAF (dichlorotriazine) and FITC (isothiocyanate). It was found that the rate of conjugation is significantly faster for the NHS ester compared to the diclorotriazine probe, which, in turn, reacts faster than the isothiocyanate derivative. Each conjugate provided a satisfactory level of stability in solution over a period of 1 week at room temperature, although the hydrolysis of the remaining, relatively inert, chloro group of DTAF was observed (Fig. 17).
Arpicco et al. 223 have prepared thioimidoester activated PEGcontaining derivatives and shown their superiority over the NHS-activated analogue for gelonin modification (the reaction was conducted in PBS at pH 7.4). In this particular case, PEGylation with a less active, compared to NHS ester, thioimidoester derivative resulted in the gelonin conjugate with higher inhibiting activity. Ikeda and associates 224 have recently described a protocol for the preparation of the glutalaldehyde-functionalised PEG reagent, allowing for protein PEGylation under mild reaction conditions. Similarly, the modified protein exhibited higher biological activity than when reacted with a corresponding NHS-activated PEGylation reagent.
a-Halocarbonyls, such as iodoacetamides, can modify lysine residues at pH 4 7.0, 225 but the reaction rate is much slower than the reaction with cysteine residues. Another class of   Fig. 15 Procedure of the cleavable crosslinking of the intact 30s ribosomes (pdb: 1J5E) described by Traut et al. 201 Lysine residues (Lys-72 and Lys-156) were chosen randomly for simplicity purposes (TEASH stands for triethanolamine buffer adjusted to 3% 3-mercaptoethanol).
reagents usually used in cysteine-selective conjugation -vinylsulfones 226 (see Section 1.3.3) -was recently reported to be applicable for lysine labelling at slightly basic pH. 227,228 Modification of Lys residues with acid anhydrides, including succinic, citraconic, maleic, trimellitic, cis-aconitic, and various phthalic anhydride derivatives belongs to a pool of classically used protein modification methodologies 229 and allows for transforming nucleophilic amines into acids and, as a result, enables carboxylate-selective chemistry thereof.
For more details of the practical aspects of using the abovedescribed methodologies, the reader is referred to a recent review by Brun and Gauzy-Lazo 230 on the preparation of antibody-drug conjugates by lysine conjugation.

Serine and threonine
With pK a values 4 13, the hydroxyl groups of serine (Ser, S) and threonine (Thr, T) are rather poor nucleophiles close to physiological pH. No examples of direct conjugation of in-chain serine and threonine have been therefore reported to date.
However, highly amine selective N-hydroxysuccinimide (NHS) esters have been documented to give occasional side reactions with hydroxyl side chains. 83,90,92,231 In a series of experiments, Miller et al. have demonstrated that the presence of histidine in sequences of the type His-AA-Ser/Thr or His-AA-AA 0 -Ser/Thr (where AA and AA 0 stand for any amino acid) can significantly increase the reactivity of hydroxyl groups toward classical amine labelling agents (Fig. 18). 85,86,88,232 Similarly, Mädler and Zenobi have reported that the guanidinium group of arginine can contribute to the reactivity of hydroxyl groups toward NHS esters and catalyse the nucleophilic substitution. 233 In both cases, it is hypothesised that the imidazolyl and guanidine moieties of histidine and arginine, respectively, catalyse the reaction by stabilizing the transition state by means of hydrogen bonds and electrostatic interactions. This promoting effect is thought to be responsible for side reactions on several substrates while using cross-linking reagents. 92,233 Despite the fact that methodologies of selective in-chain serine and threonine labelling are rather scarce, these residues are of special interest for bioconjugation when located on the N-terminus (see Section 2.2).

Cysteine
Cysteine (Cys, C) is perhaps the most convenient target for bioconjugation owing to the exceptionally high nucleophilicity of its sulfhydryl (-SH) side chain which, and particularly in its deprotonated thiolate form (-S À ), largely exceeds the reactivity of any other nucleophilic function in proteins. 234 Furthermore, its relative rarity in proteins present in living organisms (1-2%) 235,236 and the ease of its introduction into a specific site by site-directed mutagenesis allows access to protein assemblies with a single cysteine at a predetermined position. 237 Even in proteins with multiple cysteines, the multiplicity is usually much smaller compared to lysines, which makes thiol-reactive labelling the preferred approach over amine-reactive methodologies.
In proteins, thiols can also be generated by selectively reducing cystine disulfides with reagents such as dithiothreitol (DTT, D1532), 238 2-mercaptoethanol (b-mercaptoethanol), or tris[2-carboxyethyl]phosphine (TCEP). 239,240 Generally, all these reagents must be removed before conducting thiol-selective conjugation, as they will compete with target thiols in proteins otherwise. 241 Unfortunately, removal of reducing agents is sometimes accompanied by air oxidation of thiols back to disulfides. Although, in contrast to the majority of thiol-reducing agents, TCEP does not contain the thiolate group, there have been several reports that it can react with a-halocarbonyls or maleimides and that labelling is inhibited when TCEP is present in the reaction medium. 242,243 Direct labelling of the thiolate group is usually achieved by either a nucleophilic addition or displacement reaction with the thiolate anion as the nucleophile. The substantially less dissociation energy of sulfhydryl groups compared to the corresponding alcohols provides much higher acidity of the former and, as a consequence, a wider availability of its slightly nucleophilic anionic form at physiological pH.
1.3.1 a-Halocarbonyls. First reports on the use of a-halocarbonyl electrophiles, namely iodoacetamides, date back to 1935, when Goddard and Michaelis 244 have first reported their application in modifying and studying keratin. Even today, almost 80 years later, these electrophiles are still among the most widely used for the modification of cysteine, especially in mass spectral analysis and peptide mapping of cysteine containing proteins. 245 Use of iodo compounds is typical because, as iodide is a better leaving group among other halogens, these render higher reaction rates for conjugation (the relative reactivity is I 4 Br 4 Cl 4 F). Iodoacetyl-containing crosslinkers, biotinylation reagents, immobilisation kits, and mass spectrometry tags are now commercially available (e.g. BIAM, SIAB, UltraLinkt Iodoacetyl Resin and Gel, iodoTMTt). Although corresponding maleimide reagents are more popular because of their even higher reaction rates, the haloacetyl-mediated conjugations are usually preferred for the applications where the elevated stability and compact size of the generated linkage (compared to maleimide) are crucial. Indeed, such bioconjugates degrade to S-alkyl cysteine derivatives only during amino acid hydrolysis.
Typically, the reaction of sulfhydryl groups with haloacetamides is conducted under physiological and alkaline conditions (pH 7.2-9.0). When iodoacetamides are used, the reaction is preferably carried out under subdued light in order to limit free iodine generation, which has the potential to react with Tyr, His and Trp residues. The reaction is most specific for sulfhydryl groups at pH 8.3. The iodoacetyl group is known to react with other amino acid side chains, especially when there is no cysteine present or if a gross excess of iodoacetyl is used. For instance, free amino groups, the thioester of methionine, and both imidazolyl side chain nitrogens will react with iodoacetyl groups above pH 7 and 5, although with much slower kinetics. 246 This, however, can be resolved by the use of less reactive chloroacetamides 247 or cautious control of pH and incubation time.
It is to be noted that the local environment has a profound effect on the reactivity of cysteine residues in proteins. If moderately reactive reagents such as iodoacetamide are used for bioconjugation, this difference in reactivities makes it possible to discern different types of Cys moieties present in the protein. Almost half a century ago, Gerwin 248 reported dramatic differences in the reactivities of chloroacetic acid and chloroacetamide in the modification of the active-site cysteine of streptococcal proteinase, which was found to be due to the influence of the neighbouring histidine residue. As a general trend, cysteine residues possessing lower pK a values are more reactive when reaction is conducted under neutral or slightly acidic conditions, owing to their greater degree of dissociation and, as a consequence, higher concentration of the corresponding thiolate anions in the medium. For instance, Kim et al. 249 have described a method for selective biotinylation of low-pK a cysteine residues in proteins simply by conducting the reaction at slightly acidic pH (Fig. 19).
Davis and Flitsch 250 described a procedure for the selective glycosylation of proteins at one or several sites by reacting the carbohydrate-tethered iodoacetamides with cysteine side chains, which allowed for preparing homogeneously glycosylated human erythropoietin 251 and dihydrofolate reductase. 252 In 1948, Mackworth 253 published his study on the reactivity of the biochemical mechanism of the lachrymatory effect of certain war gases and first reported the reactivity of structurally relevant a-bromoacetophenones for the inhibition of several classes of thiol enzymes.
Despite advances made in the investigation of a-haloacetophenones and related ketoximes for the modification of the active sites of enzymes, [256][257][258][259] their utility for the conjugation is very limited because of various side reactions.
An interesting approach that allows photochemical conversion of cysteine into corresponding thioaldehyde and then to aldehyde thought to be formed by Norrish type II cleavage was reported by Clark and Lowe. 254,255 Photolysis of the enzyme, alkylated by a bromoacetophenone derivative, results in spontaneous loss of hydrogen sulfide from the generated thioaldehyde to give the corresponding aldehyde (Fig. 20), which can either be utilised as a locus for aldehyde-selective conjugation or be transformed into the corresponding serine or glycine residue by reduction or transamination respectively.
1.3.2 Maleimides. As early as 1949, maleic acid imides (maleimides), products of the reaction of maleic anhydride and amine derivatives, were introduced by Friedmann as cysteinespecific reagents. 260,261 Ever since, persistently gaining in popularity maleimide-mediated methodologies represent today perhaps the most often used functional groups for bioconjugation. This is mainly due to their exceptionally fast kinetics and significantly high selectivity toward the cysteine moiety in proteins.
The reason for such remarkable reactivity of maleimide towards thiolates is worth being discussed. In general, the electrophilicity of alkenes is defined by their ability to serve as acceptors of nucleophile's electron density, and thus interrelated to the energy of electrophile's p* orbital (its lowest unoccupied molecular orbital, LUMO). Generally speaking, the rule is simple: Fig. 19 Biotinylation of the low-pK a cysteine residue of rabbit muscle creatine kinase (CK, pdb: 2CRK) by BIAM. 249 The charge interaction between the negatively charged thiolate and the positively charged amino acid residues nearby results in a significantly lower pK a value of the CYS-283 residue (6.5). Consequently, selective alkylation thereof becomes possible in the presence of three other cysteine residues with higher pK a values (8.0-9.0).

Fig. 20
Photolysis of the CYS-25 residue located in papain's active site (pdb: 1PPN) after its labelling with 2-bromo-2 0 ,4 0 -dimethoxyacetophenone results in the formation of the unstable thioaldehyde, which readily hydrolyses to give the corresponding aldehyde. 254,255 the lower the energy of the alkene's p* orbital -the faster its reaction with nucleophiles. There exist two main approaches for decreasing alkene's LUMO energy: the direct attachment of an electron-withdrawing group (EWG) and the straining of the double bond. Although proceeding via two different mechanisms: by decreasing the energy of both orbitals or by diminishing the energetic gap between them, either approach results in lowering the LUMO energy of the alkene and, as a result, in the increase of its reactivity (Fig. 21). The unique reactivity of the maleimide moiety owes to the fact that it exploits these two mechanisms together. 262,263 To date, a large variety of maleimide-based modifying reagents are available from a number of leading biochemical companies with even more being synthesised in laboratories around the world for specific applications. The applications of these reagents strongly overlap those of iodoacetamides, although maleimides apparently do not react with methionine, histidine or tyrosine. 264,265 The optimum reaction conditions for maleimide-mediated conjugation, namely conducting the reaction under near neutral conditions (pH 6.5-7.5, Fig. 22), prevent the reaction of maleimide with amines, because the latter requires a higher pH to occur. At pH above 8 the hydrolysis of maleimide itself results in obtaining a mixture of isomeric non-strained maleamic acids unreactive toward sulfhydryls and can thus compete with thiol modification. 28,266 Similarly, maleimide-thiol adducts hydrolyse, which either results in complete deconjugation or causes a significant change in the properties of the conjugate. 266 Furthermore, especially at pH above 9, ring-opening by nucleophilic reaction with an adjacent amine may yield crosslinked products. 267 Schuber and co-workers 269 have found that important kinetic discrimination can be achieved between the maleimide and bromoacetyl functions when the reaction with thiols is conducted at pH 6.5 and 9.0, respectively.
Maleimide-NHS heterobifunctional reagents are especially important for the formation of conjugates. Hydrolysis of both the maleimide moiety and the generated thioester linkage is considerably dependent on the type of chemical group adjacent to the maleimide. Interestingly, the cyclohexane ring was found to provide increased maleimide stability to hydrolysis due to its steric effects and its lack of aromatic character. For this reason, succinimidyl-4-(N-maleimidomethyl)cyclohexane-1-carboxylate (SMCC) and its water-soluble analogue (sulfo-SMCC) are today among the most popular crosslinkers in bioconjugation. They are often used in the synthesis of the protein-protein or protein-probe assemblies such as antibody-enzyme or antibody-drug conjugates respectively. These include enzyme immunoassays, [270][271][272][273][274] carrierprotein conjugates, [275][276][277] albumin-binding prodrugs, 278,279 and even approved therapies. 117,119,280,281 Short homobifunctional maleimides are commonly used to explore and characterize protein structure (i.e., oligomerisation) or protein interactions. [282][283][284][285][286][287][288] Maleimide-mediated immobilisation of biomolecules is often achieved either by direct conjugation 13,289,290 or by prior biotinylation of the molecule of interest. 291 The latter approach has been used for protein enrichment, 292 capture, 293,294 and immobilisation on modified supports. [295][296][297][298] Most of the optical thiol-selective fluorescent probes often used as sensors for monitoring biological processes are represented by maleimide-containing reagents. [299][300][301] Another testament to maleimide utility is its use in glycosylation, 302 radiolabelling, 303,304 studying protein interactions, [305][306][307][308] and quantitation of cysteine residues. 309,310 Despite its successful application as a reagent for the chemical modification of proteins, the irreversibility of maleimide's addition makes it impossible to regenerate the unmodified protein by controlled disassembly of the conjugate. Such necessity is however often desirable for in vitro or in vivo applications. Several studies were devoted to a mild and specific hydrolysis of the imido group in maleimide conjugates. 312,313 These approaches turned the originally irreversible maleimide-mediated thiol conjugation into the cleavable methodology. However, the harsh reaction conditions of this cleavage (strong basic conditions or the presence of a high amount of imidazole) make them incompatible with many fragile protein substrates.   . In contrast to methanethiosulfonates (see Section 1.3.6), 315 monobromomaleimides allow much more stable conjugation of thiolates, which are easily cleavable upon reaction with TCEP by the addition-elimination sequence (Fig. 23). Moreover, the initial modification of a protein resulted in obtaining a thiol-maleimide moiety, which was shown to be prone to a second thiol addition and resolved another recognised drawback of maleimide-based methodologies, namely the presence of only two points of attachment. 316 Similar to a non-substituted maleimide, the hydrolysis of thiol-maleimide linkage results in a dramatic decrease in its reactivity towards thiolates, which can be used for ''switching off'' the linker after the first thiol addition (Fig. 24). 317, 318 1.3.3 Vinyl sulfones. The Michael-type addition of vinyl sulfones (VS) is an attractive strategy for protein conjugation, because of the elevated water stability of VS function and almost quantitative yields of their reaction with thiolates. [319][320][321][322][323] The reaction of vinyl sulfones with lysine residues has been reported, 320,322 however, occurring only at high pH values (pH 4 9.3).
Initially, VS-mediated approaches have been used almost exclusively for PEGylation of proteins with end-functionalised PEG derivatives. 322,324 Several studies on the immobilisation of macromolecules on solid supports using vinyl sulfones were reported, owing to the elaboration of new methods for the preparation of VS-modified surfaces. 319,320,[325][326][327] Versatile VScontaining probes, namely carbohydrates, 227,328 chelating agents, 329 fluorescent tags, 330,331 and biotinylation reagents 331 were recently developed and applied in the bioconjugation of proteins (Fig. 25). Ovaa and co-workers 332 used vinyl sulfone handle to conjugate enzymes to a ubiquitin-like protein. The applications of VS-tags in proteomics have recently gained popularity and have been reviewed by Lopez-Jaramillo et al. 331 Vinyl sulfones react with thiols to form a stable thioether linkage to the protein under slightly basic conditions (pH 7-8). 228 The reaction may proceed faster if the pH is increased, but this usually also increases the amount of side-products (namely the modification of the Lys e-amino groups and the His imidazole rings). 320 The main advantage of VS-tags is their elevated stability in aqueous solutions, compared to more reactive thiols and maleimides, which can be subjected to ring opening or addition of water across the double bond. 185 1.3.4 Thiol-ene coupling. Discovered at the beginning of the last century by Posner, 333 free-radical-based hydrothiolation of terminal alkenes, also called the thiol-ene coupling reaction (TEC), has emerged as a powerful approach for the chemoselective modification of both peptides and proteins. 334,335 The initial step of the reaction is light-and/or initiator-induced generation of the thiyl radical. This adds to alkene in an anti-Markovnikov fashion to yield the thioalkyl radical. This leads to the propagation of the radical chain by abstraction of hydrogen from the other thiolate (Fig. 26).
The TEC conjugations (usually conducted in PBS-DMSO buffer at pH 7.0-7.5) are compatible with oxygen and aqueous media and are usually carried out upon irradiation at l max 365 nm in the presence of Vazo44 (2,2 0 -azobis[2-(2-imidazolin-2-yl)propane]dihydrochloride) as an initiator. The resulting thioether linkage is biologically stable and robust.
The first approach to protein conjugation namely glycosylation via TEC was reported by Davis et al. in 2009. 336 However, it consisted of photoinduced coupling of various glycosyl thiols with site-specifically introduced unnatural L-homoallylglycine. A complementary approach to peptide and protein glycoconjugation by photoinduced coupling on cysteines was first introduced almost at the same time by Dondoni and co-workers. 337   The 66 kDa globular bovine serum albumin (BSA) possessing one free CYS-34 residue was selected as a model protein.
Surprisingly, it revealed that not only the one CYS-34 SH group, as expected, but also two more SH groups arising from the 75 2 91 disulfide bond were modified. It was suggested that such hyperglycosylation was due to well-documented disulfide bond degradation by UV-irradiation, 338 namely to an electron transfer process from photoexcited tryptophan residues. Furthermore, prolonged irradiation of the reaction mixture up to 2 hours induced the introduction of seven glycoside residues into BSA. Despite the necessity for UV-irradiation, ensuing side-reactions, and often moderate yields, 335 the fact that, in contrast to the majority of thiol-selective methodologies, TEC does not exploit elevated nucleophilicity of the thiolate but its readiness for the generation of radicals makes it especially tolerant to a wide range of functional groups. For instance, Garber and Carlson 339 have used this feature of TEC for selective capping of thiols in the presence of thiophosphorylated groups, free alcohols and amines.
Several approaches involving the combination of cysteineselective methodologies have been recently reported. Stolz and Northrop 340 studied the reactivity of N-allyl maleimides and found this scaffold to be appropriate for consecutive two-step conjugation of thiols: via (1) base-initiated Michael-addition to maleimide moiety and (2) radical-mediated TEC of allyl-fragment. Scanlan and associates 341 developed the sequential NCL-TEC approach (for more details on NCL, see Section 2.3.1) for the functionalisation of the cysteine thiolate generated at the ligation site during native chemical ligation.
1.3.5 Thiol-yne coupling. After the rise of TEC for bioconjugation, its sister reaction of the hydrothiolation of alkynes, also referred to as thiol-yne coupling (TYC), began to receive increased attention. 342 Discovered in 1949 by Jones and collaborators, 343 TYC allows introduction of two thiol fragments across a carbon-carbon triple bond via a free-radical mechanism similar to TEC. The first step of the anti-Markovnikov addition of a thiyl radical to the triple bond yields an intermediate vinyl thioether capable of undergoing a second addition of the thiyl radical through the same mechanism, leading to the 1,2-dithioether ( Fig. 27).
TYC occurs under the same reaction conditions as TEC and smoothly proceeds at room temperature in aqueous solutions. First trials on the applicability of TYC on peptides were reported by Dondoni and collaborators in 2010. 344 The authors have demonstrated the possibility of dual glycosylation of a series of peptides (up to 8 residues). Later on, Davis and Dondoni have expanded the dual conjugation strategy for achieving sequential glycosylation and fluorescent labelling of BSA (Fig. 28). 345 Just as with TEC, the reaction also occurs at cysteine residues of the 75 2 91 disulfide bridge.
The necessity of using a photo-or a chemical initiator in both TEC and TYC conjugations represents the main drawback of these methodologies, as the presence of free radicals results in a series of side reactions, namely oxidation and crosslinking of proteins.
1.3.6 Disulfide reaction. Simple air oxidation of two thiolates to form a disulfide bond is probably the most straightforward among cysteine-selective conjugation techniques. Very simplistically, it consists of open to air stirring of a protein possessing a free cysteine residue with a thiol-containing probe for several days under basic conditions. 346 Apparently, a large excess of the thiolate-probe is required in order to reduce the likelihood of protein dimerisation. Treatment with iodine 347 was reported for the activation of cysteine formation of mixed disulfides. However, restricted control of product distribution and long reaction times largely limits the applicability of these methods for bioconjugation.
Diverse disulfides have been extensively used in the past decade for the modification of cysteine by disulfide exchange. This reversible reaction involves attack of cysteine thiolate at the disulfide, breaking the S-S bond, and subsequent formation of a new mixed disulfide. A well-known example of such reaction is colorimetric quantitation of free sulfhydryls with Ellman's Reagent. 348 Several symmetric disulfide-containing fluorescent probes such as BODIPY L-cystine and fluorescein L-cystine are commercially available. However, because there is no thermodynamic preference for this disulfide exchange to pass one way or another, labelling with non-activated disulfides generally requires use of a large excess of the probe to achieve sufficient levels of tagging. 349 In contrast, related activated thiols, namely thiosulfates (R-S-SO 3 À ), thiosulfonates (R-S-SO 2 -R 0 , MTS), sulfenyl halides (R-S-X), 350 pyridyl disulfides, 351,352 and TNB-thiols (derivatives of 5-thio-2-nitrobenzoic acid) contain good leaving groups, which tautomerise to give unreactive forms thus shifting the reaction equilibrium (Fig. 29). PEGylation, fluorescent and biotinylation probes containing thiosulfate (commercialised as TS-link reagents) and pyridyl  disulfide motifs are today widely commercially available. Thiosulfonates were first introduced for bioconjugation by Davis and co-workers 353,354 in their work on the controlled glycosylation 355 and further elaborated by Zhao et al. 356 for site-selective PEGylation of proteins. Recent advances resulted in the further development of thiol activated methodologies towards selenenylsulfides 357,358 and even methanedithiosulfonates, allowing for synthesizing trisulfide conjugates. 359 Disulfide-based conjugation was recently reported for the preparation of antibodydrug conjugates and studying the influence of the spacer length on their stability. 360 Diselene analogues of disulfide PEGylation reagents were proposed by Jevševar et al. 361 as selective and fast alternatives for coupling. Although high conversion yield required the use of a large molar excess of the probe, this elegant approach represents an interesting technology which deserves further investigation.
The main factor that has gained popularity to methodologies yielding disulfide and selenol-sulfide linkage is the reversibility they afford. However, the resulting conjugates are generally less stable than those obtained using bromomaleimides 315 (Section 1.3.2) and can be readily cleaved with classical reducing agents such as DTT, b-mercaptoethanol or TCEP. Yet, in the case of disulfides, the modified protein can be made more stable and resistant to reduction by the corresponding thioether-linked conjugate by means of HMPT-mediated desulfurisation elaborated by Davis and associates (Fig. 30). 362 The use of hindered disulfides represents another way to increase the resistance of generated conjugates to cleavage. 363 1.3.7 Disulfide rebridging. Use of thiol reactive reagents often requires the necessity for recombinant introduction of a free cysteine into the protein 364 because most proteins do not have a free cysteine. 365,366 This new free unpaired cysteine may cause disulfide scrambling, complicate protein refolding, 364 or lead to aggregation of the protein. 367 In contrast, most of the biologically relevant proteins possess at least one disulfide bond in their structure. 368 The direct reduction of disulfide bonds followed by conjugation with thiol-selective reagents is, however, usually inadmissible, since these are responsible for their structure, stability, or function 369,370 and must thus remain bridged after the modification in order not to alter protein tertiary structure.
In an attempt to resolve this problem, Brocchini et al. 371 developed a clever methodology for the PEGylation of protein disulfide bonds with a,b-unsaturated bis-thiol alkylating reagents. Covalent rebridging of the two thiols derived from the disulfide after its mild reduction allowed obtaining the modified proteins with retained tertiary structure and biological activity. Interferon a-2b (IFN) was used in the initial studies, because it is representative of four-helical-bundle proteins with accessible disulfide bonds. Following reduction of the disulfide in IFN's, the two free cysteines were re-joined using a threecarbon linked functional PEG. 368,371 The methodology was further expanded to PEGylation of therapeutic proteins, 368,372,373 antigen-binding fragments of immunoglobulin G, 374 and poly phosphocholine labelling of IFN. 375 Simultaneously with the introduction of previously mentioned monobromomaleimides, Baker and co-workers have introduced a relevant class of reagents containing a highly reactive dibromomaleimide or dibromopyridazonedione scaffold, allowing rapid and efficient disulfide rebridging by installing a rigid two-carbon linker. 316,376 This approach was first applied for equimolar PEGylation of 32-amino acid salmon calcitonin (sCT, Fig. 31) 377 and very recently for the preparation of homogeneous antibody conjugates. 378,379 Although being very rapid (full conversion is achieved in less than 5 minutes), dibromomaleimide-based conjugation resulted in obtaining a small amount of the multimers while modifying complex polypeptides. 378 Developed by Baker et al. 380 and Haddleton et al. 381 more stable and less reactive dithiophenolmaleimides allowed avoiding this apparent drawback of dibromomaleimide probes. In combination with benzeneselenols known for their efficiency in the catalysis of disulfide cleavage, the dithiophenolmaleimide approach allowed selective antibody fragment conjugation with no detectable formation of multimers and conserving a high reaction rate (Fig. 32). 378 Haddleton and colleagues 382 have recently reported an in situ one-pot preparation of oxytocin-polymer conjugates using dithiophenolmaleimidecontaining probes.
Structurally similar to dibromomaleimides, but containing four attachment points instead of three, dibromopyridazinediones (PD) Fig. 30 Two-step protocol for the preparation of thioether linkage via disulfide exchange reaction followed by HMPT-mediated desulfurisation (P(NMe 2 ) 3 , hexamethylphospotriamide) of the glycosylated mutant S156C SBL (wild-type pdb: 1GCI). 362  were recently described by Caddick et al. 383,384 and provided a platform for IgG antibody dual labelling. 384 Namely, the authors elaborated the preparation of a PD-linker containing two orthogonal reactive handles in its structure: (1) a strained alkyne, which readily reacts with azides in Cu-free strain-promoted azide-alkyne cycloaddition (SPAAC), and (2) a terminal alkyne, which reacts with azides in Cu-catalysed azide-alkyne cycloaddition (CuAAC). The construct obtained after rebridging a reduced antibody with the PD-linker was then used to selectively introduce two distinct functionalities (Fig. 33).
An interesting disulfide stapling-unstapling strategy using dichloro-s-tetrazine was developed by Smith and collaborators (Fig. 34). 385,386 In addition to their ability to be photochemically cleaved (i.e., unstaple thus regenerating reduced disulphide bonds; Fig. 34a), S,S-tetrazine macrocycles provide a possibility for labeling by exploiting the reactivity of the tetrazine in the inverse electron demand Diels-Alder reaction (Fig. 34b).
Organic arsenicals (similar to tetracysteine-selective biarsenical dyes initially developed by Tsien and co-workers, see Section 4) were recently exploited for efficient protein-polymer conjugation (Fig. 35). 387 It is noteworthy that, in contrast to highly thiol-reactive dibromomaleimides, these reagents demonstrated enhanced selectivity for disulfide rebridging in the presence of free Cys residues. Namely, while dibromomaleimides reacted near quantitatively within 30 minutes with the free CYS-34 of native BSA, organic arsenicals exhibited limited reactivity and demonstrated only about 20% labeling over the same period of time. 387 The authors hypothesised entropy-driven affinity of arsenicals for closely spaced dithiols to be the main reason of such specificity.
1.3.8 Transforming to dehydroalanine. b-Elimination of thiolate from the cysteine moiety turns this one of the strongest nucleophilic side chain into a dehydroalanine moiety (Dha, Fig. 36) representing an electrophilic centre for reactions with nucleophiles. Such ''umpolung'' in terms of nucleophilicityelectrophilicity opens an extremely interesting prospect for the Fig. 33 Dual labelling of IgG antibody using dibromopyridazinediones (PD, shown in orange). 384 The construct obtained after rebridging of a reduced IgG with a PD-probe containing a strained alkyne (reactive in strain-promoted azide-alkyne cycloaddition, SPAAC) and a terminal alkyne (reactive in Cu-catalysed azide-alkyne cycloaddition, CuAAC) was subsequently labelled with two azide-containing probes (N 3 -R 1 and N 3 -R 2 ).   transformations, whose applicability is generally largely restricted due to the general nucleophilic nature of amino acid side chains. Namely, Pd and Rh catalysed reactions, [388][389][390][391] Michael addition, 392,393 hydroboration. 394 Site-selective incorporation of Dha into proteins may be achieved by a considerable number of chemical transformations. Historically, the first example of such reaction was reported by Koshland and collaborators 50 years ago and consisted of transformation of nucleophilic serine of chymotrypsin to dehydroalanine via selective sulfonylation followed by base-mediated elimination. [396][397][398] The method, however, exploited the particularly high nucleophilicity of the SER-195 residue located in the active site of chymotrypsin. Logically, more general methods based on the exceptional nucleophilic properties of cysteine received increased attention in the years to follow. These are represented in Fig. 36 and include: reductionelimination, representing an often observed undesired side-reaction during reduction of disulfides; base-mediated elimination of activated thiolate, typically requiring temperatures incompatible with protein substrates and thus not being of synthetic interest; oxidative elimination, and bis-alkylation-elimination. 395 Two last approaches seems to us the most promising among available Cys -Dha transformations and it is to these methodologies that we now turn.
1.3.8.1 Oxidative elimination. Oxidative elimination of thiolates is readily achievable, but often required high temperatures and severe reaction conditions yielded the methodology incompatible with fragile protein substances. 399 However, recent efforts focused on finding milder conditions to carry out these desulfurisations, have resulted in developing two promising classes of chemical reagents: o-mesitylenesulfonylhydroxylamine (MSH, Fig. 37), 392,400 and bromomaleimides (above-mentioned for the generation of biscysteine adducts). 314 Basic conditions are generally required for the reaction to achieve high conversion yields. Other amino acid residues may also react with MSH and bromomaleimide, but the reaction rates are largely inferior to those of thiolates and resulting products are generally unstable in basic conditions and decompose back to their starting unmodified forms.
1.3.8.2 Bis-alkylation-elimination. Conversion of cysteine to dehydroalanine by means of bis-alkylation-elimination approach was first introduced by Holmes and Lawton. 401 Initially this transformation implied quite strenuous reaction conditions and was only compatible with a restricted number of protein substrates. Only recently have Davis and collaborators 395 reported a more general method for Cys -Dha transformation by means of water-soluble a,a 0 -dibromoadipyl(bis)amide (DBAA) allowing generation of Dha moiety under mild conditions (37 1C, pH 7.0-8.0) at sufficiently high yields. This approach was evaluated on several model proteins, including SBL (see above), 395 the single-domain antibody cAb-Lys3 A104C mutant, 395 histone H3 (Fig. 38), 393 AurA kinase domain, 402 and GFP mutant. 403 1.3.8.3 Other approaches. Despite recent advances in dehydroalanine-mediated conjugation methodologies, the inherent limitations of these methods preclude their general use for peptide and protein modification. None of these approaches enable general, chemo-and site-selective incorporation of dehydroalanine into proteins without the need for prior incorporation of an accessible Cys residue. Several other approaches continue to appear and are designed to overpass this problematic. These are oxidative elimination of aryl-selenocysteine, 399,404-408 utilising of lacticin synthetase, 409 transformation to selenocysteine thioethers. 410 1.3.9 Miscellaneous thiol-selective reagents. An example of a simple alkylation reaction that still remains relevant in bioconjugation is aminoethylation. Known for more than half a century, 411,412 it allows transforming cysteine thiolates into lysine mimicking thialysine residues by means of bromoethylamines or aziridines. Obtained thialysines were validated as appropriate synthetic substrates for further amine-selective transformations (see Section 1.1). Furthermore, the method was recently demonstrated appropriate for providing the access to more peculiar methylated lysine analogues. 413 The reaction is typically conducted at pH 4 8.5 to ensure a high level of cysteine deprotonation.
S N Ar substitution chemistry approaches for cysteine modification in proteins were reported by several research groups. 395,414,415 Davis et al. 395 utilised Mukaiyama's reagent (2-chloro-1-methylpyridinium iodide) to generate an arylated cysteine as an intermediate for  Fig. 37 Conjugation of mutant S156C SBL (wild-type pdb: 1GCI) containing a single, surface-exposed cysteine residue CYS-156 by oxidative elimination followed by conjugation with thiol probes. 392 conversion thereof to dehydroalanine. Pentelute and co-workers have expanded the approach towards perfluoroaryls for protein stapling and conjugation. 414,415 Finally, Barbas and associates 416 have developed a class of Julia-Kocienski-like methylsulfonylfunctionalised reagents, that reacts rapidly and specifically with thiols under biologically relevant pH (5.8-8.0). Notably, the resulting conjugates possess superior hydrolytic stability compared to cysteine-maleimide, which makes this methodology appropriate for the preparation of stable protein conjugates and PEGylated proteins (Fig. 39).
A strategy exploiting selective cyclization of peptides containing three cysteines to generate combinatorial libraries of cysteine-rich bicyclic peptides was recently developed. This approach is based on utilisation of homotrifunctional linkers: TBMB (tribromomethyl-), TATA (triacryloyl-), or TBAB (tribromoacetamide-containing reagents). 418,419 An efficient gold-catalysed allene-mediated coupling reaction has been recently developed by Che and colleagues. 420 The method allowed direct thiol-selective functionalisation of model peptides and reduced RNase A (Fig. 40).
Reactions of thiols with electron deficient acetylenes have been known for decades, being, however, mostly conducted in organic solvents. [421][422][423][424][425] Several examples of reactions in aqueous media have been recently reported. 417,420,426,427 Che and co-workers 417 have elaborated a versatile method for the selective cysteine labelling of unprotected peptides and proteins in aqueous media with arylalkynone reagents. Notably, modified peptides could be converted back into the unmodified peptides by treatment with thiols under mild reaction conditions (Fig. 41).
Interestingly, in contrast to arylalkynones, structurally similar electron deficient acetylenes -3-arylpropiolonitriles (APN) -were recently reported as a prominent class of reagents for irreversible tagging of cysteine. 428,429 A superior stability of resulting conjugates in aqueous and biological media opens an interesting prospect in many fields where stability of obtained conjugates is crucial, e.g. for preparation of antibody conjugates possessing increased plasma stability (Fig. 42). 429 Oxanorbornadienedicarboxylates (OND reagents), strained adducts of furans and electron-deficient alkynes, were found to provide better water stability while retaining selective, rapid, and fluorogenic reactivity towards cysteine compared to corresponding alkynes. 430 a,b-Unsaturated ketones and amides (typically acrylamides) can undergo Michael-addition. 431,432 However, the rate of addition is not generally high enough to provide it with competitive advantages compared with other approaches. Internal Cys residues were reported to accelerate native chemical ligation (see Section 2.3.1), an especially selective approach for N-terminal cysteine conjugation, via cyclic transition states. 433-438

Tryptophan
Tryptophan (Trp, W) is the second (after cysteine) low abundance amino acid with about 1% frequency (depending on the living organism), 235 but approximately 90% of proteins contain at least one Trp residue in their sequence. 439 The specific reactivity of tryptophan in proteins is one of the most challenging problems in bioconjugation. In spite of the variety of reagents introduced over the years for selective modification of tryptophan, only a few can be used for conjugation. For instance, such classically used species as Koshland's reagent (2-hydroxy-5-nitrobenzylbromide) 440 or chlorosulfonium ions 441 present a high degree of cross reactivity with nucleophilic side chains, nonetheless still being used in numerous studies. These are, for example, investigation of the role of tryptophan in active sites of enzymes, 442 estimation of its content in proteins, 443 or determination of the surface accessibility of Trp residues in proteins. 444 1.4.1 Malondialdehydes. In 2007, further exploring the reactivity of dicarbonyl compounds towards tryptophan described by Teuber and colleagues 445    with the indole nitrogen of the Trp side chain of 8-mer peptide PTHIKWGD under acidic conditions (Fig. 43). The obtained substituted acrolein moiety with a remaining reactive aldehyde group can be further converted to a hydrazone using hydrazide compounds, or using other methodologies for aldehyde conjugations. Hydrazine, phenylhydrazine and secondary amines such as pyrrolidine were reported to act as cleavage reagents and allow releasing the free tryptophane after conjugation.
To overpass selectivity issues, namely a known side reaction with Arg side chains, 448 the conditions for the reactions with MDA, the hydrazone formation and the cleavage of the MDA derivative, had to be optimised concerning pH, buffer, temperature, and reagent. No side reactions of MDAs were observed only under strongly acidic conditions, such as aqueous TFA (80%). The following hydrazone formation requires approximately 50-100 fold molar reagent excess at a pH of 5-7 and sometimes increasing the temperature to 50 1C. Although unstable at acidic conditions and when the reagent excess is removed, the hydrazone bond remains firm in alkaline medium (pH 4 9). The optimal conditions for the cleavage were found using hydrazine (applied as the dihydrochloride salt) in ammonium acetate solution at a pH B 3. Demonstrably, these rather rough reaction conditions prevent this methodology from finding widespread use for sensitive protein targets, yet allowing its application in proteomics on peptide digests. 449 1.4.2 Metallocarbenoids. The same drawback is shared by another approach involving vinyl metallocarbenoids described in 2004 by Francis and collaborators. 450 The authors have shown, that two Trp residues of horse heart myoglobin can be selectively tagged by a stabilised vinyl diazo compound in the presence of Rh 2 (OAc) 4 (Fig. 44). Difficulties reminiscent to the known instability of the rhodium carbenoid intermediate in aqueous media 451 were overcome by using an unusual additive -hydroxylamine hydrochloridethat was found to facilitate the reaction and enhance efficiency of the tryptophan modification pathway relative to hydrolysis of metallocarbene. However, the use of an excess of the corresponding diazo compound is usually required (at least 4 equivalents) and a 2 : 3 mixtures of N-alkylated versus 2-alkylated products (Fig. 44) are generally obtained in moderate yields. 450,451 The initially reported reaction conditions tolerated several aqueous solvent systems and proceeded at room temperature. Yet, acidic conditions (pH 1.5-3.5) were still necessary for efficient protein labelling and stood out as the main drawback preventing this approach from being generally applicable. For instance, in the same work, authors have stated that myoglobin was denatured and the heme dissociated from the protein due to the high acidity of the medium.
To address these limitations, following efforts of the same group were to improve the pH range of the tryptophan modification methodology. 452 For hydroxylamine was found to be ineffective at generating rhodium carbenoids at pH Z 6, a wide screening of commonly used buffers, as well as additives structurally similar to hydroxylamine H 2 NOH, was conducted in order to identify appropriate conditions. From these studies, tBuNHOH was found to be highly effective at promoting carbenoid addition. Despite the precise mode of action for tBuNHOH remains unclear, the authors attributed the substantial increase in catalytic activity to a specific interaction between this additive and Rh 2 (OAc) 4 . They speculated that, in contrast to hydroxylamine, tBuNHOH binds to Rh 2 (OAc) 4 through the oxygen, rather than the nitrogen, the latter being disfavoured by the bulky tert-butyl substituent (Fig. 45a), and increases both the stability and the reactivity of the complex at neutral and slightly basic pH.
Interestingly, in the same work, the authors have demonstrated the key role of solvent accessibility of residues in determining the outcome of conjugation on tyrosine using rhodium metallocarbenes. Human FK506 binding protein (FKBP) was identified as a suitable substrate for the study. The only Trp residue (TRP-59) of a wild type FDBP (containing an additional C-terminal threonine residue) is located at the base of the binding pocket, and therefore is unavailable for modification under nondenaturing conditions (Fig. 45b).
To overcome these difficulties, a labelling strategy based on tryptophan mutagenesis followed by chemoselective modification with rhodium carbenoids was utilised. Tryptophan-containing FKBP proteins were expressed in E. coli with C-terminal intein fusions containing a chitin binding domain for affinity Fig. 43 Selective labelling of Trp side-chain in a 8-mer peptide PTHIKWGD with malondialdehyde described by Foettinger et al. 446 The peptide structure was simulated using RaptorX web server. 447 Fig. 44 Modification of horse heart myoglobin (pdb: 1YMB) with rhodium carbenoids described by Francis et al. 450 A 100 mM solution of myoglobin was exposed to stabilised vinyl diazo precursor (10 mM) and Rh 2 (OAc) 4 (100 mM) for 7 h. N-and C-derivatisation of indole rings of both Trp residues -TRP-14 and TRP-7 -were identified by the mass reconstruction. An excess of hydroxylamine hydrochloride (75 mM) is crucial for the efficiency of the conjugation, although its mode of action was not elucidated.
purification and short tryptophan-containing peptides (Fig. 45c). Indeed, these newly obtained mutants with solvent accessible Trp residues showed significant level of conjugation (more than 40%) under optimised non-denaturating conditions at room temperature.
In 2010, further developing the rhodium-carbenoids methodology for selective tryptophan labelling described 6 years before by Francis et al., 453 Popp and Ball have reported structure-selective modification of aromatic side chains (expanding its scope to include Tyr and Phe residues) using proximity-driven approach (see Section 5 for details). 454 Structurally similar to Rh 2 (OAc) 4 , metallopeptide complexes with a dirhodium center bounded with two glutamate residues were envisioned to provide delivering of the catalyst to a close proximity of the reactive side chains by exploiting the coiled coil matched peptides, 455 for molecular peptide-peptide recognition (Fig. 46).
By combining residue-selective chemistry with secondarystructure recognition, the authors have provided a strategy for selective covalent modification of biomolecules. However, only simple diazo reagents without functional handles were used in controlled environments on model peptide substrates.
In the following year, the same group has extended their initial studies to examine the reactivity of whole proteins in a complex, cell-like environment. 456 For this, the proximity-driven catalysis approach was applied to a recombinant maltose binding protein (MBP), fused with the 21-amino-acid tryptophan-containing coil (almost identical to one used in the initial publication, Fig. 46). Directly after the expression, the lysate was subjected to metallopeptide-catalysed biotinylation. A single band in Western blot analysis indicated highly selective modification of the modified MBP protein with no nonselective modification to be observed.

Histidine
Histidine (His, H) is the only amino acid with a pK a in the physiological range, hence often found in active sites of many enzymes and is of crucial importance in mechanisms where abstraction or donation of a proton is needed. Because of its pK a value, both the acid and base forms are present at physiological pH (Fig. 47). Most of the studies on catalytic activity of enzymes and protein-protein interactions involving histidine-containing active centres were done by measuring the influence of site-specific modifications of His residues on the activity of the macromolecule.   Selective covalent labelling of the TRP-9 residue of the peptide QEISALEKWISALEQEISALEK with its complementary dirhodium metallopeptide KISALQKQKESALEQKISALKQ described by Popp and Ball. 454 The rhodium cluster, chelated with two glutamate residues, is brought closer to the reactive Trp residue by peptides coil self-assembling, resulting in selective peptide modification on TRP-9. Peptide structures were simulated using the RaptorX web server. 447 Fig. 47 Tautomerisation equilibrium of the neutral imidazole side chain (base forms A and C) occurring through the acid form B. 457 Form A is somewhat favoured over C at neutral and acidic pH, while at basic pH the form C is preferred. 458 The authors suggest that these probes can be used for specific labelling of His residues in proteins if a mild reaction condition (lower reaction temperature but longer reaction time) was used, but no example of such application was given. Moreover, considering known reactivity of epoxides with primary amines, thiolates, and hydroxyl groups, such selectivity towards histidine at physiological pH seems improbable. 461 An affinity-based labelling approach (see Section 4) based on the epoxide opening was developed by Hamachi and collaborators for selective histidine labelling of bovine carbonic anhydrase II. 462,463 Labelling reagents investigated by the authors must consist of at least three major fragments: (1) a benzenesulfonamide ligand directing specifically to bCA, (2) a reactive electrophilic epoxide for protein labelling, and (3) an exchangeable hydrazone bond between the ligand and the epoxide group for removing the ligand by hydrazone/oxime-exchange and restoring the enzymatic activity (Fig. 49a). Further developing their approach, 463 the authors added an iodophenyl or acetylene handle on the epoxide-containing fragment to enable the possibility of further derivatisation of the obtained conjugate by Suzuki coupling 464 or Huisgen cycloaddition 465 either after or before removing the ligand from the active site of the enzyme (Fig. 49b).
1.5.2 Complexes with transition metals. The affinity of transition metal ions to histidine in aqueous solutions was known for decades. 466 Copper and nickel ions have the greatest affinity for histidine and this property is the most often used for protein purification by immobilised-metal affinity chromatography (IMAC), exploiting the synergetic coordination effect of oligo-histidine tags (see Section 4). Meanwhile, recently reported by Wang et al., histidine-specific iridium(III)-probe for peptide labelling demonstrates an excellent example of selectivity based on exceptional coordination properties of only one or two His residues (Fig. 50). 467 Further exploring the advantage of the previously described iridium(III)-complex, used by Wong and colleagues for luminescent labelling of histidine-rich proteins, 468 authors showed its applicability for histidine labelling in cell-imaging studies.
Although, reckoning obtaining of the coordination complexes to conjugation techniques would be stretching a point, they are however included in this survey because of histidine liability to complexation and increased stability of obtained complexes to decomposition.

Michael addition.
Several examples of histidineselective Michael addition to the carbon double bond of conjugated aldehydes -2-alkenals -were found during studies of oxidative modification of proteins. 469,470 Even though alkenals are known for the modification of the other basic amino acid residues in the protein, 471 Zamora et al. succeeded in achieving a high level of histidine ligation in bovine albumin by incubation  in PBS buffer (pH 7.4) at 37 1C (Fig. 51), however, a reaction of Lys residues was also observed. 470 Using similar conditions -the incubation in phosphate buffer (pH 7.2) at 37 1C -Uchida and Stadtman 469 were able to tag selectively insulin (which contains no sulfhydryl groups) with 4-hydroxynon-2-enal. In both studies, the authors suggest only His residues are modified, but definitive evidence on this point is absent. Obtained conjugates contain active aldehyde residues and represent examples of protein carbonylation, allowing their derivatisation with aldehyde-selective reagents. 472 1.5.4 Miscellaneous histidine-selective reagents. Some examples of the selective histidine tagging by reagents that, in general, react more avidly with other nucleophilic residues were reported to date. For instance, Pramanik and colleagues achieved dominant PEGylation rh-interferon-a2A on histidine at mildly acidic levels of pH with a classic amine PEGylation succinimidyl carbonyl precursor. 473 Another group reported selective reaction of His residues of D-amino acid oxidase with dansyl chloride in 0.05 M phosphate buffer at pH 6.6. 474 These reaction conditions resulted in virtually complete inactivation of the enzyme after the reaction and its complete reactivation after the reaction with 0.5 M hydroxylamine (NH 2 OH). Such reactivation excludes reaction of primary amino groups, and amino acid analysis suggested that the reaction had not occurred with an oxygen nucleophile such as serine or tyrosine. Even a-halo carbonyl compounds -phenacyl bromides, a-halo carboxylic acids and amides -classically known for their selectivity towards thiols electrophiles as were found to be histidine selective on several substrates and in carefully tuned conditions. [475][476][477][478][479] Described by Pauly in the beginning of the 20th century, 480,481 the reaction with diazonium salts has been only used for colorimetric determining of His residues 482 and was not further elaborated for bioconjugation. Other classical examples of chemical modification of histidine are mainly reactions with pyrocarbonate, sulfonyl chlorides, sulfonic esters, phenacyl-and acylbromides and activated esters. 229 These electrophiles react readily with other nucleophilic groups presented in proteins (thiols, amines, alcohols, or guanidino groups) and require a careful tuning of the reaction conditions to achieve sufficient selectivity. For instance, at low pH (generally o 6.0) these reactions are quite selective for histidine, as the main side reaction with the e-amino group of lysine proceeds very slowly. 483 Reader should, nonetheless, be aware, that these examples do not represent a general rule, but an exception from it. To avoid the reactions of more nucleophilic functions, the His residue must have been located in a unique microenvironment 476,477 or have an enhanced nucleophilic character, 478 but even in this case, prior modifications of highly reactive Cys residues are often inevitable. 479

Tyrosine
Tyrosine (Tyr, Y) is one of the important amino acid residues which is known to be the active centre in many enzymes (notably tyrosinespecific protein kinases) and is used in signal transduction and cell signalling. 484,485 Occurring with intermediate to low frequency in native proteins, tyrosine is often considered as an attractive target in bioconjugation, despite being often partially or completely buried due to the amphiphilic nature of the phenolic group.
The reactivity of tyrosyl moiety is easily influenced by its deprotonation, which is a function of the microenvironment inside the protein. All described methodologies take advantage of either the peculiar chemical properties of the electron-rich aromatic ring, or the easiness of the tyrosine hydroxyl group to be transformed into highly reactive phenolate.
1.6.1 O-Derivatisation. Although O-acetylation of the tyrosyl residue with acetic anhydride and N-acetylimidazole is arguably the most widely used technique for tyrosine modification, 229 its application for conjugation is rather limited. Mainly, these limitations are due to low selectivity of acylation in the presence of other nucleophilic amino acid residues and modest stability of obtained conjugates.
An elegant approach -the affinity labelling -allows surpassing the selectivity issues of tyrosine acylation by ligand-tethered directing of the reaction. In the case of tyrosine-selective modification, an acyl transfer catalyst is connected to a ligand with a high affinity to the target protein. 462,486 The acyl group activated by the anchored catalyst is brought to the binging pocket of the protein and transfers an acyl moiety on the nucleophilic Tyr residue in close proximity. Utilizing this methodology, Hamachi et al. 486 demonstrated selective tagging of Y51 residue of Congerin II (Fig. 52) using a suitable saccharide as a ligand to the target lectin (carbohydrate-binding protein) and DMAP (4-dimethylaminopyridine) as an acyltransfer catalyst. In a similar way, Broo and collaborators 487 have demonstrated the possibility of a site-specific acylation of a tyrosine residue situated in an active site of human glutathione transferase (hGST).
Miller and collaborators have shown that biotinylation with NHS esters (see Section 1.1.2) may result in preferential O-acylation of hydroxyl-containing residues -serine, threonine and tyrosine (though to a greater extent of the first two) -when they are located two positions next to histidine (i.e. in sequences His-AA-Tyr, where AA refers to any amino acid). 85,88 Several approaches for labelling involve the initial modification of tyrosine and successive conjugation of an obtained intermediate. For instance, an ortho-nitration of tyrosine with tetranitromethane (TNM) 489 or peroxynitrite 490 results in obtaining of o-nitrotyrosine that can be then reduced by sodium dithionite (Na 2 S 2 O 4 ) to form an o-aminotyrosine.
Although much less reactive than aliphatic amines at neutral pH, the aromatic amine of o-aminotyrosine can selectively react with amine-reactive reagents at lower pH. 491,492 Namely, Nikov et al. 492 have demonstrated that selective labelling of aminotyrosines is achievable in the presence N-terminal and e-amino groups of lysines by using NHS-activated ester at particular reaction conditions (acetate buffer, pH 5.0, 2 hours). Exploitation of the pK a difference between aminotyrosyl residues and other reactive groups in proteins (4.75 for aminotyrosine, whilst much higher values for N-terminal and side-chain amino groups, see Section 1.1) allows selective labelling thereof. The method was validated on model peptides and then applied to a human serum albumin modification (Fig. 53).
Despite the reaction of TNM and peroxynitrite with proteins being reasonably specific for tyrosine, side reactions with histidine, methionine and tryptophan have been reported, as has oxidation of sulfhydryl groups. The latter would seem to be the most common side reaction, as it can result in disulfide bond formation and the formation of oxidation products such as sulfone and sulfenic acid derivatives. As a general rule, it is normally assumed that the reaction of nitration reagents with Cys residues proceeds equally well at pH 6 and pH 8, while the reaction with tyrosine occurs at pH 8 and not at pH 6.
In a like manner, the phenol group in Tyr residues can be initially ortho-formylated with chloroform in an alkaline medium to a salicylaldehyde derivative, and then undergo a reaction with ortho-phenylenediamine derivatives to form fluorescent benzimidazoles as conjugation products (Fig. 54). 493,494 Further exploiting the methodology developed by Trost and Toste for selective O-and C-alkylation of phenols with p-allylpalladium complexes, 495,496 Francis et al., have demonstrated the possibility of selective allylic alkylation of surface-exposed tyrosines of several full-size proteins. 497 1.6.2 O-Oxidative coupling. In 1995, Kodadek and co-workers 498 reported that Ni(II) complexed with a Gly-Gly-His (GGH) was chemically activated by oxone (KHSO 5 ) or magnesium monoperoxyphthalate (MMPP). A Ni(III) oxo intermediate is hypothesised to promote protein cross-linking. 499 Other peptides, notably His 6 500,501 or the entire ribonuclease A protein, 502 which can be incorporated into proteins of interest at the genetic level, have been shown to be effective ligands for nickel catalysed oxidative cross coupling.
The photoactivable metal-catalysed version of tyrosine oxidation chemistry is significantly faster than the one achieved by Ni(II)peptide complexes. It has been largely exploited by Kodadek and co-workers for cross-linking of closely associated proteins. [503][504][505] These coupling reactions are hypothesised to occur through the addition of tyrosyl radicals to adjacent Tyr residues (Fig. 55). It is worth mentioning, that in some cases nearby  492 Preferential nitration of TYR-138 (shown in magenta) and TYR-411 residues of HSA with peroxynitrite was achieved using protocol described by Jiao and colleagues. 490 Aminotyrosine of the peptide 138 YIYEIARK 144 , obtained after the step of reduction with sodium dithionite and following digestion of nitrated HSA with trypsin, were selectively modified with a cleavable biotin-containing reactant at pH 5.0. tryptophan and other nucleophilic side chains can also participate in oxidation. 503 For more details, the reader is directed to a review by Bonnafous and a publication of Francis that provide an excellent overview of oxidative cross-linking techniques. 507,508 The use of cerium(IV) ammonium nitrate (CAN) -a classical one-electron oxidant -for chemoselective ligation on tyrosine was demonstrated by Francis et al. (Fig. 56). 509 After the optimisation of reaction conditions, the authors could achieve modification of tyrosine-containing proteins with high yields at neutral pH and low substrate concentration and applied this strategy to modify both native and introduced residues on proteins with polyethylene glycol (PEG) and small peptides, although dealing with the concurrent reaction of Trp residues. 509 Notwithstanding the issue of specificity, photo-oxidation and oxidation of techniques of tyrosine ligation continue to be of considerable interest for the study of protein-protein interactions, 510 mapping multi-protein complexes, 511 or assembling of macromolecules. 512 1.6.3 Diazonium reagents. Diazonium reaction of tyrosine has been of special interest ever since its introduction by Pauly in 1915. 481 In 1959, inspired by these pioneering efforts, Higgins and Harrington advanced the use of this methodology and tried applied it to complex proteins. 513 The authors concluded that the reaction was not confined to the tyrosine and emphasised its competitive nature and strong dependence on the relative concentration of protein and diazonium salt. Moreover, strongly acidic conditions generally required for the preparation of diazonium salts from anilines, 514 are not compatible with pH-sensitive proteins. Together with a relative instability of diazonium salts and the prerequisite of their preparation just prior to use, these drawbacks prevented this methodology from the widespread use.
The optimised conditions have nonetheless allowed its application for selective modification of tyrosines on the surface of bacteriophage MS2, 515,516 the modification of the tobacco mosaic virus, 517 and the direct conjugation on proteins. 518 Francis and co-workers have demonstrated that highly reactive diazonium salts (i.e. containing electron withdrawing groups in their structure) should be utilised in order to achieve efficient Tyr targeting and avoid concurrent reaction with Lys and His residues (see Section 1.5.4). 517 Recently described by Barbas et al., formylbenzene diazonium hexafluorophosphate reagent 519 represents an elegant example of a stable ready-to-use reagent for tyrosine labelling and introduction of an aldehyde bioorthogonal tag, capable for future bioorthogonal modifications (Fig. 57).
1.6.4 Mannich-type reaction. Albeit with no control of selectivity, tyrosine conjugation via Mannich-type cross-linking reaction have been first reported by Fraenkel-Conrat and Olcott 520 and proceeded through the conjugation of tyrosines with imines formed in situ by condensation of lysine amino groups and formaldehyde. The reaction conditions, namely the necessity of using high concentrations of formaldehyde and significant heating, limit the utility of this approach for the vast majority of biological applications.  The three-component Mannich-type methodology -involving the in situ reaction between a Tyr residue, an amine and formaldehydewas reincarnated more than 50 years later by Francis et al. 521 The authors demonstrated the possibility of selective modification of tyrosine residues of a-chymotrypsinogen A under mild conditions (pH 6.5, 25-37 1C) and at low concentration of the protein (20-200 mM). However, 18 hours of incubation were needed to reach a reasonable level of tagging (66% in the case of a fluorescent labelling, Fig. 58). The same group then used this to incorporate synthetic peptides into full-sized proteins. 522 Despite recognised selectivity issues of a three-component Mannich-type approach for tyrosine labelling, 523 its main advantage is the possibility to easily vary the participating partners: an aldehyde (Fig. 58, shown in blue) and an aniline residues (Fig. 58, shown in violet). In the following publication on the subject, Francis et al. have demonstrated the viability of NMR-based characterisation of the conjugate isotopically enriched by incorporation of 13 C-formaldehyde into the coupling reaction. 523 Interestingly, while a reaction by-product arising from tryptophan indole ring was revealed, Cys moiety was found to not participate in the reaction, except in the case of a reduced disulfide, which formed a dithioacetal.
Using similar precursors -electron-rich aniline derivatives -Tanaka et al. 524 could demonstrate the potential of in situ obtained imines as fluorogenic probes for tyrosine labelling. While the educts, as well as the imine derivatives, exhibited weak or no fluorescence, the addition products had a significantly higher (more than 100-fold) level of fluorescence.
In the successive study, the same group has expanded this approach to presynthesised cyclic imines completely excluding the need for using an excess of highly reactive formaldehyde. 525 Although, the authors have clearly demonstrated the applicability of their methodology in water at room temperature over a wide pH range (pH 2-10) on a set of model phenols, no example of peptide or protein conjugation has been given.
1.6.5 Dicarboxylates and dicarboxamides. As early as 1969, the reaction of electron-rich arenes with acyclic diazodicarboxylates was reported by Schroeter (Fig. 59a). 526 Numerous examples of electron-deficient diazodicarboxylates were established in further studies being mainly focused on their synthetic usefulness for electrophilic amination in organic solvents in the presence of activating protic or Lewis acid additives. [527][528][529][530][531][532][533][534] However, these highly reactive reagents decompose rapidly in aqueous media, which makes them not suitable for bioconjugation. 535 On the other hand, corresponding diazodicarboxyamide reagents are too stabilised and do not react with phenols in aqueous media. 535 Cyclic diazodicarboxyamides like 4-phenyl-3H-1,2,4-triazoline-3,5(4H)-dione (PTAD) were recently reported by Barbas and collaborators and represent a good compromise between reactivity and stability of diazodicarboxyl-containing reagents. 536,537 Diazodicarboxylate-mediated tyrosine conjugation is applicable over a wide pH range, however the highest labeling efficiency was observed at pH 7-10. 536 A versatile class of stable PTAD precursors, possessing different functional groups, was developed and applied for a selective tyrosine conjugation (Fig. 59b). Their utilisation implies prior to use oxidation with 1,3-dibromo-5,5-dimethylhydantoin and the addition of a small amount of TRIS (2-amino-2-hydroxymethyl-propane-1,3-diol) during the step of conjugation. The latter is of crucial importance for the coupling selectivity, for it is hypothesised to serve as a scavenger of a putative isocyanate by-product of the PTAD decomposition, which is promiscuous in labelling.
The non-selective labelling of other aromatic side chains of proteins is the Achilles' heel of the vast majority of approaches   described for tyrosine labelling. Careful tuning of reaction conditions is important for achieving appropriate levels of selectivity. In some cases where purely chemical distinction of reactivity of amino acid moieties is not feasible, catalysis on the basis of molecular shape rather than local environment can be used to induce selectivity. This concept is routinely exploited by enzymes and allows enabling reactivity that would otherwise be kinetically impossible. In 2010, Popp and Ball used dirhodium metallopeptide catalysts for selective conjugation on tyrosine and tryptophan using the concept of the proximitydriven mechanism (see Section 5). 454 In the following year, Silverman et al. have demonstrated a DNA-catalysed approach for selective labelling of tyrosine, although only on small peptide substrates. 538

Arginine
With a pK a value above 12, arginine (Arg, R) is mainly presented in its protonated form in acidic, neutral, and even most basic environments. Effective delocalisation of a positive charge between nitrogen lone pairs and the double bond favours the formation of hydrogen bonds 539,540 and makes the guanidinium side chain of arginine the least acidic cationic group among all 20 natural amino acids (Fig. 60). 541 However, the pK a value of arginine was found to vary significantly in the microenvironments within certain proteins, 542,543 allowing, in terms of Leitner and Lindner, 544 the grouping of arginines in ''exposed'' or ''partially buried'' residues, basing on the difference of their reactivities.
Most of the described approaches for arginine labelling and modification exploit the chemistry of a-dicarbonyl compounds. For instance, introduced by Takahashi 545 as an arginyl reagent, phenylglyoxal has since been applied for the study of complex systems in the past decade. [546][547][548][549][550] The reaction occurs under mild conditions and consists of two steps: first addition of phenylglyoxal resulting in the formation of hydrolytically instable imidazolidine diol, and the second step results in a relatively stable addition product (Fig. 61).
Substituted phenylglioxal analogs, such as p-hydroxyphenylglyoxal, p-nitrophenylglyoxal and 4-hydroxy-3-nitrophenylglyoxal, p-azidophenylglyoxal (APG) were reported for spectrophotometric and cross-linking study of the modification of arginine in proteins. [551][552][553][554][555] None of these linkers have however been used in bond-forming conjugation. Because phenylglyoxal, like glyoxal, reacts with e-amino groups at a significant rate, 545 many efforts were made to increase its selectivity towards guanidinium residue. Cheung and Fonda have studied the effect of buffers and pH on the reaction rate 556 and found that the reaction of arginine is greatly accelerated in bicarbonate-carbonate buffer systems, possibly due to the stabilisation of the obtained diol.
Geminal diones -namely 2,3-butanedione (introduced by Yankeelov) 557,558 and 1,2-cyclohexanedione (introduced by Itano) 559 -are another well-characterised reagents for the modification of Arg residues. The reaction progresses through the pathway that is similar to the phenylglyoxal addition. However, it was not until the observation that borate had a significant effect on the selectivity of the reaction that the use of this reagents became practical. 560,561 The presence of borate in the solution allows shifting of the equilibrium during the addition to a guanidine moiety through the stabilisation of reversibly obtained diol (Fig. 62).
In 2005, using this approach, Lindner and colleagues have described a method for the selection of arginine-containing peptides from a tryptic digest of the model proteins (BSA, lysozyme, ovalbumin) by a solid phase capture and release. 562 First, arginine containing peptides presented in the digest were covalently modified on the guanidine moiety with 2,3-butanedione and phenylboronic acid under alkaline conditions. Polymeric materials allowing the immobilisation of phenylboronic acid were then used to capture the arginine-peptides on a solid support while washing away all not covalently bonded arginine-free peptides. Finally, the arginine-peptides were cleaved again from the boronic acid beads due to the reversibility of the reaction. Photoactivable bifunctional reagents for cross-linking of arginine moieties have been elaborated by Ngo et al. and Politz et al. to study enzymes with an arginine at their active sites. 555,563 Arginine-specific PEGylation of lysozyme using polyethylene glycols containing an a-oxo-aldehyde motif in borate buffer was recently reported by Gauthier and Klok 564 and represents mild and selective method for protein modification (Fig. 63). Other methods described to date 565,566 possess selectivities, which are not sufficient (especially in the presence of Lys moieties) to consider them as suitable for bioconjugation.

Aspartic and glutamic acids
Carboxylic acid groups can be found in protein structure either on its C-terminus, or as side chains of Asp and Glu. Due to the low reactivity of carboxylate in water, it is usually difficult to selectively conjugate proteins at these moieties. Carboxylic acid should thus generally be converted to a more reactive ester by means of so called activating reagents. For more than a half century, carbodiimide-mediated activation is the most extensively used methodology for the modification of free-acids in protein. 567,568 The reaction of carbodiimides with protonated carboxyl groups yields activated acylisoureas, which then react smoothly with a variety of nucleophiles, namely amines (Fig. 64). 569 It is important to utilise weakly basic amines, that remain deprotonated and thus reactive at pH below 8.0, to avoid protein cross-linking occurring at higher pH values. For this reason, weakly basic hydrazides are often reagents of choice in coupling reactions with activated carboxylic acids. 570 Although waterinsoluble carbodiimides (DCC, DIC) still continue to be useful for acid-selective protein conjugation, 571,572 most current reports exploit water-soluble carbodiimides such as 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide (EDC). Developed by Sheehan and Hlavka 573,574 these carbodiimides first proved their especial usefulness as zero-length cross-linking reagents to study proteins. 574 Subsequent studies were devoted for the application of carbodiimides for quantitation of accessible carboxyl groups in proteins, 567,568,575 preparation of antigenic conjugates, 576 and protein immobilisation. 577 As mentioned previously, the upper limit for the optimal pH of carboxylate conjugation is defined by the reactivity of free amino groups present in protein. The lower limit is mainly determined by aqueous stability of the activating reagent. Borders and co-workers 578 studied the stability of EDC in aqueous solution.
It was found that EDC has a T 12 of 37 hours (pH 7.0), 20 hours (pH 6.0), and 3.9 hours (pH 5.0) in 50 mM MES buffer at 25 1C; in the presence of 100 mM glycine, the T 12 values were 15.8 hours, 6.7 hours, and 0.73 hours respectively. This supports the optimal pH for acid-selective conjugation to be in a range from 6.0 to 7.0. NHS (or its water-soluble analogue sulfo-NHS) is often included in coupling protocols to improve efficiency or to create a more stable intermediate. Possible side-reactions involving activating reagents were recently reviewed by Valeur and Bradley. 579 Woodward's reagent K (N-ethyl-5-phenylisoxazolium-3 0sulfonate) 580 and analogous substrates were used as activating reagents of carboxyl groups for synthetic purposes. Bodlaender et al. 581 used N-ethyl-5-phenylisoxazolium-3-sulfonate, the N-alkyl derivatives of 5-phenylisoxazolium fluoroborate, to activate carboxyl groups on trypsin for subsequent modification with methylamine or ethylamine.
Lastly, several studies revealed unexpected examples of carboxyl group modification with reagents usually reacting far more effectively with other nucleophiles. For instance, p-bromophenacyls and iodoacetamides have been found to selectively alkylate carboxylic acid moieties of pepsin and ribonuclease T1 respectively. [582][583][584] However, the applicability of these reagents is not general and is appropriate on specific substrates only.

Methionine
In spite being often considered a rather simple target for chemical modification (mainly through the oxidation and the reaction with a-halo acetic acids and their derivatives), 585,586 only a handful of conjugation methodologies involving methionine (Met, M) were described up to the present.
All approaches described in literature exploit alkylation of Met residues in acidic media. Although many nucleophilic functional groups present in proteins can react with alkylating reagents, at low pH all of them except methionine exist in protonated forms, which greatly decreases their reactivity. 587 Consequently, alkylations of other nucleophilic functional groups, such as thiols, are commonly conducted at high pH, 400 while methionine is the only functional group in proteins able to react with alkylating reagents at low pH.
Basing their research on pioneering studies being done by Toennies in the 1940s, 589,590 Kramer and Deming have recently reported a reversible chemoselective labelling of methionine in peptides and polypeptides. 588 Treatment of the model peptide (PHCRKM) with alkylating reagents of different structures in 0.2 M aqueous formic acid (pH 2.4) gave a single product, where only the Met residue was alkylated. This resulting sulfonium salts can readily be dealkylated by addition of pyridine-2(1H)-thione (PyS) to give the starting peptide as the sole product along with the alkylated PyS byproduct. The removal reaction was found to be selective and allows selective dealkylating using concentrations of PyS that do not react with the disulfide bond in cystine under identical conditions (Fig. 65).

a-Amino groups
The most important thing to note about N-terminal amino groups is that these are the only primary amines present in the protein structure that possess an adjacent amide bond in a-position, which slightly influences their reactivity. Consequently, the majority of classical methodologies exploit this peculiarity of the N-terminus.
2.1.1 Classical approaches. As rather unpolarizable nucleophiles, amines react preferentially with hard electrophiles like acid anhydrides and acyl halogenides. 591 The anchimeric influence of the adjacent amide bond consists in the lowering the pK value of the amine by electron withdrawal. Consequently, this makes the discerning of N-terminal amino groups from the e-amino groups of Lys residues achievable by working at a pH close to their pK values, i.e. under slightly acidic conditions. That is, virtually all methodologies described in the Section 1.1 of this review are, to some extent, applicable for selective labelling of N-terminal amino groups of the proteins.
In the classical work on acetylation of the growth hormone, Reid 592 has for the first time demonstrated the possibility of selective modification of N-termini, if acetylation is performed with a relatively small amount of acetic anhydride. Further development of this approach has in several protocols for selective labelling of a-amino groups of proteins, 3,593 peptides, 594,595 and proteomes. [596][597][598] Under optimal reaction conditions, the use of a 5-fold excess of amine-reactive reagent in PBS (pH 6.5) at 4 1C, high levels of selectivity can be achieve after 2-24 hours of reaction. However, the preference for terminal amino groups achieved by control of pH is rather limited, mainly owing to the fickleness of the pK a of the amino group depending on the microenvironment and reaction conditions. Consequently, more proficient methods that rely upon increased chelating ability of N-termini, 599 direct participation of the adjacent side chains or peptide bond, 600,601 were developed and represent to date preferential approaches for bioconjugation.
2.1.2 Ketene-mediated conjugation. A method for selective N-terminal modification of proteins by ketenes was introduces by Che and co-workers 602 and consisted in N-terminal ligation of peptides through oxidative amide bond formation using the ''[Mn(2,6-Cl 2 TPP)Cl]/alkyne/H 2 O 2 '' systems. Initially, the method was tested on a set of six peptides and demonstrated its applicability. However, inevitable oxidation at Cys and Met residues hindered the application of the protocol in the field of bioconjugation. Only after having conducted the mechanistic studies of this approach, the authors have realised that ketenes generated in situ were the key intermediates accounting for the reactivity; therefore prior preparation thereof would allow for refraining from the need to use an oxidant. In the publication to follow, Che 603 has introduced a general approach for the modification of N-terminal a-amino groups of a series of proteins and peptides using an isolated alkyne-functionalised ketene (Fig. 66). Interestingly, in contrast to classical approaches, increasing the pH of the reaction mixture did not significantly affect N-terminal selectivity of conjugation. For comparison, the ketene reagent was side-by-side benchmarked with a corresponding NHS-ester. Remarkably, much poorer N-terminal selectivity was obtained for all studied peptide substrates when the NHS ester was used. The reason for this impressive specificity of ketenes, however, remains unelucidated.
2.1.3 Transamination. It was after the success of the conversion of glyoxyloyl groups into glycyl groups (see Fig. 75) 604 that Dixon and co-workers realised that if a terminal Gly residue can be made by transamination then a terminal residue of any kind might be transformed into a corresponding carbonyl-containing residue by transamination. These introduced carbonyl groups are not naturally occurring functionalities in proteins and can therefore be used as unique loci of attachment for synthetic groups through the formation of hydrazone or stable oxime bonds. 605,606 Inspired by the pioneering works of Metzler and Snell, 607 and Cennamo and collaborators 608,609 on the transamination of simple amino acids and peptides under harsh conditions (heating at 100 1C and pH 5.0), Dixon and Moret 610,611 developed a method for mild transamination in the presence of copper(II) salts, which allowed the reaction to pass at room temperature (Fig. 67). The isomerisation of the imine generated in situ by catalysed 1,3-proton shift transfer is the key step of the transformation which defines both its direction and the reaction rate. It is clearly the activation of a-protons of N-terminus by the Fig. 65 Selective reversible modification of a 6-amino-acid peptide (PHCRKM) via methionine alkylation reported by Kramer and Deming. 588 No reactions with other amino acids were detected. adjacent peptide bond and the metal ions that makes this reaction specific to a-amino groups. Interestingly, Dixon reports that no traces of the reaction of lysine side chains were ever observed. 600 Quite a wide range of reaction conditions has been tested ever since. The discovery that pyridine 612 and acetate 613 greatly accelerated the transamination of amino acids led to a slightly milder reaction conditions, which however were still too harsh to maintain the folded structure of most proteins, and were therefore more appropriate for sequence-analysis applications. 614 Only recently Francis and co-workers 601 have re-examined Cennamo's approach 608 of amino acid transamination in the presence of pyridoxal-5-phosphate (PLP, vitamin B6) at 100 1C and found much milder the reaction conditions when modifying the N-terminal residues of peptides (65 1C for reaction to be over in 2 hours; 25 1C to achieve a complete conversion in 24 hours). Screening experiments on different N-terminal amino acids of the peptides indicated that the aldehyde structure strongly influenced the reaction efficiency. Amusingly, known for more than 50 years PLP emerged as the most effective aldehyde among dozens being screened, affording the highest yields at milder reaction conditions.
The mechanism of PLP-catalysed transamination is depicted in Fig. 68. The reaction of PLP and N-terminal amine results in forming of a Schiff base aldimine (a). Then the a-proton the amino acid transferred to the 4 0 position of the pyridoxal unit (b and c). Finally, hydrolysis of the obtained ketimine leads to the desired a-ketoacid and pyridoxamine phosphate (d). Quinoid is an important intermediate for the transformation of aldimine to ketimine and can be found in all transamination reagents described to date.
Under optimal reaction conditions: 10-50 mM protein and 10-50 mM PLP at 37 1C in PB (pH 6-7), a complete conversion is generally achieved after 2-24 hours. The resulting keto-proteins are generally rather stable and can be concentrated, stored, or lyophylised without any specific precautions. 615 An example of transamination-conjugation methodology was demonstrated by Francis in the initial publication on selective labelling of an N-terminal glycine residue of horse heart myoglobin and enhanced green fluorescent protein (EGFP, Fig. 69). 601 Although the side chain of the N-terminus does not participate directly in the transamination mechanism, the reaction rates were found to vary significantly depending on the amino acid in the N-terminal position. 616 Generally, the majority N-terminal amino acids provide high yields of the desired transaminated products; however, some residues (His, Trp, Lys, and Pro) generate adducts with PLP itself, while other are incompatible with the technique because of known side reactions (Ser, Thr, Cys and Trp) or complete inertness (Pro).
In the attempts for the investigation of transamination reaction scope and limitations, Francis and collaborators have prepared an 8000-member one-bead-one-sequence combinatorial peptide library in which the three N-terminal residues were varied. 617 Interestingly, the Ala-Lys (AK) motif was found to favour especially the transamination yields (the Lys residue is hypothesised to accelerate the isomerisation step of 1,3-proton shift acting as a general base). To demonstrate this, labelling of the Type III ''Antifreeze'' Protein and its mutant presenting the AKT sequence on the N-terminus were side-by-side benchmarked. At every time point analysed, the AKT terminus outperformed the wild-type one (GNQ) at different concentrations of PLP.
Although mild reaction conditions of PLP-mediated transamination render it amenable for the modification of intact proteins, 618 the yields are generally not high and elevated temperatures are usually required, which largely limits the practical applicability of the approach. Given this situation, Francis and co-workers 619 have utilised above-described combinatorial approach to identify another transaminative agents. As a result, N-methylpyridinium-4-carboxaldehyde benzenesulfonate Fig. 67 General scheme of the transamination reaction activated by metal ions (typically Cu 2+ and Ni 2+ ). 610 Principal steps of the reaction mechanism: (a) generation of the imine; (b) isomerisation of the obtained imine by proton removal; (c) hydrolysis of isomeric imine to generate transaminated reaction partners. Fig. 68 Structure of PLP and mechanism of PLP-mediated transamination. Reaction pathway consists of (a) condensation reaction between pyridoxal and the amine; (b and c) tautomerisation of the obtained aldimine being favourable because of a much lower intrinsic pK a values of a-proton (shown in blue); (d) hydrolysis of the resulting ketimine, accompanied by decarboxylation in the case of aspartic acid (R = -CH 2 COOH). 601 Fig. 69 Site-specific N-terminal labelling of EGFP (pdb: 2Y0G). Proteins possessing N-terminal carbonyl groups obtained by in the first step PLPmediated transamination were labelled with hydroxylamine probes in the second step. 601 salt (Rapoport's salt, RS, Fig. 70) was identified as a highly effective alternative to PLP. Furthermore, this was found to be particularly efficient for glutamate-rich sequences. 619,620 The fact that several antibody isotypes dispose at least one glutamate-terminal chain makes RS particularly amenable for their selective conjugation. Remarkably, the difference of transamination reaction rates on Glu-and non-Glu polypeptides was significant enough for selective labelling of only the heavy chain of immunoglobulin G1 (containing the N-terminal Glu residue), while leaving unmodified the light chain. This was assigned to be due to the higher steric hindrance of the already less-reactive substrate (N-terminal Asp). To ensure these results an IgG1 mutant possessing Asp-Asp-Ser sequence on both chains was prepared. Indeed, this underwent the modification of both sites when exposed to RS. Facing another recognised drawback of Francis methodology, namely low efficiency for bulky amino acid termini (Leu, Ile, Val), Zhang et al. 621 have elaborated an efficient PLP analogue, FHMDP (Fig. 70), that demonstrated much higher efficiency in transamination thereof.
The above-mentioned transformations provide just a few examples of the rapidly growing field of transaminative modification of proteins. Recent advances have also resulted in elaboration of general approaches for protein immobilisation (Fig. 71), 15,622 dual fluorescent modification of periplasmic solute binding proteins, 623 protein PEGylation and PEG-like conjugation (e.g. OEGMAtion), 624 preparation of phage conjugates, 625,626 N-terminus proteomics, 627,628 enabling Wittig 629 and Pictet-Spengler ligation on transaminated proteins. 630,631 2.1.4 2-Pyridinecarboxyaldehydes (2PCA). A promising approach for one-step N-terminal selective modification of proteins using 2-pyridinecarboxyaldehyde (2PCA) derivatives was recently reported by Francis and co-workers. 632 Because of its structural similarity to pyridoxal-5-phosphate and Rapoport's salt, 2PCA was occasionally discovered by the authors during a screening of various aldehyde reagents for their reactivity in transamination reaction (see Section 2.1.3).
Rather than demonstrating any ability to transaminate, 2PCA exclusively showed conversion to a pair of cyclic imidazolidinone diastereomers upon reaction with model peptides. The key step of the reaction mechanism is a nucleophilic attack of the adjacent amide nitrogen on the electrophilic carbon of the initially formed N-terminal imine (Fig. 72a). The presence of nitrogen heterocycles (namely pyridines) was found to be crucial for efficiency of this stereoelectronically disfavored condensation. It is noteworthy that Lys residues are unreactive in such a pathway, because of the lack of a neighbouring amide group suitable for cyclisation and higher pK a values compared to a-amino groups. Furthermore, this methodology was also found to be compatible with the presence of free in-chain cysteines. This approach can therefore be considered as an orthogonal to other Lys-selective (see Section 1.1) and Cys-selective methodologies (see Section 1.3).
2PCA-mediated conjugation was found to be generally applicable for protein labelling (the authors demonstrated its application on a broad set of 12 different proteins including RNase A; Fig. 72b) except for N-acylated proteins (no imine formation) and peptides containing proline in position 2 (no cyclisation of the formed N-terminal imine). 632 The resulting imidazolidinone-containing conjugates are generally moderately stable and decompose to starting protein by about 20-30% after 12 hours of incubation at 37 1C (in case of RNase A). This may limit the suitability of this methodology for several applications where stability of generated conjugates is crucial; however, the substrate variation could resolve the issue and such efforts are underway.

Serine and threonine
In 1960 Waller and Dixon have described the first procedure for selective N-terminal modification of peptides. It consisted of preparation of corticotrophin selectively acetylated on its terminal serine. 633 Although being possible only under highly denaturing conditions of alkali exposure, the approach allowed for a spontaneous intramolecular O -N acyl transfer 634 on the N-terminal Ser residues, while O-acetyl groups of in-chain Ser, Thr and Tyr residues were hydrolysed. The general idea of overcoming the  entropy barrier of an otherwise slow intermolecular process by bringing two reacting partners together through a covalent linkage to initialize an intramolecular reaction has been later transformed into a large variety of selective modification reactions.
2.2.1 O -N shift of oxazolidines. 35 years after the early publication of Waller and Dixon, 633 Liu and Tam [635][636][637] have extended the applicability of O -N acyl transfer mediated methodologies by developing an approach to the chemical ligation of N-terminal serine, threonine, and cysteine unprotected peptide segments with no need of using protecting groups. In order to make their methodology widely applicable, the authors proposed a general way to introduce an aldehyde function onto C-terminus by enzymatic coupling of a masked aldehyde, followed by chemical hydrolysis of the obtained intermediate. 635 The key step of the process is the formation of the peptide bond through an intramolecular rearrangement between the two closely neighbouring carboxyl and secondary amino groups of formed of oxazolidine (in the case when Ser and Thr residues are involved) or thiazolidine (in the case of Cys residue, see Section 2.3.2) (Fig. 73). However, only in case when Cys-peptide was involved the reaction was found to be clean and no side products were observed. In contrast, when Thr-and Ser-peptides were used, it required refinement to ensure better yields. 636 Although this methodology has demonstrated its high potential for the ligation of unprotected peptides, 635,638,639 the generation of a ''non-native'' heterocyclic fragment at the site of ligating the two peptides made it extruded almost completely by other ''native ligation'' approaches; notably, by a similar mechanistically native chemical ligation (NCL) (see Section 2.3.1).

Periodate oxidation.
A mild version of serine and threonine modification implies their prior conversion into glyoxyloyl derivatives via periodate oxidation first described by Fields and Dixon 640 and later transformed into a general method for site-directed modification of proteins with N-terminal Ser or Thr by Geoghegan and Stroh. 641 Based on the periodic acid mediated oxidation, 642 the reaction occurs only when there exists the target site for the periodate to form a cyclic intermediate, that is to say, when the N-terminal residue is represented by a serine or a threonine, or when hydroxylysine is present (rarely occurring in proteins). Possible side reactions include oxidation of the side chains of Met, Trp, and His. However, the potential for side reactions can be diminished by using very low periodate-to-protein molar ratios, as demonstrated by Geoghegan and Stroh in their experiments on two model peptides, SIGSLAK and SYSMEHFRWG, and with recombinant murine interleukin-1a (an 18 kDa cytokine with N-terminal Ser, Fig. 74), 641 or by conducting the oxidation at neutral pH. 643 As in the initial publication of Geoghegan and Stroh, 641 obtained glyoxyloyl can serve as the locus for further chemical modification involving aldehyde-selective reactions (e.g. through the formation of stable oxime, hydrazone or previously described oxazolidine moieties). 605,606 Robin and colleagues have demonstrated the possibility of using this two-step methodology for assembling two unprotected protein fragments: oxidised to glyoxyloyl-containing and hydrazide peptide derivatives. 644 Rose and co-workers 645 exploited the reactivity of generated glyoxyloyls towards O-alkyl hydroxylamine derivatives to synthesize a pentameric form of the cholera toxin subunit B.
Further investigation of periodate oxidation allowed its promoting for site-selective tagging, PEGylation, preparation of protein conjugates, protein capture and synthesis of large protein dendrimers. 643,[646][647][648][649][650][651] It is worth mentioning that periodate oxidation is incompatible with a number of protein classes. For instance, glycoproteins will undergo periodate-based cleavage of polysaccharide chains as a side reaction pathway. 652 Lastly, glyoxyloyles can easily be transformed into corresponding amines via transamination reaction in the presence of copper(II) or nickel(II) salts. 604,610,653 The reaction mechanism as well as need for both essential components of the system: the acceptor of the glyoxyloyl (usually aspartic acid or glycine) and the cation of a heavy metal are explained in Fig. 75. Despite being of moderate interest for bioconjugation by itself, this approach has initiated the development of a more general methodology for selective N-terminus modification -transaminative conjugation (Section 2.1.3). The reader is directed to a recent review by El Mahdi and Melnyk 654 for a complete overview of the glyoxyloyl transformations in bioconjugation.

Phosphate-assisted ligation.
Conceptually catching phosphate-assisted ligation at serine and threonine was recently reported by Payne and Thomas. 655 The inherent reactivity of an N-terminal phosphorylated Ser or Thr residues was demonstrated to significantly facilitate the amide bond formation with a range of C-terminal peptide Tam. 635 A model 50-residue peptide was obtained in good yield in ligation reaction between a 32-mer peptide VVSHFNDCPDSHTQFEFHGTCRFLV-QEDKPAR containing C-terminal aldehyde function and a 17-mer peptide CHSGYVGARC(Ac-m)EHADLLA containing N-terminal cysteine; in the case of Thr-and Ser-peptides, the reaction was moderately efficient. The peptides structures were simulated using the RaptorX web server. 447 Fig. 74 N-terminal serine labelling of recombinant murine interleukin-1a (pdb: 2KKI) with Lucifer yellow dye described by Geoghegan and Stroh. 641 The method is also applicable if N-terminal threonine is present.
thioesters. Although it is not yet clear what exact intermediate is formed during ligation, the authors have hypothesised that rapid acyl migration to the N-terminal amine of a peptide occurs through the formation of unstable acyl phosphate (Fig. 76).
2.2.4 Indirect approaches. Ligation at Ser/Thr can also be achieved considering two distinct indirect approaches. Firstly, as it was demonstrated by Danishefsky and co-workers, 656 NCLdesulfurisation methodology can be used to access threonine at ligation sites (see Section 2.3.1.4). Secondly, as an alternative to the NCL-desulfurisation sequence, cysteine obtained during NCL can be chemically transformed into serine by methylation followed by activation of the obtained S-methylcysteine by cyanogen bromide (BrCN, Fig. 77). 657

Cysteine
Generation of N-terminal Cys residue for native chemical ligation can be accomplished using solid-phase peptide synthesis, 659 proteolytic processing, 660 or by the spontaneous hydrolysis of intein fusion protein. 661 Moreover, genetically directed, sitespecific incorporation of 1,2-aminothiols handle into proteins has been recently reported by Chin and associates. 662 2.3.1 Native chemical ligation. The very principle of ''chemical ligation'' was coined by Kent in the early 1990s and consisted in an approach for covalent condensation of unprotected peptide segments by the means of ''unique, mutually reactive functionalities designed to react only with each other and not with any of the functional groups found in peptides''. 663 That is, a general method that would enable the application of chemical tools to the world of the proteins. However, the original ligation chemistries exploited the reciprocal reactivity of chemical functions which are not present in native proteins and thus their prior introducing onto reacting partners is require and often associated with synthetic difficulties. Moreover, a non-native linkage is generated at the ligation site; therefore, many scientists remained sceptical about the validity of using such ''analogous'' proteins as tools for understanding the molecular basis of protein function.
In 1994, confronted with this criticism, Kent and co-workers 664 introduced a versatile approach to the linkage of peptide fragments using a native peptide bond -native chemical ligation (NCL). Based on the original principles of the chemical ligation methodology 663 and the ability of thioesters to undergo S -N acyl shift discovered by Wieland et al., 665 NCL allowed to achieve chemoselective formation of the amide bond in the presence of unprotected nucleophilic amino acid side chains as alcohols   By analogy with previously developed O -N acyl shift, the reversibility of the thioester-thiol exchange in the presence of an exogenous thiol additive coupled with the capture of the acyl segment by S -N acyl shift, being possible only in the case when the latter is brought in the close proximity to an amine in a N-terminal cysteine thioester intermediate, result in an exquisite regioselectivity of this methodology. The product resulting from this S -N acyl shift represents a peptide, consisting of two fragments linked by a native peptide bond through a cysteine residue (Fig. 78).
Typically, the reaction performed in PS or PBS buffer (pH 7.0-8.5) at 37 1C is complete in less than an hour and with high yields. 666,667 Solubilizing agents such as guanidine hydrochloride or urea do not interfere with the ligation and are usually used to enhance the concentration of peptide segments, and thus increase reaction rate. It is important to prevent the thiolate of N-terminal cysteine from the oxidation resulting in a disulfide-linked dimer, which is unreactive in the ligation.
A reductant (e.g. TCEP) or an excess of thiol corresponding to the thioester leaving group (4-5%, vol/vol) is generally added to keep the Cys residues in reduced form. Moreover, the latter largely increases the overall rate of NCL by reversing the first step of transesterification for in-chain intermediate adducts deprived from the possibility to undergo S -N acyl shift and to generate a stable amide bond.
The first step in synthesizing a protein by NCL generally consists in defining the fragments to be used in the ligation reactions. Preferentially, naturally occurring AA-Cys motifs in the native sequence should be chosen as the ligation sites (AA stands for any amino acid). Val, Ile, Asp, Asn, Glu, Gln and Pro represent less favourable choices, because of lower ligation rates and possible side reactions, 666 which, however, can be accelerated either by transformation of the corresponding thioesters into selenoesters, 668 or by tuning the reaction pH. 669 Higher reaction rates were reported to be achievable while using good thiol-containing living groups, i.e. mildly acidic thiols such as thiophenol, 4-(carboxymethyl)thiophenol (MPAA), or 5-thio-2-nitrobenzoic acid (TNB, the reduced form of Elman's reagent). These are generally generated in situ by thiol-thioester exchange from the relatively unreactive peptide-( a COSCH 2 CH 2 CO)-Leu alkylthioester by adding an excess thereof (1-5%, vol/vol). 666 2.3.1.1 Sequential NCL. The complexity of proteins that can be obtained by NCL technique is limited by the maximum size of the accessible synthetic peptide segments. Two main approaches used today for the preparation thereof are: solid phase peptide synthesis pioneered by Merrifield (allowing the synthesis of proteins containing up to 50 residues), 659 and recombinant DNA expression elaborated by Lobban et al. 670 and Jackson et al. 671 These allow for the preparation of peptide fragments containing up to 50, and up to 150 amino acid residues respectively. Sequential native chemical ligation allows further extending of this limit by means of N-terminal cysteine protected peptides. Three polypeptide fragments: a peptide 1 -COSR Ar (N-terminal fragment), a protected PG-Cys-peptide 2 -COSR Ar (middle fragment), and a Cys-peptide 3 (C-terminal fragment), are thus assembled in a one-pot three step synthesis. Firstly, the middle fragment and the C-terminal fragment are ligated under the classical reaction conditions of NCL. Then protecting group is removed, uncovering N-terminal cysteine of the obtained polypeptide (central plus C-terminal fragment), which undergoes the second NCL with the N-terminal fragment to give the target protein (Fig. 79).
Since its introduction, sequential native chemical ligation has demonstrated its general applicability to the preparation of various complex assemblies. Consequently, several methodologies compatible with NCL for the protection of N-terminal Cys residues were elaborated. The most relevant among them are depicted in the Table 1.
The reader is referred to a recent review by Melnyk and collaborators 673 for a more complete overview of the sequential ligation strategies on proteins.  nature was exploited by Bang et al. 675 to introduce a convergent strategy for the synthesis of native peptides -kinetically controlled ligation (KCL).
The fact that the kinetics of NCL with alkylthioesters is significantly inferior of those with arylthioester makes it possible to control the intrinsic dual reactivity of a bifunctional Cys-peptide 2 -COSR Alk so that it would selectively react with a peptide 1 -COSR Ar and then undergo a classical NCL (i.e. in the presence of exogenous aryl thiol) with a third Cys-peptide 3 to yield an assembled peptide 1 -Cys-peptide 2 -Cys-peptide 3 with no need to use protecting groups. Bang and collaborators have applied this methodology to assemble a 46-residue protein crambin from six peptide fragments (Fig. 80).
Further advancing pioneering works done by Botti et al. 686 on in situ acyl migration, KCL methodology has been recently extended from alkylthioesters to a full class of O-esters undergoing a spontaneous transformation to produce a thioester when exposed to a reducing agent through disulfide bond reduction followed by O -S acyl shift (Fig. 81). 687 Through a thorough investigation Zheng et al. 687 have defined that structures of the O-esters have an important effect on their reactivity. The authors have side-by-side benchmarked their methodology with previously described KCL by synthesizing the same V15A crambin (Fig. 80) by a one-pot one-step condensation of peptide segments and found its applicability to this system. Readily available by Fmoc solid-phase synthesis, Fig. 79 Synthesis of insulin-like growth factor 1 (IGF-1) via sequential native chemical ligation described by Sohma et al. 672 The reversible protection of the a-amino group of the central peptide fragment IGF  prevents its self-reaction with the a-thioester moiety present in the same molecule. Thiazolidine protecting group can be easily removed by brief treatment with NH 2 OMeÁHCl at pH 4. NCL reaction condition used: PB (pH 6.7), 6M GndCl, 10 mM MPPA, 20 mM TCEP.  Fig. 80 Two final steps of the synthesis of V15A crambin (pdb: 3NIR) described by Bang and associated. 675 The mutation was introduced for simplifying the prior KCL step of assembling the second peptide. Kinetically controlled ligation of S Ar thioester spontaneously occurs in aqueous media in the absence of exogenous thiophenol, while native chemical ligation of S Alk thioester must be accelerated by the addition of 1% PhSH.
these O-ester scaffolds can expand the applicability of NCL to substrates with hardly accessible thioester peptide fragments.
2.3.1.3 Access to C-terminal peptide thioesters Classical approaches. The preparation of C-terminal a-thioesters involved in native chemical ligation is often associated with synthetic difficulties. Being especially reactive species, they either have to be introduced at the end of the synthetic pathway or be kept in a hidden form of thioester surrogates possessing higher stability.
Despite its recognised drawbacks due to hazardous acid treatment often leading to undesired side-reactions, the protocol of in situ neutralisation for Boc-based solid-phase peptide synthesis represents the most effective approach for the preparation of peptidyl thioesters. 666,[688][689][690] Alternatively, the Fmoc synthesis approach was investigated 691-693 and found to be favoured when synthesizing phospho-and glycopeptides.
Expressed protein ligation (EPL). Introduced by Muir et al. in 1998, 694 EPL represents another approach for the preparation of a-thioesters. It allows for obtaining the recombinant protein thioester by thiolysis of an intein fusion protein and thus enables a large pool of elaborated recombinant protein techniques for NCL. The reader is directed to several recent reviews of this area on chemical research for more details. 695,696 X -S acyl-transfer. An elegant approach for keeping thioester in a relatively inert ''hidden'' form ready to be uncovered when required was first introduced by Danishefsky and colleagues in 2004 (Fig. 82). 697 Several years later, the name ''crypted thioesters'' was coined by Otaka 698 as a general term for such compounds. This approach is of especial interest, because it enables the assembly of peptide segments in N-to-C direction, which is rather rare and often difficult to achieve. 673 Indeed, all above-described methodologies rely on the assembling of polypeptide chain in C-to-N direction: for instance, the sequential native chemical ligation per se consists of iterated cycle of ligation-deprotection-ligation. . ., i.e. adding new peptide fragments onto N-termini after deprotection thereof (see Section 2.3.1.1).
Synthesis of thioesters in situ from stable amides via N -S acyl-transfer was demonstrated by Ohta et al. 699 who studied acylated oxazolidinones derived from S-protected cysteine. These possess a distorted amide planarity provoking so called ground-state destabilisation 700 and, as a consequence, favour the acyl migration to a deprotected thiolate. Nakahara and collaborators have studied and elaborated two classes of secondary amides amenable to the N -S acyl shift at low pH values: 5-mercaptomethyl prolines 701 and N-alkyl cysteamides (Fig. 83a). 702 Oxazolidinones 699 and N-sulfanylethylamilides (SEAlides) 703 described by Otaka and collaborators were found to possess similar aptitude towards N -S acyl-transfer shift at low pH (for a complete overview of N -S acyl-transfer systems described before 2010 see review by Kang and Macmillan). 704 Erlich et al. 705 have recently applied N-alkyl cysteamide-based approach for the synthesis of 76-residue ubiquitin thioester, while Otaka and collaborators have demonstrated high potential of SEAlides by conducting the chemical synthesis of 162-redue active glycosylated GM2-activator protein. 706 Almost simultaneously have two research groups reported a general approach, based on the application of bis(2-sulfanylethyl)amides (SAM) as precursors for NCL. [707][708][709][710] An interesting extension of the methodology enabling the possibility of triggering the reactivity of SAM -so called SEA on/off system -has been further elaborated by Melnyk and collaborators. 711 The transition between reactive (SEA on ) and unreactive (SEA off ) states is simply triggered by mild oxidation/reduction procedures (Fig. 83b). SEA off can be easily switched on via TCEP reduction, while the reverse switching off is achieved by mild oxidation with iodine. After few seconds, the excess of iodine is decomposed by the addition of dithiothreitol (DTT). Other amino acid residues susceptible to oxidation such as methionine or tryptophan are not affected because of this very short exposure to oxidant. However, cysteine residues must be protected with tertbutylsulfenyl groups to remain unaffected. At low pH values these are not reducible by DTT, allowing thus reliable protection of cysteines during the cycles of oxidative-reductive SAM triggering.   712 have introduced an approach that, for the first time, allowed virtually conducting NCL at alanine. While the ligation still occurred at N-terminal cysteine, its subsequent desulfurisation with freshly prepared RANEY s nickel produced the native target sequence containing alanine residue at the ligation site (Fig. 84). This strategy has inspired the development of a large pool of various ligation junctions that includes phenylalanine, 713,714 glycine, 715 valine, 716,717 leucine, 674,718 threonine, 656 serine, 657 proline, 719-721 aspartate, 722 glutamine, 723 homocysteine, 724,725 methionine (by subsequent S-methylation of the ligated homocysteinyl product by p-nitrobenzenesulfonate) 724 arginine, 726 and lysine (a-or e-selective) ligations. 727,728 In order to extend native chemical ligation-desulfurisation approach to these amino acids, a SH-group should be attached to the carbon atom situated in the b-position to amino group (in some cases in the g-position). The new building block containing a b-mercapto-a-amino or g-mercapto-a-amino fragment is then introduced at the N-terminal position during solid-phase peptide synthesis or by means of DNA-recombinant technologies. 714 Following ligation is conducted under classic NCL conditions: at pH 7.5-8.0 in the presence of TCEP as reducing agent and 1% of exogenous thiol additive. Finally, the desulfurisation of cysteine gives a nascent residue of interest (Table 2).
Besides aforementioned reduction on RANEY s nickel, various milder conditions such as nickel boride, 712 Pd/Al 2 O 3 , 729 or metal-free conditions 730,731 were developed to achieve efficient desulfurisation. More recently, in situ ligation-desulfurisation approach was also reported. 732,733 Desulfurisation-based methodology of the NCL expanding towards other amino acid junctions have contributed in many ways to prepare proteins and posttranslationally modified analogues for biochemical and structural analyses ever since its introduction. However, despite this broad utility, carrying out desulfurisation of the linkage site in the presence of other Cys residues in the protein sequence usually requires using of  protecting groups. 729,734 Not only does this necessity represent an undesirable step to protein synthesis, but it also implies some limitations on the applicability of the approach, mainly due to the solubility issues. Developed by Dawson and collaborators 735 the protocol for selective reduction of selenocysteines (Sec, U) in the presence of cysteine allowed overpassing the need for protection of thiolates and expanded the already established field of Sec-mediated native chemical ligations (see Section 2.6). [736][737][738][739] The sensitivity of Sec peptides to reduction was noted in several works on selenocysteine ligations, 737,738,740 During their preceding work on the synthesis of seleno-glutaredoxin 3 analogues (Se-Grx3), Dawson and associates 741 have observed this incompatibility of Sec-containing proteins and peptides with TCEP-assisted native chemical ligation due to the generation of significant levels of a deselenised side product. In the publication to follow, the authors have successively applied the ligation-deselenisation strategy on a model peptide system. Accordingly, N-terminal Sec-peptide 1 (UGLEFRSI-amide) prepared in the form of a diselenide dimer was ligated to the thioester peptide 2 (Ac-LYRAG-SR) (Fig. 85) to produce the deselenised alanyl-peptide (Ac-LYRAGAGLEFRSI-amide) after the treatment with 50-fold excess TCEP at pH 5.5. Importantly, an excess of 200 mM 4-mercaptophenylacetic acid (MPAA) was needed for the ligation step. MPAA is both served as a catalyst to activate the alkyl thioester and as mild reducing agent to generate a small pool of free selenol to facilitate the ligation reaction.
This selectivity of reduction with TCEP is hypothesised to be due to the weaker Se-C bond compared with the S-C bond coupled with higher propensity of selenols to form radicals. 742 It should be mention, however, that albeit upon heating, cysteine can be desulfurised in the absence of a radical initiator when treated with excess phosphine. 743 Selenocysteine-mediated NCL deselenisation procedure have been recently exploited for selective ligation of selenolphenylalanine, 744 and g-selenolproline, 720 easily transformable into peptides phenylalanine and proline respectively at the ligation site by treating Se-containing intermediates with TCEP or DTT. Analogously to N -S acyl transfer (Section 2.3.1.3), N -Se acyl shift was recently observed by Adams and Macmillan and allowed NCL to take place at lower temperatures and on shorter time scales. 745 Corresponding selenoesters can be readily accessible by direct solid phase synthesis. 746 NCL principles of S -N acyl transfer found their application in ligation assisted by proximity effect (see Section 5), 747 allowing for conjugation of N-terminal residues other that cysteine by auxiliary-mediated acyl transfer. Furthermore, the Cys side chain thiolate introduced during NCL can also provide a synthetic handle for further functionalisation using cysteineselective methodologies (see Section 1.2). It was recently demonstrated by Fang and co-workers that more accessible than thioesters simple phenyl esters could undergo native chemical ligation smoothly under the promotion of imidazole. 748 Lastly, recently reviewed by Monbaliu and Katritzky 749 Kemp's templatemediated thiol ligation, [750][751][752][753][754][755][756][757][758] Tam's ligation by thiol/disulfide exchange, [759][760][761] and other auxiliary-driven extensions of native chemical ligation, 762,763 represent significant importance in the field of protein synthesis and can be considered as appropriate for bioconjugation.
2.3.2 Thiazolidine formation. Further expanding the applicability of the reaction of cysteine with formaldehyde described by Ratner and Clarke, 764 Tam and collaborators have elaborated a method for the selective conjugation of N-terminal cysteines with aldehydes, resulting in obtaining of stable thiazolidines. 636,643,[765][766][767] The reaction of 1,2-aminothiols readily occurs at slightly acidic pH of 4-5, while the concurrent reaction of free amines with aldehydes results in obtaining Schiff bases reversibly under the same conditions.
The thiazolidine-mediated ligation was first applied by Tam and associates to the preparation of peptide dendrimers 765,767 by attaching unprotected peptide dendrones containing Cys residues at their N-termini to a branched core matrix with aldehyde functions. Botti et al. 766 have transformed this approach to a general method for the preparation of cyclic peptides. Villain et al. have demonstrated that obtained thiazolidines (sometimes referred to as pseudo-prolines) 768 can be selectively cleaved by adding hydroxylamine derivatives, which react with aldehyde functions protected under the form of thiazolidine to form oximes. The authors applied this methodology for the covalent capture of proteins possessing N-terminal Cys or Thr residues (Fig. 86). 769 Interestingly, under the same conditions N-terminal Ser residues reacted only poorly.
Because of the recent advances in the semisynthesis of proteins and the encoding of 1,2-aminothiols into recombinant proteins, 662,770,771 thiazolidine-mediated conjugation is now experiencing a reappraisal of its potential for bioconjugation. 772,773 For instance, Casi et al. 772 have exploited thiazolidine formation for the preparation of antibody-drug conjugates by site-specific incorporation of a potent drug, containing an aldehyde moiety, to engineered recombinant antibodies displaying a Cys residue at their N-terminus, or a 1,2-aminothiol at their C-terminus. 772 Lastly, thiazolidines represent one of the most often used N-terminal cysteine protecting groups for sequential NCL (see Table 1, Section 2.3.1.1).

2-Cyanobenzothiazoles (CBT).
The reaction of 2-cyanobenzothiazole (CBT) with D-cysteine was first conducted by Field Fig. 85 Traceless ligation of peptides using selective deselenisation described by Dawson and collaborators. 735 and collaborators 774 for the preparation of synthetic luciferin: a compound found in various living organisms and responsible for emitting light after being oxidised by a specific enzyme luciferase. Ever since its isolation (9 mg from 15 000 firefly lanterns), 775 luciferin enzymatic oxidation has been studied for the last 50 years. 776 The regeneration pathway of luciferin in firefly was found to be consisted of the condensation of CBT with D-cysteine (Fig. 87). 774 The reaction mechanism underlying this addition include first attack of cyano group of CBT by cysteine thiolate. This results in the formation of the electrophilic imidothiolate, subjected to the second attack by cysteine amino group to form thiazole structure after the yielding of ammonia gas.
Inspired by these early works, Rao and co-workers 777 have further investigated the reaction of cyano-substituted aromatic compounds with amino-thiol substrates. They have demonstrated that benzotriazole motif plays an important role for the activation of the cysteine addition to a nitrile group. For instance, under optimal reaction conditions (PBS, pH 7.0-7.5) its replacement by other aromatic fragment such as picolinonitrile or benzonitrile largely decreases the reaction yield. All naturally occurring amino acids are unreactive towards CBT, except for cysteine owing the highest second-order rate constant among six other tested aminothiol substrates (9.19 M À1 s À1 , which is significantly higher than these of the majority of biocompatible click reactions). 778 Finally, the efficiency and specificity of CBT-based labelling of terminal cysteine residues was demonstrated on proteins in vitro (Fig. 88) as well as on cell surfaces.
In the publications to follow, Rao and colleagues have extended the applicability of CBT towards biocompatible condensations to create polymer assemblies in vitro and in living cells under the control of either pH, disulfide reduction or enzymatic cleavage. 779,780 Yuan et al. have taken advantage of this approach and developed a method for the determination of glutathione (GSH) concentration in vitro and in HepG2 human liver cancer cells. 781 Jeon et al. 782 have elaborated a CBT-based 18 F-probe radiolabelling of N-terminal cysteine-bearing peptides and proteins. Two labelled substrates: a dimeric RGD-peptide -[ 18 F]CBTRGD 2 , and Renilla lucifierase bearing a cysteine at N-terminus, have been synthesised with excellent radiochemical yields and shown good in vivo molecular PET imaging efficiency. Proceeding efficiently at physiological conditions, CBT-mediated N-terminal Cys conjugation represents a useful alternative to existing approaches for protein labelling. 783 2.4 Tryptophan 2.4.1 Sulfenylation-coupling. Encouraged by early reports from Scoffone and colleagues, who examined the site-selective modification of the nucleophilic 2-position of the tryptophan indole ring through electrophilic sulfenylation with various sulfenyl chlorides, 784,785 Payne and collaborators have recently brought back into life a classical reagent for Trp selective modification -2,4-dinitrophenylsulfenyl chloride (DNPS-Cl). 786 The authors have demonstrated that, in acidic conditions, all nucleophilic amino acid side-chains except tryptophane (gives moderate yields of about 50% after 16 hours) either remained unmodified, as in the case of serine, threonine and the e-amino groups of lysine, or were reversibly modified in the case of cysteine, which forms an easily reducible asymmetric disulfide. Further thiolytic cleavage of the resulting 2-Trp thioether derivatives with an external thiol nucleophile affords the corresponding 2-thiol Trp derivatives (2SH-Trp, Fig. 89a) in good yields. Interestingly, being placed on the N-terminus of the peptides, 2SH-Trp scaffolds were found to enhance the kinetics  of the native ligation with peptide thioesters, and could thus serve as Na acyl transfer auxiliaries (see Section 2.3.1). 787 The proposed mechanism for this approach is mechanistically similar to NCL. It was hypothesised that the reaction would proceed via an initial step of the peptide thioester transthioesterification with an indole 2-thiol functionality followed by an S-to N-acyl shift through a 7-membered ring transition state to generate a native amide bond. The last step of 2-thiol Trp desulfurisation results in obtention of a ligated product with only naturally occurring amino acid residues (Fig. 89b). Although this methodology represents a clever chemoselective approach for the ligation of completely unprotected peptide fragments through Trp moiety, the harshness of the reaction conditions of sulfenylation and desulfurisation limit it only to peptide substrated and don't allow its application on complex biomolecules.
2.4.2 Pictet-Spengler reaction. Another approach for the ligation of unprotected peptides was proposed by Li et al. 788 and exploited the Pictet-Spengler reaction: an acid-catalysed intramolecular condensation between an iminium ion and an aromatic C-nucleophile described in 1911. 789 This non-natural ligation (peptides are linked by non-natural bonds) involves reaction of two peptide partners in acetic acid: one containing a Trp residue at its N-term and another with a C-terminal aldehyde function. The latter should generally be introduced by means of solid-phase peptide synthesis on acetal resin (described by the same authors).
2.4.3 N-Acyl tryptophan isopeptides. Lastly, an interesting example of native N-terminal Trp ligation, mechanistically very similar to NCL (see Section 2.3.1), was recently reported by Popov et al. 790 The key intermediates of this methodology -N-acyl tryptophan isopeptides -undergo selective acyl transfer to yield natural peptides. These are, however, not accessible directly by methods reported so far, which substantially restricts the applicability of the methodology.

Histidine
Despite being known for its particular importance for the acyl transfer in many enzymatic processes, 665 histidine has been rarely used for bioconjugation. The only study describing the ligation of N-terminal His peptides with activated thioester was reported by Zhang and Tam. 791 Ellman's regent 348 was used to activate the C-terminal peptide thiocarboxylic acid by forming acyldisulfide derivative, which is then nucleophilically attacked by N-terminal histidine. Captured by the imidazole of the N-terminal histidine, the obtained N im -acyl intermediate is hypothesised to undergo N im -N a shift to form histidine at the ligation site (Fig. 90). However, the N im -acyl intermediate has not been isolated and it is quite possible that regioselectivity is obtained simply because of anchimeric assistance of the proximal imidazole moiety at the ligation site.
Interestingly, no sign of coupling reaction has occurred when a corresponding non-activated C-terminal thiocarboxylic acid is participating in the reaction instead of the acyldisulfide. The reaction pH plays an important role on the effectiveness of the reaction. Only when maintained at slightly acidic values (pH 5-6) and in the absence of the thiol nucleophiles, would the imidazolyl moiety of histidine be the sole nucleophile present in the polypeptide. This methodology has been applied to generate histidine-containing peptides with yields up to 75%.

Selenocysteine
Selenium and sulphur belong to the same main group of elements; therefore, 21st proteinogenic amino selenocysteine and Cys residue exhibit rather similar properties in terms of reactivity for bioconjugation. [792][793][794] For instance, both Hilvert's 738 and Raines's group 736 demonstrated that C-terminal peptide thioesters react smoothly with peptide fragments containing a N-terminal selenocysteine in exactly the same manner as with corresponding cysteine analogues. Presumably proceeding through the same mechanism as NCL (Section 2.3.1), the first  step of the ligation process consists in the nucleophilic attack on the thioester by selenolate to give a selenoester intermediate that subsequently rearranges to give a native chemical bond.
Sec ligation can be chemoselective when conducted at slightly acidic pH. Low intrinsic pK a values of selenocysteine (5.2) 795 and consequently its higher dissociation level at low pH, endows this amino acid with unique biochemical properties, allowing regiospecific covalent conjugation with electrophilic compounds in the presence of the side chains of all other natural amino acids including the thiol group of Cys (pK a 8.3). For instance, the reaction rate with selenocysteine was found to be 1000 fold faster than with cysteine at pH 5.0. 736 Moreover, the lower pH generally suppresses b-elimination of the selenol group from selenocysteine resulting in the obtaining of unreactive dehydroalanine. 392 Initially, considerable efforts were made to show the applicability of selenocysteine NCL for the preparation of selenium-containing derivatives of enzymes and benchmarking activities thereof. Hilvert and associates synthesised a C38U analogue of bovine pancreatic trypsin inhibitor (BPTI). Amusingly, the wild-type BPTI and its artificial analogue folded into alike conformations and demonstrated similar inhibiting affinity of trypsin and chymotrypsin. 738 Raines et al. selected 124-residue ribonuclease A (RNase A) as a model protein for the investigations. 736 DNA recombinant technology was utilised to prepare a C-terminal thioester fragment corresponding to residues 1-109, while standard solid phase peptide synthesis methodology was used to obtain a N-terminal Cys and Sec peptides corresponding to residues 110-124 (Fig. 91). Just as in the case of BPTI, the semisynthetic wild-type RNase A and C110U RNase A presented equivalent ribonucleolytic activities. Further advances in the field of Sec-NCL have resulted in synthesis and investigations of other different proteins such as seleno-glutaredoxin, 796 azurin, 797,798 and thioredoxine reductase.
Ease of the post-ligation transformation of selenocysteine to alanine (by deselenisation), dehydroalanine (by b-elimination) or non-natural amino acids (by addition reaction to dehydroalanine, see Section 1.3.8) became a spur to a further propagation of Sec-mediated methodologies as very effective tools in rational design of peptides and proteins. Quaderer and Hilvert 799 exploited such transmutations of selenocysteine to access a series 16-residue cyclic peptides (Fig. 92).
In this initial report, the deselenisation step was conducted rather harshly (RANEY s Ni, H 2 ), implying that all Cys residues (if there were any) would have been reduced as well. Recently, however, Dawson and collaborators 735 have demonstrated that selenocysteine can be chemoselectively deselenated with TCEP in the presence of cysteines. This allowed overpassing the main limitation of the NCL-desulfurisation strategy (see Section 2.3.1.4), namely the inability to control regioselectivity of desulfurisations if several cysteines are present in the peptide or protein, and yielded in a pool of NCL-deselenisation strategies for mild incorporation of alanine, phenylalanine and proline into the ligation site by classic NCL approaches (see Table 2). Finally, selenocysteine peptides were found to efficiently undergo reverse NCL at acidic pH and thus to be of particular interest for the generation of thioesters by sequential N -Se acyl-transfer and substitution of the obtained selenoester by exogenous thiol (see Section 2.3.1.3). 745 Because the incorporations of selenocysteine by the cell translational machinery are generally very laborious, 800,801 selenopeptides are mainly obtained by SPPS. 736,737,794

Proline
The oxidative coupling of o-aminophenols and o-catechols recently reported by Francis and collaborators 802 represents an interesting approach for selective N-terminal conjugation on proteins. The strategy consists in prior in situ oxidation of o-aminophenols and o-catechols to active coupling species using potassium ferricyanide (Fig. 93a) followed by their reaction with a protein (Fig. 93a). This approach was shown to work particularly well with proline (due to increased nucleophilicity thereof) and can therefore be considered as N-terminal Pro selective methodology. 802 The key advantage of the approach compared to the majority of N-terminal selective methodologies is its fast second-order   kinetics in a single step transformation, which does not require prior oxidation of the N-terminus. Free cysteine residues, however, are also reactive during the oxidative coupling and must thus be protected, which represents the main drawback of the approach.

C-terminal conjugation
Chemical methods for C-termini conjugation are rather scarce. Definitely, previously mentioned approaches for the conjugation of side-chain carboxylates (see Section 1.5) can also be applied effectively to modify the terminal carboxylates. The selectivity, however, would be the main issue, for up to date there is no described method for somewhat selective activation of N-terminal carboxylates by any activating reagent.
Because of the impetuous development of NCL (see Section 2.3.1), requiring C-terminal thiolates as reacting partners for N-terminal Cys proteins, those became widely accessible, namely by means of SPPS. A perspective approach, exploiting these advances was proposed by Goody 803 and collaborators, who developed a protocol for selective transformation of C-terminal thioesters to corresponding hydroxylamines, enabling thus the application of aldehyde-and ketone-selective methodologies on the C-terminus.
On the other hand, the unique position of protein C-termini has stimulated numerous efforts to target this location, which resulted in numerous enzymatic and intein-based approaches for C-terminal-selective protein modification. [804][805][806][807][808][809] These methodologies are, however, not covered by this review devoted to chemical methods of bioconjugation.

Sequence-selective approaches
Several especially useful methodologies in bioconjugation exploit not a specific property of the residue, or its peculiar position in protein, but rather the synergetic effect of a batch of neighbouring amino acid residues. For instance, an example of such selectivity is the aforementioned selective modification of His-AA-Ser and His-AA-Thr peptide motifs (see Section 1.2) by usually promiscuous activated esters. In this case, the His imidazolyl side chain located in close proximity to Ser/Thr side chains increased the reactivity thereof towards electrophiles.

Proximity-driven modifications
All the above-mentioned methodologies are mainly residuespecific. That is to say, they exploit specific reactivity of the functional group of interest or of an assembly of residues. As a result, bioconjugations of highly nucleophilic Cys, Lys, and Tyr residues with electrophilic reagents are definitely prevalent among described methodologies.
The inherent reactivity is rare to be overcome, which largely limits of the scope of known methods for bioconjugation of the amino acids possessing low nucleophilicity. However, bringing the reaction partners into close proximity can accelerate a reaction thereof, which would not otherwise be possible because of the presence of other more reactive species. Routinely exploited by enzymes, this approach enables selectivity on the basis of the molecular shape rather than reactivity or the local environment (Fig. 95).

Fig. 94
Strategies for the selective conjugation of proteins based on metal-chelation: tetracysteine/biarsenical system, oligohistidine/nickelcomplex system, tetraserine-borate system, oligo-aspartate/zinc-complex system. 27 Developed by Hamachi and collaborators in 2006, 462 the post-affinity-labelling approach can be used for the selective tethering a functional molecule at the proximity of the active site of enzymes. This method was then used on numerous substrates and enlarged the scope of selective conjugation especially on histidine and tyrosine (see Section 1.5.1). 462,463,486,814 Despite these successes, a prevalent limitation of the applicability of the method due to the possible modification of the residues situated only in the vicinity of ligand-binding pocket of the target protein prevents this methodology from becoming a general approach for protein labelling.
In 2010, exploiting similar idea, Popp and Ball 454 envisioned the combination of two previously described techniques: the coiled-coil based molecular recognition of complementary peptides 455,[815][816][817] and high catalytic activity of dirhodium complexes on carbene C-H insertion, previously reported by Francis and collaborators 452 for the selective modification of tryptophan (see Section 1.4.2). Two complementary peptides: one, containing a dirhodium catalytic centre (precomplexated through two glutamate side chains), and another, containing a side chain to be modified are thus involved in this methodology (Fig. 96a). Because of the compelled proximity of the side chain of interest and the active catalytic centre in the obtained supramolecular assembly, the reaction of rhodium-catalysed C-H insertion is largely accelerated (more than 10 3 times). 818 As a result, the conjugation of amino acid residues with lower reactivities becomes possible. This allowed to expand the scope of originally tryptophan-selective dirhodium carbene methodology first to the other aromatic residues, phenylalanine and tyrosine, 454 and then to over half of the naturally occurring amino acid residues (Fig. 96b). 818 To date, dirhodium metallopeptide represent the only reported method for selective modification of Gln, Asn, and Phe side chains. The authors have also demonstrated the possibility to apply their methodology on chimeric proteins, containing fused coils, 456 as well as on full-sised natural proteins possessing coiled-coils in their structures. 818 Despite its important potential, the metallopeptides methodology is however not devoid of drawbacks. Because binding to dirhodium is nonselective and thus cannot be performed in the presence of other carboxylate-containing peptides, rhodiumpeptide complexes must be synthesised beforehand, which is often challenging mainly due to their poor solubility. 819 Moreover, the method is restricted to proteins containing coiled coil fragments in their structures, which in case of the vast majority of targets would mean the need for resource-and timeconsuming expression of fused proteins.
Another approach, developed by Silverman and colleagues, exploits self-assembling of complementary DNA to bring into proximity two reacting fragments and allowed, although only on simple substrates, selective phosphorylation of tyrosine and serine, otherwise not feasible. 538 Beyond coiled coils and DNA-based preorganisation, the principles of proximity-driven selectivity should be extended to other helix-binding protein domains and to biological molecular recognition generally. A significant broadening of the applicability of this elegant approach for protein modification, biochemistry and biomaterials engineering is anticipated in the nearest future.

Conclusions
The field of bioconjugation has expanded in the last 100 years and passed from a blind-eye modification of proteins one could found in nature to a well-established independent domain full of approaches allowing precise and reliable introduction of various tags into proteins' structure.
Many, if not most of these methods, however, often possess drawbacks limiting their general applicability. This fact has become of special consideration with a rise of novel exigent applications for bioconjugation, namely the preparation of new therapeutic conjugates, vaccines, and biomaterials. Also tremendous progress in the sensitivity of analytical methodologies as well as the need to work with smaller and smaller amounts of sample, often-unstandardized patient samples, highlighted the need for more efficient, selective and reliable bioconjugation methods.
Moreover, some parameters of the mode of conjugation, previously completely neglected, were recently revealed to be of paramount importance. For instance, the stability of the generated linkage and the distribution of products generated upon conjugation can be determining for the overall efficiency of the conjugate.
Overall, we believe that intensive ongoing research in the field of bioconjugation will result in more efficient and selective methodologies allowing specific conjugation of native proteins in complex biological media, and ultimately in living organisms.