Peptide-tags for site-specific protein labelling in vitro and in vivo

Visualizing the localization and trafficking of proteins in living systems is the key to understand protein functions in their native environment. Tracking and visualization by fluorescence microscopy require methods that allow the site-specific labelling of a protein of interest (POI). Established almost 20 years ago, the genetic fusion of a POI to an autofluorescent protein such as the green fluorescent protein (GFP) revolutionized protein imaging. The fusion of POIs with engineered, self-modifying enzymes such as SNAP-, CLIP-, TMP-, or Halo-tags significantly extended the scope to customized labelling by enabling the use of reporter units with unique spectroscopic signatures. Enzyme-based tags provide high specificity and enable fast labelling reactions, which is an advantage when fast biological processes ought to be analysed in time-lapse experiments. Yet, the fusion of enzymes brings along a massive increase in size (18–33 kDa). This can perturb protein trafficking and may impair protein–protein interaction networks that rely on clustering. To avoid potential interference with protein localization and protein function, major efforts are spent to decrease the tag size. The smallest tags currently accessible are provided by the incorporation of unnatural amino acids (UAA) by amber stop codon or related technologies. In addition, the ligand-directed protein labelling or traceless affinity labelling is independent of any tag. However, the establishment and incorporation of UAA in mammalian cells still is very challenging and the traceless affinity labelling requires a POI with a native binding pocket as well as the exact knowledge of the binding site and ligand. Peptide-based recognition tags offer an attractive alternative for protein labelling. The genetic fusion of peptide-tags is straightforward, the size of 0.6–6 kDa is considerably small and a variety of labelling chemistries allow the introduction of virtually arbitrary reporter groups. Peptide-tag based labelling methods are multifaceted and can be categorised into three classes: (i) enzyme mediated labelling, (ii) metal ionor small molecule-dependent labelling and (iii) labelling based on peptide–peptide interactions. We will discuss the characteristics of the different approaches with respect to specificity, speed, stability, tag size, toxicity and versatility. Applications as well as shortcomings will be explained. The review covers the literature until 2015 and particularly emphasizes the most recent and most frequently used developments. Additional information can be taken from other comprehensive reviews.


Introduction
Visualizing the localization and trafficking of proteins in living systems is the key to understand protein functions in their native environment. Tracking and visualization by fluorescence microscopy require methods that allow the site-specific labelling of a protein of interest (POI). Established almost 20 years ago, 1 the genetic fusion of a POI to an autofluorescent protein such as the green fluorescent protein (GFP) revolutionized protein imaging. The fusion of POIs with engineered, self-modifying enzymes such as SNAP-, 2 CLIP-, 3 TMP-, 4,5 or Halo-tags 6 significantly extended the scope to customized labelling by enabling the use of reporter units with unique spectroscopic signatures. Enzyme-based tags provide high specificity and enable fast labelling reactions, which is an advantage when fast biological processes ought to be analysed in time-lapse experiments. Yet, the fusion of enzymes brings along a massive increase in size (18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31)(32)(33). This can perturb protein trafficking and may impair protein-protein interaction networks that rely on clustering. To avoid potential interference with protein localization and protein function, major efforts are spent to decrease the tag size. The smallest tags currently accessible are provided by the incorporation of unnatural amino acids (UAA) by amber stop codon or related technologies. 7,8 In addition, the ligand-directed protein labelling or traceless affinity labelling is independent of any tag. 9,10 However, the establishment and incorporation of UAA in mammalian cells still is very challenging and the traceless affinity labelling requires a POI with a native binding pocket as well as the exact knowledge of the binding site and ligand.
Peptide-based recognition tags offer an attractive alternative for protein labelling. The genetic fusion of peptide-tags is straightforward, the size of 0.6-6 kDa is considerably small and a variety of labelling chemistries allow the introduction of virtually arbitrary reporter groups.
Peptide-tag based labelling methods are multifaceted and can be categorised into three classes: (i) enzyme mediated labelling, (ii) metal ion-or small molecule-dependent labelling and (iii) labelling based on peptide-peptide interactions. We will discuss the characteristics of the different approaches with respect to specificity, speed, stability, tag size, toxicity and versatility. Applications as well as shortcomings will be explained. The review covers the literature until 2015 and particularly emphasizes the most recent and most frequently used developments. Additional information can be taken from other comprehensive reviews. [11][12][13][14] Peptide-tag based labelling by enzymes -mechanisms and characteristics Enzymes offer a naturally occurring sequence-specificity for site-selective modifications. Ligases such as biotin ligase, lipoic acid ligase or tubulin tyrosine ligase are capable of transferring molecular probes to a particular amino acid (aa) within a recognition sequence. Similarly, transferases can be used to transfer functional groups from a donor substrate to a specific amino acid within a peptide sequence, as demonstrated for phosphopantetheinyl transferases Sfp and AcpS, transglutaminase or the phosphocholine transferase AnkX. Another possibility is to use transpeptidase activity. Sortase A exchanges a part of the recognition sequence with a separate peptide conjugated to a labelling moiety. The specific modification of an amino acid side chain can also be achieved with formylglycine-generating enzyme, which naturally modifies a cysteine derived thiol group to an aldehyde function (see Table 1).
Already in 1999 a 15 aa long acceptor peptide, termed AP-tag, was reported to serve as a substrate for E. coli derived biotin ligase (BirA). This enzyme promotes the conjugation of a biotin to a lysine side chain within the AP-tag (Fig. 1A). 15 BirA-mediated biotinylation was demonstrated for many different AP-tagged proteins in cell lysates including the transcription factors GATA-1 and EKLF. 16 Co-expression of AP-tagged and BirA-tagged proteins allows the so-called proximity biotinylation to study protein-protein interactions. 17 This method is widely used and has enabled investigations of homo-and heterodimerization of chemokine receptors CXCR4, CCR2 and CCR5 in RPE cells 18 or glycoprotein interactions during herpesvirus entry into cells. 19 Of note, the method also enables investigations of proximity between two proteins expressed on different cells. For example, BirA-mediated biotinylation was used to characterize the synaptic connectivity between the presynaptic adhesion protein NRX-1 and the postsynaptic transmembrane protein NLG-1 in vivo in Caenorhabditis elegans. 20 In a recent study, BirA-mediated biotinylation exposed the anterograde trafficking of calcium activated potassium channels (K Ca 3.1) in polarized epithelial cells (including HEK293, MDCK, LLC-PK1 and Caro-2 cells). The AP-tag was introduced into the protein and BirA was fused to an ER-retaining sequence (KDEL) to specifically label AP-tagged K Ca 3.1 along the anterograde pathway. 21 Furthermore the endocytosis of the apical Na + /K + /2Cl À co-transporter (NKCC2) equipped with an AP-tag between the 5th and 6th transmembrane domain was measured in real time by total internal reflection fluorescence (TIRF) microscopy. After BirA-mediated biotinylation, fluorescent staining was performed with fluorescent streptavidin. 22 BirA also accepts a biotin derivative containing a ketone group. A comparison of biotin ligases from different organisms revealed that Saccharomyces cerevisiae and Pyrococcus horikoshii accept alkyne and azide derivatives of biotin. 23,24 The biotin modifications enable Table 1 Characteristics of enzymatic labelling techniques. One-step labelling achieves the direct modification of a POI with a reporter group (e.g. fluorescent probes). Two-step labelling involves in a first step the transfer of a chemically reactive functional group (e.g. azide, alkyne, biotin, hydrazine conjugates) which is targeted in a second step upon reaction with a suitable modified reporter group (e.g. via SPAAC, biotin-streptavidin binding or hydrazine formation)

Enzyme
Recognition tag Labelling with 1-10 mM CoA-derivative for 20-40 min; reaction temperature: 37 1C 38,39 In vitro, in vivo (cell surface labelling with fixed or live cells) One-step and two-step In vitro, in vivo (cell surface labelling with live cell imaging) subsequent labelling of proteins such as epidermal growth factor receptor (EGFR) on live cells by formation of acyl hydrazones or Staudinger ligation. 23 A BirA cross-compatible technique uses E. coli derived lipoic acid ligase (LplA), which catalyses the ligation of lipoic acid to a sequence-specific lysine side chain in an ATP-dependent manner (Fig. 1B). 25,26 LplA specifically recognizes the LAP-tag, which consists of 22 aa, and accepts different lipoic acid derivatives containing alkyne, azide, trans-cyclooctene, aryl aldehyde or aryl hydrazine groups as well as fluorophores. 25,[27][28][29] Strain-promoted alkyne-azide cycloaddition (SPAAC), hydrazone formation or Diels-Alder cycloaddition 30 enabled the subsequent chemical modification of the lipoic acid appendage on LAP-tagged proteins such as CFP, 25 BFP, 28 low-density lipoprotein receptor (LDLR), 25 neurexin-1b 27 or actin. 28 The LplA mutant LplA W37V has been reported to provide higher ligation yields than the wild-type enzyme. 27 In contrast to BirA, LplA accepts fluorescently labelled conjugates and permits onestep labelling of the POI. LplA-mediated protein labelling was demonstrated on HEK293T-, HeLa and COS-7 cells. A minimized LAP-tag with only 13 aa 31 was used to incorporate a p-iodophenyl derivative as new bio-orthogonal handle into proteins, which could be selectively addressed in a second step by palladium-catalysed Sonogashira cross-coupling. 32 Wombacher et al. reported norbornene-conjugated lipoic acid analogues, which were introduced into LAP-tagged CFP and dihydrofolate reductase from E. coli (eDHFR) and fluorescently labelled via inverse-electron demand Diels-Alder reaction. 30 The enzyme tubulin tyrosine ligase (TTL) allows the C-terminal labelling of proteins with a 14 aa hydrophilic recognition sequence called Tub-tag. TTL is a regulator of microtubule homeostasis and usually promotes the coupling of tyrosine to the a-tubulin C-terminus. Hackenberger and Leonhardt showed that TTL is also able to incorporate tyrosine derivatives containing azide, formyl, amine or nitryl groups (Fig. 1C). 33 This enabled the two step labelling of GFP-specific nanobodies with biotin, fluorescent probes or polyethylene glycol (PEG) chains by means of SPAAC, Staudinger ligation, Staudinger-phosphite reactions and hydrazone or oxime forming reactions. Labelling also succeeded in E. coli lysate. The use of transferases offers interesting alternatives to ligase-mediated labelling. Phosphopantetheinyl transferases (PPTase) such as Sfp and AcpS catalyse the transfer of a phosphopantetheinyl (Ppant) group derived from coenzyme A (CoA) to a serine residue within a specific recognition sequence ( Fig. 2A). 34 Owing to an impressive substrate promiscuity of Sfp and AcpS, CoA can be modified at its terminal thiol with small molecules, including biotin, sugars, peptides, porphyrin and fluorophores, the latter allowing one-step labelling similar to LplA. [34][35][36][37] Furthermore, a set of three specific tags is available: ybbR-tag (11 aa), S6-tag (12 aa), preferentially recognized by Sfp and A1-tag (12 aa) preferentially recognized by AcpS. 38 The ybbR-tag was successfully introduced at the N-and C-terminus, or within the protein sequence as demonstrated for eGFP. 37 Orthogonal cell surface labelling with fluorescent CoA conjugates on living HeLa cells was demonstrated for S6-tagged EGFR or A1-tagged transferrin receptor 1 (TfR1). 38 The S6/A1 pair of tags was also used to enable labelling of the tropomyosin receptor kinase TrkA and the p75 neurotrophin co-receptor (P75NTR) in SH-SY5Y cells. 39 The A1-P75NTR construct was first labelled with biotin-CoA using AcpS and then stained with streptavidin-QDot525. Subsequently, the S6-TrkA construct was biotinylated by using Sfp followed by labelling with streptavidin-QDot655. To enable imaging of interactions between chemokines and their fluorescent protein (FP)-tagged receptors on live CHO-K1 cells, the chemokines CXCL12, CCL2 and CCL21 were expressed with a S6-tag at the C-terminus and subjected to PPTase-mediated labelling with a fluorophore-CoA conjugate. 40 Co-localization of signals from the fluorophore and the FP indicated receptor-chemokine interactions.
Transglutaminases (TGase) catalyse the formation of an amide bond between the carboxamide of a glutamine side chain and the e-amino group of a lysine side chain. TGases have been used to attach small molecules to antibodies, 41 modify proteins by PEGylation 42 or lipidation, 43 site-specifically conjugate proteins, 44 as well as immobilize proteins on solid support. 45 CFP and EGFR were labelled in vitro and on the surface of living cells by means of guinea pig liver TGase (gpTGase), which promotes the conjugation of labelled cadaverine (lysine analogue conjugated to biotin, AlexaFluor586 or photoaffinity moieties) with 6-7 aa long glutamine-rich Q-tags. 46 Recent studies described the use of tissue TGase (TG2) or microbial TGase (mTGase). 47,48 The latter provides the advantage of calcium-independent labelling. Transfer of propargylamine with subsequent CuAAC modification was demonstrated for maltose binding protein using mTGase.
Two enzymes from Legionella pneumophila were recently investigated by Hedberg et al. for the reversible covalent labelling of proteins. The first enzyme AnkX transfers a phosphocholine moiety from a cytidine diphosphate choline (CDP-choline) derivative to the second serine residue within the eight amino acid recognition sequence TITSSYYR (Fig. 2B). The second enzyme Lem3 is able to remove the label upon hydrolytic cleavage of the transferred phosphocholine from the serine side chain. The method tolerates fusion of the recognition sequence to the N-or C-terminus, as well as into internal loops as demonstrated for fluorescein labelling of model proteins such as maltose binding protein, small ubiquitin-like modifier and DrrA enzyme. 49 Remarkably small tag sizes are found for sortase A and formylglycine-generating enzyme. The transpeptidase sortase A (SrtA) derived from Staphylococcus aureus specifically recognizes the short pentapeptide sequence LPxTG. SrtA cleaves the amide bond between threonine and glycine, forms a reactive thioester between the carboxyl group of threonine and an enzyme-derived cysteine and promotes the ligation of this reactive intermediate with an amino group of an oligoglycine substrate by formation of a new amide bond (Fig. 3A). First reports on SrtA as new protein engineering tool described modifications of LPETG-tagged GFP with a set of various oligoglycine substrates conjugated to peptides including unnatural amino acids, proteins or small molecules like folate. 50 Subsequently, the substrate diversity was extended to PEGylation, peptide-glycosylphosphatidylinositol conjugates or cyclised peptides. [51][52][53] SrtA-mediated labelling has been used to label proteins with fluorophores, photocrosslinkers, biotin, alkyl chains and cholesterol on intact cells such as HEK293T, CHO and HeLa cells. [54][55][56] In a noteworthy study the bacterial toxin aerolysin was labelled with fluorophores and biotin. The ability to form homo-heptameric pores in membranes of target cells after binding to cell surface receptors was not impaired and SrtA-mediated biotinylation helped to identify aerolysin-receptor interactions, which are responsible for aerolysin uptake. 57 Recently LPETG-tagged lipopeptides enabled the visualization of liposome trafficking into lung cancer. 58 The formylglycine-generating enzyme (FGE) naturally co-or post-translationally modifies a cysteine within the conserved recognition motif CxPxR to an aldehyde-bearing formylglycine (Fig. 3B). 59 The hexapeptide sequence LCxPxR, termed ''aldehyde tag'', was included at N-terminal, C-terminal or internal sites as shown for maltose binding protein, human growth hormone, bacterial sulfotransferase or antibodies. Because FGE is endogenously expressed in most prokaryotes and eukaryotes, the aldehyde moiety is formed without additional intervention. For labelling with fluorophores, biotin, peptides or PEG probes the aldehyde group was addressed in hydrazone or oxime forming reactions. 59,60 The endogenous location of FGE in the endoplasmic reticulum was exploited to specifically trace aldehyde-tagged proteins such as IgG antibody, platelet-derived growth factor receptor (PDGFR) and glycoprotein CD4 along the secretory pathway. 61 Recently, a natural glycosylation site within an Fc fragment was replaced by an aldehyde tag, which allowed the introduction of tailor-made glycoforms by means of oxime-forming reactions. 60 Recent work with recombinantly expressed FGE suggests that Cu II is required as a cofactor in order to achieve high turnover in formyl generating reactions. 62 Enzymatic labelling techniques provide high specificity, relatively small tag sizes and covalently bound labels. Yet there are disadvantages. To achieve high labelling yields rather long reaction times and/or high concentrations of substrate are required. One of the fastest approaches to stain proteins in living cells was reported for BirA with a total labelling time of 20 min. 23 However, the second step of this two-step labelling method had to be performed at 4 1C to reduce endocytosis of the target protein EGFR. Labelling with SrtA and fluorescent oligoglycine conjugates was reported to be achieved within 15 min on living HEK293T cells. But the concentration of the oligoglycine substrate was considerably high with 10 mM. 56 At high concentrations of substrate the specificity of labelling can be at stake and excessive washing may be required to remove non-covalently bound substrate. PPTase-mediated protein labelling is even performed at 1-150 mM concentration of biotin-CoA, 36,37 AlexaFluor488-or TexasRed-CoA 38 probes. Most other enzyme-based labelling methods require higher concentrations of substrate. For example, 1 mM ketone-containing biotin derivative is applied in ligations by BirA. 23 Some enzyme-catalysed reactions such as the transpeptidase reaction catalysed by SrtA are reversible which leads to decreases in labelling yields if the substrate is applied at low concentration. 52,53 Many enzymes depend on cofactors such as ATP (e.g. BirA, 23,24 LplA, 25 48 ) which may interfere with the biological process under investigation. With regard to the labelling time, the two-step procedures significantly prolong the process. Usual incubation times in the first step take 10-60 min (e.g. LplA, 25,28 TTL, 33 AnkX, 49 SrtA, [54][55][56] BirA, 23 TGase 46,47 ). The bio-orthogonal labelling reactions in the second step require up to 3 h (e.g. SPAAC, Staudinger ligation) 33 or even overnight treatment (Cu-promoted alkyne-azide cycloaddition). 48 However, a rapid labelling at low concentration of labelling agents is desirable in studies of fast biological processes. This feature can be obtained with peptide-tags that have high affinity for reactive metal complexes, small molecules or peptides and thereby enable reactions at high effective molarity, which provides for efficient labelling with reagents at nanomolar concentrations.

Recognition of metal ions and small molecules by peptide-tagsmechanisms and characteristics
The selectivity of labelling can be ensured by the mutual recognition between electron-deficient metal ions and electron-rich ligands. The binding of small metal ion probes, in most cases by a chelating effect, facilitates a one-step labelling of the target protein (see Table 2).
One of the shortest peptide-tags is comprised of tetracysteine or tetraserine motifs. In a pioneering study, Tsien et al. showed in 1998 that organobisarsenic acid thioesters form covalent bonds with peptide-tags containing a tetracysteine core CCxxCC, most commonly CCGPCC. Upon binding to the thiol groups of the peptide-tag, the Fluorescein-Arsenical-Helix-Binders (FlAsH) increase their fluorescence approximately 50 000-fold (Fig. 4A). In order to reduce the cytotoxicity and non-specific binding of FlAsH, micromolar concentrations of 1,2-ethandiol (EDT) have to be distributed to the cells simultaneously. 63,64 Derivatives such as ReAsH, 64 CHoXAsH, 64 CrAsH 65 and Cy3As 66 expand the scope of available dyes. In many cases, flanking sequences have been added to the CCxxCC motif to increase the selectivity. 67 When administered in 2.5 mM concentration, the bisarsenical dyes permeate the cell membrane and enable intracellular labelling of CCxxCC tagged proteins. 64 Background signals due to interactions with other thiol-containing proteins often require time-consuming washing procedures. Furthermore, the incorporation of such cysteine-rich sequences can promote incorrect disulphide-bond formation within the POI and disturb the integrity of the protein. Schepartz presented bisboronic acid dyes as alternatives of the bisarsenical dyes (Fig. 4B). A rhodamine-derived bisboronic acid (RhoBo) showed nanomolar affinity for the tetraserine motif SSPGSS and provided increases in fluorescence after binding. 68 But in contrast to the tetracysteine motif, the target sequence SSPGSS is found in more than 100 human proteins, amongst them the highly abundant myosin heavy chain. Improved specificity can be achieved by metal-chelating tags, which recognise rather unnatural repetitive sequences such as the His 6 -, His 10 -tag or the oligo-aspartate tag. The hexahistidine tag (His 6 -tag), originally developed as an affinity tag for protein purification, is as short as the tetracysteine or tetraserine motif and known to interact tightly with transitionmetal complexes, e.g. Ni II :nitrilotriacetic acid (Ni II :NTA). 69 Since the interactions between metal ion and ligand are noncovalent, high mutual affinities are required to achieve stable labelling. To further increase the stability of the label, Tampé, Piehler and colleagues developed multivalent complexes carrying two to four NTA moieties (called bis-, tris-and tetrakis-NTA, Fig. 5A). 70 The trisNTA ligands have exceptionally high affinities for His-tags (Ni II :trisNTAÁHis 10 : K D = 0.1 nM). 71 The attachment of bright fluorophores allowed the single-molecule super-resolution microscopy of actin and lamin filaments in CHO-K1 cells. 72 Allbritton designed a trifunctional probe comprised of a Ni II :NTA guiding unit and a photo-reactive arylazide group which enabled the covalent linkage with a His 6 -tagged murine dihydrofolate reductase (mDHFR) in vitro. The third functionality was either a biotin or a DNA strand for further conjugation or immobilization of the POI in vitro. 73 Trifunctional probes containing a reporter dye were used by Auer et al. for the covalent labelling of a His 6 -tagged interleukin-4 receptor on the surface of living cells. 74 A covalent modification of His-tagged proteins was recently introduced by Tørring and Gothelf. The so-called DNA-templated protein conjugation (DTPC) describes the binding of a trisNTA-oligonucleotide to a His 6 -tagged POI. This allows site-selective recognition by a second complementary oligonucleotide carrying a reactive N-hydroxysuccinimide (NHS) moiety, which enables subsequent crosslinking to a lysine residue in close proximity within the POI. 75 This was demonstrated for His 6 -tagged serotransferrin and native IgG1 antibodies, which contain a histidine cluster. Tsien et al. found that the hexahistidine tag not only binds Ni II :NTA probes, but also a Zn II complex (HisZiFiT-Zn II ) with a K D E 40 nM. 76 Since the diamagnetic Zn II is not a fluorescence quencher (other than the paramagnetic Ni II ), fluorescein was derivatized in a way to  directly bind two Zn 2+ ions. Incubation with 100 nM HisZiFiT in buffer with 1 mM free Zn 2+ and two subsequent washing steps allowed the labelling of cell surface-exposed His 6 -STIM1-CFP. Hamachi et al. introduced a binuclear Zn II :iminodiacetic acid (IDA) complex to which a Cy5 fluorophore was attached via an amide bond. Two His 10 -tagged G protein-coupled receptors (B2R and m1AchR) could be stained on the surface of HEK293 cells when treated with 0.5 mM Zn II :IDA-Cy5 conjugate for 10 min. 77 In 2006 Hamachi et al. introduced the oligo-aspartate tag (D4 tag) and the corresponding multinuclear Zn II :bis((dipicolylamino)methyl)tyrosine (Zn II :DpaTyr) complexes. 78 A binuclear Zn II :DpaTyr had a moderate affinity for the D4 tag (K D = 1.4 mM). But by elongation of the tag (D4x3 tag) and utilization of the dimeric Zn II :DpaTyr-FITC complex (harbouring four Zn II ions) at 2 mM concentration, the fluorescence labelling of the muscarinic acetylcholine receptor (m1AChR) was feasible on live cells. However, the interaction is non-covalent and the label will dissociate at prolonged times after labelling. To prevent undesired label losses, a nucleophilic cysteine residue was added to the D4 tag and the binuclear Zn II :DpaTyr complex was equipped with an a-chloroacetyl group (Fig. 5B). After incubation for 12 h with 20 mM of the reactive Zn II :DpaTyr-TAMRA conjugate, the probes were found to enter E. coli cells and covalently label intracellular CA 6 D 4 -tagged maltose binding protein. 79 The D4/Zn II :DpaTyr system was reported to be orthogonal to the His 6 /Ni II :NTA complex. 78 Later it was described that Ni II :DpaTyr complexes show remarkably high affinity for the FLAG tag (DYKDDDDK), the D3 tag (DDD) and D3x2 tag (DDDXXDDD), with binding affinities around 100-fold stronger than for the D4/Zn II :DpaTyr pair. 80,81 More recently, Komatsu et al. designed a Tb 3+ -binding peptide (LBP) de novo with aspartic acid as Tb 3+ -binding sites and tryptophan as sensitizing dye. The 15-mer sequence DDDWDDDWDDDWDDD was genetically fused to the C-terminus of glutathione S-transferase (GST) and used for affinity purification on a Tb 3+ -loaded column, as well as for in-gel detection of the fusion protein, rendering the LBP especially useful due to its dual functionality. 82 While transition metal ions such as Ni II , Co II and Zn II confer high affinity for recognition tags as described in the labelling techniques above, lanthanide ions add further desirable features; long-lived luminescence in the presence of sensitizing chromophores, excellent X-ray scattering powers and some of them even magnetic properties for NMR studies (e.g. Gd III , Dy III , Tb III , Tm III ). 83 This prompted a search for peptide-tags with high affinity for lanthanides to proteins. Imperiali et al. optimized a sequence found in a loop region of the calmodulin protein family, by means of a split-and-pool library and identified an 18-mer lanthanide-binding tag (LBT, ACADYNKDGWYEELECAA). With a K D value of around 220 nM for the binding of Tb 3+ , the LBT tag could be used for concentration determination in cell lysate and for in-gel visualization of a LBT-ubiquitin fusion protein. 84 Later, Imperiali also found cysteine-free LBTs with nanomolar affinities for Tb 3+ (YIDTNNDGWYEGDELLA, K D = 57 nM; 85 FIDTNNDGWIEG DELLLEEG, K D = 19 nM 86 ).
The search for peptide sequences that enable fluorescence labelling of proteins without added enzymes was extended from metal binding tags to tags, which have the ability to directly bind small molecule fluorescent probes. Already in 1998, Nolan et al. discovered by phage display a panel of Texas red, Rhodamine red, Oregon green 514 and fluorescein binding peptides, named ''Fluorettes'' (Fig. 6). 87 Amongst them, the TexasRed-binding peptides (TR) showed the highest, subnanomolar affinities. Jäschke converted the TR peptide into a reactive tag (ReacTR, sequence: CCGGGSKVILFEGPAGRWTWE PISEGAPGSKVILFEGGPG) by adding two cysteine residues, which can undergo a proximity-induced nucleophilic substitution reaction with the N-a-chloroacetamide-conjugated TexasRed. After incubation with 10 mM reactive TexasRed probe for 1 h, the tagged maltose binding protein (MBP-ReacTR) showed bright fluorescence in the cytosol of transfected E. coli cells. 88 Barbas et al. subjected phage-displayed peptide libraries to a reaction-based selection. A 21-mer peptide (rpf1368, sequence: CHNHQKATCRRMRSRETSVKK) was identified which formed an enaminone with 1,3-diketone derivatives. The reaction rates were rather slow with a labelling yield of 90% after 10 h. 89 Also employing phage display, Weiss et al. searched for peptides that react with hydrazides. The resulting hydrazine reactive peptide (HyRe; sequence: HKTNHSCHKREQEHCRVTTT) was fused to the T4 lysozyme and underwent selective modification with 1 mM biotin hydrazide as well as with 1 mM rhodamine B hydrazide after 1 h reaction time in a crude cell lysate. 90 The rational design of a maleinimide reactive a-helical peptide led to the dC10a tag (sequence: LSAAECAAREAAC REAAARAGGK), in which two cysteine residues are separated by two turns of an a-helix. The minimalistic probe molecule comprises a coumarin fluorophore and a dimaleinimide moiety that quenches the fluorescence until both maleinimide groups undergo thiol addition during the labelling reaction. Owing to the small size, the probe was able to enter HEK293T cells after incubation for 30 min at 10 mM concentration and label both, C-terminally dC10a-tagged histone H2B as well as N-terminally tagged actin. 91 The small molecule-binding peptide-tags provide properties that allow protein labelling in complex mixtures. However, for the achievement of a high affinity tag metal complexes seem to require less amino acid residues than small molecules (e.g. 6 aa tag for FlAsH vs. 20 aa HyRe tag for rhodamine B hydrazine). High affinity usually provides faster labelling at lower concentrations of labelling agent and, therefore, the reported labelling times for metal-chelating tags in living cells can be as short as 5-7 min in case of oligo-aspartate tagged (D4) 3 -m1AChR 78 or 20 min for tetracysteine-tagged proteins. 64 Moreover, metal-chelating tags require significant lower amounts of labelling probes with the lowest of 100 nM, as demonstrated for His 10 /Ni II :trisNTA pair. 72 These characteristics demonstrate advantages, also in comparison to enzymatic labelling techniques, however, the labelling of the POI remains non-covalent and thus reversible, hindering unambiguous analyses of low abundance proteins. The reported attempts to establish a covalent linkage between tag and probe employed moderately reactive groups to avoid background labelling. The low reactivity typically resulted in slow rates of covalent labelling, which is disadvantageous for pulse-chase-type experiments.
Peptide and protein based recognition by peptide-tags -mechanisms and characteristics One way to improve the binding affinity of a tag sequence for the labelling agent is to increase its size. This led to the notion of peptide-binding peptide-tags. Peptides can bind other peptides with high mutual affinity (see Table 3). One example for high affinity peptide-peptide interactions is the coiled-coil motif, in which helical peptides wrap around each other to form a superhelix. In 1996, Hodges et al. designed a coiled-coil peptide de novo. 92 The dimerization domain consists of five repeating heptads of the sequences EVSALEK for the negatively charged E-coil and KVSALKE for the positively charged K-coil. Electrostatic interactions drive the high-affinity heterodimer formation (K D = 1 nM). Using the E-coil as genetically encoded tag, the peptides were used for the purification, immobilization and detection of peptides expressed in E. coli. In 2008, Matsuzaki reported the artificial, heterodimeric coiled-coil peptides K3 (KIAALKE) 3 , K4 (KIAALKE) 4 , E3 (EIAALEK) 3 and E4 (EIAALEK) 4originally developed by Hodges et al. 93 -as tag/probe pair for the live cell imaging of cell surface receptors. The stability of the E3/ K3 coiled-coil is in the nanomolar range (K D E 70 nM). With a K D E 6 nM the E3/K4 coiled-coil provided even higher stability. This allowed the labelling of an E3-tagged prostaglandin E2 receptor EP3b subtype (EP3bR) upon incubation with only 20 nM TMR-conjugated K3 and K4 probe peptide within 5 min. 94 The method was used to investigate the oligomerization state of two membrane receptors by staining the E3-tagged receptors with a mixture of fluorophore-modified K4 peptides. 95,96 To avoid a potential loss of label during prolonged imaging periods, the coiled-coil motif was repurposed as a scaffold that instructs reactions to form a covalent bond. Ball et al. substituted amino acids at the interface of the E3/K3 coiled-coil pair to position a dirhodium catalyst on one coil so that addition of a diazo agent induces the alkylation of a tryptophan residue in the other coil. 97 The E3 g W-tagged maltose binding protein was selectively biotinylated in cell lysate by treatment with 5 mM K3 a,e Rh 2 catalyst for 16 h in presence of 100 mM biotin-diazo reagent (Fig. 7A). 98 Xia et al. also harnessed the concept of proximity-induced reactivity by modifying the E3-coil with a nucleophilic thiol provided by a cysteine residue within the E3-coil (CCE probe) and the K3-coil with an electrophilic a-chloroacetyl moiety (CCK probe) (Fig. 7B). Positioning of the two reactive residues at the interface of the coiled-coil promoted the ligation between probe peptide and CCE-tagged cell-surface receptors. For increased reaction rates, the labelling probe had to be used as CCK-1-dimer, still requiring more than 2 h to reach complete ligation and resulting in a 79 aa long protein label. 99 In 2015, Matsuzaki et al. presented a reductionfree, amine-reactive crosslinking reaction between modified E/K coiled-coil peptides. A carboxy sulfosuccinimidyl ester linker was added to the peptide meant to serve as a labelling agent (R3CL probe) and a lysine residue was introduced to the E-peptide (ER3 tag) (Fig. 7C). Within 20 min a significant crosslinking was observed between 150 mM R3CL probes and ER3-tagged b 2 -adrenergic receptors on the surface of CHO cells. 100 With an aim to find a very fast biocompatible reaction for covalent protein labelling that adds only little mass, Seitz et al. designed a coiled-coil-promoted acyl transfer reaction, which also bypasses the ligation between tag and probe. For this purpose, an N-terminal cysteine residue was added to the E3 tag and fused to the POI. A thioester-linked fluorophore was conjugated to the N-terminus of the K3 probe. 101 The formation of the parallel coiled-coil triggered the proximity-induced transfer of the dye to the E3 tag according to mechanism known from nucleic acid templated native chemical ligation. 102 Due to the high reactivity of the arylmercapto-linked thioesters and the high effective molarity induced in the end-of-helix arrangement of functional groups, the transfer of the fluorophore from the K3 probe to the Cys-E3-tag succeeded within 2 min reaction time at 100 nM labelling probe (Fig. 8). The approach was applied to live cell imaging studies, in which a variety of G protein-coupled receptors (including the human neuropeptide Y receptors 1, 2, 4, 5, human neuropeptide FF receptors 1 and 2 and human dopamine receptor 1) were labelled at the cell surface. The modular nature of the thioester-containing K3 probe facilitates the use of different fluorophores or reporter groups, as it was shown for TAMRA, ATTO488, AF350 and biotin labels. 103 In this study, the authors also demonstrated that the covalent linkage between reporter dye and target protein was important to monitor receptor internalization and recycling in living HEK293 cells. Another small and stable coiled-coil scaffold is the leucine zipper. Based on the crystal structure of a coiled-coil trimer of GCN4, Tamamura et al. developed in 2009 artificial leucine zipper peptides with high mutual affinity (ZIP tag/probe pair, K D = 18 nM) for the fluorescence imaging of ZIP-fused membrane proteins. The 23 aa probe helix is equipped with an environmentally sensitive dye (4-nitrobenzo-2-oxa-1,3-diazole, NBD), the other two tag helixes are connected via a loop sequence and offer a hydrophobic pocket for the dye (A2-tag, 49 aa) (Fig. 9A). The chemokine receptor CXCR4 was genetically fused to the A2-tag and expressed in CHO cells. Incubation of the cells with 1 mM NBD-probe peptide led to a fluorescence change. 104 To increase the chemical and biological stability, Tamamura et al. later incorporated an a-chloroacetyl group to the probe peptide that can crosslink to a cysteine residue in the loop region of the A2-tag. 105 For intracellular protein targets, an octaarginine sequence was attached to the C-terminus of the probe peptide. 106 Peptide-based recognition of genetically encoded protein tags beyond coiled-coil interactions was introduced by Huganir et al. in 2004. The labelling system relied on a-bungarotoxin (BTX, 74 aa); a small protein from snake venom. 107 Peptide sequences that bind to BTX were initially selected by phage display and modified for high affinity binding. 108 The resulting 13 aa BTX-binding site (BBS, sequence: WRYYESSLEPYPD, K D E 60 nM 109 ) was placed in the extracellular domain of the AMPA receptor for analysing the expression and membrane trafficking of the tagged receptor. At the same time, Sanes et al. not only used the BTX-binding site, but also the streptavidin-binding sequence (SBP, sequence: MDEKTTGWRGG HVVEGLAGELEQLRARLEHHPQGQREP, 110 K D E 25 nM 109 ) to stain  the C-terminally tagged vesicle-associated protein (VAMP2) on the surface of live cells by incubation with the respective fluorophoreprotein (BTX or SA) conjugates. 109 Attempts to use BTX binding site or SBP for covalent labelling have not been reported but seem rather challenging given the size of BTX and SA.
The development of peptide-based tag/probe pairs for protein labelling seems to be characterised by a trade-off between the specificity of a tag and its size. Split intein-mediated protein labelling circumvents this correlation because intein excises itself after ligation of the flanking sequences (Fig. 9B). One intein fragment should be short in order to allow efficient solidsupported peptide synthesis. While such split inteins have been created artificially, they usually lack sufficient splicing performance and solubility. 111 The naturally occurring Npu DnaE intein is one of the fastest inteins reported with a half-life of only 1 min. 112 In 2012, Camarero et al. equipped the 36 aa C-intein (C int ) of Npu DnaE with a FRET pair in order to quench the fluorescence in the unspliced state. Using a commercial protein delivery system, the intracellular labelling of transcription factor Yin Yang 1 took place within 2 h using the FRET-quenched protein trans-splicing reaction (PTS). 113 In 2015, Kwon engineered a Npu DnaE intein and synthesized a C int with a photo-protected ester-linkage for controlled activation, a FRET pair for turn-on fluorescence detection and an octaarginine as cell penetrating peptide (Fig. 9C). 114 As a proof of concept, labelling of maltose binding protein via PTS was carried out in HeLa cells. Detection of fluorescence was observed within the cells after 30 min incubation time. Beside the Npu DnaE intein other fast split inteins were described such as the Ava DnaE intein. 115 In a recent study, histone H2B was fused to a 102 aa long Ava DnaE N-intein (N int ) and expressed in HEK293T cells. Live cell imaging was demonstrated after 1 h incubation with Npu DnaE C int carrying different reporter groups. Npu DnaE C int was used instead of Ava DnaE C int due to higher synthesis yields. 116 Mootz et al. engineered an atypical naturally occurring split intein with an N int fragment consisting of 25 aa only and a C int of 100 aa. 117 Recently, Tampé et al. significantly improved the affinity of a Ssp DnaB M86 split intein composed of an N int with 11 aa and a C int with 143 aa. The N int was C-terminal modified with trisNTA moieties and the C int was N-terminal fused to a His 6 -tag which increased the affinity up to 40 fold and allowed protein trans-splicing at nanomolar concentrations. 118 Intein technology is interesting, because the tag is excised upon labelling. Yet, to retain the benefits of small peptide-tags, one short, POI-connected intein fragment has to be combined with a rather large intein fragment that brings in the labelled extein. This construct must be recombinantly expressed and conjugated with a reporter group, which may limit the application.

Conclusion
The ideal protein labelling method should proceed with high tag specificity, require only small tag sizes (to avoid interference with protein function), succeed within short incubation times (to enable investigations of fast biological processes in pulse-chasetype experiments), confer stability of labelling (preferably by formation of covalent linkages to allow measurements over prolonged time scales), enable versatility as far as the choice of labelling groups is concerned and lay little toxic burden onto the cell or the cellular process under scrutiny. None of the tag-based methods discussed is without shortcomings and there is, at current, no silver bullet available for protein labelling. Enzymatic methods provide high specificity, small tag size and covalent labels, but with the enzymes currently used rather high substrate concentrations and long incubation times are required. The stability and acquisition of the enzymes itself are not always without problems and labelling of intracellular proteins remains challenging. Nevertheless, the growing number of publications in which enzymes such as BirA are used, for example to study protein-protein interactions by proximity biotinlyation demonstrate that problems are manageable and that the method allows applications that are difficult to be performed by other means. Notably short incubation times are feasible with metal ion recognition tags. Recent studies demonstrate that the placement of bright dyes such as AlexaFluor 647 in immediate vicinity of the protein via the His 6-10 tags and high affinity Ni II :trisNTA complexes facilitates super resolution microscopy of fixed and permeabilized cells. Ideally, the peptide-tag would be targeted by a membrane permeable probe. Though numerous studies demonstrate the feasibility of this approach, the available peptide-tags require rather long sizes to confer affinities for fluorescent dyes in a range that require staining with Z10 mM concentration; which may lead to high background. Higher tag specificities are achievable with peptide based affinity reagents. For example, the de novo designed E3/K3 coiled-coil offers high affinity and selectivity at comparably small tag size (21 aa). This enables rapid labelling at low concentrations of probe. A recently developed peptide-templated acyl transfer technique combines the speed of complex formation via coiled-coil interactions with the advantages of covalent labelling. The method involves a highly reactive thioester-linked fluorophore K3 conjugate, which rapidly reacts with a cysteine terminated E3-coil. The covalent labelling is extremely fast (2-5 min) and only little cargo is added to the protein of interest. Yet at current, the method is restricted to N-terminal labelling of cell surface proteins.
It is obvious that the peptide-tag based labelling methods available as of today still do not offer the robustness and wide applicability known from autofluorescent proteins and enzyme based tags (e.g. SNAP, CLIP, Halo). Yet, given the scientific need for less perturbing labelling methods, considering the pace of the current development of the field and taking into account that an ever increasing number of chemists joins the race, the authors are convinced that peptide based tags will become game changing enablers in a not too distant future.