Biomolecular Chemistry Site-directed spin labeling of proteins for distance measurements in vitro and in cells

b Site-directed spin labeling (SDSL) in combination with electron paramagnetic resonance (EPR) spectroscopy allows studying the structure, dynamics, and interactions of proteins via distance measurements in the nanometer range. We here give an overview of available spin labels, the strategies for their introduction into proteins, and the associated potentials for protein structural studies in vitro and in the context of living cells.


Introduction
Site-directed spin labeling (SDSL) is the concept of introducing paramagnetic centers into a biomacromolecule of interest at user-defined sites. Pioneered by Hubbell and co-workers more than 25 years ago, 1,2 this technique is widely used for studying the structure and function of macromolecules by EPR spectroscopy. 3 This includes proteins, but also synthetic polymers, 4,5 nucleic acids 6 and lipids. 7 Though natural paramagnetic centers such as metal ions are present as cofactors in certain classes of proteins and can be used as probes for EPR spectroscopy, most proteins are diamagnetic and thus EPR silent. Hence, SDSL provides insights into protein structure and dynamics at sites of interest virtually without background. Four main sources of information are typically exploited: 8 Label dynamics, solvent accessibility, polarity of the microenvironment, and distance distributions between two spin labels in the nanometer range on which we will particularly focus in the following. A variety of spin labels can be used for such studies, with nitroxides being the most popular class. 9 EPR spectroscopy SDSL enables detailed studies of a large variety of protein properties. In this review, we will give only a short overview of most of these properties and will focus on distance measurements as main application. For a detailed discussion of all properties, we refer to references cited in the corresponding paragraphs.
Rotational motions of spin labels, often characterized by one or several rotational correlation times result in characteristic spectral shapes. Three contributions to the overall dynamics can be distinguished: dynamics of the spin label linker itself, dynamics of the proteins secondary structure element at the incorporation site, and rotation of the entire protein or protein complex. The spectra represent a superposition of all three dynamic processes. [10][11][12] Since the linker dynamics of the spin label can be restricted depending on the steric properties of the molecular environment, 10 changes in this environment can be monitored. These can arise from e.g. conformational changes, ligand binding, or differences in the exposure to membrane lipids or intra/extracellular environments in the case of transmembrane proteins. Spectral characteristics can also be assigned to different types of secondary structure elements. 13,14 The solvent accessibility of a labeled site can be measured via Heisenberg exchange. When exposed to paramagnetic quenchers like molecular oxygen or water soluble nickel complexes (e.g. Ni-EDDA), Heisenberg exchange processes result in a significant increase of spin-lattice relaxation. 15 Since this interaction is highly dependent on a close interaction with the quencher, the choice of a suitable reagent polarity can visualize e.g. membrane bound areas of a protein when nonpolar molecular oxygen is added to the system. In contrast, polar quenchers like Ni-EDDA complexes will only interact with parts of the target molecule that are exposed to the solvent. 16 Analysis of EPR spectra reveals the polarity of the microenvironment. Via SDSL, subsequent labeling and thus mapping of the local polarity of e.g. a protein chain is possible. [17][18][19][20] be employed for distance measurements. 21 Additionally, the orientation of the labels with respect to each other can be determined. EPR does not only allow to determine a mean distance, but gives access to a fully quantitative distance distribution representing the complete conformational ensemble. The obtained distance distributions are quantitatively derived without the need for assumptions. Intermolecular distances can be measured between singly spin labeled proteins for analyzing supramolecular aspects such as oligomerization, formation of fibrils or complex formation. SDSL of a protein at two sites can be used to determine intramolecular distance constraints for protein structure determination (Fig. 1). As different secondary structures will lead to a predictable change in the spin-spin distance, for instance folding and unfolding processes can be observed.
EPR distance measurements are complementary to Förster resonance energy transfer (FRET) where distances between a donor and an acceptor fluorophore can be detected when these labels are positioned close to the Förster radius. FRET allows single molecule detection, however, precise distances or distance distributions are usually not measured, and the FRET pairs have to be carefully selected based on expected distances. EPR distance measurements, on the other hand, allow for precise determination of distance constraints, distance distributions and their shape over a wide range of distances. Spin labels are significantly smaller than typical fluorophores, and have been shown to not perturb the structure of proteins in various cases. 22 Moreover, the possibility to use two identical spin labels instead of a dedicated pair of fluorescence donor and acceptor significantly simplifies labeling approaches.
For EPR distance measurements, the sample is usually shock-frozen in liquid nitrogen and the subsequent measurements are carried out at cryogenic temperature. By shock-freezing, a snapshot of the complete conformational ensemble in solution-like environment is generated. Low temperatures facilitate distance measurements, as they result in reduced spin-relaxation and slow down the reorientation of the spinspin vector between the labels, which deteriorates the signal. Current developments aim for measurements at physiological temperatures using either nitroxides [23][24][25] or triarylmethyl (trityl or TAM) spin labels. 26 By EPR, a large range of distances between around 1.2 and 10 nm can be measured. Suitable methods have to be chosen carefully depending on the occurring distances and the choice of spin labels as each method has limiting factors. 27 Tech-niques for distance measurements are e.g. double-quantum coherence (DQC) 28,29 or T 1 relaxation measurements, 30-32 but the most commonly applied method in the field is double electron electron resonance (DEER), also called pulsed electronelectron double resonance (PELDOR). [33][34][35][36][37][38] When the distance between spin labels is less than approximately 2.5 nm, the resulting dipole-dipole coupling significantly broadens the EPR spectrum, which can be directly resolved. For distances smaller than approximately 1.2 nm, a lower limit for precise distance determination is reached because of large exchange contributions. For distances larger than 2.5 nm, the dipoledipole interaction cannot be directly resolved in the EPR spectrum and thus has to be separated from all other contributions. 39 Therefore, pulsed EPR methods are applied, which have been extensively reviewed. 40 For a summary of distance determination methods and feasible ranges, we refer to recent literature. 39 DEER allows distance determinations of usually up to 8 nm, in perdeuterated samples with an improved transverse relaxation time T 2 more than 10 nm are feasible. 41,42 Initially started as an X-Band technique, Q-Band measurements with increased sensitivity are on the rise. 43 The experimental raw data shows a distance-dependent modulation, but is folded with a background function that can be determined individually. Data analysis after removal of the background leads to the desired distance distributions. A key advantage of DEER measurements is that they can be carried out virtually background-free even in complex environments as encountered e.g. in membranes or in the cytoplasm of cells.

Site directed spin labeling
A variety of protein classes, such as flavoproteins and photosystems, are intrinsically paramagnetic and can thus be directly monitored via EPR techniques. 44,45 However, most proteins are intrinsically diamagnetic and become accessible for DEER distance measurements only by SDSL. The two central aspects of SDSL are the chemical and spectroscopic properties of the spin label itself, and the strategy used for its introduction into the protein under study. A large number of requirements apply to both aspects, and currently there is no general strategy that satisfactorily fulfills all of them. This makes it necessary to customize the SDSL strategy to the specific demands of the envisaged experiment. [46][47][48] A critical property of spin labels is the conformational flexibility of both the label scaffold itself and of the linker between this scaffold and the protein backbone. This contributes to the overall dynamics of the paramagnetic center, which complicates the analysis of protein dynamics.
A second important property of spin labels is the chemical stability of the label itself and of the linker, most importantly in the context of reducing and biological environments such as the cytoplasm of living cells. Therefore, nitroxide radicals (that are typically prone to e.g. reduction) are kinetically stabilized by steric shielding via bulky, quaternary carbon centres in α-position to the nitrogen atom with four methyl, ethyl or even larger alkyl substituents. However, larger spin labels expectedly have an increased potential to perturb the structure of the labeled protein. Spin labels based on chelated metal cations feature a high stability, but are often both bulky and charged. [49][50][51] Besides these chemical and spectroscopic properties of the spin label, the strategies for its introduction into the target protein strongly differ in the scope of labels and target proteins that can be used, the applicability for in vitro or in cell studies, and last but no least the technical ease as well as labeling efficiency and selectivity.
For example, peptide synthesis ( Fig. 2A) offers a wide scope of labels that can be introduced, including labels with excel-lent spectroscopic properties that cannot be introduced by other methods. 46,47 However, the scope of applicable proteins is limited by the inherent difficulties of solid phase synthesis in the production of large proteins, and by the lack of natural folding processes and posttranslational modifications that may be required for natural function. Ligation of peptides to expressed proteins (Fig. 2B) 52 offers access to larger proteins that can be at least in part naturally folded and modified, but still have limitations in respect to the possible incorporation sites and technical ease.
Labeling of expressed proteins by chemoselective conjugation reactions at specific amino acids (Fig. 2C) offers the modification of fully natural, endogenous proteins with great technical ease. [46][47][48] On the other hand, the scope of applicable spin labels for chemoselective labeling is reduced compared to approaches based on peptide synthesis. In particular, paramagnetic centers introduced via chemoselective labeling have a minimum flexibility that is dictated by the side chain of the targeted amino acid. Moreover, labeling canonical amino acidsmost importantly cysteinesrequires mutations to introduce cysteines at user-defined sites and is thus not applicable to proteins that rely on cysteines for their natural function, e.g. as part of catalytic centers or of disulfide bridges for structural stability. Targeting amino acids other than cysteine (such as lysine) suffers from analogous limitations. A solution to this problem are peptide tags (such as His-tags) for noncovalent introduction of spin labels via metal chelation (Fig. 2D), 53,54 but for the cost of introducing comparably large changes into the protein, and with limitations in respect to the applicable sites.
The incorporation of noncanonical amino acids (ncAA) with unique reactivity by nonsense suppression, i.e. by translation with an expanded genetic code, 55,56 offers chemoselective labeling of endogenous proteins irrespective of the presence of cysteines, and with introducing only minimal changes into the protein (Fig. 2E). 57,58 Moreover, spin labels can be directly genetically encoded as ncAA during translation in vivo [59][60][61] (and using microinjection with chemically aminoacylated tRNAs with lower protein yields in vitro 62,63 ), without the need for conjugation reactions at the ncAA (Fig. 2F). Besides technical ease, a direct genetic encoding in vivo conceptually offers a high potential for intracellular studies, since proteins are biosynthesized directly in the cell, whereas chemical labelling procedures require the transformation or microinjection of labeled proteins into cells. This expectedly does not lead to a state that resembles the one of endogenous proteins that are e.g. processed and transported in and between specific subcellular compartments.

Nitroxide spin labels
The most commonly employed spin labels are nitroxides 9 that exist in various designs and differ in overall ring structure, substituents in α-position to the nitrogen atom, and linkers ( Fig. 3A shows a general structure that covers most nitroxides currently in use). Besides the mobility of the paramagnetic center, in particular the chemical stability of nitroxides differs widely, and is controlled by ring size and α-substituents as two major factors. Here, nitroxides with five-membered rings tend to be more stable than ones with six-membered rings. Moreover, increased steric shielding by larger alkyl α-substituents (e.g. ethyl, propyl) can contribute to stability. [64][65][66][67][68][69][70] Among the large variety of nitroxides that can be introduced into synthetic peptides by direct solid phase synthesis or postsynthetic labeling, the most widely used one is the 4-amino-1oxyl-2,2,6,6,-tetramethyl-piperidine-4-carboxylic acid (TOAC 1, Fig. 3B) spin label. 71,72 This label is exceptionally rigid and allows flexibility of the paramagnetic center only by flipping of the piperidine moiety. However, this rigidity also bears the potential of perturbing peptide secondary structures.
The by far most popular nitroxide for SDSL of proteins by conjugation to canonical amino acids is the sulfhydryl-reactive methanethiosulfonate spin label 3 (MTSSL, Fig. 3C). 74 MTSSL readily reacts with accessible cysteine residues in proteins under formation of the side chain R1 (Fig. 3C). The conformational properties of R1 are well characterized and the label has been used in a vast variety of proteins. 75,76 Moreover, the 1-oxyl-2,2,5,5-tetramethyl-pyrroline moiety of MTSSL is comparably stable in reducing/biological environments. 64 However, since R1 bears a disulfide linkage that is by itself redox-sensitive, alternative labels with stable linkers for cysteine labeling have been introduced, among them the maleimides and iodoacetamides 4 and 5 (Fig. 3C). Compared to TOAC and TOPP, all cysteine-reactive nitroxides result in side chains with increased flexibility. However, an interesting concept to improve the rigidity of such labels are double sulfhydryl-reactive labels that can be used for conjugation to two cysteines in spatial vicinity. 77 Here, the side chain RX formed by reaction with 6 ( Fig. 3C) has been introduced into different secondary structural elements of T4 lysozyme and exhibited reduced flexibility compared to R1. To overcome inherent limitations of cysteine labeling, ncAA with reactive side chains for bioorthogonal conjugation reactions can be incorporated into proteins in vivo by nonsense codon suppression using orthogonal tRNA/aminoacyl-tRNA-synthetase (aaRS) pairs. 55,56 The first study in this direction employed an evolved tyrosyl-tRNA-synthetase from Methanocaldococcus jannaschii and the genetically encoded ketone-bearing ncAA p-acetyl-L-phenylalanine ( p-AcF). This ncAA was reacted with the aminooxy-functionalized nitroxide 7 via oxime formation in the context of T4 lysozyme. 57 This resulted in side chain K1 (Fig. 3D) that exhibited useful spectroscopic properties, though compared to R1 with increased flexibility. The requirement for low pH or for the presence of high concentration of catalyst for oxime formation and the moderate kinetics of the reaction make it however desirable to employ faster and more bioorthogonal chemistries such as copper-free click reactions as available for genetically encoded ncAA. 56,78 Indeed, a proof of principle has been demonstrated for the conjugation of the cyclooctyne-bearing nitroxide 8 that was reacted with the aryl azide ncAA p-azido-L-phenylalanine ( p-AzF) in T4 lysozyme, resulting in side chain T1. 60 However, the very large size of T1 is suggestive of a high flexibility, and the spectroscopic properties of T1 have not yet been thoroughly studied. Finally, the nitroxide ncAA 9 (SLK-1, Fig. 4) has been genetically encoded using an evolved Methanosarcina mazei pyrrolysyl-tRNA-synthetase mutant, 79 allowing for the cotranslational incorporation of nitroxides directly in E. coli cells. 59,61 SLK-1 was introduced into multiple sites of GFP and thioredoxin and exhibited similar flexibility as MTSSL in simulations and in DEER distance measurements. 60

Spin labels based on metal cations
A major drawback of nitroxide-based spin labels is their limited lifetime in intracellular environments 60,67 which limits their use for in-cell EPR structural studies on proteins (see below). A promising approach to circumvent this limitation involves the application of reduction-stable, paramagnetic metal cations and their attachment to proteins using conjugation or complexation strategies. Lanthanide tags based on the complexation of Gd 3+ ions have proven highly useful for EPR distance measurements on proteins in vitro, 49 but also in intracellular environments. 50,51,80 Prerequisite for the application of paramagnetic cations in SDSL is a suitable complexation approach. Since free Gd 3+ ions decrease data quality in DEER experiments, 81 high affinity chelating agents are necessary to stably complex the Gd 3+ ion. Affinity is largely controlled by the denticity of the complex, imposing the necessity of rather bulky chelating ligands. This increases the overall size of the label and thus the potential to perturb the structure of the protein under study. A variety of Gd 3+ spin labels have been developed and have been shown to be applicable for protein SDSL when introduced via sulfhydryl-directed conjugation chemistries (Fig. 4).
The first report of Gd 3+ -labeled proteins in DEER measurements was based on pre-activating cysteine residues with Ellman's reagent (5,5′-dithiobis-(2-nitrobenzoic acid)) and subsequently conjugating 4-mercaptomethyldipicolinic acid 10 (4MMDPA) followed by Gd 3+ complexation (Fig. 4). 82 To avoid titration with Gd 3+ ions after labeling and thus to reduce free Gd 3+ species, phenylethyl-derived Gd 3+ -DOTA 11 (1,4,7,10tetraazacyclododecane-1,4,7,10-tetraacetic acid) pre-complexes were also applied in SDSL approaches (Fig. 4). [83][84][85] The bulkiness of this complex is sterically reducing the mobility of the paramagnetic Gd 3+ center; assuming the structure to remain unperturbed by the size of the complex, the steric restriction may provide a narrower distance distribution. 85 The activated disulfide-bond reagent 13 (C1-tag, Fig. 4) was applied for labeling unique cysteine residues, and DEER measurements were performed to investigate chaperon ERp29 in W-band. 83 Further improving the spectroscopic properties of these labels, a more rigid linker was then established and employed in the cysteine-reactive label 14 (C9-tag, Fig. 4) to resolve rather narrow distance distributions in DEER experiments with high resolution. 85 Gd 3+ -DOTA complexes 15 have also been introduced into proteins via methanethiosulfonate 84,86 and maleimide 51 moieties 16 and 17 for cysteine conjugation, the latter strategy offering redox-stable linkages for in cell EPR measurements. 51 Moreover, the sulfhydryl-reactive vinyl-derived Gd 3+complex 18 has been shown to be applicable for in-cell distance measurements. 50 Similar to nitroxide spin labeling, the range of conjugation chemistries has also been extended for metal cation-based labeling, from targeting cysteine to targeting ncAA with alternative bioorthogonal chemistries. By employing an evolved orthogonal tyrosyl-tRNA synthetase from M. jannaschii for nonsense codon suppression in vivo, the ncAA p-AzF was site-specifically incorporated into various sites of the E. coli aspartate/glutamate binding protein. 87 Labeling with the alkyne-modified label 12 (C3-tag, Fig. 4) by copper click chemistry resulted in a conformational fixation of the complex via the participation of the formed triazole ring in Gd 3+ -complexation, facilitating theoretical predictions due to the rigidity of the complex. 88 Though being potentially interesting for in cell applications, this approach suffers from chemical conversion of the azide moiety inside cells and could thus only be applied using a cell-free protein synthesis approach. 89 Until today, the labeling of proteins using Gd 3+ -tags could not be demonstrated inside cells, making invasive microinjection or transfection protocols necessary for intracellular studies.
Besides Gd 3+ , the transition metals Cu 2+ and Mn 2+ have been shown to be interesting paramagnetic centers in protein SDSL EPR experiments. In a first study, Mn 2+ -EDTA methanethiosulfonate tags 19 were covalently attached to unique cysteine residues on the death domain of a neurotrophin receptor, and subjected to Mn 2+ -Mn 2+ distance measurements (Fig. 5A). 90 To overcome limitations concerning the stability of the formed disulfide-bonds, the Mn 2+ binding ligands 20-22 based on a PEDTA chelator (N-( pyrid-2-yl-methyl)ethylenediamine-N,N′,N′-triacetic acid) were developed to introduce the Mn 2+ label via C-S bond formation into proteins (L1, L2 and L3, Fig. 5A). 91 Finally, Mn 2+ ions have been introduced sitespecifically into proteins via the genetically encoded ncAA 8-hydroxyquinoline alanine that was co-translationally incorporated into proteins in vivo by nonsense suppression (Fig. 5B). 92,93 Labeling proteins with Cu 2+ ions has been achieved by EDTA and TETAC complexation ( Fig. 5A and C) 94 or recently, by exploiting a rigid, double-histidine binding motif in combination with iminodiacetate to resolve distances with remarkable resolution (Fig. 5D). 54 These transition metal labeling strategies can also be applied in combination with nitroxide spin labels to measure distance distributions. 23 Trityl-based spin-labels Since EPR distance measurements usually have to be performed at cryogenic temperatures, trityl radical spin-labels were investigated in SDSL and distance measurements at physiologically relevant temperatures to overcome this limitation. Measurements at ambient temperature can potentially resolve conformational sub-states, which might not be accessible following traditional measurements in frozen solution. Especially tetrathiatriarylmethyl-based spin-labels feature beneficial spectroscopic properties, showing one narrow signal with a line width of 90 mG (ref. 95) and a prolonged transverse relaxation time in the microsecond range, making them suitable candidates for measurements at ambient temperature. 96 Interestingly, increased stabilities in human blood were reported, featuring half-lives of up to 24 h. 97 Besides their applicability as oxygen sensors, [97][98][99] trityl-based spin labels could be applied in distance measurements of polymers 100 and nucleic acids (at 37°C). 101 Recently, the first report of distance measurements using a doubly TAM-labeled protein could demonstrate that protein distance measurements in liquid solution can be feasible. A cysteine-reactive TAM-reagent ( Fig. 6) was applied to label the rigid C helix of T4-lysozyme, which was immobilized and subjected to double quantum coherence (DQC) distances measurements at 4°C. 26 Although a sharp distance distribution of 2 nm could be resolved, further improvements to increase the T m of the TAM-label have to be conducted to expand the range of accessible distances beyond 2 nm.

In cell SDSL EPR studies of proteins
In cell EPR and in particular in cell DEER distance measurements in combination with SDSL can provide insights into the structural dynamics of proteins in living cells as basis of their physiological function. This field has so far been approached using both nitroxide and metal cation-based spin labels, and using different labeling strategies. First studies used a complex transfection of previous in vitro chemically spin-labeled organic macromolecules in model cells (e.g. Xenopus oocytes). [102][103][104] Apart from the considerable experimental effort of such experiments, Xenopus oocytes represent an artificial environment for all non-Xenopus proteins, i.e. without natural processing of the target protein by folding, transport, post-translational modifications and degradation.
A bacterial outer membrane protein was spin-labeled via traditional MTSSL labeling of unique cysteine residues in live E. coli cells, and the proteins conformational flexibility upon ligand binding was studied. 105,106 Here, the difficulties concerning the reduction of nitroxide radicals (and disulfide bridges) in cells were circumvented by studying proteins on the cell surface rather than in the cytoplasm. In contrast, the nitroxide ncAA 9 (SLK-1) was used for the intracellular biosynthesis of E. coli thioredoxin in the cytoplasm of live E. coli cells by translation with an expanded genetic code. However, the experiment was limited to the selective detection of the protein in live E. coli cells and no in cell DEER distance measurements were demonstrated. The limited stability of nitroxides in cells has led to strategies using redox-stable Gd 3+ labels for in cell DEER distance measurements. This approach has led to successful in cell distance measurements in proteins in several cases, highlighting the superior stability of the employed Gd 3+ complexes in the tested cell types. However, these labels can so far only be introduced chemically into pro-  teins and thus require microinjection or transfection procedures to deliver the labeled protein into living cells. 50,51 This does not allow for studying proteins in their truly natural state, that is, after natural translation, folding, transport, modification and degradation.

Conclusions and outlook
Though a large variety of spin labels and strategies for their introduction into proteins are now available for SDSL EPR studies, there is still no ideal approach for all applications. Instead, available approaches have unique profiles with specific strengths and weaknesses in respect to label size, flexibility, stability, and to the potential to study proteins in their natural biological environments. One of the most exciting perspectives of SDSL EPR is the emerging field of in cell EPR. Here, the use of stable Gd 3+ labels has recently enabled important breakthroughs, 50,51,107 and is expected to greatly advance the significance of EPR for cell biology. However, the current requirement to deliver chemically labeled proteins into cells does not fully meet the ultimate goal of in cell EPR, that is, studying endogenous, naturally translated and processed proteins directly in their natural host cells. The perspective of studying membrane proteins by chemical spin labeling, though limited to the cell surface, does offer this potential, and could deliver insights into biological processes based on proteins that are otherwise difficult to study due to size and crystallization properties. Moreover, the use of genetically encoded ncAA offers advancements for bioorthogonal cell surface (and even cytoplasmic) labelling by rapid click chemistries, using both nitroxide or Gd 3+ labels. Alternatively, the direct genetic encoding of stable spin labeled ncAA offers the potential for further simplifications and may allow the investigation of proteins directly in their natural cellular environment with minimal perturbation. These combined developments have the potential to redefine the role of EPR structural studies for cell biology by providing precise insights into the physiological function of proteins that are otherwise difficult to access.