The synthesis and application of a diazirine-modi ﬁ ed uridine analogue for investigating RNA – protein interactions †

The roles played by many ncRNAs remain largely unknown. Similarly, relatively little is known about the RNA binding proteins involved in processing ncRNA. Identi ﬁ cation of new RNA/RNA binding protein (RBP) interactions may pave the way to gain a better understanding of the complex events occurring within cells during gene expression and ncRNA biogenesis. The development of chemical tools for the isolation of RBPs is of paramount importance. In this context, we report on the synthesis of the uridine phosphoramidite U Dz that bears a diazirine moiety on the nucleobase. RNA probes containing U Dz units were irradiated in the presence of single-stranded DNA binding protein (SSB), which is also known to bind ssRNAs, and shown to e ﬃ ciently (15% yield) and selectively cross-link to the protein. The corresponding diazirine-modi ﬁ ed uridine triphosphate U Dz TP was synthesized and its capacity to act as a substrate for the T7 RNA polymerase was tested in transcription assays. U Dz TP was accepted with a maximum yield of 38% for a 26mer RNA containing a single incorporation and 28% yield for triple consecutive incorporations. Thus, this uridine analogue represents a convenient biochemical tool for the identi ﬁ cation of RNA binding proteins and unraveling the role and function played by ncRNAs.


Introduction
2][3] Examples of ncRNAs include not only the classical transfer RNA (tRNA) 4 and ribosomal RNA (rRNA) 5 involved in translation, but small interfering RNA (siRNA), 6 micro RNA (miRNA), 6,7 small nucleolar RNA (snoRNA), 8 small nuclear RNA (snRNA) 8 and numerous others that do not t into a distinct category.Each of these classes of ncRNAs fulls specic functions in all three domains of life (i.e.Eukarya, Bacteria, and Archaea) during gene expression, early development, 9 and cell maintenance, with an ever-growing list of diseases [10][11][12][13] related to unusual ncRNA behavior hinting at a crucial involvement of these RNAs for correct cellular function. 14Difficulties in identifying the function of these ncRNAs is oen connected to complications in differentiating mRNAs from ncRNAs in bioinformatic approaches, 15 alongside problems generating enough material with sufficient separation to distinguish unique ncRNAs in experimental procedures. 16A pathways are extensively regulated by proteins called RNA binding proteins (RBPs) [17][18][19] and vice versa, 20 and together RNA and RBPs function as ribonucleoprotein (RNP) complexes, the best known of which are the spliceosome 21 and ribosome.5 The inherent association of RNA with proteins has led to the development of experimental approaches for identifying ncRNA and RBP function based on the interactions between the two groups of biomacromolecules.Computational analysis, [22][23][24] Xray crystallography [25][26][27] and cross-linking [28][29][30][31][32][33][34] methods of isolation have all been employed to investigate RNA/RBP interactions.One advantage of cross-linking is that no prior knowledge of an interaction between a protein and RNA is required. It ma also be used to identify networks of RNA and proteins involved in forming RNP complexes. Crss-linking between oligonucleotides and proteins may be achieved by the addition of external cross-linking reagents, [35][36][37] or by inducing the cross-link photochemically with UV light.38,39 The advantage of photochemical cross-linking is that the natural binding is preserved, which may be disrupted by the addition of external reagents.Unfortunately, the wavelengths of UV light required to induce cross-linking between nucleic acids and proteins ($254 nm) are actually damaging to the molecules and thus have limited applicability in vivo.40 The introduction of photoreactive groups into nucleic acids, allows for the modulation of the wavelength of UV light needed to achieve reasonable yields of cross-linking and minimize the potential damage caused to biomolecules.41 Such modications have included halides, 28,42 inclusion of sulfur atoms, [43][44][45] and the addition of more sophisticated moieties such as azides, 46,47 and benzophenones. 4856 Particularly, p-stabilized carbenes, 57,58 such as generated by exposure of 3-tri-uoromethyl-3-phenyldiazirine 54,59 to 365 nm irradiation, have been used in a wide range of applications.56,60 Examples include: cross-linking of DNA-protein interactions; [49][50][51][52]61,62 photoactive amino acids and peptides for substrate identication; [63][64][65] and diazirine modied sugars to capture glycoprotein binding events.66 Further to this, a recent study of commonly used crosslinkers have identied the triuoromethyldiazirine moiety superior to benzophenone and phenylazide derivatives, as it produces simple and accurate cross-linking proles of amyloid nanostructures.67 The use of diazirines to identify DNA binding protein interactions hints at the possible application of this chemistry to identify RNA binding proteins.Although 3-triuoromethyl-3phenyldiazirine has been used to cross-link RNA previously, 68 it involved coupling the photoactive moiety to a 2-thiocytidine residue already incorporated into a tRNA anticodon loop before cross-linking it to the ribosome.This strategy has shortcomings as it can lead to poor efficiencies of cross-linking, and the external addition requires accessibility to the modied base before cross-linking can be achieved, thus limiting its application.Moreover, modied RNA triphosphates (N*TPs) have been previously synthesized as photoactive substrates for polymerases; although the modications have generally been made on the DNA level, rather than RNA.69 Photo-labelling of viral proteins during transcription with 5-azidouridine triphosphate 70 has been described, but the azido functionality is incompatible with phosphoramidite chemistry, limiting its applicability. In adition, N*TPs have efficiently served in in vitro selection experiments for the generation of modied aptamers and ribozymes, 71 and this methodology could be exploited to detect previously unknown interactions between RNA and proteins and polymerases.
Here, we report on the synthesis of the RNA building block U Dz bearing a diazirine moiety on the nucleobase (Fig. 1) with the intent of developing a new chemical tool for the investigation of the interaction of RNAs with RBPs.

Design of the probe
We postulated that by incorporating the diazirine moiety within an RNA probe, through attachment at the C5-position of the uridine nucleobase, the photoactive RNA probe would be compatible with a wider range of applications.The choice of the location of the diazirine photoreactive group was dictated by the fact that appending modications at the C5-position of pyrimidines minimizes structural perturbation of duplexes. 72In addition, the photoreactive group was connected to the nucleobase via a pentynol linker arm that was designed so as to minimize undesired self-binding to the RNA probe.Therefore, a diazirine-containing uridine analogue (U Dz ) was designed to be incorporated into RNA probes using standard phosphoramidite chemistry (Fig. 1).The DNA analogue dU Dz has been previously synthesized by Carell et al. and incorporated into DNA strands opposite a UV induced pyrimidine lesion and was able to selectively cross-link the DNA lesion repair enzyme DNA photolyase in the presence of other proteins. 48,622 Synthesis of the phosphoramidite U Dz and nucleoside triphosphate U Dz TP The synthesis of phosphoramidite U Dz followed the synthetic pathway developed for the corresponding DNA building block.62 Briey, diazirine 1 (ref. 5473 and 74) was coupled to 4-pentyn-1-ol in the presence of NaH, before Sonogashira coupling to the protected nucleoside 3 (which was prepared via standard DMT protection of 5-iodouridine) to generate 4, a common precursor to both U Dz and U Dz TP (Scheme 1 and ESI †).Indeed, in order to generate phosphoramidite U Dz , 4 was selectively protected at the 2 0 -OH with TBDMS before phosphitylation under standard conditions to yield U Dz (56% over 2 steps; Scheme 2).
Moreover, U Dz TP could be accessed by rst bis-acetylating compound 4 (71%) and then removing the DMT protecting This journal is © The Royal Society of Chemistry 2014 group under acidic conditions (Scheme 3).The suitably protected intermediate 7 was then converted to the corresponding triphosphate U Dz TP by application of the Ludwig-Eckstein protocol 75,76 and was obtained in 21% aer a thorough semipreparative RP-HPLC purication. 76

Oligonucleotide synthesis and thermal stability
The 5 0 -biotinylated 21mer RNA probes were designed for capture on streptavidin magnetic beads aer cross-linking and contained single U Dz units at different locations within the sequence (ONs 2-4 in Table 1 and ESI †).
Moreover, the sequence composition was chosen so as to exclude the formation of dimers and tertiary structures.UV thermal melting measurements for each sequence versus both the RNA and DNA complements were recorded to assess whether the modication caused any disturbance in the duplex stability (Fig. 2).The impact of the modication on the duplex stability with both the complementary DNA and RNA strands seems to be negligible (DT m in the range of À1.4 to +0.5 C per modication; Table 2).A similar behavior has been observed for other C5-modied uridine analogues, [77][78][79] and therefore modi-cations located at the C5 position of pyrimidines are well-accommodated in the major groove and do not interfere with duplex formation and stability. 80,81

Cross-linking experiments
Initial irradiation experiments with the modied oligonucleotides were performed in the presence of the single stranded DNA binding protein (SSB). 82Although SSB is a DNA binding protein with a 10-fold lower affinity for ssRNA compared to ssDNA, 83,84 it was deemed sufficient for a "proof of concept" experiment.ON1-4 were 3 0 -end 32 P-radiolabelled with T4 RNA ligase for visualization on the gel and to avoid photobleaching of potential uorescent tags. 85 The probes and negative controls were irradiated (l ¼ 375 nm for 30 minutes, in pH 7.8 buffer) in the presence of a slight excess of the SSB protein, and eluted on a 10% tricine SDS PAGE, 86 followed by phosphorimaging (Fig. 3).The irradiation conditions were chosen so that the diazirine will photolyze into reactive carbenes, which in turn can insert irreversibly into nearby CH bonds of the target protein. 48,62The appearance of a slower running radioactive band (highlighted in red) in Lane 4 of each gel, is consistent with cross-linking of the modied RNA aer irradiation.Indeed, the appearance of this band was not observed in the natural control (Lane 5) or in the absence of irradiation (Lane 2).Moreover, the appearance of a faint, faster running band in Lane 2 (highlighted in green), indicates that a higher molecular weight species than the free RNA is formed under irradiation conditions, in the absence of the protein.
We hypothesize that interstrand cross-linking might occur by insertion of the carbene generated under irradiation conditions into CH bonds of an adjacent RNA strand, although care was taken when designing the oligonucleotide sequences to ensure no hairpin formation or dimerization was likely.The disappearance of this band in the presence of the protein underscores the high selectivity and effectiveness of the diazirine photoreactive group incorporated in the RNA sequences.
As a second conrmation of the presence of RNA crosslinked to the protein, samples of ON1 and ON4 were irradiated in the presence of SSB, before purication with a size exclusion spin column (30 kDa cut off), then 3 0 -end 32 Pradiolabelled.The resulting samples were eluted on 10% tricine SDS PAGE, alongside the ON4 sample prepared in the rst experiment (see ESI Fig. S45 †).The radioactive band corresponding to the RNA-SSB crosslinks in both samples with ON4 eluted at the same rate, and the selective nature of radiolabelling using an RNA ligase conrms the presence of RNA in the modied samples, with no such corresponding band in the natural control, or in the absence of irradiation.
The yields for cross-linking to SSB obtained by quantication of the respective bands can be found in Table 3.The yield of cross-linking for each modied strand of RNA was consistently around 15%, which compares favorably to the crosslinking yields obtained for DNA probes equipped with diazirine groups and DNA photolyases. 62The main side reaction competing with the carbene CH bond insertion in this experiment is the insertion of H 2 O.The insertion of H 2 O is unavoidable in the unbound RNA, thus, only the probes in contact with the protein at the moment of excitation are cross-linked.The advantage of the carbene being quenched by small molecules is that the resulting RNA will not interfere with data obtained for crosslinking to macromolecules as they will be resolved easily under electrophoretic conditions, giving specic cross-linking proles for a mixture of components.Isolation with streptavidin beads was never achieved.Perhaps the bulk of the protein prevented the biotin from interacting with the beads.

Transcription assays
8][89] In the context of ribonucleoside triphosphates (N*TPs), the modications are usually anchored at the 2 0 -OH of the sugar moiety and only rarely at the level of the nucleobase. 89Thus, U Dz TP was synthesized and used in transcription assays in order to assess the compatibility of this heavily-modied uridine analogue with an RNA polymerase.During transcription RNA polymerases use a double stranded DNA template to generate single stranded RNA transcripts.E. coli.T7 RNA polymerase is commonly used in molecular biology 90,91 and has been used to incorporate N*TPs including 2 0 -uoro-2 0 -amino-NTPs; 92 unnatural base pairs; 93,94 stabilized RNA analogues; 95 photoactive monomers such as 8-azido ATP, 96 5-bromo and 5-iodo UTP; 28,97 and other 5 0 -modied UTP analogues. 98,99Matsuda et al. used a 43 base-pair dsDNA template containing a minimal T7 promoter followed by the sequence of the transcripts to generate 26 mer products containing 4 0 -thioRNA nucleotides. 95This dsDNA template was the basis for the design of the system used to test the incorporation of U Dz TP into RNA oligonucleotides (Fig. 4).Indeed, the templates generate 26mer RNA transcripts containing a single (Product P1) or triple consecutive (Product P2) incorporation of U Dz nucleotides.Moreover, in order to assess the efficiency of   the incorporation of the modied U Dz TP in lieu of its natural counterpart, the transcription efficiencies were measured relative to that using all natural NTPs which was dened as being 100%.
When the standard conditions for T7 RNA polymerasemediated transcription reactions were used, little full length products could be observed when U Dz TP was included (data not shown).On the other hand, optimized reaction conditions, i.e. increased concentrations of U Dz TP and longer reaction times, (vide infra), led to the appearance of bands corresponding to full length products when U Dz TP was used as a substitute for UTP (Lane 3 in Fig. 5A and B).This conrms the ability of T7 RNA polymerase to accept this heavily modied N*TP as a substrate.Moreover, the absence in Lane 3 of similar abortive sequences as shown in the negative control experiments (Lanes 2 and 4) is in agreement with this statement and shows that the polymerization reactions proceed with high yields.Interestingly, the T7 RNA polymerase could readily incorporate three modied U Dz TP in a row (Fig. 5B) without stalling, clearly showing that this N*TP is a rather good substrate for the polymerase.
However, the transcription efficiencies are rather depleted when compared to those obtained with the unmodied UTP (Table 4) hinting at a slowing of transcription due to the presence of the N*TP.This is conrmed by the fact that the reaction times had to be extended (to at least 8 and 24 hours for the dsDNA templates T1/T1 0 and T2/T2 0 , respectively) during optimization to allow the synthesis of the modied RNA.The polymerase was unable to generate the full length transcript with the exact template used by Matsuda et al. which should lead to six incorporations of U Dz TP (data not shown).There also seems to be a position dependence on the ability of the enzyme to incorporate U Dz TP, since using a template containing complementary A nucleotides at positions 10, 15 and 21 (template T3/T3 0 , Table S2, ESI †) only led to abortive sequences (data not shown).
However, a longer template T4/T4 0 designed to generate a 52mer RNA product (consisting the sequences of templates T1/ T1 0 and T3/T3 0 appended onto one another Table S2, ESI †) containing 4 modications dispersed throughout the transcript did generate full length transcripts, in a similar yield (37%) and time scale (26 hours) as the other templates (Fig. S46, ESI †).

Conclusions
With the intent of developing a new tool for the investigation of RNA-protein interactions, a photoreactive diazirine-containing uridine analogue was designed so that it could be incorporated into RNA, either synthetically via phosphoramidite chemistry (U Dz ) or by enzymatic polymerization (U Dz TP).
The modied RNA oligonucleotides obtained by incorporation of the phosphoramidite building block during solid-phase synthesis served two purposes: (1) assessing the effect of the modication on duplex stability and (2) showing that the inclusion of the photoreactive diazirine moiety into RNAs could serve as a versatile tool for the investigation of RNA-protein   Efficiency compared to natural transcript (%) 38  28  interactions.Not surprisingly, UV-melting experiments revealed that the modication did not cause any disruption of the Watson-Crick base-pairing and was well-tolerated in the major groove of the duplexes.Moreover, the resulting synthetic probes were assessed for cross-linking potential to single-stranded binding protein (SSB) in an irradiation assay.The RNA probes containing the modied uridine analogue were selectively cross-linked to SSB.The selectivity of this assay derives from the main competitor of the CH bond insertion of the protein, being the insertion of H 2 O, thus only molecules in direct contact with the modied RNA will be cross-linked, preventing non-specic cross-linking.Even though the SSB mainly served as a paradigm for the development of photo-crosslinking RNA probes, this approach can be extended to investigate on the binding mode of other single-stranded nucleic acid binding proteins.
Indeed, the virus-encoded single-stranded DNA binding protein ICP8 was shown to bind to ssRNA with an affinity only $5 fold lower than that of ssDNA, hinting at a possible implication of ICP8 in the regulation of viral replication. 84Finally, other yet unknown proteins binding to various forms of coding or noncoding RNAs could be easily targeted by application of this general method.The ability of T7 RNA polymerase to accept the triphosphate analogue was also investigated in transcription assays, generating modied RNA containing up to four incorporations of the diazirine-modied uridine.The efficient transcription reactions with U Dz TP show that bulkier modications are also tolerated at this particular location.In this context, various investigations have shown that UTP analogues equipped with smaller modi-cations such as hydrophobic moieties, 98 amines, thiols, 79,100 and azides 101 acted as good substrates for the T7 RNA polymerase.Thus, the T7 RNA polymerase is rather tolerant to a broad variety of modications anchored at position C5 of pyrimidine nucleoside triphosphates.
However, the rate of transcription seems to be retarded by the presence of the modied UTP.A kinetic investigation on the effect of the analogue U Dz TP on the catalytic efficiency (k cat /K M ) is currently under way and will shed some light on this phenomenon.Finally, the substrate acceptance of the noncanonical U Dz TP analogue might be further improved by using mutant RNA polymerases. 102verall, the good substrate acceptance of triphosphate U Dz TP and the photo-crosslinking ability of the probes containing the uridine analogue U Dz bode well for the isolation and identication of RNA binding proteins and ncRNAs as well as understanding their interactions.

Cross-linking experiments
Single-stranded DNA binding protein (SSB) (1 mg, 10 pmol) and oligonucleotide (3 pmol of radiolabelled RNA) were incubated in the irradiation buffer (50 mM Tris-HCl, 10 mM MgCl 2 , 10 mM DTT, pH 7.8) in a total volume of 15 ml in a PCR vial at room temperature for at least 30 minutes prior to irradiation.The sample was irradiated at 375 nm for 30 minutes.Loading buffer (70% formamide, 50 mM EDTA, bromophenol blue and xylene cyanol FF (each 0.1%) 15 ml) was then added and heated to 95 C for 5 minutes.The resulting RNA-protein complexes were then resolved on 10% SDS-PAGE at 10 W for 2 hours.The gel was exposed on an imaging plate at À18 C overnight before visualizing on a phosphorimager.

Transcription assays
General procedure: 40 pmol of ds template DNA were annealed by heating to 95 C and then gradually cooling down to room temperature.0.4 mM of natural NTPs, 0.6 mM U Dz TP, 100 units of T7 RNA polymerase, 0.6 units of PPiase, 0.5 ml RNasin, and 10 mCi of a-32 P-ATP were then added at 0 C to the transcription optimized buffer composed of 40 mM Tris-HCl (pH 7.9), 6 mM MgCl 2 , 10 mM NaCl, 2 mM spermidine, 10 mM DTT and 0.05% of the polysorbate detergent Tween, and made up to a total volume of 20 ml with ultra-pure H 2 O. Reactions were incubated at 37 C for different reaction times: template T1, 8 hours; template T2 24 hours; under 20 ml mineral oil.The reactions were quenched by adding of 20 mL of stop solution (70% formamide, 50 mM EDTA, 0.1% bromophenol, 0.1% xylene cyanol) and resolved on 15-20% PAGE.

Table 3
Quantitation (in percentage) of cross-linking of RNA ON2-4 with the SSB protein

Table 4
Relative transcription efficiency with dsDNA templates