Identification of a selective DNA ligase for accurate recognition and ultrasensitive quantification of N6-methyladenosine in RNA at one-nucleotide resolution

Here we establish an ultrasensitive quantitation assay for accurately determining N6-methyladenosine at one-nucleotide resolution in RNA.

Introduction N 6 -Methyladenosine (m 6 A) is the most frequent form of posttranscriptional modication in RNA. Recent studies have shown that m 6 A occurs not only in eukaryotic and bacterial messenger RNA (mRNA) 1,2 but also in viral RNA, 3,4 ribosomal RNA (rRNA) and long non-coding RNA (lncRNA). 5 The discovery of methyltransferase [6][7][8][9] and demethylase 10 for m 6 A suggests that m 6 A modication is revisable and dynamically regulated, and further indicates that m 6 A modication is associated with gene regulation, including mRNA translation efficiency, 11 stability and splicing. 12,13 Therefore, m 6 A modication plays a critical role in many biological processes and is linked to human health. 14 However, the biology of m 6 A remains largely uncovered, due to the lack of sensitive and robust methods to quantitatively determine the degree of m 6 A modication at a precise location.
In 2012, by combining m 6 A-specic methylated RNA immunoprecipitation with massively parallel sequencing (MeRIP-seq), S. R. Jafferey's and D. Dominissini's groups developed high-throughput sequencing methods to analyze m 6 A modication on a transcriptome-wide scale. 15,16 These studies showed that more than 7000 mammalian mRNAs contain m 6 A and that the m 6 A sites preferentially appear near stop codons and in 3 0 UTRs. However, the sequencing methods cannot quantitatively detect the m 6 A modication fraction at the precise location, which generally locate the m 6 A residues within approximately 200 nucleotides. As the MeRIP-seq methods offer the distribution of m 6 A at a transcriptome-wide scale, quantitative methods for the determination of site-specic m 6 A in RNA have become increasingly important, and are greatly signicant for revealing the biological functions of m 6 A and the association between m 6 A and human disease and clinical diagnosis. Liu et al. rst reported a RNase H cleavagebased method for quantifying m 6 A at a specic site, which required site-specic cleavage and radioactive labelling followed by ligation-assisted extraction and Thin Layer Chromatography (SCARLET). 17 The multiple enzymatic steps and the separation lead to very laborious processes, expensive reagents and radioactive hazards. More recently, E. T. Kool et al. have identied a DNA polymerase (Tth DNA Pol) with selectivity for the incorporation of thymidine opposite unmodied A over m 6 A. 18 Zhou et al. have also revealed that Bst DNA polymerase can signicantly hinder RNA-directed DNA synthesis. 19 On the basis of the selectivity of DNA polymerases, probe extensionbased methods have been established for quantifying m 6 A at specic sites. Although these methods also need electrophoresis separation and radioactive or uorescent labels, the procedures have been simplied. Of particular note is that the sensitivity of all methods mentioned above is too low to quantify m 6 A in low abundance mRNA or lncRNA due to the lack of an amplication step, in which about 10 fmole RNA containing m 6 A can be detected in a large amount of total RNA samples. The reason may be that the products of probe extension and enzymatic cleavage are difficult to specically amplify.
Generally, DNA ligase shows high specicity compared to DNA polymerase and the ligated DNA products can be specically amplied. Dai et al. have described the T4 DNA ligasebased method to discriminate between A and m 6 A in RNA. 20 However, no attempt was made to detect m 6 A in RNA samples, possibly due to the low selectivity of T4 DNA ligase (see below). In this work, we rst reveal that T3 DNA ligase has strong selectivity to discriminate m 6 A from A in RNA. By ligating the DNA probes with T3 DNA ligase using a template of RNA and amplifying the ligated DNA products with PCR, as low as 4 fM (corresponding to 40 zmol in 10 mL solution) lncRNA containing m 6 A can be accurately determined and the selectivity for discriminating m 6 A in RNA can be greatly improved up to 54.1fold. Ligation of DNA probes using a template of RNA The ligation reaction mixture A consisted of 20 nM probe L, 20 nM probe R, the ligation buffer (66 mM Tris-HCl, 10 mM MgCl 2 , 1 mM ATP, 1 mM DTT, and 7.5% polyethylene glycol (PEG 6000) at pH 7.6 and 25 C) and an appropriate amount of the RNA target. The ligation reaction mixture B consisted of 0.9 U ligase and the ligation buffer. Mixture A was heated at 85 C for 3 min and then incubated at 35 C for 10 min; then the ligation reaction mixture B was added. The nal volume of the ligation reaction mixture was 10 mL. Then the ligation reaction mixture was incubated at an appropriate temperature to carry out ligation. Aer ligation, the reaction mixture was put on ice immediately. The reaction conditions for T3 DNA ligase, T4 DNA ligase, T7 DNA ligase, T4 RNA ligase 2 and Taq DNA ligase were 35 C for 15 min, 25 C for 30 min, 35 C for 15 min, 37 C for 30 min, and 35 C for 15 min, respectively.

PCR amplication reaction and real-time uorescence measurements
A volume of 2 mL of the ligation product was transferred to a PCR reaction mixture with a nal volume of 10 mL. The PCR reaction mixture contained a forward primer, a reverse primer (the concentration of each was 200 nM), 250 mM dNTPs, 0.4Â Super Green I, 0.5 U JumpStart™ Taq DNA Polymerase and the PCR buffer (10 mM Tris-HCl at pH 8.3, 50 mM KCl, 1.5 mM MgCl 2 , and 0.001% (w/v) gelatin). The PCR reaction was carried out with a StepOne Real-Time PCR System (Applied Biosystems, USA) using a hot start of 94 C for 2 min, followed by 45 cycles of 94 C for 20 s and 60 C for 30 s. The real-time uorescence intensity was monitored at 60 C.
The DMEM was removed from the Petri dish while the bottom of the Petri dish was lled with cells. Aer that the cells were washed three times with cold PBS buffer (10 mM sodium phosphate buffer, 0.1 M NaCl, at pH 7.4 and 25 C). Next, 6 mL RNAiso solution was added into the Petri dish, and then the cells were incubated at room temperature for 2 min. Aer incubation, the cell lysates were collected in a centrifuge tube which contained 1.2 mL trichloromethane. The centrifuge tube was shaken slightly at room temperature for 2 min, and then the cell lysates were centrifuged for 15 min at 12000 rpm at 4 C, and the aqueous phase was collected in a new tube. Next, 3 mL isopropyl alcohol was added to the aqueous phase and the mixture was incubated at room temperature for 10 min. Then the mixture was centrifuged for 20 min at 12000 rpm at 4 C. The precipitate was washed once with 75% ethanol and dissolved in DEPC-treated water. Finally, the concentration of total RNA was determined with a Nano Drop 2000 (Thermo Scientic).

Extraction of poly A + RNA from total RNA
The structure of most lncRNA is similar to that of mRNA containing a polyA tail. 21,22 The m 6 A to be detected is located in MALAT1 lncRNA which has a polyA tail. Therefore, MALAT1 lncRNA can be obtained by the extraction of poly A + RNA.
Biotin-oligo (dT) binding on streptavidin (STV) of magnetic beads. First, the magnetic beads in the tube were washed three times with an STV-biotin binding buffer (150 mM KCl, 1.5 mM MgCl 2 , 10 mM Tris-HCl, 0.05 mM DDT, and 0.05% NP-40). Next, 500 mL STV-biotin binding buffer and 5 mL biotin-oligo (dT) (the concentration was 50 mM) were added into the tube. Then the tube was placed at 4 C for 4 h with slight shaking, and thus the biotin-oligo (dT) bound onto the STV of the magnetic beads. The free biotin-oligo (dT) was washed out with washing buffer (10 mM Tris-HCl and 150 mM NaCl). Finally, 500 mL 2Â binding buffer (300 mM KCl, 3 mM MgCl 2 , and 20 mM Tris-HCl) was added into the tube.
Hybridization with poly A + RNA. The magnetic beads and total RNA (the volume was 500 mL) were heated separately at 85 C for 3 min, and aer that the magnetic beads were transferred into the total RNA and the mixture was heated at 85 C for another 3 min. Next, the mixture was shaken for 30 min at room temperature, and during that time the 3 0 poly (A) region present in poly A + RNA was hybridized with biotin-oligo (dT).
Elution of poly A + RNA. The magnetic beads were washed three times with washing buffer. Then poly A + RNA attached to the magnetic beads was eluted with DEPC-treated water. The concentration of poly A + RNA was determined with a Nano Drop 2000 (Thermo Scientic) and the poly A + RNA was stored at À80 C.
Determination of melting temperatures (T m ) of the hybrids of the RNA segments and the probes To determine the melting temperature (T m ) of the hybrids of RNA targets and the DNA probes in the ligation reaction, 200 nM RNA target (RNA2577-A or RNA2577-m 6 A) was mixed with 200 nM probe L1 and 200 nM probe R1 D in a 10 mL mixture solution which contained T3 DNA ligase reaction buffer (66 mM Tris-HCl, 2 mM ATP, 10 mM MgCl 2 , 1 mM DTT, and 3.75% PEG 6000) and 0.4Â Super Green I. The mixture was put into a Ste-pOne Real-Time PCR System (Applied Biosystems, USA) to determine the uorescence signal changes when increasing the temperature from 25 C to 95 C. Fig. S1(a) and (b) † show the derivatives of the uorescence intensity signal as a function of temperature, respectively, produced by RNA2577-A/(probe L1probe R1 D) and RNA2577-m 6 A/(probe L1probe R1 D). Aer that, probe R1 D was replaced by probe R1 and we conducted the same experiment. The derivatives of the uorescence intensity signal as a function of temperature, produced by RNA2577-A/(probe L1probe R1) and RNA2577-m 6 A/(probe L1 probe R1) are depicted in Fig. S1(c) and (d), †respectively.

Results and discussion
Principle of the ligation-dependent PCR method for RNA m 6

A detection
The general outline of the proposed ligase-dependent PCR assay for m 6 A determination is schematically illustrated in Fig. 1. m 6 A at the 2577 th site of MALAT1 lncRNA, which has been demonstrated to contain an m 6 A modication, 17 is employed as the model target in this study. lncRNA containing m 6 A and A at the 2577 th site are respectively dened as RNA2577-m 6 A and RNA2577-A. For detection of RNA2577-m 6 A, DNA probe L1 (le probe) and probe R1 (right probe) were designed (see the sequence in Table S1 †). Each probe contains a universal primer-specic sequence used for PCR amplication (red and orange) and a target-specic sequence (blue and black), which is respectively complementary to the RNA target immediately downstream and upstream of the 2577 th site. Probe L1 is modied with a phosphate group at its 5 0 -terminus and probe R1 is modied with two ribonucleotides at its 3 0 -terminus. As shown in Fig. 1(I), in the presence of RNA2577-A, probe L1 and probe R1 will adjacently hybridize with RNA around the 2577 th A site, and thus they can be ligated by catalysis with T3 DNA ligase. The ligated products are subsequently amplied with PCR using the universal forward primer and reverse primer. When the 2577 th A is methylated at the N 6 position, the m 6 A at the 2577 th site can signicantly hinder the ligation of probe L1 and probe R1 (Fig. 1(II)). Therefore, in the presence of RNA2577m 6 A, the ligated products and nal PCR products will be greatly reduced. Thus, RNA2577-m 6 A can be detected. In order to quantitatively determine the m 6 A modication fraction at the 2577 th site in MALAT1 lncRNA, the A at the 2488 th site of the same MALAT1 lncRNA molecule is selected as the reference site ( Fig. 1(III)), which is known to only contain A. 15,16 Probe L2 and probe R2 are designed to be the same as probe L1 and probe R1, respectively, except for the target-specic sequences, which are respectively complementary to the RNA target immediately downstream and upstream of the 2488 th site. In the presence of MALAT1 lncRNA, probe L2 and probe R2 will be ligated and the PCR signal of the ligated products can accurately indicate the lncRNA concentration. As demonstrated below, by comparing the real-time uorescence PCR signals at the 2577 th site and 2488 th site, the m 6 A modication fraction can be precisely determined.

Identication of a ligase with substantial selectivity towards m 6 A
The selectivity of the ligases for discriminating m 6 A from A is the key issue for the ligase-dependent PCR assay to detect m 6 A in RNA. So, we rst identify the substantial selectivity of the ligases towards m 6 A. A pair of 69 mer synthetic MALAT1 lncRNA segments containing either m 6 A or A at the 2577 th site are employed as the templates. Probe L1 and probe R1 D, which is the same as probe R1 but without modication at its 3 0terminus with two ribonucleotides, are rst hybridized with the RNA2577-A or RNA2577-m 6 A segments, and are then ligated through catalysis by various ligases. The ligated products are amplied with PCR and the PCR products are detected with gel electrophoresis. As demonstrated in Fig. 2(a), when the pixel intensity of the PCR products produced from the RNA2577-A segment is dened as 100, the ratio of the pixel intensity produced from the RNA2577-m 6 A segment to that produced from the RNA2577-A segment using different ligases is 24.5 : 100 (T3 DNA ligase), 32.1 : 100 (T4 RNA ligase 2), 79.1 : 100 (T4 DNA ligase), 64.4 : 100 (T7 DNA ligase) and 77.7 : 100 (Taq DNA ligase), in which the T3 DNA ligase and T4 RNA ligase 2 show good selectivity for recognizing m 6 A from A in RNA. Using an RNA molecule as the template to ligate the DNA probes, it has been reported that modication with two ribonucleotides of one DNA probe at its 3 0 -terminus can greatly improve the specicity and efficiency of the ligase reaction. 23 Therefore, probe L1 and probe R1 with the modication of two ribonucleotides at the 3 0 -terminus are used to further investigate the selectivity of T3 DNA ligase and T4 RNA ligase 2. As shown in Fig. 2(b), using probe L1 and R1, the ratio of the pixel intensity produced from the RNA2577-m 6 A segment to that produced from the RNA2577-A segment reaches 6.87 : 100 (T3 DNA ligase) and 30.5 : 100 (T4 RNA ligase 2). These results indicate that T3 DNA ligase has the strongest selectivity to discriminate m 6 A from A in RNA and modication of probe R1 with two ribonucleotides can greatly improve the selectivity.
The m 6 A modication is a single methylation of adenosine at the N 6 position. The methyl group is small and chemically inert. More importantly, the methylated N 6 -position retains its ability to donate a hydrogen bond, so that m 6 A can still form Watson-Crick base pairs. 17 Therefore, it is extremely difficult to specically recognize m 6 A from A in RNA. 18 Fortunately, as demonstrated above, we have revealed that different ligases have different identifying abilities towards m 6 A, and T3 DNA ligase shows the strongest selectivity, which is ascribed to its ability to recognize between m 6 A:T and A:T. We also demonstrate that T3 DNA ligase shows better selectivity to discriminate between m 6 A:U and A:U when probe R1 is modied with two ribonucleotides at its 3 0 -terminus. The melting temperatures (T m ) of the hybrids of the RNA segments and the probes were detected, and are 44.3 C (RNA2577-A/probe L1probe R1D), 43.8 C (RNA2577-m 6 A/probe L1probe R1D), 43.7 C (RNA2577-A/ probe L1probe R1), and 41.4 C (RNA2577-m 6 A/probe L1probe R1) (see Fig. S1 †). Accordingly, the m 6 A modication decreases the T m value by only 0.5 C in the presence of m 6 A:T and A:T pairs, and in the presence of m 6 A:U and A:U pairs, the T m value can be decreased by 2.3 C, which also makes T3 DNA ligase more selective for recognizing m 6 A and A in RNA.

Assessment of the effects of core sequences on specicity
In 2012, Dan Dominissini et al. found that all sequences with m 6 A conform to the degenerate consensus RRACH (A ¼ m 6 A, R ¼ A or G, H]U or A), 15 and the commonest core sequences for m 6 A are GGACU, AAACU, AGACU and GGACA. The core sequence of RNA 2577 is GGACU. To assess the effects of the core sequences on the specicity of the proposed ligasedependent PCR assay, the core sequence of RNA 2577 (GGACU) is replaced with AAACU, AGACU, and GGACA, respectively. The corresponding RNAs containing A or m 6 A are respectively dened as RNA AAACU-A or RNA AAACU-m 6 A, RNA AGACU-A or RNA AGACU-m 6 A, and RNA GGACA-A or RNA GGACA-m 6 A. These RNA molecules were used to perform the specicity experiments. The experiments were performed with the same procedures using T3 DNA ligase as demonstrated above. The results are shown in Fig. 3, and the ratio of the pixel intensity produced from the m 6 A segment to that produced from the A segment reaches 7.20 : 100 (RNA AAACU), 7.70 : 100 (RNA AGACU), and 5.46 : 100 (RNA GGACA). These results indicate that core sequences have little effect on the specicity of the proposed ligase-dependent PCR assay.

Analytical performance of the ligation-dependent PCR method for RNA m 6 A detection
With the proposed ligase-dependent PCR assay, we systematically investigate the inuence of experimental parameters, including the ligation temperature, ligation time and amounts of T3 DNA ligase, on the selectivity and sensitivity for detecting m 6 A, and optimize the experimental conditions (see Fig. S2-S4 †). Under the optimum experimental conditions, the analytical performance of the ligase-dependent PCR assay is evaluated using real-time uorescence PCR measurements, in which Super Green I is utilized as the uorescent dye for real-time detection of the PCR products. As shown in Fig. 4(a), well-dened real-time uorescence curves can be obtained which are produced by the RNA2577-A segment in the concentration range from 4 fM to 4 nM. When the C T values (the number of cycles experienced by the uorescence signal of each reaction up to the threshold value) are plotted against the logarithm (lg) of RNA2577-A concentration, as depicted in Fig. 4(b), an excellent linear relationship is obtained. The linear regression equation is C T ¼ À8.59 À 3.14 lg C RNA2577-A (M) with a corresponding correlation coefficient, R 2 , of 0.998. At the same time, as demonstrated in Fig. 4(c), the C T values produced by RNA2577-m 6 A at the same concentrations are much less than those produced by RNA2577-A. According to the calibration curve shown in Fig. 4(b), the C T values produced by 4 pM  This journal is © The Royal Society of Chemistry 2018 RNA2577-m 6 A correspond to the concentration of RNA2577-A at 73.9 fM. Therefore, the selectivity of the ligase-dependent PCR assay for detecting m 6 A in RNA is up to 54.1-fold. It is particularly worthwhile to note that the real-time uorescence signal produced by 400 fM RNA2577-m 6 A is almost the same as the blank. That is to say, when the total concentration of the RNA target to be detected is less than 400 fM, RNA containing m 6 A produced a negligible signal, which will be of no contribution to the real-time uorescence signal. The result means that the RNA-m 6 A concentration should be equal to that of the total RNA target concentration minus the RNA2577-A concentration when the total RNA concentration is less than 400 fM, which can be determined with the ligase-dependent PCR assay.
Subsequently, we tested whether the ligase-dependent PCR assay can be employed for quantitative detection of m 6 Acontaining RNA. The RNA2577-A segment and RNA2577-m 6 A segment were rstly mixed with a total concentration of 400 fM as the synthetic samples, in which the RNA 2577-m 6 A proportions were 0%, 10%, 30%, 50%, 80%, and 100%. Then, the RNA2577-A concentration was determined with the ligasedependent PCR assay according to the calibration curve shown in Fig. 4(b). The RNA2577-m 6 A concentration was equal to 400 fM minus the RNA2577-A concentration. As shown in Table S2, † the detected RNA2577-m 6 A concentrations are in good agreement with the added ones, indicating that the proposed assay can be used for quantitative evaluation of the extent of RNA containing m 6 A in RNA samples at a specic site when the total RNA concentration is less than 400 fM.

Detection of m 6 A in real poly A + RNA samples
For determination of RNA containing m 6 A in real biological samples, the vital issue is how to obtain the total concentration of the RNA target. To address this issue, as demonstrated before, we selected an adenosine (A) site in the same RNA target molecule as the reference site, which is known to only contain A, such as the 2488 th site in the MALAT1 lncRNA. The ligasedependent assay can easily determine the total concentration of the RNA target molecule at the reference site, which can also control the total RNA target concentration at less than 400 fM for determination of real biological samples.
The designed RNA2488-A specic probe L2 and probe R2 are used to perform the ligation reaction by catalysis with T3 DNA ligase and then the ligated products are amplied with PCR. A synthetic RNA2488-A segment is employed to construct the calibration curve. As demonstrated in Fig. 5(a) and (b), there is an excellent linear relationship between the C T value and lg of RNA2488-A concentration in the range from 4 fM to 4 nM. The correlation equation is C T ¼ À6.39 À 3.16 lg C RNA2488-A (M) and the correlation coefficient, R 2 , is 0.990. Finally, we apply the ligase-dependent PCR assay to determine the m 6 A modication fraction at the 2577 th site of the MALAT1 lncRNA in 85 ng poly A + RNA extracted from HeLa cells. As shown in Fig. 5(b) and (c), the total concentration of MALAT1 lncRNA is determined to be 285.9 fM (C RNA2488-A ) in the sample with the ligase-dependent PCR assay at the 2488 th site. As shown in Fig. 5(d) and 4(b), the RNA2577-A concentration is determined to be 55.2 fM (C RNA2577-A ) with the ligase-dependent PCR assay at the 2577 th site. Therefore, the RNA2577-m 6 A concentration is calculated as 230.7 fM from the formula C RNA2488-A -C RNA2577-A , and the m 6 A modication fraction at the 2577 th site is estimated to be 80.7%, which very much coincides with that obtained by the SCARLET assay ($80%). 17 The same method is applied to the measurement of the m 6 A modication fraction at the 2577 th site of the MALAT1 lncRNA in poly A + RNA extracted from HEK293T, and the m 6 A modication fraction at the 2577 th site is estimated to  be 51.3% (Fig. S5 †), which also coincides well with that obtained by the SCARLET assay ($51%). 17 The above results indicate the good applicability of the ligase-dependent PCR assay for quantitation of m 6 A in real biological samples.

Conclusions
In summary, we have rst revealed that T3 DNA ligase, a commercially available ligase, has strong selectivity for the discrimination of m 6 A from A in RNA molecules. On this basis, a highly sensitive and selective ligase-dependent PCR assay has been established, which can accurately determine the m 6 A modication fraction at any specic site in RNA. Most importantly, owing to its ultrahigh sensitivity, the proposed assay can be used to quantify m 6 A in low abundance cellular RNA in real biological samples, which is currently not possible using existing techniques due to the lack of practical ability for nucleic acid amplication. In addition, the ligase-dependent PCR assay requires only common and available lab equipment and materials. Therefore, we believe that the proposed assay should be readily promising for applications in m 6 A related fundamental studies and clinical diagnosis.

Conflicts of interest
There are no conicts to declare.