Open Access Article
Crystal M.
Han
ab,
David
Catoe
b,
Sarah A.
Munro
bc,
Ruba
Khnouf
de,
Michael P.
Snyder
f,
Juan G.
Santiago
e,
Marc L.
Salit
*b and
Can
Cenik
*fg
aDepartment of Mechanical Engineering, San Jose State University, San Jose, CA 95192, USA
bJoint Initiative for Metrology in Biology, National Institute of Standards and Technology, Stanford, CA, USA. E-mail: msalit@stanford.edu
cMinnesota Supercomputing Institute, University of Minnesota, MN 55455, USA
dDepartment of Biomedical Engineering, Jordan University of Science and Technology, Irbid, Jordan
eDepartment of Mechanical Engineering, Stanford University, Stanford, CA 94305, USA
fDepartment of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
gDepartment of Molecular Biosciences, University of Texas at Austin, Austin, TX 78705, USA. E-mail: ccenik@austin.utexas.edu
First published on 22nd July 2019
We present an on-chip method for the extraction of RNA within a specific size range from low-abundance samples. We use isotachophoresis (ITP) with an ionic spacer and a sieving matrix to enable size-selection with a high yield of RNA in the target size range. The spacer zone separates two concentrated ITP peaks, the first containing unwanted single nucleotides and the second focusing RNA of the target size range (2–35 nt). Our ITP method excludes >90% of single nucleotides and >65% of longer RNAs (>35 nt). Compared to size selection using gel electrophoresis, ITP-based size-selection yields a 2.2-fold increase in the amount of extracted RNAs within the target size range. We also demonstrate compatibility of the ITP-based size-selection with downstream next generation sequencing. On-chip ITP-prepared samples reveal higher reproducibility of transcript-specific measurements compared to samples size-selected by gel electrophoresis. Our method offers an attractive alternative to conventional sample preparation for sequencing with shorter assay time, higher extraction efficiency and reproducibility. Potential applications of ITP-based size-selection include sequencing-based analyses of small RNAs from low-abundance samples such as rare cell types, samples from fluorescence activated cell sorting (FACS), or limited clinical samples.
Coupling ITP with sieving matrices enables size-selective purification of nucleic acids.5–9 Sieving matrices decrease nucleic acid mobilities in a size-dependent manner, yet have minimal effect on the mobility of small ions.10 Previous approaches using a sieving matrix for size-selection achieved exclusion of RNAs >40 nt with 5.5% PVP,6,7 miRNA detection with 4% polyacrylamide,9 and exclusion of 66 nt synthetic RNA with 30% Pluronic F-127.5 ITP sample focusing can also be used with ‘spacer ions’ to create separation of mixed samples.11 In the latter mode, multiple peak-mode ITP zones are separated from each other by the spacer ion zones in plateau mode. Previous studies have shown separation of single- and double-stranded DNA by a single spacer zone8 and separation of multiple serum lipoproteins by several spacer zones formed by a carrier ampholyte.12
ITP-extracted nucleic acids are compatible with downstream analyses such as RT-qPCR,13,14 microarray hybridization15 and sequencing.16,17 Among these methods, sequencing offers unmatched advantages including digital quantification, the ability to identify novel transcripts, high throughput, and single-base resolution.18 Many sequencing-based methods have been developed to address a wide range of questions in RNA biology.19 For example, UV cross-linking and immunoprecipitation (CLIP-Seq)20 and RNA immunoprecipitation followed by deep sequencing (RIP-Seq)21 measure protein–RNA interactions. Similarly, sequencing based methods can probe RNA secondary structure22 or unveil mRNA fragment sequences protected by ribosomes during translation.23 All of these approaches require extraction of specific-size RNA fragments with as high yield as possible. Compared to conventional sample preparation methods such as column-based RNA purification followed by gel electrophoresis, ITP purification has the potential to offer higher yield especially with lower input amounts and shorter nucleic acids.5,17 Furthermore, ITP may provide better consistency, fewer hands-on steps, and faster processing time.
To our knowledge, all previously reported size-selective ITP methods demonstrated exclusion of RNA longer than a certain cutoff size using combinations of TE ions and sieving matrices.5–9 We know of no reported ITP method that is size-selective between both low and high limits of RNA length. Having a lower limit in the size selection is especially critical for sequencing applications since the presence of very small nucleic acid fragments leads to significant contamination in sequencing reads. Importantly, presence of mononucleotides in the sample inhibits the library preparation required in several aforementioned sequencing methods. In these methods, ribonucleases are used to digest mRNAs that are not protected by ribosomes or RNA binding proteins, which results in 3′ phosphate group on the protected RNA fragments. Sequencing library preparation for RNA fragments generated by ribonucleases starts with an initial dephosphorylation. The most commonly used enzyme for the dephosphorylation reaction is T4 PNK.24–28 However, in the presence of ATP, T4 PNK functions as a kinase and instead adds a 5′ phosphate to the substrate.29 Given that intracellular ATP concentration can be as high as 10 mM,30 its removal from the lysate is essential for the dephosphorylation step. In conventional methods, size range selection is typically performed by denaturing polyacrylamide gel electrophoresis.24,31 However, despite its ubiquitous adoption, this method is severely limited because it offers low yield, is time-consuming, and cannot be easily parallelized;32 hence alternative methods are needed.
In the current study, we present the first on-chip ITP method for selecting a range of RNA sizes using an ionic spacer, sieving matrix, and a two-step collection method. We demonstrate >90% removal of single nucleotides and >65% removal of RNA longer than 35 nt in the extracted sample. Our method performs RNA extraction and size selection simultaneously in a single on-chip process within 10 min. We also compare our method to size-selection by denaturing gel electrophoresis and demonstrate a 2.2 fold increase in yield. Lastly, we demonstrate the compatibility of ITP-extracted RNA with high-throughput sequencing.
![]() | ||
| Fig. 1 (a) Schematic representation of RNA size-selection. Initially, the chip is loaded with LE including sample (S) in the sample section of the channel and LE including sieving matrix in the separation channel (time t1). At time t2, an electrical current of 300 μA is applied to the main channel and a 30 μA current is applied in the branch channel. The spacer zone forms between ITP peaks of the two fluorescent dyes. At time t3, the first peak arrives at the collection reservoir. Fraction 1 collected from the collection reservoir contains single nucleotides and is discarded. The reservoir is refilled with fresh collection buffer and the same current is applied again. At time t4, the second peak arrives and fraction 2 containing the RNAs of target sizes is collected. Longer RNA molecules remain in the channel (fraction 3). (b) Visualization of the two ITP peaks separated by a spacer zone. The first peak is visualized by AF488 and includes single nucleotides. In the second peak, DyLight488 and RNA in the target range of 2–35 nt co-focus. The snapshot was captured from Video S1† at 3:20 s. | ||
As the ITP zones enter the separation channel, the sieving matrix reduces the RNA mobility while those of ions are negligibly affected. Consequently, RNA molecules rearrange based on their sizes such that single nucleotides remain focused in the first peak, and ∼2–35 nt RNAs are focused in the second peak. RNAs that are longer than 35 nt defocus and travel electrophoretically behind the second peak. When the first peak arrives at the collection reservoir (at time t3), we temporarily suspend the current and collect the contents from the collection reservoir (fraction 1). Fraction 1 contains single nucleotides and is discarded. We then re-apply current to reinitiate ITP. At time t4 (typically after about a minute of re-applying current) the second peak arrives at the collection reservoir, and the sample containing RNA of the desired size range (fraction 2) is collected. Longer RNAs remain in the channel and are not collected except when we analyze the contents of fraction 3, which can be retrieved by applying additional 70 s of electric field.
:
1 (w/w) ratio precursor-to-curing-agent (Sylgard 184, Dow Corning, Menlo Park, CA) was poured on the mold taped on a 10 mm Petri dish. After degassing for 20 min in a desiccator chamber connected to a vacuum pump, the Petri dish was placed in an oven at 50 °C for at least 5 h. We then cut out and peeled off the PDMS slab from the mold, and punched four 6-mm diameter holes to form the TE, LE, branch, and collection reservoirs. The surface of the PDMS slab and a microscope glass slide was cleaned with a scotch tape and plasma-treated using a plasma cleaner (PDC-32G, Harrick Plasma, Ithaca, NW) connected to a vacuum pump (PDC-VPE, Harrick Plasma, Ithaca, NW) at high RF power for 90 s. Immediately after the plasma treatment, we bonded the PDMS substrate on the glass slide and waited at least 2 h to ensure a leak-free bond before using the channel.
000 rpm. The sucrose layer was discarded and the pellet was resuspended in 700 μl of Qiazol reagent from miRNeasy kit (Qiagen) followed by RNA extraction using manufacturer's instructions.
000), 8 M urea, 20 mM HCl, 130 mM Bis-Tris (measured pH 7.2). LE buffer was made fresh daily. Reservoir LE buffer consisted of 35% Pluronic F-127, 50 mM HCl, 200 mM Bis-Tris, and reservoir TE buffer was made of 35% Pluronic F-127, 200 mM Bis-Tris, 50 mM MOPS, and 25 mM caproic acid. Solutions containing >25% Pluronic F-127 is liquid at temperature below 4 °C and solid otherwise. We stored the solutions containing Pluronic F-127 in a 4 °C refrigerator before experiments and on ice during experiments. Pluronic F-127-containing solutions were loaded quickly and carefully while in the cold, liquid state as they solidify quickly at room temperature. Sample buffer included 250 nM Alexa Fluor 488 (AF488), 750 nM DyLight 488, 0.5% PVP, 20 mM HCl, and 130 mM Bis-Tris, and varying contents of RNA. Collection buffer was 20 mM HCl, 130 mM Bis-Tris, 0.1% PVP, and 0.4 U μl−1 SUPERase In RNase inhibitor.
For the single nucleotide exclusion experiment, we included 500 μM rATP (NEB) and 1 μM synthetic 26 nt RNA in the sample buffer. The sequence of 26 nt synthetic RNA was 5′-AUGUACACGGAGUCGACCCAACGCGA-3′ (IDT). For all other experiments, we used RNA from LCL extracts as detailed above. We purchased PVP, HCl, Bis-Tris, urea, MOPS, caproic acid, Pluronic F-127, NaOH, and Triton X-100 from Sigma-Aldrich. AF488 (A20000), DyLight488 (46402), and SUPERase In (AM2694) were purchased from Thermo Fisher Scientific. All solutions were prepared with UltraPure DNase/RNase free distilled water (10977015, Thermo Fisher Scientific).
After loading, we placed platinum electrodes in the TE and LE reservoirs, and applied 300 μA in the main channel and 30 μA in the branch channel with a high voltage source meter (Keithley 2410, Tektronix, Beaverton, OR). For the current applied in the main channel, any lower current can be used at a cost of increased assay time. We decided to use 300 μA at which we observed no temperature increase due to Joule heating during 10 minutes of our current protocol. We visualized fluorescence from the dyes (AF488 and DyLight 488) with a blue light transilluminator (DR22A, Clare Chemical Research, Dolores, CO). A video of the visualization process was captured by a cell phone camera (iPhone 6S, Apple, Cupertino, CA). The first ITP peak was visualized by AF488 and the second peak was visualized by DyLight488.
We performed three replicates of ITP size-selection experiments. After collecting fraction 1 and fraction 2, we quantified the RNA concentration in both fractions as well as in the initial sample using A260 absorbance (Nanodrop ND-1000) and fluorescence (Qubit 2.0 Fluorometer) measurements. Because of the mutually exclusive dynamic range of the Qubit (0.02–0.5 ng μl−1) and Nanodrop (0.5–3000 ng μl−1), the concentration of 26 nt RNA was only measured by Qubit and that of rATP was only detected by Nanodrop. In this test, we used the same 17 μl volume for the sample input and the collection. After each collection, we validated the output volume was indeed 17 μl by directly measuring the volume using a pipettor.
We next estimated the 26 nt RNA extraction efficiency. We found that the 26 nt RNA concentration in fraction 2 compared to that of the initial sample was 79.7%. Our results are consistent with previously reported recovery efficiencies of ∼80% for DNA input concentrations ranging from 0.25 to 250 ng using ITP.38 High-efficiency of ITP extraction compares favorably to gel extraction which suffers from low yields especially with low input samples.
We observed replicate to replicate variability in the size selection. For example, the fraction 3 of the center panel included negligible amount of RNA >100 nt while that of other replicates detected RNA >100 nt. We hypothesize that in the experiment associated with the center panel, RNAs electro-migrated slower than other ITP experiments. This hypothesis is supported by the narrower RNA size range observed in fraction 2 of the center panel compared to others. The slower RNA migration can happen if LE containing the sieving matrix smears into the sample channel during the loading step. For such cases, 70 s for fraction 3 recovery is insufficient for RNAs larger than 100 nt to arrive the collection reservoir. In addition, indirect monitoring of the RNA locations using fluorescence dyes may contribute to run-to-run variability. The repeatability of our method may be improved with the use of labeled RNA markers with specific sizes.
We further analyzed the data by quantifying the fluorescence signal from two size groups: 17–35 nt and 36–150 nt. We chose the size range of 17–35 nt since it coincides with the size range that is relevant for many sequencing analyses including ribosome profiling and RNA-binding protein footprinting techniques. RNAs longer than 35 nt represent undesired longer fragments that we aim to remove. For each size range, we calculated the mean percentage of RNAs contained in each fraction such that the sum of the values from all three fractions constitutes 100%. As shown in Fig. S3,† 75.3% of the total signal in the size range of 17–35 nt was from the ITP-extracted sample (fraction 2). The observed percentage is consistent with the recovery efficiency estimated using the synthetic 26 nt RNA as described above. In the size range of 36–150 nt, we found 68.5% of the signal was from fraction 3 and the rest of the signal was predominantly from fraction 2. We attribute the presence of longer RNAs in fraction 2 to a varying range of mobilities due to both RNA secondary structures39 and long RNAs outpacing the ITP interface due to their starting location far ahead (near branch channel) of the initial ITP interface (TE reservoir). This observation suggests that the percentage of long RNAs in the collected sample may be reduced further by having a longer separation channel compared to sample channel, at the cost of longer assay time and potential joule heating problems.
In addition, we carried out experiments which demonstrate the efficacy of our methods in size selection of RNA from cell lysate (data shown in ESI†). We applied ITP extraction to micrococcal nuclease (MNase)-digested chronic myelogenous leukemia cell lysate (K562). Since we used an endonuclease, the digested lysates included significant amounts of mono- and oligo-nucleotides that were initially part of the mRNAs that were not protected by ribosomes or RNA binding proteins. The sample preparation protocol and the experimental data are presented in section S1 of ESI.† Fig. S5† includes the bioanalyzer electropherograms of fraction 2 collected from three replicates of ITP size selection experiment using the digested cell lysate. This data provides evidence that our method is directly applicable to complex RNase-digested cell lysates for simultaneous purification and size selection.
ITP size selection excluded the majority of RNAs >35 nt as they remained in the channel as discussed in Fig. 3. Yet, a broader size range of RNA was observed in the ITP sample compared to the gel electrophoresis method (Fig. 4a). To quantify the yield of RNA extraction, we calculated the RNA concentration in the size range of 17–35 nt using Bioanalyzer software (2100 Expert). Compared to gel electrophoresis method, ITP method yielded 2.2 fold higher amount of RNA in the desired size range (Fig. 4b). The loss in the gel electrophoresis method is mainly attributed to the gel extraction step which is highly dependent on RNA size and sample amount.
For these experiments, we used RNA extracts from cells that were treated with ribonucleases (Methods). This sample preparation enriches for RNA fragments in the 17–35 nt range as ribosomes protect these mRNA fragments from nuclease digestion. Consequently, we expected a large fraction of the reads to map to the coding regions of transcripts as opposed to 5′ or 3′ untranslated regions. As expected, more than 82% of the transcriptome mapping reads mapped to the coding regions (Fig. 5a). We note that the mappability of reads to the transcriptome was low given that a significant portion of reads mapped to rRNA in both methods (range 89.8–92.9%). This is consistent with previous studies that do not employ rRNA depletion where approximately 80–95% of all reads are typically assigned to rRNA fragments.25,40 Although rRNA depletion method can be easily incorporated, we intentionally decided not to include this step in order to reflect the entire pool of captured RNAs and to avoid potential sequence-specific biases due to additional selection.41
![]() | ||
| Fig. 5 Comparison between high-throughput sequencing results of RNA fragments recovered by ITP and gel electrophoresis methods. (a) Reads mapping to the coding region, 3′ and 5′UTR were counted and plotted for both methods (three replicates each). (b) The mean number of reads per transcript across the three replicates of each method was calculated and compared. See Fig. S8† for replicate-level comparisons. (c) For each transcript, the mean and standard deviation of log2 read count were calculated across the replicates. A cubic spline was fitted and plotted for each method. See Fig. S11† for the individual data points showing each transcript. | ||
We next compared transcript-level quantification obtained by the two methods using the reads mapping to the coding regions. Specifically, we selected the subset of transcripts with read counts per million (cpm) greater than one in at least two of the six libraries. The mean number of reads per transcript correlated strongly (Spearman rank correlation: 0.97) between libraries prepared from ITP and gel electrophoresis methods (Fig. 5b). When we analyzed the pairwise rank correlations between replicates of each method, we similarly observed very high Spearman rank correlations ranging from 0.93 to 0.95 (Fig. S8†). We include systematic identification of transcripts with the high deviations between the quantifications from the two methods in section S3 of ESI.† In short, we used an MA-plot (Fig. S9†) to identify 230 transcripts with the largest deviations between two methods. Functional enrichment analysis of these “outliers” suggested higher read counts for transcripts associated with nucleosome (Table S1†). In a plot of M-values as a function of transcript length (Fig. S10†), we observed a very weak relationship (Spearman correlation ρ = 0.12).
While correlation coefficients are ubiquitously adopted in the literature for assessing reproducibility, they are susceptible to dynamic-range-dependent biases. Hence, we evaluated reproducibility by analyzing the mean to variance relationship of the quantifications for each transcript. We fitted splines to standard deviation of log2 read counts as a function of their mean. In Fig. 5c, we plotted these best-fit lines for the gel electrophoresis and ITP method, and individual data points for all transcripts are presented in Fig. S11.† A method with higher reproducibility should yield lower standard deviation of read counts across the range of mean of log2 read counts. We observed that results from ITP method had higher reproducibility across the three replicates for the entire range of mean transcript expression (Fig. 5c). We attribute the higher reproducibility of ITP compared to gel extraction to the higher yield of ITP. With higher yield, we sample more molecules from RNA prepared by ITP extraction, and thus the uncertainty of the mean is expected to be smaller for the same underlying distribution.
Detailed analyses on the sequencing read lengths can be found in ESI† section S2. In Fig. S6,† we provide the size distribution of all sequencing reads after the removal of the adapter sequence. Both methods exhibit specific peaks predominantly at 23 nt and 35–36 nt. These corresponded to rRNA contaminants, which were bioinformatically filtered. We also observed that ITP samples had a higher proportion of reads of length <20 nt, which are removed from analysis as described in Materials and methods. In Fig. S7,† we present size distribution of only the reads that aligned to the transcriptome for the two methods. We found that sequencing libraries from both methods contained only a small number of reads longer than 35 nt (<3% for ITP; <2% for gel electrophoresis) as expected.
Footnote |
| † Electronic supplementary information (ESI) available. See DOI: 10.1039/c9lc00311h |
| This journal is © The Royal Society of Chemistry 2019 |