Nicolas
Milon
ab,
Juan-Luis
Fuentes Rojas
a,
Adrien
Castinel
d,
Laurent
Bigot
c,
Géraud
Bouwmans
c,
Karen
Baudelle
c,
Audrey
Boutonnet
b,
Audrey
Gibert
d,
Olivier
Bouchez
d,
Cécile
Donnadieu
d,
Frédéric
Ginot
b and
Aurélien
Bancaud
*a
aCNRS, LAAS, 7 avenue du colonel Roche, F-31400, Toulouse, France. E-mail: abancaud@laas.fr; Tel: +33 5 61 33 62 46
bAdelis Technologies, 478 Rue de la Découverte, 31670 Labège, France
cUniv. Lille, CNRS, UMR 8523 - PhLAM - Physique des Lasers Atomes et Molécules, F-59000 Lille, France
dINRA, US 1426 GeT-PlaGe, INRA Auzeville, F-31326, Castanet-Tolosan Cedex, France
First published on 25th November 2019
In third generation sequencing, the production of quality data requires the selection of molecules longer than ∼20 kbp, but the size selection threshold of most purification technologies is smaller than this target. Here, we describe a technology operated in a capillary with a tunable selection threshold in the range of 3 to 40 kbp controlled by an electric field. We demonstrate that the selection cut-off is sharp, the purification yield is high, and the purification throughput is scalable. We also provide an analytical model that the actuation settings of the filter. The selection of high molecular weight genomic DNA from the melon Cucumis melo L., a diploid organism of ∼0.45 Gbp, is then reported. Linked-read sequencing data show that the N50 phase block size, which scores the correct representation of two chromosomes, is enhanced by a factor of 2 after size selection, establishing the relevance and versatility of our technology.
The process flow for DNA sequencing starts with a purification step (Fig. 1A), which is most commonly operated by controlling the binding of genomic DNA on a matrix and subsequently releasing it into an appropriate buffer,8 or by DNA precipitation. Matrix binding and precipitation protocols have performed equally well for the production of quality sequencing data with SGS.9 However, molecules of more than ∼20 kbp tend to be degraded during these purification protocols,10 and a size selection step is usually carried out to remove the low MW residues before library preparation in TGS. If the MW of these by-products is very low, typically a few nucleotides, they can be readily eliminated by solid-phase reversible immobilization.11,12 The selection of higher MW residues is however more laborious, because it is often performed by pulsed-field gel electrophoresis for separation followed by band excision and electroelution.13–15 Furthermore, the yield of this size selection process is lower than 50% for molecules of 20 to 50 kbp, imposing the purification of large initial amounts of genomic material, which are not always accessible e.g. for single cell sequencing studies.16 The development of fast and high-yield size selection technologies is therefore direly needed to enhance and speed up the extraction of quality data in TGS.
![]() | ||
Fig. 1 Principle of the DNA size selection filter for TGS. (A) The panel shows the main steps of the process flow for TGS. Our study is focused on the size selection of genomic DNA and the analysis of the resulting material by linked-read sequencing (see Methods in the ESI†). (B) The sketch represents the response of DNA in a capillary system. Transport is controlled by a Poiseuille flow opposite to the electrophoretic force (blue and red arrows, respectively). The balance between hydrodynamic and electrophoretic forces is favorable to hydrodynamic transport upstream of the constriction and to electrophoresis downstream, defining a region of DNA accumulation represented in green. By tuning the electric field (middle panel), we allow the leak of low MW DNA molecules whereas longer ones remain trapped in the concentrator. The high MW fraction is collected by switching off the electric field (bottom panel). (C) The electron micrograph shows a multicapillary system with 61 channels of 46 μm, and the sketch presents the processing of DNA in this system. Photographs of the devices shown in panel (B) and (C) are shown in ESI† Fig. S1. |
Microfluidic technologies offer unique solutions for the manipulation and purification of high MW DNA. For instance, DNA electrophoresis in artificial separation matrices made out of periodic arrays of obstacles etched in glass or silicon demonstrated their relevance for the sorting of ∼100 kbp in 15 s.17 Because DNA molecules are not entrapped in the separation matrix, they can subsequently be sorted and purified by size, as recently reported with a selection cut-off of ∼2 kbp.18 Alternatively, we recently developed the “μ-Laboratory for DNA Analysis and Separation” (μLAS) technology for DNA analysis in the size range 0.1 to 200 kbp.19,20 This technology is operated in a capillary electrophoresis system by controlling the fluid flow and using a counter electrophoretic force.21 The key component of μLAS is a constriction, in which DNA can be concentrated before its analysis with a limit of detection of 10 fg μL−1.19,22 Because the technology is operated without separation matrices, it can be used for DNA size selection.20 Here, we set out the development of a tunable size selection filter to rapidly produce 10 to 20 ng of DNA for TGS. The first section of this report describes the operating principle of this filter, which is subsequently operated in a monocapillary system. The selection cut-off can be tuned with excellent precision in the size range 3 to 40 kbp by monitoring the electric field in the range of 3 to 7 kV m−1. We then prove that the throughput of the filter is scalable by parallelizing the technology in a multicapillary system of 61 channels. These settings allow us to produce 17 ng of high MW genomic DNA in two hours. The resulting sample is analyzed by linked-read sequencing, which consists in partitioning long DNA molecules into one million droplets each containing a specific barcode and the material for library preparation, before standard short-read sequencing.23 Sequencing data are significantly improved with the purification compared to those without it for haplotype resolution. We finally discuss the different applications for which the μLAS size selection technology may offer advantages to obtain quality sequence data.
More quantitatively, we showed that the transverse viscoelastic force FVE increased linearly with the fluid maximum velocity V0 and the electrophoretic velocity Ve = μ0E with μ0 being the mobility,24,25 following a scaling in the form:
![]() | (1) |
The force in eqn (1) is equivalent to an elastic spring that keeps the molecule near the wall, allowing us to deduce the average position of DNA from the wall based on Boltzmann statistics:24,25
![]() | (2) |
The threshold size Nc corresponds to the situation of null velocity with balanced hydrodynamic and electrophoretic velocities at the constriction:
![]() | (3) |
By plugging eqn (2) into eqn (3), we deduce that:
![]() | (4) |
Because the electric field can be tuned with excellent precision, the size selection threshold Nc appears to be highly tunable. The cubic dependence on electric field also suggests that Nc can be adjusted over a broad range with small variations of the electric field.
The prediction of the size selection threshold (eqn (4)) is expected to be relevant for a multicapillary system, which consists of parallel channels of equal diameter (Fig. 1C, see Methods in the ESI† for details on the fabrication protocol). Indeed, the flow velocity and electric field are the same in each narrow channel, and they can be set to the same regime of concentration as in the monocapillary system (green rectangle in Fig. 1C). Notably, because the total flow rate is the sum of that flowing in each channel, we expect the multicapillary system to be a scalable technology for processing larger volumes and larger DNA quantities.
The multi-capillary system is based on the technology developed to manufacture photonic crystal optical fibers by the stack-and-draw method.33 More precisely, a meter-long silica tube of 25 mm outer diameter (OD) and 15 mm ID was elongated into several tens of millimeter-sized capillaries. This operation was performed on a drawing tower, a vertical equipment consisting in a feeding preform unit, a high temperature furnace, and a tractor unit. The resulting monocapillaries were then assembled manually to form a hexagonal stack inserted into a tube of 25 mm OD and 19 mm ID. This stack was eventually drawn into a multi-capillary cane using the same drawing tower. A fiber of ∼900 μm in OD constituting 61 capillaries of 45.8 μm in diameter (standard deviation 0.6 μm) was obtained. In order to fabricate the multicapillary system, we fitted a 2 cm long section of the multicapillary fiber into a glass vial insert of 300 μL (# 4025 GF-625, J.G. Finneran Associates, Vineland, NJ) and assembled it with the acrylate UV-curable glue.
The multicapillary device was operated on a custom prototype with two electrodes connected to the insert and collection vial. Liquid flows characterized by a mean flow velocity of 1.0 mm s−1 were controlled by gravity (see the ESI† for hydrodynamic modelling). The samples were manually loaded at the entrance of the fiber and a carousel with several tubes was placed at the outlet for sample fractionation.
After size selection operations with the mono- or multicapillary systems, the different sample fractions were characterized with the μLAS separation and titration protocol described in ref. 20. The sizing and quantification errors are 3% and 10%, respectively.
During the second step, we collected the retained high MW fraction by hydrodynamics only during ten minutes in a fresh vial containing 10 μL of buffer. The final volume of the collected sample was 11.5 μL. The leak and retained fractions were subsequently analyzed using the μLAS high MW separation method developed in ref. 20 in order to determine their molecular composition and concentration. In Fig. 2A, we report the reference ladder together with the leak and retained fractions (black, green, and red curves, respectively), as obtained with a selection filter actuated with an electric field of 5.4 kV m−1. The size of the cut-off appeared to be ∼7 kbp with a sharp separation between the bands of 6 and 8 kbp. The superposition of the different curves testified that the yield of this purification process was higher than ∼80% for all the bands in the sample.
![]() | ||
Fig. 2 Tunable DNA size filtration in a monocapillary. (A) The chromatograms show the separation of the DNA ladder with the μLAS technology using the separation settings reported in ref. 20. The dashed black curve corresponds to the reference ladder with a few annotated bands, and the leak and retained fractions are plotted in green and red, respectively. The electric field was set to 5.4 kV m−1 during the selection phase, corresponding to a cut-off size of ∼7 kb. The red and green curves have been multiplied by the dilution factors 11.5 and 30, respectively. (B) The same ladder shown in (A) is fractionated into four fractions using three consecutive settings for the electric field of 7.5, 5.4, and 4.2 kV m−1, and a final collection step without any electric field (as indicated in the legend). |
We then modulated the electric field in order to fractionate the same ladder sample into four fractions. Using three threshold voltages of 7.5, 5.4, and 4.2 kV m−1, we performed three consecutive purification phases of one hour each, and finally collected the remaining fraction during 10 minutes. The resulting four fractions were analyzed to evaluate the three size selection cut-offs of 4, 7 and 12 kbp (Fig. 2B). The final retained fraction plotted in green contained the peaks of 15, 20, and 50 kbp, suggesting that our technology was adequate for high MW DNA purification in the context of TGS technologies. Furthermore, the size selection threshold could be adjusted to ∼40 kbp, i.e. with the purification of the 50 kbp band from the rest of the ladder (ESI† Fig. S2), by setting the electric field to 3 kV m−1.
We then checked the validity of our model to predict the size selection cut-off (eqn (4)). We performed a series of 13 experiments using different settings for the flow velocity and the electric field in the ranges 0.3 to 1.5 mm s−1 and 3 to 7 kV m−1, respectively. We estimated the cut-off size Nc with a precision determined by the size difference between the last peak in the leak fraction and the first peak in the retained fraction. We plotted Nc as a function of the electric field (Fig. 3A), showing a non-linear decrease as the electric field increased. We also noted that Nc increased with the flow velocity at a constant electric field, in agreement with the prediction of our model in eqn (4). The relevance of our model was strongly confirmed by plotting the size cut-off as a function of V0/E3 (Fig. 3B), because a linear trend associated with a Pearson coefficient R2 of 0.94 was detected (dashed line in Fig. 3B). Altogether, we demonstrate the principle of a versatile and predictable DNA selection filter with a tunable cut-off size in the range of 3 to 40 kbp controlled by an electric field delivered by a commercial capillary electrophoresis system.
![]() | ||
Fig. 3 Control of the size selection threshold by the electric field. (A) The graph presents the size selection cut-off as a function of the electric field for various flow velocities, as indicated by the color bar. Note that five experiments were carried out with a constant electric field of 3.8 kV m−1 and different flow velocities. (B) The same dataset can be cast on a master curve (dashed line) using the normalization suggested by our model (eqn (4)). |
![]() | ||
Fig. 4 Saturation of the DNA filter. (A) The graph shows the yield of the leak and retained fractions (represented as dashed and solid lines, respectively) as a function of DNA size in logarithmic scale. We used the same ladder as that in Fig. 2 and set the cut-off to 9 kbp (green line). The concentration is determined with an error of 10%, as shown in ref. 20. (B) The fluorescence micrographs present a time-lapse recording of the response of high MW DNA during the size selection process. Molecules are concentrated at the entry of the narrow capillary (shown with dashed white contour lines). The bright cluster marked in red is not retained and leaks away from the constriction. |
In order to investigate the saturation of the size selection filter, we performed live fluorescence microscopy analysis of the dynamics of DNA concentration at the constriction. We injected 8 ng of 50 kbp DNA, i.e. forty times more than the saturation threshold, and set the cut-off to 10 kb with an electric field and hydrodynamic flow velocity of 5.4 kV m−1 and 1.5 mm s−1, respectively. We detected the accumulation of DNA molecules in the narrow capillary, and the formation of bright clusters of heterogeneous sizes (Fig. 4B). These clusters could not be stably retained, as shown by one escape event associated with hydrodynamics-dominated transport of a bright mass of DNA along the narrow channel (Fig. 4B). By approximating the region where DNA accumulates in a cylinder of 50 μm in diameter and 200 μm in height, we evaluated its volume to be ∼0.4 nL. Taking the saturation threshold to be 0.2 ng, we then deduced that the size selection filter saturated for a DNA concentration of ∼0.5 mg mL−1. Note that this estimate is likely underevaluated because DNA accumulates close to the walls and not evenly across the capillary section. The DNA concentration at saturation then appears to be lower than the solubility limit of this biomolecule in the range 10–100 mg mL−1,28 but larger than the threshold of DNA aggregation driven by AC electric fields of 0.05 mg mL−1.29 Consequently, irrespective to the mechanism of clustering, the saturation of the size selection filter appears to arise from the confinement of the retained fraction in a narrow volume and the ensuing molecular aggregation.
We first checked that the size selection threshold was similar by performing sample fractionation with an electric field set to 4 kV m−1 during the retention phase. We expected a size cut-off of 12 kb, and measured it at 9 kbp (ESI† Fig. S4A), confirming that the operating principle of the technology remained nearly the same in the mono- and multi-capillary systems. The quality of the selection filter's cut-off was slightly decreased, as we detected residual amounts of the 8 and 6 kbp bands in the retained fraction of 7 and 3%, respectively. In order to select high MW DNA for sequencing applications, we then established the size selection cut-off to ∼40 kbp by adjusting the electric field to 2.9 kV m−1 (Fig. 5). For this experiment, we spiked the DNA ladder with an additional DNA fragment of 100 kbp, which was collected in the retained fraction together with the 50 kbp band (red curve in Fig. 5). Because the threshold was close to 50 kbp, we also detected a small proportion of 50 kbp DNA in the leak fraction (blue curve in Fig. 5). These experiments hence showed that the selection technology could be operated with a multicapillary system with the same electro-hydrodynamic actuation parameters as those in the monocapillary.
We then evaluated the saturation threshold of the multicapillary device by injecting gradual amounts of the kb Extend DNA ladder from 5 to 50 ng, i.e. 1.5 ng to 15 ng retained at the constriction, with the same actuation parameters as those in ESI† Fig. S4A. The collection yield appeared to be lower in the multicapillary vs. monocapillary system. We indeed measured collection yields of 75% and 55% for the leak and retained fractions, respectively (ESI† Fig. S4B). This apparent loss of DNA remains unclear, but we suspect that some molecules remain trapped on the outer glass shell of the multicapillary of 250 μm (Fig. 1C), where the flow velocity and electric fields are nearly null. The selection cut-off appeared to be sharp for 1.5 and 4.5 ng of high MW DNA at the constriction, but the size selection cut-off broadened for 15 ng of DNA. The presence of the 10 to 50 kbp fragments in the leak fraction indicated the saturation of the multicapillary system, likely associated with the formation of aggregates during the size selection process. Consequently, the scale up of our technology to 61 capillaries allowed us to increase the saturation threshold by a factor of 4/0.2–20, meeting the objectives of increasing the throughput without changing the actuation settings.
We first processed a low quantity of 2.5 ng of the melon DNA sample using the same voltage threshold of 2.9 kV m−1 as that for the DNA ladder. As expected, the retained fraction was larger than ∼40 kbp (red curve in Fig. 6A) and the leak fraction was centered at ∼25 kbp (blue curve in Fig. 6A). Next, we aimed to increase the collection to reach the ∼15 ng required for sequencing operations. Because the proportion of molecules of more than 40 kbp represented ∼20% of the sample and the collection yield was ∼50%, we performed the operation with an initial input of 100 ng. In addition, we repeated the size selection operation 4 times consecutively so as to retain 5 ng at the constriction each time and avoid saturation. At the end of this selection process, which took place in 2 hours, we collected 25 μL of material at a concentration of 0.7 ng μL−1. Hence, we obtained ∼17 ng of genomic DNA, in good agreement with our initial specifications. The size distribution of the purified sample was not as sharp as that in the calibration experiment, as shown by the presence of molecules of 15 kb (red curve in Fig. 6B), but the removal of DNA fragments lower than ∼10 kbp was clearly achieved. Because the selection threshold was lower than expected, we note that the recovery yield was lower than initially expected. This degradation of size selection performances may be due to the rapid saturation of the filter due to the presence of very high MW molecules (see more below). We indeed noticed that saturation decreased with DNA MW, as for instance exemplified by the saturation at 5 ng with a fragment of 0.2 kbp (data not shown).
We sequenced processed vs. unprocessed genomic DNA samples, and obtained the same assembly size of ∼366 Mbp, representing ∼81% of the genome (Table 1). This size range was comparable to that obtained by pyrosequencing of 375 Mbp.30 The average length of the sequenced DNA fragments was also comparable in both samples (second line in Table 1), in apparent contradiction with the size selection process. Conversely, the number of contigs longer than 50 kbp was 63% greater after size selection by μLAS. These results can be explained if some high MW molecules are eliminated during the selection process, probably because they form clusters and tend to leak during the selection phase. The sequencing of these long molecules increases the coverage of the genome, explaining that the scaffold N50 size was 68% longer without purification. Nevertheless, the narrowing of the size distribution after size selection (red curve in Fig. 6B) allows us to obtain a higher number of contigs of more than 50 kbp. Interestingly, during library preparation, the partitioning of the unprocessed sample with its broad size distribution creates discrepancies in the depth of coverage,31 likely resulting in low quality sequencing data for long molecules. Contrariwise, the narrow size distribution of DNA fragments obtained with μLAS ensures homogeneous depth of sequencing and quality data for the identification of single nucleotide polymorphisms. This proposition explains the doubling of the N50 phase block size, which scores the accurate representation of the two chromosomes from sequencing data, after size selection as well as the better performances in terms of the average number of uncalled bases (lower line in Table 1). Altogether, because the resolution of haplotypes is a key asset of the Chromium technology, we conclude that the size selection of genomic DNA with μLAS is not indispensable for sequencing but enables us to obtain quality sequencing data.
Without size selection | μLAS | |
---|---|---|
Assembled size (Mbp) | 367 | 366 |
Length-weighted mean (kbp) | 25.0 | 24.6 |
Contig > 50 kbp | 258 | 420 |
Largest contig (Mbp) | 6.8 | 7.2 |
Scaffold N50 (Mbp) | 2.58 | 1.54 |
N50 phase block (Mbp) | 8.1 | 16.5 |
Average number of uncalled bases per kbp | 3.0 | 2.5 |
Future lines of development concern the better processing of DNA molecules of more than ∼100 kbp. Our data indeed indicate that these molecules tend to be eliminated during size selection. Consistently, this limitation has not been detected during the calibration steps, which have been performed with a DNA ladder with a larger band of 100 kbp. Hence, specific improvements should be performed to define operating conditions in the range of 50 to 300 kbp with a dedicated ladder. While we recently showed that DNA separation could be performed for fragments of up to 200 kbp,20 the saturation of our technology remains to be evaluated in this size range. In the low MW size limit, size selection would also be valuable for other applications, including enhanced analysis of circulating cell-free DNA in blood plasma.32 An adequate formulation of the viscoelastic buffer has already been reported for separation and concentration for low MW DNA,19 and promising results of purification have been obtained with a favorable saturation threshold of 5 ng in a monocapillary system (data not shown).
Regarding future applications, the potential of μLAS for purification of minute amounts of genomic material and its sensitivity of 10 fg μL−1 (ref. 20) may be particularly useful for single cell sequencing studies.16 In this context, the number of molecules is minimal and the collection yield is critical. Because one human cell contains a few pg of DNA, saturation is not expected to be an issue. Hence, purification operations may be performed in the monocapillary system, which shows best performances. The preservation of long chromosome fragment integrity throughout the size selection process should carefully be evaluated, requiring the development of specific solutions for quality control of minute samples of very high MW. The resulting technologies for quality control and size selection may contribute to the better analysis of genomic heterogeneities and the interplay between allele variation and gene expression.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c9lc00965e |
This journal is © The Royal Society of Chemistry 2020 |