G. Sampath
59 Washington Street #178, Santa Clara, CA 95050, USA. E-mail: sampath_2068@yahoo.com
First published on 17th October 2014
A tandem electrolytic cell with the structure [cis1, upstream nanopore (UNP), trans1 = cis2, downstream nanopore (DNP), trans2], an exonuclease enzyme attached to the downstream side of UNP, and a chemical adapter in or a profiled voltage over DNP can be used with bandwidths of a few kHz to sequence bases in ssDNA in natural order with high accuracy and without loss in trans1/cis2 or regression into DNP from trans2.
In ‘strand sequencing’ negatively charged bases (A, T, C, G) in a single strand of DNA (ssDNA) passing from cis to trans are identified by their current blockade levels.12 Base discrimination with this method is limited by, among other things, the length of the pore and the speed of strand translocation. In ‘exonuclease sequencing’13 an exonuclease enzyme which in cis and is adjacent to the nanopore cleaves single bases (more correctly dNMPs or ‘deoxynucleoside monophospates’,13 where M = A, T, C, or G; the term ‘base’ is commonly used instead) in ssDNA and drops them into the pore for identification. This method can discriminate among the base types (including methylated bases) with 99.8% accuracy13 but it has a major problem: a cleaved base may diffuse back into the cis bulk and be missed or called out of order.14
In this communication an electrolytic cell structure with two nanopores in tandem is proposed for DNA sequencing. Formally it is written as the pipeline [cis1, upstream nanopore (UNP), trans1 = cis2, downstream nanopore (DNP), trans2]. An exonuclease enzyme covalently bonded to the downstream side of UNP cleaves the leading base from ssDNA threading through UNP. The cleaved base is detected as it passes through DNP while being slowed down by a chemical adapter inside DNP or a profiled voltage applied over a segment of DNP. A Fokker–Planck model of the tandem cell shows that it avoids the difficulties encountered in strand and exonuclease sequencing (as it is currently known13) and can, with probability approaching 1, sequence ssDNA without loss to diffusion or errors in sequence order. In particular, homopolymers (repeat bases) do not present a problem.
With this approach, sequencing efficiency depends only on the level of discrimination in DNP among the four (or more, if modifications like methylation13 are considered) base types. Possible biological and biological-synthetic implementations are considered. With a biological DNP and a cyclodextrin adapter, up to 99.8% accuracy is possible.13 Only the essential features of the proposed structure and its mathematical model are presented here, the details are given in a ESI,† which also contains brief notes on materials and other two-pore sequencing methods that have been reported.
Fig. 1 shows a schematic of the tandem cell. Most of the potential difference V05 (∼99%) drops across the two pores.14 Here it is assumed that with a large enough V05 a strand of ssDNA is captured in the mouth of UNP, threads through UNP, and presents itself to the exonuclease for cleaving on the trans1/cis2 side of UNP. With V05 = 0.4 V and 49.5% of it dropping across each of the two pores, 198 mV across UNP is sufficient to ensure capture-threading (see Fig. 7, ref. 1), and a similar 198 mV across DNP for a cleaved base to translocate through DNP and be detected there during its passage.13 Sequencing will be accurate if: (a) cleaved bases arrive at and are captured by DNP in their natural order; (b) DNP identifies each and every base as it passes through; and (c) the detected base exits DNP without regressing. All these conditions are satisfied in the tandem cell based on the following informal rationale (see below for more formal arguments based on a mathematical model): (1) if the leading base is cleaved by the enzyme at a rate of 1 every 10–80 ms,14,15 then with L23 = 1 μm, mobility μ = 2.4 × 10−8 m V−1 s−1, and diffusion constant D = 3 × 10−10 m2 s−1, the mean time for a cleaved base to translocate through trans1/cis2 is <1.667 ms (= L232/2D, the mean with zero drift voltage; see ESI†). If the translocation time spread is not too high then successive bases will not enter DNP out of order; (2) a cleaved base cannot regress into cis1 because the remaining DNA strand blocks its passage; (3) similar to 1, the mean translocation time for a cleaved base through DNP is ∼0.167 μs ≪ 10 ms = minimum time between two successive bases arriving at DNP. Therefore two bases do not occupy DNP at the same time; (4) likewise, a detected base exiting into trans2 under the influence of V05 cannot regress into DNP from trans2 for large enough V05. The likelihood of regression can be further reduced with a reinforcing drift field over trans2 using an additional electrode V4 at the top of trans2 and a voltage difference applied between V4 and V5.
More formally, the behavior of a cleaved base as it translocates through trans1/cis2 and DNP can be studied via the trajectory of a particle whose propagator function G(x,y,z,t) is given by a linear Fokker–Planck (F–P) equation14,16 in one dimension (DNP) or three (trans1/cis2).
(1) Translocation of cleaved base through DNP. A 1-d approximation is applied to DNP resulting in an initial-value boundary-value problem that can be solved analytically.16,17 A cleaved base is treated as a particle that is released at z = 0, t = 0; reflected at z = 0, t > 0; and captured at z = L34, t > 0.
The probability density function (pdf) of the first passage time (translocation time) for a particle to diffuse-drift from z = 0 to z = L34 and get absorbed at z = L34 is obtained using standard methods (see ESI† for details). Equivalently results from a recently published model14 of exonuclease sequencing have been modified (see ESI†) to obtain analytical expressions for the translocation time (T) statistics (mean and standard deviation) of a cleaved base passing through DNP. Fig. 2 shows the mean E(T) for different voltages (including negative voltages, see below for more on using these to slow down a translocating base) over DNP. (The standard deviation σ(T) is very close to the mean and is not shown; see ESI.†) With a biological DNP based on AHL and using the optimum potential difference of ∼0.18 V across the pore of a conventional cell as determined in exonuclease sequencing studies,13 E(T) and σ(T) are ∼10−8 s, which is too fast for the detector electronics.
![]() | ||
| Fig. 2 Mean of time for particle to translocate from time of entry into DNP (negligible cross-section and length L34 = 8–10 nm) to time of exit into trans2. (Standard deviation very close to mean, see ESI.†) Parameter values used: mononucleotide mobility μ = 2.4 × 10−8 m2 V−1 s−1, diffusion constant D = 3 × 10−10 m2 s−1. Calculations are for absolute potential differences in the range 0.0–0.3 V. Negative field across DNP results in markedly decreased translocation times, see text below. | ||
(2) Translocation of a cleaved base through trans1/cis2. This is modeled in three dimensions. Assuming trans1/cis2 to be a rectangular box-shaped region (0 ≤ z ≤ L23; −d/2 ≤ x, y ≤ d/2), a particle is released at the top and translocates to the bottom of the compartment where it is ‘absorbed’. That is, the particle is detected when it reaches z = L23 independent of x and y and moves into DNP without regressing into trans1/cis2. Its behavior in DNP is described by a F–P equation in three dimensions with the following initial and boundary conditions: (a) the particle is released at position (0,0,0) at t = 0; (b) it is reflected at all x, y = ±d/2 and at all z = 0, t > 0; (c) it is ‘absorbed’ at z = L23, t > 0. Since the initial condition G(x,y,z,0) = δ(x,y,z) = δ(x)δ(y)δ(z) is separable in x, y and z, the propagator function G(x,y,z,t) can be written as the product of three independent propagator functions.16 Based on the definition of ‘absorption’ as given above, it can be shown that diffusion in the x and y directions has no effect on G(x,y,z,t) so that the first passage time distribution in the three dimensional case reduces to that in the one-dimensional case. The derivation can be found in the ESI.† Fig. 3 shows the dependence on pore voltage of the mean translocation time E(T) of a cleaved base through a trans1/cis2 compartment of length L23 = 1 μm. (Once again, the standard deviation is not shown; see ESI.†)
![]() | ||
| Fig. 3 Mean of translocation time for particle (cleaved base) released by exonuclease at top of trans1/cis2 (3-dimensional box with height 1 μm and cross-section 1 μm2) to move to entrance of DNP. (Standard deviation in same range as mean, not shown; see ESI.†) Parameter values used are same as in Fig. 2. Calculations are for potential differences V05 in the range 0.0–0.3 V, with 1–2 mV dropping across trans1/cis2. | ||
The F–P model summarized above is a piecewise model that does not consider the behavior of the particle at the interface between two sections. Thus a particle oscillates at an interface because of diffusion. Formal probabilistic arguments can be used (see ESI†) to show in the case of trans1–DNP that with sufficiently large V05 the particle eventually passes into DNP, such passage being aided directly by the positively directed drift potentials in both sections (and indirectly by the reflecting boundaries in trans1/cis2). The behavior at the interface between DNP and trans2 is similar. The tapered geometry of trans1/cis2 shown in Fig. 1 aids drift into DNP. As with the box geometry considered above it can be modeled with a F–P equation and boundary conditions appropriate to such a geometry. Similarly the abrupt increase in cross-section from DNP to trans2 decreases the probability of a detected particle regressing into DNP from trans2.
The translocation of a base through DNP is too fast for the detector electronics. Methods to slow down translocation include the use of magnetic or optical tweezers,18 alternative electrolytes and/or increased salt concentration,19 and molecular brakes.20 With a biological DNP (AHL) a covalently attached adapter13 inside the pore can be used for slowdown; the bandwidth required is then ∼20 kHz.13 An alternative approach for use with a synthetic DNP based on a profiled electric field over DNP (Fig. 4) is considered here. To obtain such a profile, voltages V34-1 and V34-2 are applied to lateral electrodes in DNP at aL34 and aL34 + bL34 with a + b + c = 1, V0 < V34-1, V34-1 > V34-2, and V34-2 < V5. Calculations show that translocation over the segment [aL34, aL34 + bL34] is considerably slowed down by the negative field, which also dominates the total translocation time over DNP. With L = 10 nm, a = c = 0.3, b = 0.4, Va = Vc = V34-1 − V0 = 0.05 V, and Vb = V34-2 − V34-1 = −0.18 V, the translocation time goes up from ∼20 ns (for V34 = 0.2 V) to ∼2.9 ms. See Fig. 2.
The maximum voltage difference that can be applied over any stage in the tandem cell is set by the value of the breakdown field for distilled water (∼70 MV m−1), thus voltage differences of up to ∼0.7 V are possible over a length of 10 nm. The optimum values for V05, V34-1, and V34-2 can be determined from experiment. With AHL for DNP the optimum over DNP13 to prevent regression into DNP is ∼0.18 V.
Results from the model show that with probability approaching 1 cleaved bases enter DNP in their natural order and that no more than one base occupies DNP at any time. The next three paragraphs apply to a biological or synthetic DNP.
Let bases be cleaved at a rate of one every T seconds, where T is a random variable with a distribution of values15 in the range 10–80 ms. Let base 1 be cleaved at time t = 0 and base 2 at t = T. Let Ti = time for base i to diffuse-drift over trans1/cis2 and P = translocation time through the pore. Let Ti and P be independent and identically distributed (i.i.d.) random variables with respective means of μTi and μP, standard deviation σTi and σP, and finite support equal to 6σ. The following sufficient condition holds for the two bases to arrive in order:
| T > (μT1 + 3σT1) − max(0, μT2 − 3σT2) | (1) |
Using a minimum value of 10 ms for T and the data in Fig. 3 (as well as standard deviation data from the ESI†), V23 = 1.5 mV, μT1 = μT2 = 1.6 ms, and σT1 = σT2 = 1.3 ms, eqn (1) is satisfied. Thus bases arrive sequentially at DNP. Using similar arguments, it can be shown that detected bases passing into and through trans2 do so in their natural order.
The condition for two bases not being in the pore at the same time can be obtained as
| T + max(0, μT2 − 3σT2) > μT1 + 3σT1 + μP + 3σP | (2) |
With no slowdown in DNP, using T = 10 ms and the data in Fig. 2 (plus standard deviation data from the ESI†), V23 = 1.6 mV, V34 = 0.2 V, μT1 = μT2 = 1.6 ms, σT1 = σT2 = 1.3 ms, μP ≈ 2 × 10−8 s, and σP = 0.68 × 10−8 s, eqn (2) is satisfied, so two bases cannot be in the pore at the same time. This also means that repeat bases (homopolymers) can be identified without difficulty.
Conversely, the minimum interval required between the release of two successive cleaved bases on the exit side of UNP so that they do not occupy DNP at the same time is
| Tmin = μT1 + 3σT1 + μP + 3σP | (3) |
Using the data given earlier, Tmin ≈ 5.5 ms. With suitable controls (temperature, salt concentration, etc.) the enzyme can be set to cleave bases at time intervals >10 ms.15
With a negative electric field over part of a synthetic DNP a rough estimate can be obtained for the probability of two bases being in DNP at the same time. With L34 = 10 nm, V23 = 1.6 mV, b = 0.4, Vb = −0.2 V, Va = Vc = 0.05 V, V23 = 1.6 mV and using data from Fig. 2 and 3 and standard deviation values from the ESI,† μT1 ≈ 1.6 ms, σT1 ≈ 1.3 ms, and μP ≈ σP ≈ 0.46 ms, from which Tmin ≈ 7.35 ms, which is within the range of turnover rates achieved with exonuclease.15 The detector bandwidth required (including digital processing and noise filtering) would be on the order of 5 kHz, which is not far from the 1 ms translocation time criterion2 for effective detection. Similar results are possible for an AHL pore but the available tolerance is considerably less (see ESI, Section S–7†).
The tandem cell can be implemented in biological form (AHL or MspA for both UNP and DNP, with a cyclodextrin adapter inside DNP for slowdown) or hybrid biological-synthetic form (AHL/MspA for UNP, synthetic pore for DNP). Several implementation issues are discussed next.
Accuracy. With AHL for DNP and a cyclodextrin adapter inside, base types can be distinguished with up to 99.8% accuracy using bandwidths ∼20 kHz.13 The accuracy with synthetic DNP may be obtained by experimentation. Additional discrimination information may also be present in the inter-arrival times of bases at DNP .
Positioning the enzyme. The enzyme on the exit side of UNP needs to be in the path of the threading DNA sequence such that the first base of the remaining sequence is presented to it. Failure to cleave is indicated if current blockade pulses due to bases passing though DNP are totally absent or stop occurring.
Voltage drift. With an ion-selective DNP, ion current changes, which are typically <100 pA, can lead to the pore voltage drifting over time. Methods commonly used in electronic measurements can be used to solve the voltage drift problem. Alternatively the trans1/cis2 and trans2 compartments and DNP can be drained periodically and refilled with electrolyte.
Solid state pore for DNP. A solid state pore has the advantages of scaling and integration in fabrication. It has been studied widely both experimentally and theoretically in the context of DNA sequencing. While such pores are not useful in strand sequencing for single-nucleotide discrimination because of their thickness (currently they have a minimum thickness of 20 nm and an hourglass shape5 with actual pore thickness of 10 nm), with exonuclease sequencing using a tandem cell this may not be a problem because of the near zero probability of two nucleotides being in the nanopore at the same time.
Negative field over DNP. A negative field can be implemented over a synthetic DNP with a pair of electrodes in the form of graphene sheets with nanopores8 alternating with three layers of silicon pores leading to the structure [Si pore, graphene electrode, Si pore, graphene electrode, Si pore]. Si++ or molybdenum sulphide21 (MoS2) layers with nanopores may also be used for the electrodes. Other possibilities are considered in the ESI.†
Sticky bases. A cleaved base may stick to a side wall while diffusing inside trans1/cis2 or DNP. The probability of a base sticking to trans1/cis2 can be calculated using the model described above. One way to prevent such sticking is to hold the side walls of trans1/cis2 at a slightly negative potential with respect to V1 thereby creating a reflecting wall for the negatively charged base. Alternatively a compartment or pore wall may be chemically treated to prevent sticking, as shown in recent sequencing studies with solid-state pores22 and graphene sheets.23 Another option is to replace graphene with MoS2, which is non-sticking and also provides better discrimination, as recent simulation studies have shown.21
Recovering the original strand. The original strand may be re-synthesized with a modified tandem cell. With AHL for DNP a detected base passing through DNP could be processively added to a primer on a template strand by a polymerase enzyme attached to the trans2 side of DNP. With a synthetic DNP a third (biological) pore (TNP) and trans3 compartment may be added to the cell, leading to the structure [cis1, UNP, trans1 = cis2, DNP, trans2 = cis3, TNP, trans3]. The template–primer–enzyme complex is attached to the trans3 side of TNP. This effectively transforms exonuclease-based sequencing with a tandem cell into a non-destructive process. It is similar to a recently reported sequencing-by-synthesis (SBS) method in which bases with heavy base-specific tags are incorporated into a strand in the cis side of a conventional cell using DNA polymerase.24 The tags are cleaved during base incorporation and drop into the pore where the blockade levels are used to identify the base.
Parallelization. An arbitrary number of tandem cells could be implemented in parallel. With a sequencing rate of 30–100 s−1 (equal to the enzyme turnover rate), an array of 10
000 cells can potentially sequence 109 bases in under an hour.
Footnote |
| † Electronic supplementary information (ESI) available: Mathematical details and other implementation-related notes are given in a Supplement. See DOI: 10.1039/c4ra10326b |
| This journal is © The Royal Society of Chemistry 2015 |