Guohao
Xi
a,
Jinmeng
Su
ab,
Jie
Ma
a,
Lingzhi
Wu
c and
Jing
Tu
*a
aState Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China. E-mail: jtu@seu.edu.cn
bMonash University-Southeast University Joint Research Institute, Suzhou, 215123, China
cCollege of Science, Nanjing University of Posts and Telecommunications, Nanjing, 210046, China
First published on 24th February 2025
Solid-state nanopores represent a powerful platform for the detection and characterization of a wide range of biomolecules and particles, including proteins, viruses, and nanoparticles, for clinical and biochemical applications. Typically, nanopores operate by measuring transient pulses of ionic current during translocation events of molecules passing through the pore. Given the strong noise and stochastic fluctuations in ionic current recordings during nanopore experiments, signal processing based on the statistical analysis of numerous translocation events remains a crucial issue for nanopore sensing. Based on parallel computational processing and efficient memory management, we developed a novel signal processing procedure for translocation events to improve the signal identification performance of solid-state nanopores in the presence of baseline oscillation interference. By using an adaptive threshold within a sliding window, we could correct the baseline determination process in real time. As a result, the features of translocation event signals could be identified more accurately, especially for the intermittent occurrence of high-density complex signals. The program also demonstrated good signal differentiation. As a ready-to-use software, the data program is more efficient and compatible with diverse nanopore signals, making it suitable for more complex nanopore applications.
Over decades of study, several typical nanopore signal processing programs have been developed, such as AutoNanopore, Open Nanopore, Cavro Nanopore Sensing, Transalyzer, NanoAnalyzer, NanoPlex,24 EventPro25 and others.23,26–29 AutoNanopore and Open Nanopore employ adaptive threshold methods to detect low SNR events, but their performance may be limited under high-noise or temporally attenuating signal conditions. Cavro Nanopore Sensing and NanoAnalyzer offer high throughput and parallel analysis capabilities, yet they may struggle with noise interference in complex signal environments, especially under high salt concentrations. Transalyzer and MOSAIC have high resolution for signal precision but may encounter challenges in event recognition accuracy when dealing with signal attenuation or significant noise effects. NanoPlex is characterized by good noise suppression and adaptability to low SNR events. However, its moderate flexibility limits its application in diverse experimental setups. EasyNanopore is known for its simplicity and low hardware requirements, but it faces challenges in handling baseline drift and complex signal environments. In contrast, our “Dynamic Correction Method” dynamically adjusts thresholds and corrects baselines in real-time, significantly improving the detection of low SNR and complex signals. Especially in solid-state nanopore applications, our method effectively mitigates noise and overcomes the limitations of traditional methods, enhancing the signal detection accuracy and reliability. These methods mainly search outliers in the current traces to achieve the event recognition and information extraction of nanopore signals with the baseline and threshold algorithms. Based on the professional experience of the individual processing the data, the threshold should be selected by calculating and truncating the current signal changes. Furthermore, it is even harder to determine a uniform global standard from the fluctuation and drift of the baseline current over time. On this basis, emerging machine learning methods are currently being used for nanopore signal recognition and statistical analysis.30–32 Generally, a set of machine learning-based algorithms, including Hidden Markov models, fuzzy C-means, and Support Vector Machines, can be efficiently used to improve the recognition, feature extraction and cluster analysis of nanopore signals. Meanwhile, neural network algorithms based on deep learning have been employed to continuously optimize the prediction results of signal processing.33–35 However, these methods necessitate a specifically configured operational environment and training database, which may not be user-friendly for non-professionals engaged in software development.
To solve the massive signal processing, a configuration-free translocation event detection software named “EasyNanopore” has been developed in our research group.29 The method employs an adaptive thresholding approach based on low-frequency variance (utilizing local mean and local variance) to define the commencement and conclusion of an event. The Dynamic Correction Method is particularly effective in detecting low signal-to-noise ratio (SNR) events, which are often challenging to identify using traditional static threshold methods. By utilizing dynamic adjustments based on the characteristics of the signal, it is able to more accurately capture events that might be overlooked by fixed-threshold methods, especially in cases where noise or signal variations are subtle. Furthermore, when considering events that exhibit temporal attenuation or decay, the Dynamic Correction Method offers significant advantages. Temporal attenuation refers to the gradual decrease in signal amplitude over time, which can be an important characteristic of certain events, such as the translocation of larger molecules through nanopores. In such cases, the Dynamic Correction Method can more effectively track and detect events that experience a slow decay in signal, ensuring that these events are not missed. This capability is crucial for maintaining the accuracy and reliability of signal detection in experiments with complex signal profiles, such as those involving nanopores.
To better illustrate the strengths of the proposed method, we have compared its performance with that of several commonly used nanopore signal detection platforms. The following table summarizes key performance metrics, including sensitivity, accuracy, computational speed, and hardware requirements. This comparison highlights the advantages of the proposed method, particularly in terms of its efficiency and suitability for use with limited computational resources.
Additionally, a multi-threaded algorithm is employed to partition the file and adapt to low-end CPU configurations. The parallel computation method for file partitioning enhances the speed of event detection during the recognition process. However, the baseline parameters and thresholds are determined based on the local mean variance of each point in this algorithm. As a cumulative calculation model, this model is undeniably accurate. However, a similar error cumulative pattern will become more serious when baseline calculation deviation occurs. Especially for solid-state nanopores, where the signal output is more diverse and complex, and the baseline fluctuation situation is more drastic, persistent deviations tend to occur silently. This effect is especially pronounced in the case of dense signal fragments and mixed signal fragments. This is the principal factor contributing to the significant discrepancies and lack of reproducibility observed in the signal data obtained from solid-state nanopores.
Therefore, an improved signal processing program in our present study has been proposed based on a novel baseline and threshold correction computation model that adjusts in real time during the document recognition process. This method integrates the determination of the threshold and baseline with the current recognition area of the signal, and corrective measures are implemented in accordance with the baseline conditions in the vicinity of individual signals, thereby effectively mitigating the impact of baseline fluctuations and dense signal areas. Furthermore, the conventional baseline scanning mode has been retained, allowing the user to select it freely according to the type of signal file in question. This effective corrective measure markedly enhances the identification and precision of solid-state nanopore signals, constituting a valuable contribution to the efficient and accurate classification and deployment of nanopore signals.
We refer to this signal processing program as the Dynamic Correction Method. Among various nanopore signal detection platforms, Dynamic Correction Method stands out with several unique advantages, especially in key aspects such as noise management, low signal-to-noise ratio (SNR) event detection, baseline drift handling, flexibility, high salt concentration adaptability, complex signal processing, and hardware requirements (Table 1). Compared to other platforms, Dynamic Correction Method offers significant benefits in these critical areas.
Platform | Noise management | Low SNR event | Baseline drift handling | Flexibility | High salt conditions | Complex signals | Hardware requirements |
---|---|---|---|---|---|---|---|
AutoNanopore | Moderate | Moderate | Moderate | Good | Low | Moderate | Good |
NanoAnalyzer | Good | Moderate | Moderate | Low | Moderate | Good | Moderate |
Cavro Nanopore Sensing | Low | Low | Moderate | High | Good | Low | Low |
Kleiner Lab Software | Low | Moderate | Moderate | Moderate | Low | Moderate | Moderate |
EasyNanopore | Moderate | Moderate | Moderate | Good | Moderate | Moderate | Moderate |
NanoPlex | Good | Good | Moderate | Moderate | Good | Good | Moderate |
EventPro | Good | Moderate | Moderate | Good | Good | Moderate | Moderate |
Dynamic Correction Method | Moderate | Good | Good | Good | Moderate | Good | Good |
Firstly, in terms of noise management, Dynamic Correction Method effectively suppresses noise, ensuring the reliability of signals, particularly when the signal environment is complex or the noise level is high. While other platforms (such as AutoNanopore and NanoAnalyzer) also employ noise management techniques, these platforms often face challenges in environments with high salt concentrations or other complex factors, which can lead to misdetection or missed events in complicated signal backgrounds. In contrast, Dynamic Correction Method dynamically adjusts the threshold and baseline in real time, making it more resilient in noisy conditions and low-SNR environments.
While EventPro and NanoPlex employ adaptive baseline options to mitigate baseline fluctuations, their reliance on global fitting or fixed time-window updates may lead to delayed baseline adaptation and misclassification of weak signals in low SNR environments. In contrast, our Dynamic Correction Method introduces event-driven baseline updates and dynamic threshold adjustments, allowing it to track rapid fluctuations more effectively and enhance the accuracy of signal detection even under severe noise conditions.
For low-SNR event detection, the Dynamic Correction Method demonstrates a clear advantage over traditional approaches that rely on fixed thresholds. By dynamically adjusting the threshold based on real-time signal variations, our method can accurately capture low-SNR events that might otherwise be overlooked. This is particularly crucial in cases where signal fluctuations are subtle, ensuring the precise identification of weak signals while minimizing false positives.
To comprehensively evaluate the performance of different methods under these conditions, we compared several common signal processing approaches. Table 2 presents the performance of NanoPlex, EasyNanopore, NanoAnalyzer, and Dynamic Correction Method in key metrics, such as baseline noise, event detection rate, and signal integrity. Other methods were not included in the comparison mainly because their performance under low SNR conditions is either similar to that of the methods included in this study, or due to resource and testing limitations. Additionally, we focused on methods optimized for solid-state pore signal characteristics, ensuring the relevance of the comparison results. This comparison highlights the advantages of the Dynamic Correction Method, particularly in terms of the event detection rate and signal integrity, in complex signal environments.
Metric | RMS noise (pA) | Peak noise (pA) | Event detection rate (%) | Signal integrity (R) |
---|---|---|---|---|
NanoPlex | 15.3 ± 1.2 | 50.2 ± 2.8 | 85.4 ± 2.1 | 0.86 ± 0.02 |
EasyNanopore | 18.7 ± 1.5 | 65.4 ± 3.0 | 78.6 ± 3.0 | 0.82 ± 0.03 |
NanoAnalyzer | 16.5 ± 1.4 | 55.3 ± 2.5 | 83.1 ± 2.7 | 0.84 ± 0.02 |
Dynamic Correction Method | 12.8 ± 1.0 | 40.5 ± 2.3 | 92.8 ± 1.5 | 0.91 ± 0.01 |
In baseline drift handling, many platforms such as AutoNanopore use fixed baseline correction methods. While effective in some cases, these methods struggle when the signal exhibits significant fluctuations or complex baseline drift. Dynamic Correction Method, on the other hand, uses a dynamic baseline correction algorithm that adjusts the baseline in real time based on the signal's characteristics, effectively managing complex signal fluctuations and drift, and avoiding misjudgments caused by baseline shifts in traditional methods.
Dynamic Correction Method excels when it comes to high salt concentration conditions, particularly during nanoparticle detection. High salt concentrations often lead to aggregation of analytes, introducing noise and affecting experimental results. Compared to traditional methods, Dynamic Correction Method maintains high sensitivity even under high salt conditions, minimizing the impact of noise on the experimental outcome.
In complex signal processing, Dynamic Correction Method demonstrates strong flexibility, capable of handling a wide variety of signal types and complex signal patterns. In comparison, some platforms, such as Kleiner Lab Software, perform well with single-type signals but may struggle with overlapping signals or complex backgrounds. Dynamic Correction Method, however, is able to maintain stable performance under various complex signal conditions, ensuring accurate recognition of all target signals.
Finally, Dynamic Correction Method requires low hardware specifications. In contrast to other platforms, such as Cavro Nanopore Sensing, which typically require higher-end equipment, Dynamic Correction Method runs efficiently on low-end devices, reducing the experimental costs and technical barriers, making it suitable for a wider range of laboratory environments.
In summary, Dynamic Correction Method outperforms existing platforms in several areas, including noise management, low-SNR event detection, baseline drift handling, complex signal processing, high salt concentration adaptability, and hardware requirements. It demonstrates unique advantages, particularly in noisy environments and complex experimental conditions, ensuring high signal recognition accuracy and stability.
The gold nanoparticles used in the experiment were obtained by the reduction reaction of sodium citrate with chloroauric acid. The polymerases used in the experiments were ordered from Sangon Biotech (Shanghai) Co. For nanopore sensing, silver chloride electrodes with bias voltages were placed on both sides of the device. Analogue current signals were captured using the Axopatch 200B patch clamp (Molecular Devices, Inc. Sunnyvale, CA), filtered with a low-pass Bessel filter with a corner frequency of 10 kHz, and then digitized with a Digidata 1550B converter at a sampling frequency of 100 kHz. To effectively suppress noise and preserve the signal, we chose a 10 kHz low-pass filter. This filter is suitable for both the polymerase and gold nanoparticle signals, effectively removing high-frequency noise while retaining key signal features. Since the nanopores are fabricated through dielectric breakdown, which can result in higher noise levels compared to other fabrication methods, the 10 kHz low-pass filter is employed to reduce baseline noise and minimize the interference of high-frequency noise, thereby improving the efficiency of signal extraction while maintaining the integrity of the signal characteristics. The translocation of analytes across the nanopore was primarily driven by an applied transmembrane voltage, which generates an electric field that induces the movement of charged particles through the pore. Data were recorded by using the PClamp software.
The bulk conductance Gbulk primarily comes from the ion concentration and ion mobility in the solution. σbulk is the conductivity of the solution in S m−1 (4.2 S m−1 for 1 M LiCl), r is the radius of the pore in meters (25 nm), and L is the length of the nanopore in meters (30 nm). The formula is defined as follows:
The double layer conductance GDL is related to the charge distribution and interaction between the surface of the nanopore and the electrolyte solution. σDL is the conductivity of the double layer, typically dependent on surface charge density, ion type, and solution conditions; LDL is the effective thickness of the double layer, which is around 0.77 nm for a 1 M LiCl solution. It is calculated as follows:
After calculating both contributions, we found that the bulk conductance is significantly larger than the double-layer conductance. The bulk conductance, calculated as 2.6 × 10−8 S, is the dominant factor in determining the total conductance of the nanopore. While the double-layer conductance does play a role, its contribution is comparatively smaller, with a calculated value of 4.16 × 10−11 S.
By considering both factors in our model, we gained a more complete picture of the overall conductance, confirming that the primary influence comes from the bulk ionic conductivity, while the effect of the double layer is minimal, particularly at high salt concentrations such as 1 M LiCl. This comprehensive approach ensures that the model used in our analysis is more robust and accurate in predicting the total conductance of the system.
These optimization strategies can improve the detection of fast translocation events, providing more complete and reliable raw data for the signal extraction algorithm, thereby enhancing the accuracy and effectiveness of the algorithm.
The Dynamic correction method comprises the following steps.
Step 1: Window Initialization. First, three concepts are clarified. Window size (W) refers to the number of sample points considered at one time when processing the signal. Sampling frequency (f) refers to the number of sample points collected per second in Hz, and the sampling frequency determines the time resolution of the signals. Time (t) is the actual time covered by the window, which can be calculated from the window size and sampling frequency. The mathematical relationship between the three parameters is expressed as:
![]() | (1) |
The next steps are to set the appropriate window size (W), step size (S) and buffer size (B), and scan the entire signal file by sliding the window. During each movement of batch windows, the data within the window and (the contextual components in the front and rear areas) the buffer before and after it are further adjusted and processed based on cascade feedback. Setting the signal as x and the length as N, a sliding window function f(k) can then be defined with k denoting the index of the window and i denoting the signal point index as follows.
f(k) = {x[i]|(k − 1)S ≤ i < kS + W}, for k = 1, 2,…, N/S | (2) |
For each index k, the sliding window function f(k) returns a collection of consecutive data points of number W that are signals x from position (k − 1)S to kS + V − 1. This formula assumes that the step size S is less than or equal to the window size W, ensuring the overlap between the windows. If the step size S is greater than the window size W, there will be no overlap between the testing windows. This formula also assumes that the signal length N is an integer multiple of the step size S, which ensures that the last data point of the signal is exactly at the end of a window. If the signal length N is not an integer multiple of the step size S, then the last window may contain fewer data points, or the signal may need to be filled or truncated accordingly. The window is then scrolled through the entire signal according to the selected step size. After each sweep, the working data within the window and the connected data near the windows are ready for the next processing step.
Step 2: Preliminary Threshold Initialization. The mean variance within the window is the primary part of calculating the preliminary threshold. Here a first-order digital low-pass filter is used to calculate the local mean and variance at each point of the original signal with the following formulae.
![]() | (3) |
ml(i) = α × ml(i − 1) + (1 − α)Sraw(i), i = 1, 2,… | (4) |
![]() | (5) |
vl(i) = α × vl(i − 1) + (1 − α)[Sraw(i) − ml(i)]2, i = 1, 2,… | (6) |
![]() | (7) |
![]() | (8) |
Here, Sraw(i) is the i-th point of the original signal. ml(i) and vl(i) are the local mean and local variance of the original signal at the i-th point, respectively. Tus(i) and Tds(i) are the global mean and the global variance of the original signal, respectively, and α represents the coefficients of the filter. Tus(i) and Tds(i) represent the start thresholds for the amplitude increase and decrease, respectively. The parameter βs is used to calculate a preliminary threshold for event detection by setting a distance between a signal point and the local mean (baseline level), where the value of this distance is s times the local standard deviation, and signal data with current fluctuations greater than this distance are filtered. With greater values of s, the preliminary threshold for event detection will be higher and the algorithm will be less sensitive to the event.
Step 3: Data Filtering. Some points that are significantly below the current threshold are filtered out. The filtering action of signals will traverse the entire window, and the new thresholds derived from step 2 are constantly set depending on the correction process. If the sudden change of the current data is more than the specified threshold, an event is considered to be identified and the starting points are recorded in a list.
Step 4: Event Detection Threshold Correction. The Dynamic correction method (Fig. 2a) calculates the mean and variance within a localized window, and obtains a baseline (red dashed line) and an event detection threshold (blue dashed line). A user-configurable yellow pane is displayed within the signal file. Upon the detection of a recordable event, the window initiates a baseline correction. This involves replacing the mean and variance values within the baseline calculation for the specified area with the mean and variance values at a number of points located prior to and subsequent to the event's start and end points, respectively. The number of points included in this replacement is also adjustable. This dynamic adjustment process brings the event detection thresholds of the pane-formatted file closer to the baseline, thereby ensuring greater accuracy in the values of each feature of each recorded event. Concurrently, this process is tantamount to circumventing the interference occasioned by the fluctuating current of the event in question with each instance of its documentation, thus ensuring that the event detection threshold remains unimpaired. For comparison, the baseline scanning method is illustrated in Fig. 2(b). As events are detected, the cumulative changes in the mean and variance of the entire window can shift the event detection threshold, which may affect the signal recognition accuracy. This shift can lead to small signals being masked or the characteristics of the detected signals becoming inaccurate. In contrast, the Dynamic Correction Method dynamically adjusts the threshold in real time, mitigating the impact of such shifts. By localizing the threshold change curve, this method better fits the local signal characteristics, enhancing the detection sensitivity. Additionally, it allows for more precise extraction of event features, such as duration and amplitude, improving the accuracy and reliability of signal detection, especially in complex or low signal-to-noise ratio environments.
![]() | ||
Fig. 2 Identification of transition events by different methods. (a) Dynamic correction method; (b) baseline scanning method. |
Step 5: Event identification. As above, the end threshold will be determined based on the local mean and local variance to detect the events in the current traces. Meanwhile, the data between the start and the end of the events are checked back and forth according to the upward and downward threshold to reach its optimal results. The formula for the event end threshold is shown as below.
![]() | (9) |
![]() | (10) |
Here, Tud(i) and Tde(i) represent the thresholds with upward and downward amplitude for independent event recognition, respectively, and βe is a parameter used to compute the end threshold. It controls the distance between the end threshold and the local mean, which is e times the local standard deviation. With greater values of e, there is a lower end threshold, and the algorithm becomes more stringent in determining the end of the event. By adjusting the values of s and e, it is possible to regulate the sensitivity of the algorithm to events, as well as its requirement for the duration of events. The application of larger values of s will result in a reduction of the number of events that are identified, although this may result in the omission of some true events. Conversely, the application of larger values of e will increase the duration of events. However, this may result in the misclassification of certain noise as part of the event. In practice, the values of s and e should be set according to the specific characteristics of the signal and the requirements of the application in order to achieve the optimal results in event detection.
Step 6: Screening of Events. The detected events are subjected to screening according to preset conditions with the objective of eliminating possible false positives.
Step 7: Feature extraction. Save the current event information and calculate its features. Output the relevant information of the filtered events into the result array.
Step 8: Slide the window to subsequent data processing. Perform the detection of the translocation event in the next window.
To evaluate the improvement brought about by our algorithm, we analyzed the signal distributions by applying the Dynamic correction method (counts: 865) and Baseline scanning method (counts: 841). The signal amplitude and width distributions were fitted using a Gaussian function model, and the fitted curves are depicted as dashed lines in Fig. 3. The goodness of fit was assessed using the R-square values, which were found to be 0.827, 0.877, 0.905, and 0.988 for Fig. 3b–e, respectively. Our analysis revealed that the time of the signals in relation to the bar graph of the blockage current of the half-peak width became narrower after applying the improved algorithm. This indicates that the data became more concentrated. Additionally, the fitted peaks in Fig. 3b–e were determined to be 0.135 ms, 0.161 ms, 8991.4 pA, and 9003.2 pA, respectively. The consistency of these peaks suggests that our algorithm optimizes the identification process, while maintaining the accuracy of the data. Fig. 3f compares the signal data obtained using the Dynamic correction method (red) with the data obtained using the Baseline scanning method (black). The data points derived from our algorithm are concentrated in the center region, which is crucial for the classification of solid-state nanopore signals. This observation suggests that we can obtain various features of the sample using a smaller sample size. Overall, our findings demonstrate the effectiveness of our algorithm in improving the analysis of solid-state nanopore signals.
We have counted the signal distributions obtained by the Baseline scanning method and Dynamic correction method, and the results are shown in Fig. 5, where panels (a) and (b) show the distribution of the blocking current versus pore time for the 10 nm gold nanoparticles, and panels (c) and (d) show the distribution of the blocking current versus pore time for the 15 nm gold nanoparticles. The most significant difference is 0.041 ms in Fig. 5b (1.028 ms and 1.069 ms for the Baseline scanning method and Dynamic correction method, respectively), while the difference in the blocking current is much smaller, with 86.45 pA and 53.70 pA for the 10 nm and 15 nm gold nanoparticles, respectively. This demonstrates that the Dynamic correction method can significantly improve the signal recognition rate while maintaining the signal characteristics, especially for smaller current signals that are missed due to baseline fluctuations.
The source code can be downloaded from https://github.com/eventdetector/Event-Detector-Code.
All data included in this article are available from the authors upon request.
This journal is © The Royal Society of Chemistry 2025 |