Nano Trees: nanopore signal processing and sublevel fitting using decision trees†
Abstract
As the complexity of solid-state nanopore experiments increases, analysis of the resulting electrical signals to determine biomolecular details becomes a challenge. State of the art techniques for this task perform poorly when transient signal characteristics approach the bandwidth limitations of the measurement electronics. In this work, we address this challenge through an algorithm, called Nano Trees, for fitting piecewise constant functions. Nano Trees leverages machine learning algorithms to provide fits to the noisy piecewise constant data that is characteristic of nanopore ionic current signals, producing accurate fits on transients as short as twice the rise time of the measurement system. We demonstrate the performance of our algorithm on several real and synthetic datasets. These findings underscore the generalizability and accuracy of this approach in the regime of fast molecular translocations.