Piecing together the puzzle: nanopore technology in detection and quanti ﬁ cation of cancer biomarkers

Cancer is the result of a multistep process, including various genetic and epigenetic alterations, such as structural variants, transcriptional factors, telomere length, DNA methylation, histone – DNA modi ﬁ cation, and aberrant expression of miRNAs. These changes cause gene defects in one of two ways: (1) gain in function which shows enhanced expression or activation of oncogenes, or (2) loss of function which shows repression or inactivation of tumor-suppressor genes. However, most conventional methods for screening and diagnosing cancers require highly trained experts, intensive labor, large counter space (footprint) and extensive capital costs. Consequently, current approaches for cancer detection are still considered highly novel and are not yet practically applicable for clinical usage. Nanopore-based technology has grown rapidly in recent years, which have seen the wide application of biosensing research to a number of life sciences. In this review paper, we present a comprehensive outline of various genetic and epigenetic causal factors of cancer at the molecular level, as well as the use of nanopore technology in the detection and study of those speci ﬁ c factors. With the ability to detect both genetic and epigenetic alterations, nanopore technology would o ﬀ er a cost-e ﬃ cient, labor-free and highly practical approach to diagnosing pre-cancerous stages and early-staged tumors in both clinical and laboratory settings.


What is nanopore (NP) technology?
Formal denitions of NP-based technologies typically feature devices that contain a nanometer-scale pore embedded in a thin membrane.Originating from the Coulter counter and ion channels, NP-based devices can detect various charged biomolecules that are slightly smaller than the diameter of the pore.In the NP-based analysis, a biological or a solid-state membrane separates the experimental chamber into two compartments, referred to as the cis and trans sides, to which the cathode and anode are attached, respectively.Negatively charged biomolecules, such as DNA, are then introduced into the cis side of the chamber.Under the electrophoretic force exerted by the external voltage, the biomolecule transports through the NP to the trans chamber.As the molecule moves through the NP, it interrupts the current signal, causing ionic current blockages.Physical and chemical properties of the targeted molecule can be analyzed using the amplitude and duration of current blockages through the NP (Fig. 1). 1,2th biological and solid-state NPs, which can be obtained or fabricated in numerous ways, [3][4][5][6][7][8] offer a wide range of biomolecule detection.Biological NP is secreted from different bacteria, in which the two most popular types come from a-Hemolysin and MspA porin.These biological NPs are then usually inserted into different biological substrates, such as a phospholipid bilayer, liposomes, or polymer lms.Biological membranes are structurally well-dened and easily reproducible.Biological NP is mostly used for the detection of singlestranded DNA (ssDNA), microRNA (miRNA), and disease diagnostics. 20][11] With controllable pore size and membrane thickness, solid-state NPs have been benecial for use in RNA sequencing, single-stranded and double-stranded DNA sequencing, DNAprotein complex detection, and other biomolecule detection.

Key concepts of cancer development
The development of a malignant cancerous tumor frequently results from a multistep process, rather than just a single genetic change. 12This multistep process originates from various genetic and epigenetic modications.Genetic modications include structural variants, aberrant transcription factors activity, and telomere oxidative damage.The epigenetic modications include DNA methylation, histone modications, and aberrant expression of miRNAs.Both genetic and epigenetic alterations exert their pathological effects by causing defects in genes in one of two ways: (1) an enhanced expression or activation of oncogenesgain in function-and/or (2)  repression or inactivation of tumor-suppressor genesloss of function. 13Several methods have been used in research for detecting various cancer-causing factors and different techniques are applied depending on the particular type of cancer.

Toward personalized genomic medicines for cancer studies
Using the concept of massively parallel sequencing, the development of high-throughput next-generation sequencing (NGS) devices, particularly Illumina, Complete Genomics, and Roche Applied Science 454 among many others, have revolutionized personalized genomic medicines.5][16][17][18][19][20][21] While providing high-throughput and high accuracy reads, NGS still requires multiple processing steps and le formats for their outputs (e.g.FASTQ, BAM, SAM and VCF). 22,23The high volumes of data generated by NGS make managing and storing results one of the major challenges for clinical laboratories. 24Furthermore, NGS technologies employ several enrichment, amplication, and labeling steps, such as polymerase chain reaction (PCR) and bisulte conversions, causing the performance to be time and cost intensive, as well as increasing the possibility of false positive results. 25Due to the need of a label-free, high throughput system, there has been a growing interest toward using third-and fourthgeneration sequencing, specically NP technologies, in cancer prevention and detection in the past decades.
Since its development and publication in 1996, the NP has become an emergent and powerful technology for a direct and inexpensive method for DNA sequencing, biosensing, and detecting biological or chemical modications on single molecules, as well as the kinetics of DNA and protein folding.NP technology offers many advantages that NGS devices are incapable of.For example, NP has demonstrated the ability to detect CpGs methylation (one of the earliest epigenetic biomarkers in cancer hallmarks) without the need of PCR amplication and bisulte conversion. 26,27Thus, NP technology strives to be a potential genomic tool that is label-free, has a high throughput, a small sample volume requirement, exible runtime, and minimal footprint. 2However, despite the past twenty years of signicant progress in single molecular sequencing and analysis, NP technologies have not yet been translated into even distantly comparable advances in clinical settings.There are two general, synergistic goals that have been striven for to increase the efficacy of single molecule analysis using NP: to decrease the translocation time of biomolecules through the pore and to increase the base-calling accuracy.An ideal single molecule analysis system would be highly accurate, have a high throughput, and be sensitive to both genetic and epigenetic changes of the cancer genome.9][30][31][32] However, to the best of our knowledge, an article focusing solely on the application of NP technology in the early detection of various types of cancer biomarkers and causal factors has yet to be published.We believe this review will contribute to the further understanding of the potentials and challenges of applying NP technology in cancer research.Herein, we provide a brief overview of the six main cancer-causing factors, along with methods conventionally used in detecting cancer at the molecular level.We then focus on reviewing NP technology with a focus on its development as a method for specic molecular detection, as well as its future potential and challenges in the clinical domain.All studies presented here are not intended to form an exhaustive list, but rather, to illustrate the totality of our major achievements and challenges of applying NP technology in early cancer detection.

Structural variants
Structural variants (SVs) are one of the rst recognized causal factors of cancer.A structural variant is a form of somatic DNA mutation, whereby the SV promotes the development and progression of cancer while contributing to all the important hallmarks of the instability in cancer genomes. 33The four main types of SVs are large deletions, amplications, inversions and translocations of nucleotides within a DNA sequence.5][36][37][38] In many cases, different SVs occur simultaneously in a specic pathway that amplies their genetic effects on cell instability.For example, with head and neck cancers, it was found that when the deletion of CDKN2A and amplication of CCND1 happen together, there is a higher risk of recurrence, metastasis, and death rather than when either genetic alteration occurs alone. 37,395][46] Furthermore, CDKN2A/p16 and SMAD4/DPaazC4 have been identied as two of the most common deleted tumor suppressor genes.5][46][47] In mammalian cells with highly repetitive genomes, studies of SVs frequently use a resequencing approach, in which the read from the target genome is independently aligned from the reference genome to search for SVs. 48In general, besides specicity and sensitivity, when detecting SVs, a method's quality is further judged by its ability to accurately predict breakpoint locations, the size of variants, and changes in copy count. 33,49orris et al. demonstrated the value of detecting long SVs using Oxford MinION™, to detect a series of well-characterized SVs, including large deletions, inversions, and translocations that inactivate the CDKN2A/p16 and SMAD4/DPC4 tumor suppressor genes in pancreatic cancer. 33Using Oxford Nanopore barcodes, the Norris et al. produced libraries for all 12 PCR amplicons in one run, yielding reads with PHRED scores of 10.9-11.50.PHRED, invented back in 1998 by Ewing and Green, was originally a base-calling program for automated sequencer traces.In later research, the term "PHRED score" has been used for the determination of quality and accuracy between consensus sequences.The higher the PHRED score, the higher the accuracy.For example, a PHRED score of 10 stands for a 90% base call accuracy, and a PHRED score of 20 is correlated with 99% base call accuracy. 50For this specic study, the readings were averaged at 640 bps long with a PHRED score of 11.50.It was also found that these reads are consistent for the entire bps length.The amplicons mapped with an overall percentage of 99.6% for regions of hg19, while 79% of aligned reads accurately matched to bases.Notably, the representation of amplicons does not change accuracy based on the complexity of the sequence.Additionally, the researchers wanted to test their method with low frequency SVs.In a 1 : 100 dilutions, the run produced 4058 2D reads from 270 of 512 channels.The average read length was 650 bps and had a PHRED score of 10.9.Overall, the researchers proved their methods can be conducted in a timely manner.For the two sequences (CDKN2A/p16 and SMAD4/DPC4) in this study, it took 15 minutes and 33 minutes respectively, to generate 450 reads. 33In comparison, 2nd Fig. 2 Detecting structural variants with nanopore.(A-D) Schematic of the Oxford MinION Nanopore Library Prep workflow.Oxford Nanopore barcodes were pooled into amplicons by PCR.After NEB End Repair and dA-tailing modules, hairpin and leader adapters were ligated on, each containing a motor protein (orange).Then, tether attachment allowed DNA molecules to attach to the membrane of MinION flowcell.Within the flowcell, molecules, each with attached motor proteins, were pulled through a nanopore, producing 2D consensus read.(E) Size comparison between an Oxford MinION and a quarter coin. 33eneration sequencers could generate millions of reads simultaneously, but it would take hours to days to complete.The experiment indicated the ability of NPs to serve as a reliable and efficient method of sequencing, allowing rapid detection of tumor-associated structural variants.The two limitations of MinION™, as noted by the researchers, were (1) a relatively high mismatch and index error rate and (2) a limited yield (on the scale of Mb or Gb) (Fig. 2).
Comparing to conventional genome-based methods, such as uorescence in situ hybridization (FISH), ber-FISH, array comparative genomic hybridization (aCGH) and paired-end mapping (PEM), which have a read length of approximately 35-400 base pairs (bps), 40,49 NP allows a much more exible read lengths (of a few bps to kbps).However, the average PHRED score of reads generated by MinION is still relatively low compared to other sequencers (e.g.Illumina, 454, Ion Torrent, and PacBio).At the moment, Illumina is the most popular DNA sequencer on the market.Still, depending on the equipment model and sample size, sequencing using Illumina can take from 3-12 days to complete.Additionally, the current market price of Illumina ranges from $50 000 (MiniSeq) to over $6M (Illumina HiSeq X Five), costing tremendously more than the NP-based sequencers.

Transcriptional factors
The second most well-known causal factor of cancer is aberrant activity of transcription factors (TFs), which are oen members of multigene families with common structural domains. 12TFs are the main regulators of gene expression and signaling pathways in all biological systems and bind to a specic sequence of DNA to promote or inhibit gene expression.In cells, a major portion of oncogenes and tumor-suppressor genes are encoded by TFs. 13,51Aberrant TF activity can occur due to changes in expression, protein stability, protein-protein interactions, post-translational modications, and numerous other mechanisms. 52In a healthy cell, upstream transcriptional regulators highly regulate all genes with similar functions.However, changes in TF activity leads to deregulation of genes involved in promoting cancer cell proliferation, survival, and inducing angiogenesis and metastasis of tumors. 13,519][60][61][62][63] However, most of these methods require some combination of chemical cross-linking between TFs and DNA, modication or tagging of the TF and DNA, and amplication assays.Furthermore, due to the complicated requirements, these methods would lack the ability to resolve ne details of the TF and DNA complex (i.e.partial versus full binding of the TF domains to DNA). 63The specic mechanism of TFs binding to DNA sequences is still under invasive study and is a major area of interest in molecular biology. 63,64quires et al. used solid-state NPs as biosensors for the characterization of DNA, RNA, and proteins.With the use of an electric eld, the researchers could guide the polymers through a NP and identify individual molecules.The current-blockage patterns generated during translocation of charged molecules provides an abundance of information about TF local properties, as well as TF-DNA interactions. 63As previously noted, the regulation of TFs has not been well investigated, hence the use of solid-state NPs could be a novel technique in describing these molecular interactions.As proof of technique, the Squires et al. has shown that their NPs can distinguish between specic and nonspecic binding of TF, by analyzing the ion current of the canonical zinc-nger DNA-binding domain of early growth response 1 (zif268).Characterization of the zif268 was accomplished using the distinct blockage patterns of the current within the nanopore. 65Through analyzing the data, the researchers found that there are three main types of blockages, existing mostly in ve distinct patterns rather than randomly.These patterns have a direct correlation to preexisting data.Hence, the NP presents great potential in characterizing DNA complexes because of its ability to detect complex structures and protein conformations, with the possibility of removing TFs as needed.Squires et al. note that their NP sensor can identify small TFs in DNA as well as distinguish between specic and nonspecic binding.This research technique allows information-gathering availability with respect to TF-DNA interactions (Fig. 3).

Telomeres
7][68][69] When shortened to a critical length, telomeres lose their ability to protect the DNA chromosomes 70 and restrict the proliferation of normal somatic cells. 69This leads to chromosomal fusion and degradation. 71,72In contrast, approximately 85% of human cancer cells can achieve an "immortality" status by maintaining and elongating telomeres via the de novo synthesis of telomeric DNA. 71A study was conducted on 47 102 individuals from the general population, where these individuals were followed for up to 20 years to nd out the relationship between telomere length and cancer.Although short telomere length is not an indication of cancer, 69 it was observed that cancer patients with shorter telomere length had increased risk of early death.This result was observed in patients with lung and esophagus cancer, malignant melanoma, and leukemia. 69ven though it has been years since the rst research, the kinetics of telomeres in cancer cells remains elusive.At present, measuring the length of telomeres and observing the kinetics of folding are still challenging, as there is no gold-standard technique. 73In order to fully understand the role of telomeres in cancer prediction or therapy, it is essential to understand the kinetics of telomere folding and other conformational changes as a response to different living and environmental conditions.
Work is currently underway to apply NP sensor in tracking the telomeric DNA G-quadruplex folding/unfolding.][76] Findings from these studies reported that even though the four G-quadruplex structures all folded from the same DNA sequence, they produced very different electrical signatures. 76his was attributed to the overall shape and volume of each secondary structure.It was observed that both hybrid-1, -2, and basket forms had a diameter of 2.7 nm and 2.4 nm, respectively.Since the cis opening of the a-hemolysin pore has a diameter of 3.0 nm, these three folds can enter the large vestibule.However, the propeller fold, with a disk-shaped structure and diameter of 4.0 nm, exceeds the diameter of the NP cis opening and was unable to enter the vestibule. 77nother inventive solution to capture and unravel Gquadruplexes is to employ a 25-mer poly-2 0 deoxyadenosine tail (d25A-tail) on the 5 0 end of the telomeric DNA.Applying this method, the Burrows group reported the analysis of various folding motifs of the telomere sequence, with and without the 5 0 -d25A-tails. 76Among the four loop topologies, only the basket fold was able to translocate through the NP without the addition of the homopolymer tail to the 5 0 end.For the G-quadruplex to move through the NP, it needs to unravel to a singular strand which would be able to translocate through the narrow b-barrel, and the remaining G-triplex has to roll within the vestibule.This is likely a favorable process for the basket fold because of its nearly spherical shape. 78Even though the volume of the vestibule is large enough to accommodate all four G-quadruplexes within its cavity, the narrow entrance of the vestibule prevented the propeller fold from entering the NP.However, with the addition of the 5 0 tail, the propeller fold was able to circumvent the problem of entering the cavity, and yet still had a very fast translocation signature.This is attributed to the fact that the propeller fold was able to roll outside of the vestibule while an electric force was applied to the dA25-tail as it threaded through the ion channel, without having any molecular interactions or steric hindrance that would have been experienced on the interior of the vestibule. 76 the light of those previous studies, for the rst time, the unfolding kinetics of human i-motifs were studied using the ahemolysin NP.Under acidic conditions, cytosine (C)-rich DNA sequences can adopt i-motif folds, since the hemi-protonation of C-rich strands allow C + $C base pairs to form. 79The Ding et al. conducted experiments on the human i-motif sequence at a constant ionic strength, but various pH (5.0-7.2).Since the dimension of an i-motif (2.0 nm Â 2.0 nm) is smaller than cis opening ($3.0 nm) of the a-hemolysin pore, it can enter the pore without unfolding and be captured in the nanocavity. 79ence, a d25A tail was attached to the sequence, in order to increase the unfolding rate of i-motif.Upon the attachment of d25A, it was observed that at pH 5.0, the folded structure entered the a-HL pore, yielding characteristic current patterns.However, when the pH was at 6.8 and 7.2 (higher than the transition pH 6.15), the percentage of strands still folded was 4% and 2%, respectively.Furthermore, the force applied in this study was analogous to the forces exerted on genomic DNA by RNA polymerases II (5-20 pN) and DNA helicase (6-16 pN). 79ence, these studies strive to show the potential of a-hemolysin as part of biosensor development, aiding in our knowledge of the lifetimes of i-motifs of telomere sequences, and their biologically relevant structures, which can be used as drug delivery targets for cancer treatments. 80hese ndings are steps toward a better understanding of the folding and unfolding mechanisms of the telomere.][83] Whereas NP analysis, lacking all those complications, allows a better understanding of the kinetics and mechanisms, aiding in the analysis of how different oxidation, stress and factors affect the length of telomeres, as well as the correlation between cancer development and telomere immortality (Fig. 4).

DNA methylation
Hyper-and hypomethylation of CpGs.In humans, methylation of DNA is an epigenetic modication that transfers a methyl group from S-adenosyl-methionine to cytosine residues, forming 5-methylcytosine (5-mC).In mammalian cells, methylation of CpGs can directly or indirectly repress gene expression.For example, hypermethylation of CpG islands in the promoter region can directly lead to transcriptional silencing of tumor-suppressor genes.On the other hand, methylated CpGs can indirectly interfere with transcription to prevent the binding of basal transcriptional machinery or ubiquitous TFs.This process contributes to all of the typical hallmarks of a cancer cell originated from tumor-suppressor inactivation. 84With aging, cell deregulation provides mutation accumulation and epigenetic alterations (i.e.aberrant methylation in DNA) the chance to build up, causing proliferative advantages and genomic instability.5][96][97][98][99] Hence, detecting aberrant DNA methylation can have an important role in cancer treatment and precancerous detection. 26,1003][104][105][106][107][108] However, these methods have certain drawbacks.For example, although HPLC and HPCE can accurately quantify the total amount of methylated CpGs, they have incomplete restriction enzyme cutting, offer limited region of study, require substantial amounts of high molecular weight DNA, and are labor intensive.Similarly, with PCR-based methods, only the methylation status of CpG sites that are complementary to the primers can be interrogated.Thus, the predominant methylation patterns in the sample may not necessarily reect the actual results (false positive results).
With NP analysis, current methods used in the detection of aberrant CpGs methylation usually employ either a methylation specic labeler, or an electro-optical tagging. 26,27,109The rst method, as proposed by Shim et al., employs an engineered methyl-CpG-binding domain protein (i.e.MBD1x or Kaiso Zinc Finger proteins) as a selective labeler to detect and quantify hypermethylated CpG sites in double-stranded DNA (dsDNA). 26,109As the DNA translocated through the NP, the presence of 5-mC$labeler complexes caused a signature current blockage, allowing the detection and coarse quantication of 5-mC sites on a single molecule. 109Indeed, this method set an initial application in screening for the presence of hyper-and hypomethylated DNA.Moreover, Shim et al. pointed out that with the versatile binding affinity of KZF to various methylation patterns, the studied assay can allow various patterns to be screened. 26Since NP analysis requires low volumes of DNA for testing, the technique will be more applicable and practical for clinical use.Without the need of DNA replication and amplication, detecting CpG methylation using NPs requires much less labor in comparison to other conventional methods.
The second method, as mentioned before, uses an electrooptical solid-state NP to detect and quantify hypomethylation in DNA. 27In this approach, enzyme DNA MTases was assisted by small molecular weight synthetic cofactors to catalyze a one- unfolding, but cannot pass through the pore constriction. 65(E) Models of the three conformations with the additional 5 0 -dA25 tail unraveling through a-hemolysin pore.Both hybrid and basket folds were able to enter the cis opening of the a-hemolysin pore, thus unraveled inside the pore nanocavity.On the other hand, propeller fold, because of its size, could not enter the NP.This conformation unraveled its structure outside of the pore, using the help of the 5 0 -dA25 additional tail. 65tep enzymatic reaction.This enzyme-cofactor complex was directly conjugated onto uorescent probes and attached to the unmethylated CpG sites.The Meller group was able to detect and differentiate between fully methylated, partially methylated and unmethylated dsDNA, using ultrasensitive electro-optical NP sensing as the tool for single-uorophore multicolor quan-tication.Unlike MBPs, DNA MTase only labeled unmethylated CpG sites of the target DNA.This allowed the direct targeting of hypomethylated CpG sites in the genome (i.e. promoter regions of oncogenes).Furthermore, this electro-optical solid-state NP showed a high potential for employing multiple DNA MTases and other epigenetic biomarkers.With the aid of those biomarkers, orthogonal labeling/sensing of 5-mC can be achieved in the future. 27Further research must be done in order to develop a calibrated scale to count the number of unmethylated CpGs in the target sequence.
Other variants of CpGs methylation: mC, hmC, caC, and fC.Recent discoveries of three other variants of cytosine made the study of DNA methylation even more complex.1][112][113] 5-hmC normally exists at a high level in self-renewing and pluripotent stem cells. 110,114Both mc and hmC inuence mammalian embryonic stem cell maintenance, 115,116 angiogenesis, 117 and development. 118Thus, hmC is a promising molecular biomarker with predictive and prognostic value. 119As for fC and caC, there is still very little research being done.Because the topic has just recently been discovered, we currently lack a robust method to distinguish between these ve chemical modications of cytosine.Even distinguishing between mC and hmC is a challenge for available methods. 116,1202][123] For instance, the Drndic group proposed a method using solid-state NP to discriminate two different structures that translocated through the pore (5-mC and 5-hmC).Upon the addition of 3 kbp dsDNA, a sequence of current blockage was generated, in which the magnitude of each spike was related to the excluded volume of biopolymer that occupies the pore.From the differences in DI max values, Wanunu et al. was able to discriminate between 5-mC and 5-hmC.Shorter end-to-end distance of the more polar 5-hmC indicated an increased exibility in 5-hmC comparing to cytosine and 5-mC.Moreover, it was shown that different proportions of 5-hmC in DNA fragment containing cytosine and 5-mC can be quantied using ionic current signal. 121The second device used in the detection of CpG methylation variants employed both the wild-type phi29 DNA polymerase (phi29 DNAP) and MspA in the same assay. 122,123With this unique approach, the Wescoe et al. reported a direction detection of all ve cytosine variants (C, mC, hmC, fC and caC).In this singlemolecule tool, a phi29 DNA polymerase drew ssDNA through the pore in single-nucleotide steps and the ion current through the pore was recorded. 122Overall, the single-pass call ranged from approximately 91.6% to 98.3% depending on neighboring nucleotides. 122,123Since the knowledge of the ve cytosine variants, especially fC and caC, is still very limited, the possibility of these variants having an impact on genome-wide demethylation or other modications in cancer cells should not be eliminated.
These studies have shown NP analysis potential as a robust and efficient tool for the study of DNA methylation.The technique can directly detect CpG methylation without the need for DNA amplication or complicated preparation processes.Due to its special characteristics, methylation of CpG is usually erased during replication and amplication.Bisulte conversion, for example, requires large amplication, leading to false positive results.Hence, NP analysis could be a more practical and reliable method to screen and detect aberrant DNA methylation in cancer patients.However, in order to apply NP technology to clinical trials and testing, a genome-wide mapping of CpG methylation needs to be developed with a higher base-call accuracy (Fig. 5).

Histone-DNA modications
5][126][127] Histones are the gene activity's dynamic regulators.They go through several post-translational modications, such as acetylation, methylation, phosphorylation, ubiquitylation and others.Specically, methylation and acetylation of lysine residues on the nucleosomal core histones play an essential role in gene expression and chromatin structure regulation. 128In normal cells, histones in DNA sequences are hypoacetylated and hypermethylated.9][130] However, global alterations of histone modication patterns can interrupt normal gene expression, thus the genome's structure and integrity. 131Structurally, histone can assemble into octamers, where a strand of 146 bps long DNA then wraps around to create a nucleosome.Acetylation or deacetylation of the lysine residue on nucleosomes can promote or suppress DNA replication accordingly.Accumulation of histones and nucleosomes in a cell has also been shown to be the early markers of cell death. 132,133][136] Several studies have been conducted on the translocation or unravelling of a nucleosome and its subunit structure through NP. 136,138,139 Generally, it was found that DNA-histone complexes lead to higher applied voltage required and overall longer time periods to translocate through the NP, most likely due to either: (1) the bulky disk shape nucleosome experienced a higher drag force comparing to a bare dsDNA, (2) the positively charged histone core lowered the total net charge density of nucleosomes, causing the translocation speed in electrophoresis to reduce, and (3) the unwinding process of histone-DNA complex. 136,140s mentioned earlier, epigenetic modications have been known to affect the structural integrity and stability of nucleosomes.Given this fact, it was hypothesized that methylation of CpGs on dsDNA would affect the way nucleosomes fold and/or unravel.To test this hypothesis, the Langecker et al. investigated the inuence of DNA methylation on the stability of unlabeled mononucleosomes. 139Similar to the results reported in other studies, under the electrophoretic force, the nucleosomal DNA tail entered the pore and gradually unraveled under increasing voltage, which was much higher in comparison with free DNA capture. 141This experiment was repeated on nucleosomes with and without methylated DNA sequences, yielding that methylation of CpGs did not affect the nucleosome assembly, stability, or unraveling trajectories.This nding suggested that histone modications (i.e.acetylation and phosphorylation) play a much more dominant role in nucleosomal maintenance than DNA methylation.The conrmation of methylation-independent nucleosome stability indicated other possible mechanisms by which DNA methylation alters gene expression, for example, modulating the binding of transcription activators/repressor. 139 The NP-based studies outlined herein lay the groundwork for understanding and predicting the inuence of different histone core modications on the nucleosome structure, 139 in which our knowledge is still quite limited.Unlike conventional methods (i.e., single-gene chromatin immunoprecipitation (ChIP), ChIP with a DNA array (ChIP-on-chip), 142,143 HPLC, HPCE, and many others), NP devices is more versatile, because they do not rely heavily on the quality of the polyclonal antibodies or antibodies that are available. 101Although the study here indicated that DNA methylation does not affect the nucleosome assemble, further studies need to be done in order to conrm the role of DNA methylation in other processes (i.e.regulating transcription activators/repressors binding, or gene expressions), as well as the relationship between acetylation and phosphorylation on nucleosome assembly, and chromatin stability.

MicroRNA
MiRNAs are small endogenous biomolecules that are in length of 18-22 bps.They play an important role in embryonic differentiation, hematopoiesis, cardiac hypertrophy and numerous cancer-related processes, including proliferation, apoptosis, differentiation, migration and metabolism. 144,145Since a single miRNA can target up to hundreds of mRNAs, 137 an aberrant miRNA expression may affect several transcripts and cancerrelated signaling pathways.In cancer cells, because of the genetic diversity of tumors and cancer cell lines, an individual miRNA can be up-regulated in one type of cancer and downregulated in another. 137Overall, miRNAs function depends on their targets within the specic tissue. 12Usually, the upregulated miRNAs function as oncogenes by down-regulating tumor-suppressor genes, while the down-regulated miRNAs function as tumor-suppressor genes by down-regulating oncogenes.
Detection of miRNAs faces several challenges, mainly due to the shortness of miRNAs.7][148][149] Unfortunately, these techniques incur DNA amplication errors, unavailable internal controls, and cross-hybridization.Also, the short sequence of miRNAs makes the designing of probes and primers even more challenging. 146,148iRNAs have been investigated as potential molecular biomarkers, because their expression levels are associated with various diseases. 150For instance, each year, lung cancer causes approximately 1.2 million deaths worldwide. 151Since there is no effective screening procedure available, more than 70% of lung cancer patients were diagnosed with less than a 15% chance of a 5 year survival rate. 151More than 100 types of miRNAs have been identied to deregulate lung cancer progression. 150oticeably, high levels of miR155 and low levels of let-7a-2 have been associated with a signicantly poor prognosis and shorter  154 survival times in lung cancer patients. 152,153Many research groups have used biological and solid-state NPs for the detection of miRNAs in different tissues.For example, the solid-state NP was used for rapid detection of probe-specic miRNAs (miRNA-122a and miRNA-153). 154Specically, for every 1 fmol of miRNA duplex per mL solution, the capture rate was 1 molecule per second.In this study, the p19 protein from the Carnation Italian ringspot virus was used to enrich miRNA-122a and miRNA-153.Since miRNA concentrations were 1% relative to other cellular RNAs, to detect a specic miRNA using a NP sequence, an enrichment step was required. 154P19 binds 21-23 bps dsRNA in a size-dependent, but sequence-independent manner.Additionally, the highly affinitive and selective viral p19 protein does not bind ssRNA, tRNA or rRNA.This eliminates the possibility of false results from mismatched binding. 155Detection of 250 molecules in 4 minutes was sufficient to determine miRNA concentration with 93% condence. 154 different approach from using viral proteins for probe-specic miRNAs detection is to employ an engineered-probe with a programmable sequence to differentiate single nucleotide differences in miRNA family members.150 The Wang et al. proposed a system that enabled sensitive, selective, and direct quantications of cancer-associated miRNAs in the blood.In this study, the group constructed a robust protein nanoporebased sensor that utilized an oligonucleotide probe (P155) to detect aberrant expression of miRNA-155 and let-7a-2 from lung cancer patients.150 The generated signature electrical signals provided a direct and label-free detection of the target miRNA in a uctuating background, such as plasma RNA extract.150 Probe (P155) has a programmable sequence and can be optimized to achieve high sensitivity and selectivity.Additionally, using chemical modications, distinct probes can further be engineered with specic barcodes, allowing multiple miRNAs to be simultaneously detected.Furthermore, with the development of miRNA markers, manipulatable miRNA prole detection NP arrays can be constructed for a noninvasive screening and early diagnosis of cancer. 150Comparing to qRT-PCR assays, microarrays, colorimetry, bioluminescence, and other current methods, [146][147][148][149] NP arrays is a simpler, faster methods to detect miRNAs in cancer patients.This approach lacks all the complications that conventional methods have, such as DNA amplication errors, unavailable internal controls, and cross-hybridization.Early detection is one of the most crucial contributors to a higher survival rate, especially lung cancer patients (Fig. 6).151

Summary and conclusions
In this paper, we have concisely reviewed the main genetic and epigenetic causal factors of cancer, as well as summarized how NPs have been used in the research of each factor.Thanks to several unique features of this emerging technology, the NPbased analysis offers four main benets: rstly, NP analysis offers long reads of genomic DNA (>10 kB).Therefore, linkages between modied cytosines may be revealed that are biologically signicant and otherwise difficult to discern.For example, it was shown that histone-DNA interaction is not affected by methylation of DNA.Also, for the rst time, differences between caC, fC, and hmC from mC were successfully distinguished.Secondly, the genomic DNA is read directly as it transports through the NP.Thus, errors (false-positive results) caused by copying do not occur.Thirdly, with biological NP membranes, the study of biomolecules' folding-unfolding kinetics and mechanisms are possible to accomplish.Furthermore, the DNA fragment can be retained in NP indenitely, allowing rereads of a captured DNA fragment. 123Lastly, many conventional methods are still impractical for clinical testing, because these methods require highly trained experts, intensive labor, a high capital cost, and a large footprint.With NP technology, there are no such requirements, offering more exibility and practicality for research labs and clinics (Table 1).
Although the concepts of NP analysis in early cancer detection are exceptionally promising, several key technological challenges must be addressed before this method can be implemented in clinical uses.First and foremost, the biggest drawback of NP-based methods is high mismatch and error rates.Because the NP membrane thickness, especially biological ones, is relatively large comparing to a nucleotide, NP sensitivity is still low at the single-nucleotide level.Furthermore, even though different DNA conformations and foldings yield distinguishing characteristic current blockades, information about the molecular structure cannot be determined by NP membrane alone.In order to conrm the exact structure that causes a signature blockades in NP, researchers need the aid of other equipment, such as circular dichroism (CD), FRET, FISH, among many others.This limits the use of NP membranes as an independent, stand-alone tool for molecular studies in general, and early cancer detection, specically.Moreover, since one single biological molecule can quickly adopt multiple, complex conformations under different environments, many research groups choose to use short/simplied sequences in their NP studies.Hence, the complexity of cancer cells has not yet been demonstrated and/or fully investigated with NP membranes.
With this review paper, we hope to give our readers an overview of the essential genetic and epigenetic modications in cancerous tissue and the progression of cancer cells.With the complexity of the human body and more specically cancer tissues, many of the mechanisms for cancer proliferation remain unknown.NP-based membranes have shown their ability to detect various biomolecules chemical and structural modications, as well as genetic and epigenetic modications.Thus, NP technology could be the one simple solution replacing many costly, labor-intensive conventional cancer screening methods.With the complexity of the eld, there is growing opportunity for more signicant research to be conducted in the next few decades.

Fig. 1
Fig. 1 Schematic view of the NP-based experimental setup.The two chambers (cis and trans) are separated by a membrane, which is usually either biological or solid-state.A nanopore, which was embedded/fabricated into the membrane, acts as the single channel connecting the two chambers.DNA is added to the cis side.Under the electrophoretic force exerted by applied voltage, DNA strands translocate through the NP to the trans chamber, creating characteristic current blockages.2

Fig. 3
Fig.3Distinguishing between specific and non-specific binding of TF-DNA with solid-state nanopore.Translocation event traces and proposed mechanisms for (A) specific binding, and (B) non-specific binding of TF to DNA.63

Fig. 4
Fig. 4 Capturing unfolding process of the four G-quadruplex structures with biological nanopore.(A) Schematic of the a-hemolysin NP, with the cis opening of 3.0 nm, constriction of 1.4 nm, and trans opening of 2.0 nm.(B) Folding structures and dimensions of G-quadruplex conformations: hybrid-1, hybrid-2, basket, and propeller.(C) G-Quadruplex fold entered and unfold inside the nanocavity of a-hemolysin NP, causing two distinct levels of blockade.(D) Except the propeller fold, all other G-quadruplex can enter the cis opening of a-hemolysin NP withoutunfolding, but cannot pass through the pore constriction.65(E) Models of the three conformations with the additional 5 0 -dA25 tail unraveling through a-hemolysin pore.Both hybrid and basket folds were able to enter the cis opening of the a-hemolysin pore, thus unraveled inside the pore nanocavity.On the other hand, propeller fold, because of its size, could not enter the NP.This conformation unraveled its structure outside of the pore, using the help of the 5 0 -dA25 additional tail.65

Fig. 5
Fig. 5 Distinguishing variants of cytosine with biological and solid state nanopores.(A) Chemical structures of cytosine and its variants.First row: mC (left) and fC (right).Second row: cytosine.Third row: hmC (left) and caC (right). 122(B) Schematic of the Phi 29 DNAP-MspA complex.MspA pore constriction is shorter and narrower compared to a-hemolysin (as shown in the top), allowing short subtle structural changes to be distinguished.(C) A typical trace of DNA translocation through the Phi 29 DNAP -MspA complex. 121(D) Detection of DNA methylation with methyl binding proteins (MBP) using solid state nanopore.MBPs bind to methylated CpGs on DNA, allow the detection and differentiation between unmethylated, hypermethylated and locally methylated DNAs.(E) Detection of DNA methylation with optical-tagging using solid-state nanopore. 27

Fig. 6
Fig. 6 Detection of a miR-155 using using solid-state and biological a-hemolysin nanopores.(A) Schematic of miRNA detection with viral proteins for probe-specific miRNA, using solid-state nanopore.Protein from Carnation Italian ringspot virus was used to enrich miRNA form background fluid.(B) Detection of probe-specific miRNA using alpha-hemolysin biological nanopore.MiRNA-155 (shown in red) was attached to a DNA P155 probe (shown in green).(C) At 8.0 pH and 100 mV, translocation of the miRNA-155$P155 resulted in various current blockage patterns.(C) A typical current blockade with three characteristic blocking levels, representing the mechanism of miRNA-155$P155 complex dissociation and translocation through the pore (as shown in the right-hand side).154

Table 1
Advantages and limitations of NP technology in detecting various cancer biomarkers comparing to conventional methods a * Does not require multiple processing steps and le output formats.Low capital cost and short sequencing time Low sensitivity and accuracy, high mismatch rate.Thus, PHRED score is not high enough for cancer detection yet 33 Transcriptional factors 3 * Label-and tether-free.Does not require chemical-crosslinking or tagging.Hence, allow direct detection and distinguishing between full versus partial, and specic versus nonspecic bindings Not able to predict the exact binding site TFs on an unknown DNA sequence a BL ¼ biological nanopore.SS ¼ solid-state nanopore.(*) label-free and do not require additional aid, in order to detect biomolecules.