Synthetic genetic polymers: advances and applications

Synthetic genetic polymers, also known as xeno-nucleic acids (XNAs), are chemically modi ﬁ ed or synthesized analogues of natural nucleic acids. Initially developed by synthetic chemists to better understand nucleic acids, XNAs have grown rapidly over the last two decades in both diversity and usefulness. Their tailor-made functionalities allow them to overcome perennial problems in using natural nucleic acids in technical applications. In this article, key milestones in XNA research are reviewed through highlighting representative examples. The advantages of using XNAs over natural nucleic acids are discussed. It is hoped that this article will provide a summary of the advances and current understanding of XNAs in addition to their technical applications, serving as an entry point to those who are interested in the synthesis and application of XNAs. Besides interesting results, challenges encountered may inspire researchers to perfect the synthesis of XNAs and tailor their functionalities.


Introduction
As natural genetic polymers, nucleic acidsdeoxyribonucleic acid (DNA) and ribonucleic acid (RNA)have evolved over geological time to reliably store and transmit hereditary information.Nucleic acids are able to achieve this due to their specific hydrogen bonding patterns between nucleobases.As described by the Watson-Crick rules, adenine (A) always pairs with thymine (T) (or uracil (U) in the case of RNA), while cytosine (C) always pairs with guanine (G). 1 Incompatible hydro-gen-bonding patterns along with proofreading enzymes ensure high fidelity in the replication and transmission of genetic information. 2The negatively charged phosphate backbones of nucleic acids ensure that they are highly soluble in water irrespective of their sequences.Thus, long and diverse nucleic acid strands can be synthesized without the need to consider their aqueous solubility.Furthermore, the modularity of nucleic acids facilitates ease of synthesis, either enzymatically or chemically. 3These properties have resulted in the ready adoption of natural nucleic acids as convenient and useful materials in various technical applications.
Inspired by the elegancy of natural nucleic acids, researchers have been actively pursuing the ultimate goal of emulating natural nucleic acids with totally man-made entitiessyn-

Qian Ma
Ms Qian Ma was born in Shannxi, China.She received her B.Sc. in Chemistry from Nanjing University in 2015.She is currently pursuing her Ph.D. at the National University of Singapore (NUS).Her research is primarily on the development of highly sensitive biosensors and bioassays for nucleic acids and proteins.

Danence Lee
Mr Danence Lee was born in Singapore.He graduated from NUS in 2010 with a B.Sc. (first class honors) in chemistry.After a 5-year teaching stint with the Ministry of Education, he is back at NUS to pursue his Ph.D. in Chemistry.His current research interest focuses mainly on the development of efficient methodologies for organic synthesis via bimetallic and cooperative catalysis.
thetic genetic polymers or xeno-nucleic acids (XNAs).The term "XNA" was first coined by Herdewijn and Marlière to describe artificial genetic polymers with the potential to emulate natural nucleic acids in information storage and propagation. 4ynthetic genetic polymers, as the name suggests, contain unnatural components with chemical modifications to either the nucleobases, the sugar moieties, the phosphodiester backbones, or a combination of the above. 5The endeavor to create XNAs has resulted in a deepened understanding of natural nucleic acids in terms of their structure, chemical and physical properties, and function.These valuable insights have in turn spun off other developments particularly in the fields of molecular biology, gene therapy, bioassays, diagnostics, and biocatalysis. 6By successfully emulating nucleic acids in information storage and propagationboth prerequisites for evolution and life 7,8 these laboratory synthesized XNAs have also helped scientists to rethink the assumption that nucleic acids and proteins are the only chemicals that power cells, fueling the possibility of the existence of xeno-organisms that utilize a completely different set of biomolecules. 9The explosion of research activities in XNAs also implies that it is impractical to cover every aspect of this exciting field.Therefore, this article focuses on the progress of XNAs research and their technical applications, starting off with an overview detailing various types of XNAs, their fundamental aspects, synthesis and evolution strategies, and ending up with their technical applications, challenges, and outlook.

XNAs with modified nucleobases
While essentially living organisms on Earth utilize nucleic acids in storing and transmitting genetic information, researchers have always envisioned the creation of a synthetic self-replicating organism that can incorporate a third basepairing mode (X-Y) into its genetic polymers (Fig. 1). 10 Aided by the advances in organic synthesis and bioanalytical methods, this vision has gradually become a reality. 11One of the main reasons for creating such an organism with an expanded genetic alphabet is for the organism to ribosomally produce novel proteins incorporated with unnatural amino acids.The discovery process of the third base pair capable of replication often revolves around iterating these processesthe chemical synthesis of artificial nucleotides, ascertaining the specificity and efficiency of the base pairing by biochemical methods, and identifying high potential base pairs for further chemical modifications. 12In order to qualify as the third base pair, the bases X and Y must exclusively be paired with each other in the helical structure as well as during nucleic acid synthesis by polymerases.
An example of such iterations is illustrated by Benner's work in developing unnatural base pairs, revolving around developing base pairs with hydrogen bonding patterns similar to those of natural base pairs.Notable pioneering work by his group includes the development of an isoG-isoC base pair (Fig. 2) and their successful in vitro incorporation into natural nucleic acids for replication and transcription. 13,14However, the potential of the isoG-isoC base pair functioning as the third base pair is seriously limited by several issues concerning its specificity and the most serious of which is the fact that Yong Quan Tan  isoG undergoes tautomerization in aqueous media at physiological pH. 15,16Once enolized, isoG (enol) pairs with thymine instead of isoC, thus severely compromising the fidelity of the isoG-isoC base pair.Furthermore, isoC was found to be chemically unstable in alkaline media. 15In order to resolve these issues, one of the strategies developed is to substitute thymine with a thymine analogue with reduced hydrogen bonding ability.Such a thymine analogue still binds to adenine but no longer mispairs with isoG (enol).The methylation of the 5-position of isoC (5-Me-isoC) has also helped to improve the stability of isoC. 17Through such successful iterations of modifications, Benner's group managed to demonstrate that the isoG-5-Me-isoC base pair can be the third base pair for polymerase chain reactions (PCR) with a fidelity up to 98% (Fig. 3). 18More recently, a new base pairing between a triaza-isoG (dP) and nitropyridone (dZ) was developed by his group (Fig. 3). 19This base pair shows enhanced polymerase recognition with improved chemical stability, enabling a fidelity up to 99.8% during PCR.They took their research a step further by designing unnatural nucleobases that pair with natural nucleobases but not with each other.Thus, undesirable primer-primer interactions are minimized with greatly improved specificity in PCR. 20n stark contrast to Benner's strategy, Morales et al. developed base pairs that depend on shape complementarity rather than hydrogen bonding patterns, thereby proving that hydrogen bonding is not a definite requirement for replication. 213][24] Well-known base pairs developed are 5SICS-MMO2, 25 5SICS-DMO, 26 5SICS-NaM, 27 and Ds-Px (Fig. 4). 28For these base pairs, a balance between shape complementarity and size similarity to natural nucleobases is achieved.Through careful design, repeated optimization, and testing, the hydrophobic interactions and stacking between  the base pairs have resulted in enhanced fidelity during PCR and transcription.For instance, the Ds-Px pair can undergo a hundred cycles of PCR with 97% fidelity. 29hrough extensive research, the genetic alphabet has been expanded to beyond A, C, G, and T. Numerous nucleoside analogues that are highly capable of specific and robust basepairing have been synthesized and are shown to be successful as the third base pair in PCR (Table 1).The arduous journey in search for the third base pair has furthered the understanding of biology from a chemical perspective.
The evolution of life over geological time has perfected a highly conservative system to reliably store and transmit hereditary information through nucleic acids.Currently, genetic information of all known living organisms is conveyed through the order of the four natural nucleobasesthe genetic code.In other words, the arrangement of the four nucleobases along the backbone of DNA/RNA in a specific order underlies the preservation and transmission of genetic information.The creation of extra genetic codes, 13,14,20,[25][26][27][28][29][30][31][32][33] together with the development of XNA replicating systems will shake our common belief of the origin of life.It will ultimately lead to the expansion of our concept of life since life does not need to be exclusively based on a certain group of biological entities.The development of xeno-biology could lead to the production of unprecedented lineages of unnatural biological entities for a wide variety of purposes ranging from diagnostics to therapeutics.It could also provide us with an ultimate biosafety toolbox capable of safeguarding any type of genetic interaction between synthetic and natural life forms. 34,35On the other hand, great care must be taken pertaining to the societal aspects of xeno-biology with concerns such as biosafety, biosecurity, intellectual property rights, and governance. 36

XNAs with modified phosphodiester backbones
Besides nucleobases, there have been endeavors in modifying the phosphodiester linkages of nucleic acids. 37Phosphodiester linkages have a huge influence on the physicochemical properties of nucleic acids.Not only do their anionic structures repel each nucleotide to create a sufficient space to facilitate enzymatic access to their sequences, their high polarity ensures high solubility in water.Modifications to the phosphate backbones may therefore significantly perturb the properties of nucleic acids, thus aiding the understanding of nucleic acids.Among nucleotides with modifications to the phosphodiester linkages, two broad classesphosphorothiolate nucleotides 38 and boranophosphate nucleotides 39 stand out (Fig. 5).The phosphorothiolate nucleotides differ from natural nucleotides in that the oxygen atom in the phosphodiester linkage is substituted by a sulfur atom.Such a substitution was found to reduce the probability of phosphorothiolate nucleotides from undergoing nuclease degradation. 40Its markedly reduced polarity, due to the increased polarizability of the sulfur atom, has also been utilized to understand the mechanisms of reactions occurring at phosphodiester linkages. 41The boranophosphate nucleotides are similar to the phosphorothiolate nucleotides in that the oxygen atom in the phosphate backbone is substituted by a boron atom.Both modified nucleotides are capable of undergoing PCR by using an engineered Taq DNA polymerase. 42

XNAs with modified sugar moieties
Representative XNAs with modifications to the sugar moieties of nucleic acids include threose nucleic acid (TNA), hexitol nucleic acid (HNA), and locked nucleic acid (LNA) (Fig. 6).TNA  Fig. 6 Representative nucleotides with modifications to the sugar moieties.
consists of an unnatural four-carbon threose sugar instead of a ribose sugar. 43The consequence of this modification is that the phosphodiester linkage is shortened by a single bond.Despite this difference, TNA is capable of pairing with complementary nucleic acid strands.TNA has also been shown to fold into tertiary structures with the desired chemical functions, driving speculation that TNA could be an RNA progenitor. 44NA consists of a six-membered pyranosyl ring with the phosphodiester linkages connected at the 4′ and 5′ positions of the pyranosyl ring. 45Like TNA, HNA can form duplexes with its complementary nucleic acids. 45Also, HNA can form a double helix with itself. 46Recently, Pinheiro et al. sought to identify polymerases that are capable of generating long XNAs to carry significant genetic information by using a compartmentalized self-tagging strategy. 47This strategy involves the construction of a library of mutated polymerases and the use of primers and modified nucleotides such as HNA in selecting the best polymerase.Remarkably, they were successful in identifying a mutant polymerase that can synthesize HNA oligonucleotides from DNA templates and a reverse transcriptase that transcribes HNA oligonucleotides back to DNA.This strategy also enables the identification of polymerases that can transcribe other XNAs with modifications to the sugar moieties such as cyclohexenyl nucleic acid (CeNA), LNA, TNA, altritol nucleic acid, arabinose nucleic acid (ANA), and 2′-fluoro arabinose nucleic acid (FANA) (Fig. 6).This general strategy for identifying polymerases that accept a variety of sugar-modified nucleic acids has great potential in advancing the field of synthetic genetics.Besides TNA and HNA, another representative sugar modified XNA is LNA, which contains a bridging methylene between the 2′ and 4′ position of the ribose.The rigidity of LNA restricts the degree of freedom of LNA strands, thus conferring remarkable hybridization stability.In addition, the fully alkylated 2′-O enhances the resistance of LNA to nuclease degradation in contrast to RNA, which can be easily degraded by nucleases that exploit its free 2′-OH. 48

XNAs with both modified sugar moieties and phosphodiester backbones
A radical modification to the backbones of nucleic acids is to totally replace phosphate and sugar backbones with a peptide peptide nucleic acid (PNA).PNA contains repeating aminoethylglycine units joined to each other by peptide linkages (Fig. 7). 49The backbone can be easily modified to replace the glycine moiety with lysine, 50 arginine, 51 or cysteine 52 changes that can alter the physicochemical properties of PNA.The most outstanding feature of PNA is that it is electrically neutral.Therefore, unlike nucleic acids with negatively charged backbones, a PNA strand can form a duplex with either itself or a DNA/RNA duplex without the need for any counter ions for stabilization. 53Furthermore, the abolishment of inter-strand electrostatic repulsion also means that PNA/ nucleic acid duplexes are thermodynamically more stable than double-stranded nucleic acids.These traits together with better resistance to nuclease degradation, allow PNA to be used in numerous applications, 54,55 even though it remains refractory to PCR. 56nother type of XNA with modifications to both the sugar moieties and phosphodiester linkages is morpholino (MO) in which riboses are replaced by morpholine moieties while the phosphodiester backbone is replaced by a phosphorodiamidate backbone (Fig. 7). 57Similar to PNA, MO possesses an electrically neutral phosphorodiamidate backbone.Despite its electrical neutrality, MO has good solubility in water due to the hydrophilic morpholine ring. 58MO is also resistant to a wide range of nucleases, which makes MO oligomers very useful as intracellular probes. 59More importantly, MO oligomers sterically prevent enzymes from accessing their target nucleic acids.By doing so, MO is capable of probing and regulating gene expression in vivo.For instance, a MO strand can bind to a region of a target messenger RNA (mRNA), reducing the translation of the protein encoded by the mRNA. 60Therefore, MO allows gene and protein expression to be investigated easily in a cellular context.

XNAs in molecular biology
Because of the scope of this article, only a brief summary of the applications of XNAs in molecular biology is presented in this section since the biological importance of nucleic acids extends beyond the nucleotide sequence.The secondary and tertiary structures of nucleic acids are known to be involved in various functions.For example, the propensity of DNA to form double-stranded helices is generally believed to confer great stability and provide an avenue for proofreading during replication, both of which are important factors in long-term information storage.Other than the familiar double helix, DNA is also capable of forming a few tertiary structures through non-Watson-Crick base-pairing modes, such as a G-quadruplex and an i-motif, which are thought to be involved in gene expression. 61The formation and interactions of G-quadruplexes have been observed by incorporating a fluorescent purine analogue, 2-aminopurine, into a region of human telo- mere known to form such a structure. 62In addition, several groups have developed PNA probes capable of binding to G-quadruplexes, presumably by binding to the loop region or by strand invasion. 63,64However, such probes are currently limited to in vitro applications. 65In contrast to the limited structural diversities of DNA, RNA can adopt a much greater variety of tertiary structures.This is due to the fact that while DNA is primarily concerned with preserving hereditary information, RNA is involved in many cellular processes, such as the control of transcription, catalysis in protein translation, and alternative splicing. 66RNA molecules involved in these functions are aptly termed "functional RNAs (fRNAs)". 67A significant proportion of fRNAs are double-stranded or contain double-stranded regions, such as siRNA and riboswitches.XNA triplex-forming probes have therefore been developed to detect double-stranded RNAs.For example, Wang et al. modified RNA with carboxamide-linked pyrene to discriminate between single-and double-stranded RNAs. 68The emission of pyrene excimers was observed for single-stranded RNA but not for double-stranded RNA.More recently, it was observed that PNA modified with thio-pseudoisocytosine nucleobase can form a triplex with double-stranded RNA. 69The neutral backbone of PNA allows a RNA/RNA-PNA triplex to form almost independent of pH and salt concentration, which is an advantage over triplex-forming nucleic acid probes. 70The unnatural nucleobase prevents the formation of PNA-RNA duplex by sterically clashing the Watson-Crick pairing mode and favors the Hoogsteen base-pairing geometry, and therefore triple formation by compatible hydrogen bonds and shape complementarity.
Like PNA, MO oligomers with sequences complementary to their target nucleic acids are capable of hybridizing with them according to the Watson-Crick base-pairing mechanism.First studied by Summerton et al., 71 MO oligomers have primarily been used in antisense applications and as a knockdown tool in developmental biology due to their high specificity and ease of cytosolic delivery.Being neutral nucleic acid analogues, PNA and MO are also used for antisense and antigene applications as well as microRNA (miRNA) therapeutics since they are resistant to nuclease and protease digestion whereas nucleic acids are highly susceptible to nuclease degradation. 72hey are being used in applications such as correcting splicing errors in pre-mRNAs in cultured cells and in extra-corporal treatments of cells from thalassemic patients.A particularly interesting application in developmental biology is the use of PNA and MO to block the expression of any selected gene throughout the course of embryogenesis. 71,72NA is best recommended for use in molecular biology where high specificity is required.LNA also has immense therapeutic potential because it can be used for the regulation of gene expression, as it is stable in serum, can be taken up by mammalian cells, and shows low toxicity in vivo.

XNAs in the study of nucleic acid-protein interaction
Nucleic acids often interact with other biomolecules, most commonly proteins, in order to perform functions related to transcription, translation, and regulation.Classical methods of probing nucleic acid-protein interactions include immunoassays, pull-down assays, and electrophoretic mobility shift assays.To gain a deeper understanding of such interactions, it may be necessary to investigate into the dynamics of such interactions.Nucleic acid dynamics may be elucidated by nuclear magnetic resonance or computational modelling, 73 among others.While these approaches have been proven useful, the perennial concern with in vitro and in silico methods is that these methods may not provide an accurate representation of nucleic acid interactions and dynamics in complex cell milieus.It is therefore often useful to obtain in vivo information to complement in vitro and in silico studies.The chemical diversity and bioorthogonality of XNAs are advantageous in this regard.MO antisense probes are well established to form stable and specific duplexes with miRNAs, thereby reducing their interactions with proteins that will exert downstream effects. 74In this manner, the function of a miRNA of interest can be evaluated in a sea of miRNA and other RNA molecules. 75Zielinski et al. developed PNA probes functionalized with a UV-activatable crosslinking amino acid, para-benzoylphenylalanine, to covalently link the probes with RNA-binding proteins (RBP) that interact with the target RNA (Fig. 8). 76The PNA probes can be isolated by antisense oligonucleotides immobilized on magnetic beads, allowing rapid identification of RBPs by mass spectrometry.The key advantage of this approach is that the probes can be used in vivo, allowing weak or transient interactions to be captured.Hence, it is expected that such a technology may reveal novel RBP-RNA interactions previously obscured by sample processing using in vitro assays.
A well-known RBP-RNA interaction is that between a shortinterfering RNA (siRNA) and an RNA induced silencing complex (RISC) in RNA interference (RNAi), a mechanism of gene knockdown in eukaryotes. 77In order to better understand the mechanism by which siRNA and RISC interact, Hernández et al. developed size-expanded nucleobase RNAs. 78sing these modified guide strands, they found that incorporating the size-expanded nucleobases from positions 2 to 11 results in poor RNAi activity, indicating that RISC should bind very tightly to the RNA strand at these positions.This observation is in concurrence with a known crystal structure of an RISC-siRNA complex. 79Interestingly, modifications at positions 16 to 19 do not have a significant impact on gene silencing, thereby suggesting that there is some degree of flexibility at the 3′ ends of the guide strands in RISC.Consequently, the size-expanded nucleobases can be included at these positions to improve the nuclease resistance of externally administered siRNA.

XNAs in the study of nucleic acids in vivo
Nucleic acids are highly susceptible to degradation, chemically and enzymatically, and have limited chemical diversity.XNAs circumvent these problems through chemical modifications, thereby providing greater stability, better sensitivity, and more desirable physicochemical properties, while retaining or even improving the specificity of nucleic acids.Consequently, a natural application of XNAs is in their use as hybridization probes.Hybridization probes are short to medium length XNAs, typically between 25-and 100-base-pair long, that detect nucleic acid sequences complementary to the probes. 80Many hybridization probes use fluorophores as reporters due to relative ease of experimental handling and visualization.Wellestablished examples of hybridization probes are molecular beacons (MBs), 81 binary probes (BPs), 82 forced-intercalation (FIT) probes 83 and fluorescence in situ hybridization (FISH) probes. 84The mechanisms of action of these probes are depicted in Fig. 9.While hybridization probes made from DNA have undoubtedly proven useful for in vitro studies such as real-time PCR, the presence of nucleases and nucleic acidbinding proteins inside cells can cause the fluorophores and quenchers of these hybridization probes to separate even when they are not bound to their target sequences, thus possibly leading to false positives. 85To overcome the shortcomings of MBs made from DNA, Wang et al. synthesized MBs using LNA. 86The presence of a covalent bond between the 2′-O and 4′-C of the ribose moiety in LNA restricts conformational changes in the sugar molecule, resulting in exceptional LNA/ DNA duplex stability due to increased base stacking interactions. 87,88In addition, LNA/DNA duplexes obey Watson-Crick base-pairing rules and display greater melting temperature lowering toward mismatches than DNA/DNA duplexes. 89,90urthermore, Wang et al. found that LNA is highly resistant to nucleases and single-stranded DNA binding proteins in vivo, resulting in an exceptionally low background compared to DNA MBs inside cells. 86In addition, the LNA MBs exhibited superior selectivity for single-nucleotide polymorphisms (SNPs) than their DNA counterparts.It is therefore hardly surprising that LNA MBs quickly found applications in the fields of medical genetics, [91][92][93][94] biosensors, [95][96][97][98] and microbiology, 99,100 among others. 101,102A perennial problem with DNA MBs is that the stem portion can participate in base pairing with off-target sequences when the MBs open, resulting in false positives.This is especially so in a cellular environment where the genome is present.To remedy this, Crey-Desbiolles and colleagues replaced the nucleotides in the stem portion of MBs with β-D-2′,3′-dideoxyglucopyranosyl (6′ → 4′)linked nucleotides, also known as homo-DNA. 103In homo-DNA, the pyranose moiety replaces the furanose in natural DNA (Fig. 10).Homo-DNA is similar to HNA, except that the base is attached at the 1′ position of the pyranose instead of 2′ for HNA.Homo-DNA forms a unique double helical structure, thus explaining why homo-DNA is unable to form a duplex with DNA. 104The homo-DNA modified MBs simplify stem design, and have been shown to be significantly more selective than DNA MBs.Not long after, Kim et al. developed MBs with stems made from L-DNA, the enantiomeric form of natural D-DNA (Fig. 10), and loops synthesized with 2′-O-Me modified RNA. 105The modified nucleotides form duplexes only with themselves, eliminating unwanted stem-loop interactions and also improving stability in vivo as the addition of a methyl group at the 2′ position of the RNA ribose abolishes RNase activity, 106 ultimately leading to more sensitive and specific MBs.The ability of 2′-O-Me modified RNA to resist nuclease activity is also exploited by Nilsson and co-workers in the development of MBs for monitoring rolling-circle amplification (RCA). 107RCA is an isothermal nucleic acid amplification technique that uses a circular DNA template to generate a very long nucleic strand consisting of many tandem copies of the template and is used in diagnostics and nanotechnology. 108,10929 DNA polymerase is commonly employed in RCA due to its remarkable strand displacing ability and processivity. 110owever, the 3′ exonuclease activity of Φ29 DNA polymerase prevents the use of DNA MBs. 111In order to reliably and easily  107 BPs are an alternative to MBs in the nucleotide sequence detection, especially in SNP discrimination.BPs work by exciting the donor fluorophore on one of the two probes, which results in Förster resonance energy transfer from the donor to an acceptor fluorophore, resulting in fluorescence.BPs are sometimes preferred over MBs as they are reportedly more selective in differentiating between single base changes as the stem region of MBs can result in off-target hybridization. 112,113Ps are also more amenable for mixing and matching, making them more economical than MBs for the detection of a large number of samples of slightly different nucleotide sequences.However, perhaps due to the commercial dominance that MBs possess over BPs, only a handful of attempts have been reported on XNA-substituted BPs.][116][117] FIT probes are relatively new PNA-based hybridization probes first developed by Kohler and Seitz. 83Instead of the fluorophores being attached on the ends of the probes as with most other hybridization probes, thiazole orange (TO) molecules are incorporated into the probes.TO is a well-established fluorescent DNA intercalator 118 and is able to base pair well with all four natural nucleobases. 119The fluorescence of TO is sensitive to its local environment: only when the bases in the FIT probes flanking TO are complementary to the target can TO intercalate into the duplex, reducing rotation around its methylene group, thereby resulting in strong fluorescence. 120he fluorescence of TO is significantly weaker if the central methylene group is allowed to rotate.This happens when there is a base mismatch adjacent to TO.The flexible PNA backbone allows the relatively large TO molecule to be accommodated in a stable PNA/DNA duplex.[123] FISH is used to locate nucleotide sequences in formaldehyde-fixed cells.It was first developed to locate genes on a chromosome, 84 but can also detect mRNA and miRNA to determine gene expression and localization. 124PNA is the most popular XNA substitute for constructing FISH probes.There are three reasons for its success: (i) the neutral backbone of PNA facilitates diffusion across cell membranes and also leads to increased rate and stability of hybridization with DNA and RNA targets in fixed cells; 125 (ii) hybridization can occur at very low ionic strengths, which is advantageous as genomic DNA that has been chemically denatured can re-anneal to each other at high ionic strengths, 126 forbidding the binding of probes; and (iii) PNA probes are highly amenable for attachment of a wide selection of fluorophores, 127 thus greatly simplifying multiplex imaging.2][133] The broad applicability of PNA FISH probes is perhaps exemplified by the fact that they can even diffuse across the highly hydrophobic and dense cell membrane of Mycobacterium sp., 134 which is a feat that cannot be accomplished by the intrinsically charged natural nucleic acid probes. 135In a similar way, MO-based FISH probes have also been found to be convenient for direct delivery into zebrafish tissues. 136However, the uncharged backbone of PNAs and MOs lowers their hydrophilicity, predisposing these XNAs to aggregation and unspecific hydrophobic interactions with proteins. 137Therefore, it may be prudent to introduce hydrophilic groups on uncharged XNA chains, such as on the nucleobases, to improve aqueous solubility. 138hile FISH probes have indubitably been proven valuable for scientists and clinicians, current protocols require fixing the cells in formaldehyde prior to probe delivery and involve extensive washing steps as probes may sometimes fluoresce even when not bound to their targets.It is desirable to perform direct imaging of live cells in order to decrease analysis time, and avoid artefacts that may emerge due to cell fixation and observe cellular dynamics.To this end, hybridization probes that can be delivered into cells with the aid of low detergent concentrations and/or chemical modifications have been developed. 139FISH probes that exhibit low background but high fluorescence only when bound to their targets, i.e. a "turn-on" signal, have been achieved by attaching TO-derived fluorophores to the DNA backbone.Another approach that has been developed is to first deliver a strongly fluorescent probe covalently linked to an azidoether-linked quencher and allow it to hybridize with a part of a target sequence. 140Afterwards, another probe that will bind to the other part of the target sequence is delivered.This probe contains a triphenylphosphine group that will reduce the azidoether to an N,Oacetal through the Staudinger reaction (Fig. 11). 141The acetal is rapidly hydrolyzed to release the quencher, resulting in rapid turn-on fluorescence. 142,143

XNAs in the study of nucleic acid-small molecule interactions
While XNAs have been extensively developed for nucleic acids and proteins, there is no reason to limit the applications of XNAs to only biomacromolecules.The strategy for developing highly affinitive XNA-based entities such as aptamers for targeting small molecules is to increase the chemical space of aptamers by using XNAs.Demonstrating this principle, Imaizumi and co-workers incorporated chemically modified uracils (Fig. 12) into DNA and through systematic evolution of ligands by exponential enrichment (SELEX) discovered aptamers that bind to thalidomide and camptothecin with K D values of 1.0 µM and 40 nM, respectively, 144 which represents vast improvements over DNA aptamers.
On the other hand, a burgeoning trend is to exploit the modularity and strong signals of oligodeoxyfluorosides (ODFs) to improve current methods for detection of small molecules and ions.ODFs are in essence modified DNA oligonucleotides where the base is substituted for fluorophores (Fig. 13). 145The overall strategy is to generate a combinatorial library of ODF tetramers that incorporate various fluorescent and non-fluorescent monomers.Screening is accomplished with polystyrene beads on which the tetramers are synthesized, thus facilitating deconvolution.ODFs have found applications in detecting and semi-quantifying diverse classes of small molecules and ionic analytes with detection limits commonly found to be in the low micromolar to nanomolar range.Examples of these analytes include: volatile organic molecules, 146 anionic water pollutants, 147 petroleum products 148 and metal ions. 149Recent improvements include the development of sensor arrays to allow a small library of tetrameric ODFs to differentiate a large number of analytes 150 and ODFs that can be inexpensively printed on paper. 151These developments place ODFs in a good position to be developed as convenient and economical sensors for widespread applications such as food spoilage and environmental monitoring, especially in areas where laboratory testing is prohibitively expensive and impractical.

XNAs in bioassays
The most popular XNAs employed in the construction of nucleic acid bioassays are LNA, PNA, and MO.[154]    LNA probes also show good sensitivity and they are commonly considered for the improvement of the sensitivity of detection. 155LNA probes have been used for simple and specific DNA detection of chronic myeloid leukemia and acute promyelocytic leukemia. 156The signal generated upon hybridization can also be amplified using different approaches. 157,158ecently, an approach on the locking of a furanose ring via a methylene linkage between 4′-C and 2′-O has been developed. 159,160NA has also been widely employed in the construction of nucleic acid bioassays.PNA was firstly investigated by Nielsen and colleagues in 1991 to, specifically, interact with doublestranded DNA in a triple-helix fashion. 161,162As a chargeneutral XNA, PNA has a greater binding affinity to its complementary nucleic acid due to the absence of electrostatic repulsion. 163The high binding affinity also allows the distinction of closely related sequences, even at single-base levels.A representative example is described by Fan et al. in which hybridized anionic target miRNA strands are utilized to attract protonated aniline molecules and consequently their polymerization along those strands (Fig. 14). 164Another example was reported by Su and co-worker, 165 in which DNA/PNA hybridization is studied using a PNA-immobilized microwell plate.After hybridizing with sampled DNA strands, the negative charges brought to the microplate are exploited for the introduction of cationic horseradish peroxidase through electrostatic interaction to produce a detectable signal. 165The current trend in the development of PNA-based probes is based on the modification or extension of their backbones or linkers. 162The change in the conformation of PNA strongly impacts on the hybridization, detection, and regeneration.For instance, the design of a PNA backbone can lead to label-free bioassays. 166nother charge-neutral XNA engaged in nucleic acid bioassays is MO.Several attempts have been made in applying MO in the construction of nucleic acid bioassays.8][169] Likewise, accompanying with hybridization, a large number of negative charges are brought into the bioassays, therefore offering a convenient means to introduce signaling units through electrostatic interaction.For example, a cationic redox polymer, acting as a signal generator, is introduced through electrostatic interaction after hybridization for the detection of nucleic acids. 167o effectively alleviate the problem of extensive secondary structures, which severely hinder hybridization, Zu et al. proposed a colorimetric assay for the detection of nucleic acids under extremely low salt conditions (Fig. 15). 168Since the base-pairing of nucleic acids is exclusively dependent on temperature and salt concentration, the secondary structures are less stable and more accessible under low salt conditions at moderate temperatures.To accomplish their goal, practically salt-independent probes such as MO are engaged.It was found that as low as 2.5 mM total salts is sufficient to successfully carry out hybridization.Their assay worked effectively in detecting sequences that are likely to form secondary structures.
Based on experimental evidence gathered so far, the charge-neutral XNAs are better probes for nucleic acid bioassays.There are several distinct advantages to substitute nucleic acids by the charge-neutral XNAs like PNA and MO such as higher hybridization efficiencies, better sequence specificities, greater resistance to enzymatic degradation, and lower salt-dependence of hybridization. 170,171In addition, the negative charges brought into the bioassays by hybridized nucleic acid strands provide an additional strategy of signal generation.However, they have not yet replaced nucleic acid probes primarily because of their high cost.
Unlike other classes of biomolecules, proteins and peptides are inherently more challenging to detect.Widely used techniques in protein detection are antibody-based assays such as enzyme linked immunosorbent assays and Western blotting.The production of antibodies is, however, laborious and expensive. 172It is simply not economically viable or feasible to raise antibodies for every potential target protein in the proteomic universe.While antibodies may be irreversibly denatured when exposed to non-physiological temperature, pH, salt concentration, or solvent, aptamers can, in principle, always be renatured and regain their target binding affinity.The ability of aptamers to be denatured and renatured multiple times is another key advantage they have over antibodies.Nucleic acid aptamers have therefore been an exciting avenue for the development of protein probes since their discovery as high affinity binding reagents, first described in the 1990s. 173,174Since then a number of nucleic acid aptamers have been generated to bind protein biomarkers of diseases such as VEGF, 175,176 thrombin, 177 HIV related proteins, 178,179 PDGF, 180 and NF-κB. 181Also, numerous aptamer-based assays have since been reported and reviewed. 182,183These assays have been reported to have detection limits as low as fg ml −1 .However, the key constraints of aptamer-based bioassays are: (i) the development of aptamers with high affinity and specificity due to the limited chemical diversity of natural nucleotides and (ii) the susceptibility of nucleic acids to be enzymatically digested in cells.XNA aptamers are therefore well positioned to circumvent these limitations and several groups have indeed divulged the enhanced binding capabilities of XNA aptamers to various protein targets as summarized in Table 2. Interestingly, it has been demonstrated that the Ds-Px base pair can be used to generate XNA aptamers that could outcompete previously developed nucleic acid aptamers in binding to IFN-γ and VEGF. 201Experimental evidence suggested that the availability of hydrophobic residues on the aptamer facilitates binding by interacting with the hydrophobic domains present on target proteins.
The poster child for protein-binding XNA aptamers would perhaps be the slow off-rate modified aptamers (SOMAmers).Selected by SELEX, SOMAmers are DNA aptamers containing 2′-deoxyuridine nucleotides that have modifications on the C5 position of the base. 202These modifications decorate uracil with a variety of residues that attempt to mimic and even transcend the chemical diversity displayed by amino acids as illustrated in Fig. 16.The first generation of SOMAmers born in 2010 have been shown to be capable of binding to many human protein targets. 202So far, SOMAmers specific to more than one thousand different human proteins have been discov-ered.Moreover, the SOMAscan™ proteomic assay developed based on these SOMAmers is now commercially available.The principle behind the assay is simple and elegantuse the library of characterized SOMAmers to bind proteins, pull down the protein-SOMAmer complexes, wash off non-specific interactions, and deconvolute the products with a DNA microarray that hybridizes to the coded non-binding ends of SOMAmers. 203o better understand the mechanism of SOMAmers, Davies et al. 204 and Gelinas and co-workers 205 crystalized complexes of SOMAmers bound to their specific targets -IL-6 and PDGF-BB. 206These structures clearly suggest that the high affinities between SOMAmers and IL-6 and PDGF are largely due to the shape complementarity to the hydrophobic pockets of the proteins.These structures further corroborate with the observation that aptamers with modifications containing aromatic hydrophobic moieties tend to produce the best performing aptamers. 207ince 2012, SOMAscan™ has been successfully employed by many groups to identify biomarkers of various diseases ranging from neurodegeneration to cancer.Of the 1129 human protein targets detectable by the SOMAscan™ assay, a wide range of different protein types involved in various key cellular processes and diseases were covered. 208o further demonstrate the applicability of SOMAscan™ as a proteomic technology relevant in a clinical setting, Gold et al. applied SOMAmers to identify novel biomarkers for chronic kidney disease 202 and non-small cell lung cancer. 209They reported a measurable dynamic range of eight orders of magnitude and a median detection limit of 40 fM for the SOMAmers used in the SOMAscan™ assay. 210The number of biomarkers identified by using SOMAscan™ is expected to grow over the coming years as more researchers adopt SOMAscan™ for future proteomic applications due its highly competitive properties and automatable workflow. 211,212NA microarrays are an indispensable tool for probing the complexity of biological systems since their inception in the 1990s. 213The capability to decipher and individually probe specific targets in highly complex biological matrices in a simple yet massively high-throughput manner is crucial for the advancement of genomics and transcriptomics. 214,215nfortunately, protein microarrays, on the other hand, have lagged behind, but not for a lack of trying.The most effective  technique for proteomics thus far has been mass spectrometry.Even so, complex approaches of tandem mass spectrometry, such as selected reaction monitoring, 216 sequential window acquisition of all theoretical fragment-ion spectra, 217 and other labelling strategies 218,219 have to be incorporated into mass spectrometry in order to be able to make sense of the complex data generated.Being able to apply the microarray technology to proteomics would tremendously aid the deconvolution process.XNA aptamers currently present a highly feasible transduction interface between the two.As previously mentioned, SOMAscan™ technology does this by having the non-binding ends of the SOMAmers code for specific sequences, which can be printed onto microarrays and used for deconvolution. 203In the case of XNA microarrays, PNA tags have been used to essentially "barcode" members of split and mix peptide libraries. 220,221For example, Diaz-Mochón et al. created PNA tagged protease substrate libraries to develop a rapid and high-throughput method for studying protease consensus sequences. 221While XNAs are well-positioned to act as a much-needed bioaffinitive interface to a microarray for proteomic applications, they are also very capable of being refashioned to be used as transducers between complex combinatorial libraries and microarrays.Going along the same train of thought, the same strategy could be expanded for studying other important enzymatic processes.One such example reported by Diaz-Mochón et al. was for identifying Abl protein kinase substrate specificity. 221It is conceivable then to further use a similar strategy for better understanding the consensus sequences that signal for other post-translational modifications.

XNAs in biocatalysis
XNAs have also shown a great promise in biocatalysis.A good example is the creation of catalytic XNAs -XNAzymes that ligate and cleave RNA. 222Several interesting RNase-like XNAzymes selected through a bimolecular approach efficiently cleave RNA strands (Fig. 17).Being the most promising XNAzyme, F2R17 was further examined.It was observed that the RNase-like activity with reasonable regioselectivity is still observable even F2R17 is reduced to only 39 nucleotides.As compared to the uncatalysed reaction, the truncated F2R17 showed a 10 4 -fold increment in the reaction rate.After structure fine-turning, it was further demonstrated that the XNAzymes also exhibit ligase-like activity.Experimental evidence indicates that only in the presence of XNAzymes will the ligation of RNA take place.Thus, this represents an exciting field that has a great potential to be further developed into new enzyme mimetics that can catalyze unconventional reactions.Limitations of the biocatalytic XNAs that are urgently needed to be addressed are their low fidelity and the limited availability of XNA polymerases and modified nucleotides, thus inferring that they make selection processes less stringent, under-sampling of the library, and tedious comparisons between XNAs and nucleic acid sequence spaces.As the fidelity of the XNA system is normally a few orders of magnitude lower than that of nucleic acid systems, it is therefore much more  challenging to identify XNA sequences encoding the best catalytic moieties.

Conclusions and outlooks
Essentially all XNAs are currently synthesized by solid-phase chemistry that is largely similar or, in cases where the XNAs have phosphate-sugar backbones, the same as the phosphoramidite chemistry used to chemically synthesize nucleic acids.Solid-phase synthesis of nucleic acids is highly efficient with a stepwise yield of >99% routinely achieved. 223However, the yield of oligonucleotide products decreases exponentially with increasing chain length.To illustrate, with a moderate coupling efficiency of 99.5%, the yield after 100 cycles is only 60%.As in the case of nucleic acid synthesis, the yield of XNAs also decreases exponentially with increasing chain length.In some applications where long XNA strands are involved, they can be fettered by the current limitation of synthetic chemistry.In an attempt to overcome this limitation, DNA-templated chemical generation of XNAs has recently been reported, 224 but the length and fidelity leave much to be desired for the above applications.It is therefore necessary to turn to Nature's chemistsenzymesfor accurately making long XNAs.However, the mechanisms by which most polymerases implement high fidelity and specificity also prevent the incorporation of chemically modified nucleotides.Hence, every new modification must be laboriously tested with a plethora of natural and engineered polymerases.In other words, conventional biochemistry cannot keep up with the diversity easily generated by chemical synthesis.Additionally, polymerases are cornerstones of most nucleotide sequencing techniques.Current XNA sequencing techniques may give abnormal results 225 or involve converting the XNA sequence to DNA, thereby encompassing additional steps. 226More reliable and expeditious sequencing techniques would be immensely beneficial for the deconvolution of XNAs, especially in evolution experiments where large XNA libraries are continuously generated.Indeed, the main advantage of using DNA or RNA over XNAs in SELEX is the wide selection of polymerases available, and the cost of such polymerases is typically the limiting factor.An obvious solution to this limitation would be to evolve new polymerases that would be capable of incorporating these unnatural nucleotides.While several commercially available polymerases are able to incorporate unnatural nucleotides, the processivity and fidelity are less than ideal. 227Nevertheless, these polymerases could serve as starting templates from which newer and more efficient XNA polymerases could be evolved or engineered.In the search for improved XNA polymerases, encouraging results can be found in the work of the Holliger 47 and Romesberg groups. 228nother way around the synthesis issue would perhaps be the use of XNA oligomers that exhibit ligase activities.It is conceivable to then develop DNAzymes or even self-catalytic XNAzymes to perform the polymerization of XNAs.0][231] Having ligase activity, a bolder though not impossible development would be then to develop polymerases made from XNAs.The main advantages in using catalytic nucleic acids in methods requiring thermal cycling are the inherent thermal stability and capability of nucleic acids to refold and regenerate their desired properties.In contrast, renaturation and restoration of activity after heating are not possible for most enzymes.Promising results have recently been reported by Taylor et al. 222 Additionally, XNAzymes with catalytic capabilities could provide an efficient means to amplify analytical signals in bioassays.Extending this idea to XNA nanotechnology and nanostructures, it would be interesting to be able to engineer XNA self-assemblies to produce XNA nanostructures capable of coordinating analytes for the construction of highly sensitive bioassays.Moreover, XNA aptamers have a great potential to be developed into high affinity ligands, it is not too much of a stretch to envision developing toolkits for molecular biology and molecular therapy.
For the most part in this review article, we have highlighted numerous methods that have been established for creative and apt use of XNAs.It is, however, unfortunate that there is a clear lack of new applications for XNAs in purification techniques.While XNA aptamers have a great potential for developing high affinity reagents, their use has not translated into purification technologies.There have been few attempts to develop methodologies for the use of aptamers in purification techniques with limited success.One such example can be found in a report by Javaherian et al., in which a method for developing DNA aptamers against protein targets in cell lysate and using the raised aptamers for purification was described. 232Their technique would be akin to attempting to raise polyclonal antibodies for protein purification, albeit at a faster and cheaper pace with the SELEX principle at its heart.As promising as this technique is, the purification process is still relatively extensive and tedious, but perhaps feasible if coupled to an automation system using XNA aptamers in a manner similar to the SOMALogic methodology. 212To speculate further, should there eventually be a technique for the convenient synthesis of long XNA sequences, it would be plausible to develop XNA purification "gels".In 2012, Zhao et al. applied RCA to produce long DNA strands of repeating aptamer units specific to protein tyrosine kinases to create a "gel" that is capable of capturing cancer cells at a higher efficiency than with single fixed aptamers or antibodies. 233hile the capture target for their purpose were cells, it is not too much of a stretch to envision developing a gel-like material that can be packed onto a column specifically targeting small molecules or other biological macromolecules.Similarly, with enzymatic synthesis, long strands of XNAs or metal ion-incorporated XNAs could be structured using DNA origami methods to fashion a new generation of metal-organic frameworks potentially applicable as chromatographic stationary phases. 234his journal is © The Royal Society of Chemistry 2016 While Nature already provides toolkits for detecting, purifying, and manipulating biological systems, sometimes these tools need to be improved using synthetic chemistry.Nucleic acids are highly amenable to chemical modifications due to their modularity and high coupling efficiency, thereby lowering the technical barrier in adopting XNA technology.Indeed, while XNAs were first developed by researchers seeking to emulate nature; some types of XNAs, such as LNA, PNA, and MO, have already been commercialized and used by biologists with little or no synthetic chemistry experience.On the other hand, XNAs containing man-made nucleobases were initially developed to better understand the physicochemical properties and biology of nucleic acids, but it was quickly realized that they can be tailor-made to overcome common shortcomings of nucleic acids in technical applications, such as in vivo stability and specificity.Although short to medium length XNAs are sufficient for many applications, problems in accurately and affordably synthesizing long XNAs may be the bottleneck in the applications of XNAs, such as XNAzymes and in nanotechnology.Nevertheless, there is intense ongoing research in developing more efficient polymerases for XNAs.With increasing relevance in molecular biology, proteomics, and diagnostics, among other fields, XNAs epitomize one of the key goals of synthetic biology, which is to engineer biology to overcome problems and rise up to greater challenges.

Mr
Yong Quan Tan was born in Singapore.He received his B.Sc. in Chemistry from NUS in 2015.He is currently pursuing his Ph.D. at NUS.His research is focused on the development of synthetic microcompartments for improving biosynthesis of valueadded chemicals in microorganisms.Garrett Wong Mr Garrett Wong was born in Singapore.He received his B.Sc. (Honors) in Life Science from NUS in 2015.He is currently pursuing his Ph.D. at the NUS and Imperial College London.His research is focused on the metabolic engineering of microorganisms for the production of ergot alkaloids.

Fig. 1 A
Fig. 1 A hypothetically unnatural base pair (X-Y) that can function in DNA replication and transcription, allowing ribosomal incorporation of unnatural amino acids during translation.
Zhiqiang Gao Dr Zhiqiang Gao is an associate professor at the Department of Chemistry, NUS.He received his B.Sc. and Ph.D. in Chemistry from Wuhan University.The following years he worked as a postdoctoral fellow at Åbo Akademi University and the Weizmann Institute of Science.After spending three years in the United States and eight years at the Institute of Bioengineering and Nanotechnology, he joined NUS in April 2011.Research in his laboratory currently includes bioengineering, renewable energy, electrochemistry, analytical chemistry, and materials science.

Fig. 5
Fig. 5 Representative nucleotides with modifications to the phosphodiester backbones.

Fig. 7
Fig. 7 Chemical structures of PNA and MO.

Fig. 11
Fig. 11 Mechanism of binary FISH probes that exhibit a rapid "turn-on" fluorescence signal (reproduced with permission from ref. 141, Copyright © 2009, American Chemical Society).

Fig. 12
Fig. 12 Structure of a chemically modified uracil.

Fig. 14
Fig. 14 Schematic illustration of the sensing mechanism involving PNA probes (reproduced with permission from ref. 164, Copyright © 2007, American Chemical Society).

Fig. 15
Fig. 15 Schematic presentation of the colorimetric detection of nucleic acids using MO probes under extremely low salt conditions (reproduced with permission from ref. 168, Copyright © 2007, American Chemical Society).

Fig. 16 A
Fig. 16 A partial representation of modifications at the C5-position of deoxyuridine available for the preparation of SOMAmers.

Fig. 17
Fig. 17Examples of XNAzymes that have RNA cleavage and ligation properties (reproduced with permission from ref. 222, Copyright © 2015, Nature Publishing Group).
Fig. 17Examples of XNAzymes that have RNA cleavage and ligation properties (reproduced with permission from ref. 222, Copyright © 2015, Nature Publishing Group).

Table 1
Summary of the size and shape complementary base pairs