Natalie Khamissi,
Christopher Korfmann,
Areeba Chaudhry and
Ryan Hili
*
Department of Chemistry, Centre for Research on Biomolecular Interactions, York University, 4700 Keele Street, Toronto, ON M3J 1P3, Canada. E-mail: rhili@yorku.ca; Web: www.yorku.ca/rhili
First published on 21st March 2025
A method to enable the transliteration between various XNA-containing nucleic acids and canonical DNA is described. Using ligase-catalysed oligonucleotide polymerisation (LOOPER), we show that DNA can be used as a template to generate nucleic acids polymers comprising various levels of 2′-fluoro (2′-F), 2′-fluoro-arabinonucleic acid (FANA), 2′-O-methyl (2′-OMe), and Locked Nucleic Acids (LNA) in moderate yields. The fidelity and biases of the LOOPER process were studied in detail for the 2′-F system by developing a hairpin-based sequencing method, which showed fidelities exceeding 95% along with positional and sequence dependencies within the polymerised XNA-containing anticodons. Lastly, we show the ability of LOOPER to regenerate DNA from 2′-F, FANA, 2′-OMe, and LNA in moderate yield and in fidelities over 95%. Taken together, this study demonstrates the potential of LOOPER to serve as a platform for applications where the transliteration between XNA and DNA is needed, such as the in vitro evolution of XNA-containing nucleic acid polymers.
Chemists have long explored the use of unnatural backbones to study and perturb the biology of living systems. For instance, β-peptides can form complex and predictable folding patterns, engage with protein targets, and exhibit profound resistance to proteolysis.1,2 Inspired by their potential, efforts to engineer the protein translation machinery have enabled the translation, evolution, and ultimately the study and application of β-peptides.3,4 Within the domain of genetic polymers, the exploration of alternative phospho-sugar backbones have been investigated for decades5 with the ultimate goal to understand the etiology of DNA and RNA in living systems, possibly from a structurally related progenitor.6 However, more recently, nucleic acid polymers with unnatural backbones have come to the fore as a critical advance in synthetic biology with considerable potential in various therapeutic applications. Such polymers, termed xeno-nucleic acids (XNA), are not recognised by endogenous nucleases, thus slowing, or precluding their degradation in biological samples. Indeed, this property has prompted their emergence in oligonucleotides used in CRISPR guide RNA,7 and various antisense oligonucleotide therapeutics.8
Functional nucleic acids are conventionally discovered through a process called Systematic Evolution of Ligands by EXponential enrichment (SELEX), which involves iterative cycles of a selection pressure against a library of nucleic acids for a desired function (e.g., binding a protein target), and amplifying the survivors. Recent advances in engineered polymerases have enabled the transcription and reverse transcription of several XNA forms,9 including those with nucleobase modifications,10 with high fidelity – a critical step to enable the evolution of functional XNAs using traditional SELEX methods. High-affinity aptamers against protein targets have been isolated from selections involving α-L-threofuranosyl nucleic acid (TNA),11–13 2′-fluoro-arabinonucleic acid (FANA),14–16 and 1′,5′-anhydrohexitol nucleic acid (HNA).17 Despite the considerable advances in polymerase engineering, new conceptual approaches to enzymatically generate XNA are needed to fully explore their potential. In particular, general methods to transcribe and reverse transcribe broad classes of XNA, chimeric XNA, or XNAs bearing nucleobase modifications, are needed to expand the utility of this promising class of biopolymers for therapeutic, diagnostic, and other biotechnology applications.
To this end, various ligases have been explored to enzymatically ligate XNA fragments on a DNA template using ligases such as T3, T4 and T7 DNA ligase18–20 and RNA ligase.21 While this approach enables rapid synthesis of discrete XNAs sequences, the generation and reverse transcription of XNA libraries using ligases has not been investigated. The ligase-catalyzed oligonucleotide polymerisation (LOOPER) provides a platform for analogous transcription of DNA into synthetic biopolymers (Fig. 1).22–29 LOOPER involves the nucleic acid-templated ligation of very short oligonucleotide fragments, termed anticodons, in a sequence-defined manner and proceeds with high fidelity when using large combinatorial libraries of templates. Importantly, LOOPER has been shown to accommodate nucleobase-modified anticodons. Indeed, complex libraries can be generated with up to 16 different chemical modifications25 or with peptide fragments up to 8-residues in length.23 LOOPER has also been implemented in SELEX to identify high-affinity aptamers against protein targets, including human α-thrombin,28 PCSK9 and IL-6.27 Due to the potential flexibility of LOOPER, we reasoned that DNA-templated XNA library synthesis and XNA-templated DNA library synthesis may be within the scope of the process, thus enabling a potentially general approach to analogous transcription and reverse transcription of XNAs that can be ported into SELEX or other applications, including memory storage (Fig. 2).30
![]() | ||
Fig. 1 General process for ligase-catalyzed oligonucleotide polymerisation (LOOPER) for the generation of modified DNA. |
The ligase used during LOOPER can have profound influence over the outcome of the process. T4 DNA ligase, which was the original enzyme used in LOOPER, can most efficiently polymerise anticodons that are at least pentanucleotides in length.31 Importantly, the anticodon must be chemically modified to enable high fidelity. When polymerising anticodons comprising the ANNNN sequence, whereby N = A, C, G, T, the incorporation of a C8 modification at the 5′-end adenine increases fidelity from 87% to 95%, presumably through an anti to syn conformational switch along the adenine glycosidic bond, resulting in a more stringent annealing process.31 With fully degenerate pentanucleotide anticodons, NNNNN, which would be required for the proposed XNA system, fidelities are decreased to 81% which are likely too low for most applications. T3 DNA ligase on the other hand has been shown to accommodate anticodons as short as three nucleotides in length.27,31 Importantly, while fully degenerate unmodified anticodons such as NNNNN result in very low fidelity (67%), shorter anticodons, such as NNNN and NNN are incorporated with much high fidelities of 89% and 97%, respectively.31 For this reason, we pursued T3 as a potential ligase for XNA-based libraries in LOOPER.
Although there was no full-length product observed when using 2′-OMe or FANA-modified trinucleotides in LOOPER, partially polymerised products were detected depending on the position of the modification on the anticodon (Fig. S1 and S2†); further optimisation may enable these classes of XNA to be more efficiently polymerised. 2′-OMe modifications at the middle position of the trinucleotide were tolerated the best, incorporating up to four trinucleotides. It is important to note that LOOPER with 2′-OMe-modified trinucleotides of a discrete sequence perform much better than in library contexts. For instance, TTT trinucleotides with 2′-OMe modifications at the middle position were well tolerated, resulting in full-length product (Fig. S2a†). While the reason for this have not been fully explored, our lab has previously observed that yield typically decreases as the LOOPER anticodon library size increases.25 This may be due to the decrease in relative concentration of the cognate anticodon at each ligation site. Furthermore, certain anticodon sequences may have superior annealing and ligation kinetics with T3 DNA ligase, which creates challenges in library-based polymerisations. For FANA-modified trinucleotide libraries, the 3′-modified library was the most tolerated by T3 DNA ligase resulting in three trinucleotide incorporations (Fig. S1b†). This is consistent with what was observed with FANA-modified TTT sequences, where the 3′-end modification exhibited the most efficient polymerisation (Fig. S1†).
We hypothesised that the lack of pairing was resulting from the disparity of PCR efficiency between the template (DNA) and products (XNA) strands. Even with more permissive polymerases, it was difficult to control this bias. To overcome this issue, we chose to evaluate the fidelity using a hairpin architecture (Fig. 4). Using this approach, full-length amplicons that go into sequencing would necessarily have read through both the template and product strand, allowing for straightforward pairing for fidelity analysis.
LOOPER was performed along a hairpin template library and PCR amplified before high-throughput sequencing. An advantage to purifying the hairpin product is that the fully extended product is well resolved during PAGE from the partial products, allowing for efficient purification. The product was validated by attaching a fluorophore to the 5′-end of the primer. This allowed for confirmation of the full-length product band for extraction. The correct amplicon length prior to sequencing further validated the fully extended product. All modified hairpin products demonstrated identical PAGE gels before purification, except for the completely modified 2′-F product. Although the modification was successful during LOOPER, PCR amplification using Q5 DNA polymerase did not result in any product. This could be due to the highly structured hairpin in conjunction with the densely modified strand that makes the polymerisation difficult. BST 3.0 polymerase, a polymerase with a larger, more accommodating active site, did not improve PCR efficiency. KOD DNA polymerase had a faint product band that was not well resolved; however, sequencing confirmed that this band was not the product. Therefore, we were unable to determine the fidelity of the completely modified 2′-F product using this current approach. In addition, the LNA product could not be PCR amplified by Q5 DNA polymerase, so we were not able to perform sequencing on these products.
After fidelity sequencing of the hairpin products, we noticed that the sequencer had difficulties sequencing through the hairpin structure. We determined this using a control hairpin that was extended using Q5 DNA polymerase and sequenced alongside the modified products. This resulted in a single nucleotide fidelity of 97.0% and a trinucleotide fidelity of 91.7%; we normalised the modified fidelities to this Q5 DNA control (Table 1). Analysis of the LOOPER control (unmodified NNN) resulted in a 95.9% normalised fidelity, which is comparable to the fidelity obtained from duplex sequencing (97.0%).31 The data shows that the middle-modified 2′-F trinucleotide system, while having the highest polymerisation yield, resulted in the lowest fidelity, which is consistent with our preliminary data using duplex sequencing (Table S1†). This may indicate that this anticodon set has a stronger polymerisation bias for subsets of anticodons. While the single nucleotide fidelities using anticodons with the 2′-F at either the 3′-end or the 5′-end were equivalent, at the trinucleotide level, the trinucleotide library with the 2′-F at the 5′-end resulted in the highest fidelities. On the whole, these levels of fidelities are consistent with those used for LOOPER-based aptamer selections,28 which highlights the utility of this method.
Anticodonb | Fidelityc (1N) | NFidelityd (1N) | Fidelitye (3N) | NFidelityf (3N) |
---|---|---|---|---|
a Fidelities were determined by Ion Torrent sequencing on an Ion GeneStudio S5 Plus. All LOOPER experiments were conducted on a template comprising a 13 codon (39 mer) reading frame.b Italicised letters indicate the location of the XNA nucleotide.c Fidelity was calculated at the single nucleotide (1N) level.d Fidelity at the single nucleotide (1N) level normalised to Q5 DNA polymerase standard, which was benchmarked at 100%.e Fidelity at the trinucleotide (3N) codon level.f Fidelity at the trinucleotide level (3N) normalised to Q5 DNA polymerase standard, which was benchmarked at 100%.g Q5 DNA polymerase control. | ||||
NNN | 94.5% | 97.4% | 87.9% | 95.9% |
NNN | 95.5% | 98.5% | 86.3% | 94.0% |
NNN | 92.6% | 95.5% | 77.3% | 84.3% |
NNN | 95.5% | 98.5% | 84.3% | 92.0% |
NNN | N/A | N/A | N/A | N/A |
Q5 ctrlg | 97% | 100% | 91.7% | 100% |
We sought to explore trends in the sequencing data to better understand the strengths and shortcomings of the XNA library synthesis method (Fig. 5). We first evaluated the effect of GC content in each trinucleotide library. We observed in the unmodified DNA control that increasing GC content resulted in decreasing fidelity; this can be rationalised by higher GC-content facilitating the ligation of stable misannealed anticodons (Fig. 5a). However, with the 2′-F modified libraries the effect was surprisingly reversed. Upon further analysis, we noticed that there are a select few trinucleotides with 0% GC-content in each of the 2′-F libraries that have unusually poor fidelity. For example, the average fidelity of all 0% GC-containing anticodons in the 5′-modified 2′-F library is just 77.3%. However, when the ATT anticodon is removed from the average, the 0%-GC fidelity increases to 86.0%. Indeed, analysis of sublibraries in each trinucleotide library (Fig. 5b) show that the A and T sublibraries have significantly lower fidelities at all positions within the trinucleotide anticodon. It is known that the incorporation of 2′-F modifications have considerable effects on DNA thermodynamics, including the increase in thermal melting,34 and changes in conformational equilibrium,35 both of which may be at play here.
While fidelity plays a large role in oligonucleotide polymerisations, so does polymerisation bias. Biases can occur when anticodons are incorporated into the product strand at different frequencies than they occur in the starting template. This can happen in low fidelity systems but can also happen to some degree in high fidelity systems when polymerisation yields are lower. High bias can convolute analysis in downstream applications, such as in vitro selections of functional nucleic acids. Here, bias analysis has revealed several trends (Fig. 5c–f). While the unmodified DNA control library displayed little bias (as indicated by data distributed along the diagonal), the XNA-based systems exhibit some outliers (Fig. 5d–f, noted in plots). Amongst the XNA libraries, having the 2′F-XNA at the 5′-end of the anticodon resulted in the lowest level of library bias, with just one significant outlier, namely ATT, which had a considerable negative bias. Across all libraries we observed that trinucleotide sequences with a 5′-G base tend to have positive biases, where T3 DNA ligase prefers incorporating these trinucleotide sequences over others. This positive bias has in part resulted in several 0% GC sequences having low fidelities. For example, the ATT trinucleotide sequence found in the 5′-modified 2′-F library resulted in a fidelity of only 11.0%. Upon analysis, we observed that GTT has 4-fold more incorporations across from the TAA codon in the template compared to the cognate ATT anticodon.
During our fidelity analysis of XNA synthesis using LOOPER, we experienced difficulty with the reverse transcription of the completely modified 2′-F products and LNA products, and thus we first explored the reverse transcription ability of LOOPER within this context. In this method, polymerisation occurs using an unmodified trinucleotide DNA library along a defined XNA-modified template (Fig. 6). We attempted an LNA-modified template modified at every third base, preserving the modification structure of the successful middle-modified LNA transcription product; this template could not be PCR amplified by Q5 DNA polymerase (Fig. S6†). LOOPER was able to successfully reverse transcribe the LNA-modified template into DNA (Fig. 6a, lane 6). A considerable amount of duplex remains intact during the denaturing PAGE conditions, likely due to the known increase in thermal melting stability of LNA-containing duplexes.36 We also tested completely modified XNA templates including 2′-F, 2′-OMe, and FANA, the latter two of which could not be directly amplified by Q5 (Fig. S6†); all were found to be within the scope of the LOOPER method. Similar to LNA, they also exhibited varying levels of duplex:single strand ratios on denaturing PAGE, which can be attributed to their known increases in thermal stability relative to canonical DNA.34,37,38 Unfortunately, the completely-modified LNA synthesis could not be included in this study due to difficulties in its synthesis and lack of available commercial availability for oligos with >20 LNA incorporations.
The fidelities of the LOOPER-mediated reverse transcriptions were purified and assessed by Illumina sequencing and were found to be between 95–99% average at the single-nucleotide level (Fig. 6b–e). Interesting, FANA exhibited considerably lower fidelity during the reverse transcription, particularly within the middle of the reading frame. Further investigation into this is ongoing as it is unclear if this effect is dependent on sequence or length of the hybrid duplex. Nonetheless, these results provide a platform for the evolution of XNA-containing nucleic acid polymers.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5sc00834d |
This journal is © The Royal Society of Chemistry 2025 |