Stefan
Matysiak
*a,
Klaus
Hellmuth
a,
Afaf H.
El-Sagheer
bc,
Arun
Shivalingam
b,
Yavuz
Ariyurek
d,
Marco
de Jong
a,
Martine J.
Hollestelle
e,
Ruud
Out
a and
Tom
Brown
*b
aPiculet-Biosciences BV, Galileiweg 8, 2333BD Leiden, The Netherlands. E-mail: S.Matysiak@gmx.net
bDepartment of Chemistry, University of Oxford, Chemistry Research Laboratory, 12 Mansfield Road, Oxford, OX1 3TA, UK. E-mail: tom.brown@chem.ox.ac.uk
cChemistry Branch, Department of Science and Mathematics, Faculty of Petroleum and Mining Engineering, Suez University, Suez 43721, Egypt
dLeiden Genome Technology Center, Leiden University Medical Center, Leiden, The Netherlands
eDep. Immunophathology and Blood Coagulation, Sanquin Diagnostic Services, Amsterdam, The Netherlands
First published on 7th December 2017
DNA encoded ligands are self-assembled into bivalent complexes and chemically ligated to link their identities. To demonstrate their potential as a combinatorial screening platform for avidity interactions, the optimal bivalent aptamer design (examplar ligands) for human alpha-thrombin is determined in a single round of selection and the DNA scaffold replaced with minimal impact on the final design.
To overcome this limitation efforts have been made to chemically modify aptamers or enhance their interaction via avidity effects. For example, many non-polymerase compatible modifications of thrombin binding aptamers have been reported,4 whilst two different thrombin exosite aptamers have been linked through carbon chains, poly(dA)/poly(dT) repeats, spacer phosphoramidites or sequence optimised spacers to enhance avidity.5–11 However, these processes generally require iterative synthesis, testing and optimisation.
A modular synthetic approach, independent of the nature of the ligand or spacer, for the discovery of high avidity bivalent binders is therefore very compelling. In this regard DNA is a versatile platform with robust and simple rules for self-assembly of two- and three-dimensional objects, some of which have been used to develop DNA-encoded12 multivalent ligands.13–20 Yet, as complexity rises, the use of hybridisation to a template to encode recovered ligand pair identify becomes more error-prone; linking the DNA-encoded ligands to give a direct read out by next generation sequencing (NGS) would be more powerful, especially if the combinatorial aspect of self-assembly can be retained.
We have incorporated the above design aspects into a DNA-based scaffolding platform that allows the generation of large libraries of homo- or hetero-bivalent ligand pairs from a relatively small number of precursor molecules (Fig. 1). Access to SABLC (Self Assembled Bivalent Ligand Complex) libraries requires the synthesis of a number “N” of individual DNA oligomers, each covalently attached to a specific potential binding sequence, e.g. a small sub-library of aptamer motifs.
Mixing, self-assembly and chemical fixation of these precursor molecules (Scheme 1) in a single tube will result in N2 combinations. The benefit of the SABLC principle is not only in the relatively small number of starting molecules, but also in the fact that the ligand complexes can be derived from a diverse range of molecules which do not have to be read through by DNA polymerases during the decoding process. Therefore the platform can be designed to allow ligands such as peptides, carbohydrates and small molecules to be incorporated. Like in a classic SELEX selection step the target molecule can be immobilized on a magnetic bead and incubated with the library. Unbound library members are washed away and the remaining bound SABLCs are eluted under stringent conditions. Their embedded identifier sequence is then decoded by first amplifying the DNA scaffold by PCR followed by Next Generation Sequencing (NGS).
Scheme 1 CuAAC mediated ligation between 3′-propargyl dmC and 5′-azide T across the dimerization region 2 at the end of the SABA DNA hairpin to form biocompatible triazole (ct) linkage, followed by PCR amplification of the ligated construct (see Fig. 1). |
After decoding the SABLC ligands and spacers, optional replacement of the scaffold can be carried out by synthesising analogues that include a short DNA hairpin loop, oligo dT or non-nucleosidic linkers to generate much smaller bivalent ligand molecules, e.g. for therapeutic applications (Fig. 1b). As DNA is simple to modify, visualization tags such as fluorophores or other labels can be readily attached. The SABL complex is then ready to be used in various applications including immuno-PCR.21
Prerequisites for a universal DNA scaffold are; (i) sufficient structural stability to maintain a fixed distance between the two attached ligands and (ii) the ability to denature during PCR to allow amplification and subsequent sequencing. A DNA hairpin of about 20 base pairs fits these criteria perfectly.
A short sequence, specific for each ligand is needed in each strand to code for the ligands and their corresponding separation. We reasoned that 7 nucleobases, which can code for 47 = 16384 ligand/spacer combinations, should be suitable. The scaffolding hairpin must also include a contiguous DNA sequence in each arm to function as a primer-binding site for PCR amplification and sequencing of the identifier codes for the ligands and spacers. Finally the scaffold should not be difficult or expensive to synthesize. Our chosen structure is shown in Fig. 1a. The target-binding regions of a SABL complex are a combination of two identical or different ligands. The distance between them is determined by the double-helical DNA scaffold and the two identical or different spacer molecules that are situated between the ligands and the dimerization region. This partially double stranded DNA complex, which is already quite stable under physiological conditions, is further stabilized by chemical ligation. This ensures that there is no possibility of cross-exchange between DNA strands during the subsequent selection step, so that the embedded information on the specific combination of ligands and their distance relative to each other is not lost during the target-binding, handling and decoding steps. To allow enzymatic DNA amplification for decoding, the two arms need to be covalently linked by means of a biocompatible linker so that a DNA polymerase can read through the two linked DNA strands during PCR. This is achieved via a Cu(I)-catalysed CuAAC reaction between an alkyne moiety at the 3′-terminus of each member of the first library (linked to a dmC residue for synthetic ease) and a 5′-azide group on each member of the second library (Scheme 1).22,23
As a first example we used specific nucleic acid aptamers as ligands, which are inhibitors of the blood coagulation factor thrombin. An initial test library was generated from a combination of 8 sequences; comprising ligands derived from aptamers TBA27 and G15D24,25 and mutated versions (Table 1). These motifs, which are known to bind to different exosites on thrombin, were either directly attached to the DNA scaffold or linked via a simple T6 or T12 spacer. Individual complementary oligonucleotide pairs were combined and clicked after hybridization to yield self-assembled bivalent aptamers (Table 1). Larger libraries could be made by a simultaneous chemical ligation of all pre-hybridized ligand combinations. These complexes were analysed and purified by polyacrylamide gel electrophoresis (PAGE). DNA sequencing was then performed to prove the sequence fidelity. The triazole linker did not lead to any nucleotide misincorporations, deletions or insertions (ESI Fig. S7†).
ID no. | Name/description | Spacer | Other details |
---|---|---|---|
1 | TBA27-c | T0 | 3′-Propyne dC |
2 | t-T0-G15D | T0 | 5′-Azido-T |
3 | t-T6-G15D | T6 | 5′-Azido-T |
4 | t-T12-G15D | T12 | 5′-Azido-T |
5 | TBA27s-c | T0 | 3′-Propyne dC |
6 | t-T12-G15Ds | T12 | 5-Azido-T |
7 (1 + 2) | TBA27-ct-T0-G15D | T0 | CuAAC ligated |
8 (1 + 3) | TBA27-ct-T6-G15D | T6 | CuAAC ligated |
9 (1 + 4) | TBA27-ct-T12-G15D | T12 | CuAAC ligated |
10 (1 + 6) | TBA27-ct-T12-G15Ds | T12 | CuAAC ligated |
11 (5 + 2) | TBA27s-ct-T0-G15D | T0 | CuAAC ligated |
12 (5 + 3) | TBA27s-ct-T6-G15D | T6 | CuAAC ligated |
13 (5 + 4) | TBA27s-ct-T12-G15D | T12 | CuAAC ligated |
14 (5 + 6) | TBA27s-ct-T12-G15Ds | T12 | CuAAC ligated |
15 | G15D | NA | |
16 | TBA27 | NA | |
17 | G15D-T0-TBA27 | T0 | Synthesised |
18 | G15D-T6-TBA27 | T6 | Synthesised |
19 | G15D-T12-TBA27 | T12 | Synthesised |
20 | TBA27-T0-G15D | T0 | Synthesised |
21 | TBA27-T6-G15D | T6 | Synthesised |
22 | TBA27-T12-G15D | T12 | Synthesised |
23 | Reference | NA | Reference |
A model self-assembled bivalent aptamer (SABA) library was prepared as follows: oligonucleotides 7–13 (each containing at least one thrombin aptamer) were mixed in 1:1 ratios and background DNA 14 (containing two scrambled thrombin aptamers) was added (Table 1). The final mixture comprised 99.125% background DNA (scrambled aptamers) and 0.125% of each of SABA sequences 7–13. The protein target (thrombin) was then immobilized onto magnetic beads.
Thrombin-coupled beads, negative control beads and the SABA test library were combined and the bound DNA was amplified by PCR followed by a second PCR to include the adapter sequences for the Illumina NGS instrument (details in the ESI section 4.vi., Fig. S12†). NGS is essential to provide sufficient sequence information to allow selection of the best binders from a single round of selection. SABA 8 and 9 with T6 and T12 spacers between the binding motifs were most abundant (>10-fold enrichment, Fig. 2). If no spacer was used (SABA 7) or one of the binding motifs is scrambled (SABA 10–13) the enrichment was far smaller (less than 10-fold), with active deselection of the construct when both motifs are scrambled (SABA-14). This trend is in accordance with the need for both motifs to bind the two exosites and the ∼4.5 nm separation between the exosites (Fig. 3),26,27 which is comparable to the DNA hairpin loop and T6 spacer size of SABA 8 and less than the T12 spacer size of SABA 9.
Fig. 3 Overlay of X-ray crystal structures of human alpha-thrombin in complex with G15D aptamer bound at exosite I (green; PDB ID 4DIH); and TBA27 aptamer bound at exosite II (blue; PDB ID 4I7Y). The distance between the aptamers is approx. 4.5 nm. |
Next, activity was measured using aPTT values determined in the presence of various aptamer constructs using normal human plasma (a pooled plasma from a pool of more than 32 healthy adult donors derived from the Sanquin Bloodbank, Fig. 4).28 The aPTT value of the blank (buffer/H2O, below 30 s) was identical to the non-thrombin binding sequence 23 (ΔaPTT ∼0 s relative to control). Consistent with NGS results, SABA 8 and 9 (ΔaPTT > 20 s) outperformed those with a T0 spacer (SABA 7), one of the two motifs being scrambled (SABA 11 and 12) or only the monomeric motif (Table ID 15/16; all ΔaPTT ∼5 s). Interestingly, SABA 10 and 13, with only functional TBA27 and G15D motifs respectively but T12 spacers, performed better than expected (ΔaPTT ∼15 s). This is possibly due to unstructured polyanionic T12 spacer being of sufficient size to interact with thrombin (akin to heparin interactions with thrombin), an assertion supported by the greater than blank ΔaPTT of SABA 14 (∼5 s), where both aptamers are scrambled but the T12 spacer is retained.
Fig. 4 Activated Partial Thromboplastin time (aPTT) values (background subtracted) for each individual or ligand composition. |
To refine the bivalent aptamers, DNA scaffolds were replaced with no (T0), T6 or T12 spacers between the two aptamer motifs (ID 17, 18, 19 respectively). For ID 20–22 the order of the aptamer motifs was changed relative ID 17–19 (TBA27 at 5′-side of G15D) whilst the spacers stayed the same. In both cases similar aPTT values for the same linker lengths were observed but with some differences relative to the SABA 7, 8 and 9. Firstly, the T6 spaced aptamers (18 and 21) perform slightly worse than their SABA 8 counterpart, whilst T12 spaced aptamers are unaffected. This is possibly due to the absence of the hairpin scaffold reducing the separation of the ligands by ∼2 nm, which is compensated for by the additional length of the T12, but not T6, spacer. Secondly, T0 spaced aptamers (17 and 20) perform better than their monomeric motifs or SABA 7 counterpart suggesting the display orientation of the aptamers by the duplex in SABA 7 is important when a linker is completely omitted. In such cases, SABA refinement may require minimal duplex motifs (e.g. very stable 7 base hairpins29).
Finally, disassociation rates as a proxy for avidity were determined for monomeric (15 and 16) and bivalent aptamers (20–22) using surface plasmon resonance (Fig. 5). Unlike association rates, disassociation rates are concentration independent making their modelling, interpretation and comparison between aptamers far simpler – the slower the decay rate and the larger its fractional contribution the stronger the avidity effect. To ensure concentration independence, aptamer disassociation from the immobilised thrombin was normalised to the start point of decay and averaged over various aptamer concentration injections (1000 to 7.8 nM); the standard deviation was less than 0.04 normalised response units. All sequences required a bi-exponential decay to accurately fit the data, with one component of similar rate and fractional contribution (kd,1 ∼0.01–0.02 s−1, ∼0.12) shared by all sequences, suggesting it is a non-specific interaction with the immobilised thrombin. The second component clearly demonstrates the potency of the bivalency effect as the disassociation of monomeric motifs is at least an order of magnitude faster than bivalent complexes (kd,2, Fig. 5). Notably the trends for 20–22 are consistent with aPPT results; kd,2 of 22 (T12 spacer) is slightly slower than 21 (T6), which in turn is slower than 20 (T0) mirroring the greater activity of 22 cf. 21 and 20 in aPPT. These trends were retained upon increasing the immobilised thrombin ∼5-fold (Fig. S14†), suggesting that of the tested designs T12 spacing of G15D and TBA27 is optimal.
In summary, we have developed a self-assembled DNA scaffold platform that is designed to allow the generation of large libraries of bivalent ligand complexes (SABLCs) from a relatively small number of starting molecules comprised of single stranded DNA sequences tagged with potential binding motifs. As a first example of the general concept we have evaluated a small self-assembled bivalent aptamer (SABA) library based on two nucleic acid aptamer sequences that bind to different exosites of thrombin. The general SABLC concept opens up the possibility of using a wide range of ligands with great chemical diversity such as non-natural nucleotides and backbone linkages, peptides, carbohydrates or small molecules. It will also allow optimization of combinations of known monovalent ligands to improve binding affinity and enzymatic stability. From an economic standpoint it is reasonable to propose the synthesis of large SABLC libraries; oligonucleotide synthesis facilities typically generate hundreds to thousands of oligonucleotides per day. The same is true for peptides manufacturing, and other potential ligands.30 Moreover, several conjugation chemistries will allow automated large scale synthesis of monovalent ligand libraries tagged with individual nucleic acids bearing specific identifier sequences. Most importantly the universal nature of the SABLC approach and the sensitivity of the decoding process allow the re-use of aliquots of larger scale pre-synthesized libraries of monovalent DNA tagged ligands for multiple selection experiments. Their reuse in combination with other sets of complementary arms is also possible. Such libraries can be stored for future use, thus expanding the libraries and accumulating a valuable resource over time.
Footnote |
† Electronic supplementary information (ESI) available: Experimental details of oligonucleotide synthesis, PCR amplification, DNA sequencing, blood clotting assays. See DOI: 10.1039/c7ob02119d |
This journal is © The Royal Society of Chemistry 2018 |