Design, preparation, and selection of DNA-encoded dynamic libraries† †Electronic supplementary information (ESI) available: Materials and general methods, experimental details, library selection and sequencing methods and fold enrichment calculations. See DOI: 10.1039/c5sc02467f Click here for addit

DNA-encoded dynamic libraries (DEDLs) are realized by dynamic DNA hybridization and a novel equilibrium-locking mechanism.


Introduction
Dynamic combinatorial chemistry (DCC) employs reversible bond formation to create dynamic systems of continuous interexchanging chemical entities. [1][2][3][4] Built on the principle of DCC, dynamic combinatorial libraries (DCLs) have emerged as efficient tools for discovering novel ligands for biological targets. [5][6][7][8] Compared with a static library, a DCL has two advantages. First, a DCL allows for a spontaneous library synthesis based on the inter-conversion of compounds through reversible reactions among building blocks (BBs); the entire library can be synthesized by simply mixing the BBs without the need for spatial separation. Second, a DCL is adaptive: adding the target induces the selection pressure to redistribute the BBs, favouring the synthesis of target-binding compounds at the expense of non-binding ones. [9][10][11][12] Moreover, aer reaching a new equilibrium in the presence of the target, the library can be "frozen" by stopping the dynamic exchange (e.g. by adding an additive or changing the pH to stop reversible reactions), so that the library population change is preserved and ready for subsequent hit identication. 1,6 DCLs have shown great potential in accelerating the discovery of lead compounds in drug discovery, 5,6,13,14 such as in fragment-based [15][16][17][18] and structurebased drug design. 5,19,20 However, DCLs face a major limitation of low library diversity, mainly resulting from the lack of suitable analytical methods. Typically, chromatographic methods, such as HPLC, are used to resolve DCLs and to identify binders by comparing spectra with and without the target, 18,21-23 but HPLC does not have the capacity to resolve large libraries containing many different compounds. 16,24 Other methods, such as non-denaturing mass spectrometry, 25 NMR, 26 and spectroscopic methods (UV and uorescence) [27][28][29] have been employed for DCLs, but the resolution and throughput of these methods are also not sufficient for large libraries. Otto, Miller, and their respective co-workers have developed several elegant approaches capable of analyzing and selecting large DCLs ($10 K compounds); [29][30][31][32] however, in most cases, DCLs only contain 10-100 compounds. Since the probability of discovering high affinity ligands increases with the library diversity, the limitation of the library size has presented a signicant obstacle for DCLs. 23 New approaches capable of resolving and analyzing large DCLs are still highly desired.
A DNA-encoded library (DEL), in which each compound is linked with a unique DNA tag, is another combinatorial library approach employing mixed compounds in library processing. [33][34][35][36][37][38][39][40][41][42] In contrast to DCLs, due to DNA's high encoding capacity, DELs can contain millions of different compounds; [43][44][45][46] library selection can be feasibly decoded using PCR amplication and DNA sequencing. 47,48 Therefore, introducing DNA-encoding to DCLs could be an effective strategy to address the limitation of their library size. Previously, nucleic acids have been successfully used as programmable templates or scaffolds with spatial precision to display ligand combinations interacting with various biological targets. [49][50][51][52][53][54][55][56][57][58][59][60][61][62][63][64][65][66] The Neri group developed a method named an Encoded Self-Assembling Combinatorial (ESAC) library, in which two sets of DNA-linked fragments form a static library by combinatorial duplex formation. 65,66 Hamilton and co-workers introduced dynamic exchange in DNA hybridization, so that the target can shi the equilibrium and enrich high affinity fragment combinations (Fig. 1a). 49,67 Very recently, Zhang and co-workers reported a similar system achieving target-induced enrichment of DNA duplexes. 68 These studies have nicely shown that the principle of dynamic exchange can be applied to DELs; however, more systematic methodology for the preparation and selection of DNA-encoded dynamic libraries (DEDLs) has yet to be developed. Moreover, previous studies require modied and immobilized targets in library selection, which is not compatible with proteins that are difficult to purify or modify, such as membrane proteins. 69,70 Aiming to address these issues, here we report the detailed study of a DEDL system, including library preparation, encoding, selection, hit deconvolution, and notably, a novel "locking" strategy to freeze the equilibrium shi for hit isolation and identication.

Results and discussion
Our strategy is shown in Fig. 1b. Libraries of BBs are conjugated to different DNA strands (ligand DNA), all having a common sequence that can form dynamically exchanging duplexes with an "anchor DNA", which is conjugated with an "anchor" molecule. Upon target addition, the equilibrium shis to form more high affinity bivalent duplexes. Next, the photo-reactive group on the anchor DNA can crosslink the two DNA strands upon irradiation, thereby stopping the dynamic exchange and locking the shied equilibrium. The distal region on the ligand DNA encodes the BB's chemical identity, and the crosslinked duplex can be isolated for hit identication with PCR amplication and DNA sequencing (Fig. 1b). By combining the features of DELs and DCLs, our design allows for the selection of high diversity DCLs to discover synergistic fragments for "affinity maturation" of the anchor molecule. 65,66,71,72 We rst veried that dynamic DNA duplex formation can be affected by the target protein. 49,68 As shown in Fig. 2a, a uorescein (FAM) molecule and a quencher (DABCYL) were conjugated to two complementary DNA strands; the decrease of uorescence therefore indicates DNA hybridization. The other end of the DNA was conjugated to a biotin, a desthiobiotin, or an iminobiotin molecule (Fig. 2b). These ligands are well known to bind to adjacent pockets on the tetrameric protein streptavidin (SA) with different affinities (K d : 40 fM, 2.0 nM and 50 nM, respectively). 73 Moreover, we reason that, in order to establish  dynamic exchange, the DNA duplex should have a melting temperature (T m ) close to the experiment temperature, and it should also be sufficiently long to ensure hybridization speci-city; therefore, either 6-or 7-base DNA duplexes were chosen in our study.
As shown in Fig. 2c, for all three ligands the uorescence decreased signicantly in the presence of SA, suggesting the formation of the ternary complex (i). In contrast, in control experiments with the non-specic protein BSA (bovine serum albumin) (ii), without SA (iii), or with one ligand omitted (iv), little or no uorescence decrease was observed, indicating the quenching in (i) depends on specic bivalent binding to SA. Notably, $40% quenching was observed for the weak binder iminobiotin (Fig. 2c, right panel). Furthermore, we performed similar uorescence quenching experiments with raloxifene, an estrogen receptor (ER) modulator (Fig. 2d); 74 dimeric raloxifene ligands are able to bind to the two binding pockets on estrogen receptor dimers. 61,75,76 Similar to the biotin ligand series, a signicant uorescence decrease was observed in the presence of the specic target ER and the bivalent raloxifene duplex (Fig. 2e). In addition, as a thermodynamically-controlled system, an important feature of DCLs is that the same state of equilibrium can be reached from different starting points. 77,78 In order to verify this, we either altered the mixing order or incubated the mixture at 4 C, 16 C, 30 C or 40 C for 30 min before incubation at 30 C for another hour (QD and FD; Fig. 3a). We observed that all experiments reached the same equilibrium based on uorescence readings, proving the dynamic nature of our system (Fig. 3b).
Next, we investigated whether the target has shied the equilibrium to promote the assembly of high affinity duplexes.
As a negative control, the non-binding BSA did not shi the equilibrium (cyan columns; Fig. 4). These results have demonstrated that the target can indeed promote the assembly of high affinity binders.
In the selection of DCLs, it is oen necessary to stop the dynamic exchange and "freeze" the shied equilibrium, so that the library population change, induced by the target, can be preserved for further characterization. For example, adding NaBH 3 CN to reduce imines to stable amines is a popular method to stop the dynamic imine formation, 21,22,[79][80][81] and lowering the pH can effectively disable disulde exchange and reversible Michael addition, which optimally occur at basic pH. 16,17,24,31,78 In this study, we designed a novel photo-crosslinking strategy to stop the dynamic DNA duplex exchange. Photo-crosslinking is kinetically fast and can be imposed/ withdrawn conveniently with minimal perturbation to the system. 82 As shown in Fig. 5a, psoralen (PS), a photo-crosslinker widely used in nucleic acid crosslinking, [83][84][85][86] was conjugated to the 5 0 -end of a short 7-nt DNA bearing the anchor molecule (AD-2). AD-2 is complementary to the 5 0 -end of a 24-nt DNA having a ligand and a FAM group (LD-2). Moreover, LD-2 also contains a thymine group at the site opposite to PS, which is known to be able to improve the crosslinking efficiency. 87 Aer DNA incubation and target addition, irradiation triggers  crosslinking between AD-2 and LD-2, thereby stopping strand exchange and locking the equilibrium. The crosslinked AD-2/ LD-2 duplex can then be isolated for PCR amplication and DNA sequencing to decode the ligand synergistically binding to the target with the anchor molecule. First, we prepared fully matched, partially mismatched, and fully mismatched AD-2/LD-2 duplexes. These DNA duplexes were mixed, irradiated, and analysed by denaturing electrophoresis. The crosslinked product was only observed with the fully matched DNA duplexes (lane 1; Fig. 5b). Next, a set of desthiobiotin-labelled AD-2 and LD-2 strands was subjected to the same procedure; results show that only in the presence of SA was the crosslinking product detected (lane 1; Fig. 5c). Multiple bands appeared in lane 1 of Fig. 5c; mass analysis conrmed that all are crosslinked duplexes (see the ESI †). We hypothesize that the "T" shape of the crosslinked duplex may partially renature in the gel, a phenomenon that we have observed previously. 88 In all negative controls (with BSA, no protein, no irradiation and no desthiobiotin on AD-2; lanes 2-5, Fig. 5c), no or very little crosslinking was detected. The product bands were excised, extracted, and quantied. With SA, a 40% crosslinking yield was obtained.
Collectively, these results have demonstrated the specicities of PS-based interstrand DNA crosslinking and its suitability for capturing target-induced duplex formation.
Next, we mixed a background DNA (5 0 -NH 2 , 28-nt; BD-3) with a ligand DNA (5 0 -desthiobiotin, 28-nt; LD-3) at a 4 : 1 ratio. BD-3 and LD-3 have orthogonal primer binding sites (PBS-1 and PBS-2; Fig. 6a). Both BD-3 and LD-3 have a 7-base region complementary to a short anchor DNA (3 0 -desthiobiotin, 7-nt; AD-3). These DNA strands were mixed at a 4 : 1 : 1.5 ratio to form the dynamic library. Aer adding SA, the mixture was irradiated and the crosslinked duplexes were gel-puried for qPCR (quantitative PCR) analysis. The qPCR threshold cycle values (C T 's) were determined to calculate the initial copy numbers of LD-3/AD-3 and BD-3/AD-3 duplexes with their respective primers. 89,90 In order to offset possible biases from experimental factors, the library was also subjected to the same procedure (irradiation, gel purication, and qPCR) with the control protein BSA. Fold enrichments were then calculated by comparing the results from these two selections (see the ESI for the calculation method; Fig. S2-S4 †). As a result, a 12.0-fold enrichment of the high affinity LD-3/AD-3 duplex was achieved (Fig. 6b), which is comparable to typical DCL-based selections. 16,18,19,91,92 Gel analysis also directly conrmed the enrichment of the crosslinked LD-3/AD-3 duplex (Fig. S5 †). Moreover, in order to test the generality of our method, we conjugated another pair of ligands, theophylline and CBS to LD-3 and AD-3 DNA strands, respectively. Theophylline and CBS were found to synergistically bind the target of carbonic anhydrase-II (CA-II) in an ESAC library selection. 66 Aer mixing with the background DNA BD-3, the formed dynamic library was subjected to the selection against the target CA-II and the negative control BSA with the same procedure. The results show that a 10.2-fold enrichment of the LD-3/AD-3 duplex was achieved (Fig. 6c).
Collectively, these results have demonstrated that the PS-based crosslinking mechanism is suitable for locking and analysing the equilibrium shi in DEDL selections.
Encouraged by these results, we further prepared several model DCLs (Fig. 7a). These libraries contain a desthiobiotinlabelled ligand DNA (LD-4) and 4 background (BD-4) DNA strands, all dynamically competing for an anchor DNA (with  desthiobiotin, AD-4). The BD-4 strands were also conjugated with several small molecules that are not known to bind SA, but represent typical fragment structures in a library. The ligand of desthiobiotin in LD-4 is encoded by a "TTT" codon, while the BD-4 strands contain varied sequences at the encoding site ("AAG", "GCA", "ACA" and "CGC"). These DNA strands were mixed at an equal ratio to form the library and then selected against SA. Aer irradiation, "hit compounds" were isolated and decoded with the same procedure as that for Fig. 6, except Sanger sequencing was used. As shown in Fig. 7b, in all cases, the "TTT" codon encoding the desthiobiotin in LD-4 has been enriched markedly by SA due to the high affinity of the LD-4/ AD-4 duplex (le panels), whereas negative selections (no protein) only generated scrambled sequences at the encoding site (Fig. 7b, right panels).
Finally, in order to mimic library diversity, we prepared a model DCL containing 1024 background (BD-5) DNA strands, a ligand DNA (with desthiobiotin, LD-5), and an anchor DNA (with desthiobiotin, AD-5) (Fig. 8). The LD-5 and all BD-5 strands were mixed at an equal ratio, realizing a 1024-fold excess of background DNA strands relative to LD-5. This library was selected against the target SA and also subjected to a "noprotein" control selection, similar to that for Fig. 6, to control for biases from the selection procedures (irradiation, gel puri-cation, PCR, sequencing, etc.). The selection results were decoded by high throughput DNA sequencing (Illumina®). The fold enrichments of selected sequences were plotted against the sequence counts to identify "hit compounds" (Fig. 8b). Again, due to the high affinity of the LD-5/AD-5 duplex, the sequence that encodes LD-5 was distinctly enriched (19.2-fold). In addition, the expected "hit", LD-5, shows a high sequence count ratio aer the target selection, while having an average count ratio in the control selection, further conrming its target specicity ( Fig. 8c and S6 †). It is worth noting that the wide distribution of sequence counts in both the target and control selections indicates that sufficient sequencing depth and high library synthesis quality (even distribution of library members) 93 are both important in library selections. Although  (c) Plot of sequence count ratios after the control selection (no protein added) versus count ratios after the target selection (with SA). Sequence ratio ¼ (sequence count)/(total sequence count of the library). Each dot represents the DNA sequence corresponding to a library member. The "hit" containing the desired LD-5 codon is highlighted in red. LD-5: 0.19 nM; BD-5: 200 nM (total); AD-5: 300 nM; SA: 400 nM. The fold enrichments for the low-count library members vary widely due to statistical under-sampling. See the ESI † for more details on the experimental procedure, data analysis and further discussion of the sequencing results. this model library only has a limited chemical diversity, these results have demonstrated our approach's suitability for the selection of large dynamic libraries.

Conclusions
In conclusion, we have developed a DNA-encoded dynamic library (DEDL) approach for the preparation and selection of large dynamic libraries. Notably, we introduced a novel locking mechanism, which is able to take a "snapshot photo" of the library equilibrium altered by the target protein, thereby enabling the downstream hit isolation and identication. Second, our method eliminated the requirement of target immobilization and physical washing; therefore, target-induced perturbation of the library equilibrium is better preserved, and unmodied, non-immobilized proteins can be used as targets. 69,70,90 However, the present method only encodes one fragment and thus is limited to the "affinity maturation" of known ligands (the "anchor"), 66,71,72 rendering it unsuitable for the de novo discovery of synergistic fragment combinations. 38 In contrast, nucleic acids have previously been successfully used as templates to pair DNA/PNA-linked small molecule ligands, therefore enabling the selection of synergistic fragment pairs for biological targets, 50,52,53,[57][58][59][60][61][62][63][64] and the strategy of interstrand code-transfer also realized the dual-pharmacophore ESAC libraries. 65 These elegant studies highlight the importance of further development of dual-display DNA-encoded dynamic library, 8,94 which indeed is currently being pursued in our laboratory using an alternative DNA architecture, more efficient crosslinker, 95 and different decoding scheme. 96 We will report the results in due course.