Edinburgh Research Explorer Non-invasive 19F NMR analysis of a protein-templated N-acylhydrazone dynamic combinatorial library.

Dynamic combinatorial chemistry (DCC) is a powerful tool to identify ligands for biological targets. We used 19 F NMR as an in situ , non-invasive technique for measuring the composition of a dynamic combinatorial library (DCL) of N- acylhydrazones (NAHs). An NAH DCL, constructed from a fluoro-aromatic aldehyde and a small set of hydrazides, was targetted at ec FabH, an essential enzyme in bacterial fatty acid biosynthesis. Our NMR analysis identified a tert-butyl NAH as the best binder which was confirmed by enzymatic assay.


Introduction
Dynamic combinatorial chemistry (DCC) is a technique derived from fragment-based screening that exploits chemicallyreversible reactions to generate a thermodynamically equilibrated library from a pool of building blocks 1 . The reversibility of the dynamic combinatorial library (DCL) allows perturbation of the equilibrium through the addition of a template. Since its inception, DCC has been applied to a variety of fields including self-assembly systems, dynamic polymers and host-guest chemistry, however an exciting application is in the field of drug discovery, whereby a protein is used to self-select a ligand through amplification of the best binding species at the expense of other combinations (see fig. 1).
The constantly expanding toolbox of DCC-compatible reversible reactions 2,3 has helped facilitate the wide range of applications of the technique as a whole, however the number of bio-compatible reactions is limited by the requirement for the reaction to proceed on a reasonable timescale under physiological conditions (temperature, solvent, pH). From this panel of suitable reactions, the chemistry of N-acylhydrazone (NAH) exchange has been well documented and successfully used to identify ligands for a number of different protein targets [4][5][6][7][8] . The reaction between an aldehyde and a hydrazide occurs rapidly at pH 4 with an equilibrium constant and stability that strongly favours product formation 9 . In order to allow the reaction to proceed at a physiologically relevant pH, Bhat et al. applied early work from Jencks on oxime exchange catalysis 10 . By using aniline as a nucleophilic catalyst they showed that their 10-member N-acylhydrazone library equilibrated within a few hours at pH 6.2 5,10,11 . Figure 1. Formation of a DCL and analysis of the blank and protein-templated DCL product distribution by 19

F NMR.
A DCC experiment typically involves 3 main steps: firstly, the DCL is equilibrated under conditions promoting a thermodynamic equilibrium. Secondly, the protein template is added and the DCL is allowed to re-equilibrate establish a new product distribution. Thirdly, the final and arguably most vital step, is the analysis of the DCL in the presence and absence of the template. Various techniques to deconvolute the library mixture have been demonstrated 12 , including HPLC 5,13-15 , mass spectrometry 8,16-18 and 1 H NMR and Saturation Transfer Difference (STD) NMR 7,19,20 , dynamic deconvolution 21 and X-ray crystallography 22 . Recent reviews of this area have highlighted both experimental and theoretical approaches used to analyse DCLs 12,23 .
The main difficulty lies in not perturbing the DCL during the analysis. This has previously been achieved by either chemically or kinetically freezing the exchange reaction. For example, NAH exchange is effectively stopped by increasing the pH to 9. The second hurdle is ensuring that the complete product population is analysed. This problem is exemplified in HPLC analysis of protein-templated DCL, where the protein must be removed from the mixture prior to analysis to prevent column fouling 5 . In the absence of denaturation agents which may perturb the finely balanced protein-templated DCL equilibrium, target binders may remain bound to the protein and are excluded from the DCL analysis. This technique of "ligand fishing" has in fact been demonstrated in the DCC context to find competitive inhibitors of lysozyme 24 .
Analysis by non-denaturing electrospray ionisation mass spectrometry (ESI-MS) has been demonstrated in a number of elegant DCC experiments 8,20,25 , however the technique relies on the protein ionising efficiently under non-denaturing conditions and is therefore not universal. STD-NMR has also been used to identify binders from a DCL, however the technique requires upwards of a 20-fold molar excess of ligand. At such stoichiometry, STD experiments do not leverage the dynamic self-selection qualities central to the DCC concept. Instead, they exploit NAH formation as a facile, one-pot synthetic route to a library of compounds for binding assay. With the exception of STD-NMR, the aforementioned techniques are all destructive in a sense that a portion of the DCL is consumed in the analysis.
Our preliminary HPLC analyses of a protein-templated NAH DCL delivered striking differences in the product distribution depending on the method of protein removal (see fig. S1). To resolve this discrepancy, we set out to employ a non-destructive technique that would allow the DCL composition to be monitored in pseudo real-time.
Fluorine has become a central part of drug discovery, with approximately 25% of all marketed drugs containing at least one fluorine atom 26 . As a bioisostere of hydrogen, its inclusion can help medicinal chemists to modulate the pharmacokinetic, pharmacodynamic and physical binding properties of a compound 26 . Subsequently 19 F NMR has developed into a popular technique for screening fluorine-based fragment libraries, with the ability to screen up to 20 fragments in one experiment. The high gyromagnetic ratio and chemical shift anisotropy of the 19 F nucleus allow for well-resolved signals that relay information on protein binding through signal broadening or chemical shift perturbation. The scarce biological abundance of 19 F allows for background-free spectra, unaffected by protonated buffers and solvents, to be recorded in minutes on a low-field NMR spectrometer. The caveat of 19 F screening is the requirement of individual well-resolved signals. This can be achieved though intelligent library design facilitated by increasingly accurate chemical shift prediction software (e.g. MNova, TopSpin). More complex NMR experiments have been used to resolve overlapping signals, both by 2D homonuclear correlation-ordered experiments ( 1 H-19 F COSY) and pseudo-2D diffusion-ordered ( 19 F DOSY) experiments, where the signals are separated on second axis by diffusion coefficient [26][27][28] . The elegance of 19 F NMR analysis has already been demonstrated in a number of abiotic DCC examples [29][30][31] . This method was also used to monitor binding of a 4-component imine library to a domain of a human -catenin target 32 . Herein, we describe our design and analysis of a protein-templated NAH DCL to further demonstrate the additional advantages of 19 F NMR in DCC for drug discovery.

Results and Discussion
We chose to design a library targeting -ketoacyl-ACP synthase III (FabH) from E. coli, the initial condensing enzyme in bacterial fatty acid biosynthesis (FAS II). Although the active site residues and primary sequence remain highly conserved across both Gram-positive and Gram-negative bacteria, small differences in the binding pocket architecture determine the substrate specificity across bacterial species. This makes FabH a plausible target for novel narrow-spectrum antibiotics 33 . There are currently few known FabH inhibitors, and none that report good in vivo efficacy [34][35][36] . Zhang et al. have recently published a number of reports on the discovery of NAH inhibitors targeting FabH where each compound was prepared individually by organic synthesis. Since each molecule was composed of A-B ring systems joined by an NAH linker it provided the basis around which our dynamic library was designed [37][38][39][40][41] .   The 5-membered proof-of-concept library was based around commercially available fluoro-aromatic aldehyde A as the central core along with five commercially available aromatic hydrazides 1-5 with differing chemical and physical properties (Fig.2). The high degree of conjugation through the system allowed us to observe sufficiently resolved 19 F chemical shifts of individual NAHs despite the differences in chemistry being up to 13 bonds away 42 (see Fig. S2).
To minimise the time required to reach equilibrium, the use of aniline as a nucleophilic catalyst for NAH exchange was introduced by Dawson 11 , and has since been successfully used in many protein-templated DCC experiments 5,6 . From preliminary 1 H-STD-NMR experiments we observed that aniline was binding to FabH and would potentially interfere with the DCL. Issues with aniline were noted by Blanden et al. in their attempts to optimise hydrazone ligation for biomolecular labelling. They identified 4-amino-L-phenylalanine (4-APA) as a suitable replacement for aniline so we investigated if 4-APA could be substituted for aniline in a DCL 43 . We validated that 4-APA has comparable catalytic efficiency to aniline with our NAH DCL, both in the forward reaction ( Fig. 3 and S4) and the reverse reaction (Fig. S5). When a library is prepared from product A5 and hydrazides 3 and 4 (see Fig. 2), the same product distribution is observed as when a library is prepared from hydrazides 3, 4 and 5 and aldehyde A (Fig. S4 and S5). Binding of 4-APA to FabH was not observed by 1 H-STD-NMR.
The DCL was assembled with each of the hydrazides 1-5 and aldehyde A at a final concentration of 200 μM, buffered at pH 6.2 using 50 mM sodium phosphate buffer with 10% DMSO and 3 mM 4-APA. The DCL cocktail was spiked with 200 μM 5fluorouracil as an internal reference and either 200 μM ecFabH or an equivalent volume of buffer. Each library was transferred to an NMR tube and a 19 F spectrum was acquired every two hours over a twelve-hour period. In agreement with the 4-APA characterisation experiment (Fig. 3), the DCL reached equilibrium within 2 hours. Fig. 4a shows the blank library 8 hours after mixing. Notably not all products are at an equal concentration, suggesting the electronic substituents of the ring have an effect on the intrinsic thermodynamic stability of each NAH. Fig. 4b shows the 19 F spectrum of the library templated by the target, ecFabH. Slight signal broadening is observable for all DCL members, suggesting a slow exchange process is occurring between bound and unbound ligand states. Most impressively, the signal broadening and upfield chemical shift perturbation of compound A4 is indicative of the compound being present in its bound state at a relatively higher proportion to the other library members 27 .
It must be noted that there are two opposing forces at play when considering signal integral: on one hand Le Chatelier's principle will strive toward a product distribution proportional to the depth of the energy well of each species, although this is not always the case 2 . Therefore, if product A4 is selected by the protein its relative concentration and corresponding signal integral should increase. Concomitantly, the signal broadening resulting from exchange between chemical shifts corresponding to bound and unbound states of the ligand may present itself as an apparent integral decrease of the bound ligand. Due to the breadth and low intensity of the bound state signal we have as yet been unable to determine the chemical shift of the bound state by 19 F COSY-NMR. This method is therefore not strictly quantitative, but acts as a qualitative indicator of potential binders whose properties can be further characterised by quantitative techniques.
Using individually synthesised DCL members an in vitro assay validated the results from the 19 F NMR DCC experiment. All 5 NAHs showed inhibitory activity at concentrations of 3 mM. Gratifyingly, compound A4 showed the strongest inhibition, causing a 50% decrease in activity at 3 mM compared to the DMSO vehicle control and the FabH inhibitor HR45 as a negative control 44,45 (Fig. S6). That such a weak inhibitor could have been picked out of the NMR analysis is encouraging. It also leads to the question of what effect would a tight nM binder have? Broadening of the signal such that it disappears altogether into the baseline may also be useful in identifying hits by comparison with protein-free controls. We do not seek to claim that the NAH molecules described here may be of therapeutic value, rather we present this as a proof of concept that 19 F NMR analysis can be used to interrogate an NAH DCL derived from a an appropriately F-labelled building block.

Conclusions
In summary, we have demonstrated a non-invasive analysis of a protein-templated DCL by 19 F NMR, using 4-APA as a biologically benign alternative to aniline-catalysed NAH exchange. The results from the DCL agreed well with preliminary inhibition data from an in vitro FabH assay. Screening of much larger compound libraries will no doubt require multiple methods to identify, then validate, hit molecules. A combination of 19 F fragment screening and MS-based methods are complementary analytical tools suitable for protein-templated DCC. Development of such rapid, cost-effective, and universal methods should help DCC become more widely used in hit discovery.

Experimental
Nuclear magnetic resonance (NMR) spectra were recorded at 298 K on Bruker PRO500, AVA400 or AVA500 spectrometers running at 500 MHz ( 1 H spectra), 126 MHz ( 12 C spectra) or 94 MHz ( 19 F spectra). Chemical shift values () are reported in parts per million (ppm) relative to tetramethylsilane (TMS = 0 ppm) and are referenced to the residual solvent peak, or to the signal of internal standard 5-fluorouracil (5FU = -169.19 ppm) in the case of the 19 F-labelled DCL. 1 H NMR data are reported in the format: chemical shift, relative intensity, multiplicity (s = singlet, d = doublet, t = triplet, m = multiplet, br = broad), coupling constant (J value, Hz), and assignment. 13 C NMR data are reported in the format: chemical shift and assignment.

Expression and purification of ecFabH.
The ecFabH/pET-28a construct (4 μL) was transformed into an aliquot (50 μL) of BL21(DE3) cells and set on ice for 25 minutes. The cells were heat shocked at 42 °C for 40 seconds and set back on ice for a further 2 minutes. SOC media (100 μL) was added and the mixture was agitated at 37 °C for 1 hour. The mixture was spread on LB agar (30 μg/mL kanamycin) and incubated overnight at 37 °C. A single transformant was used to inoculate two seed cultures of sterile LB broth (2 x 250 mL, 30 μg/mL kanamycin) and agitated overnight at 37 °C. One of the overnight seed cultures was used to sub-culture sterile LB broth (5 x 500 mL, 30 μg/mL kanamycin) to an OD600 of 0.1. The cultures were agitated at 37 °C until the OD600 reached 0.6, at which point expression was induced by addition of IPTG (final conc. 0.1 mM). Cells were harvested after a further 3 hours at 30 °C and subsequently stored at -20 °C.
N-terminal histidine-tagged ecFabH was purified at 4 °C by Ni-affinity chromatography followed by size exclusion chromatography. The BL21 (DE3) cell pellet expressing ecFabH was resuspended in lysis buffer (30 mL, 20 mM Tris-HCl pH 7.6, 300 mM NaCl, 5 mM imidazole) and lysed for 15 minutes with rounds of 30 second of sonication followed by 30 seconds of rest. The cell lysate was clarified by centrifugation (18,000 g, 30 minutes, 4 °C) and the cell-free extract was injected onto a HisTrap 5 mL (GE Healthcare) Ni 2+ -affinity chromatography column pre-equilibrated in lysis buffer. The column was washed with lysis buffer (5 CV) before the histidine-tagged protein was eluted using a gradient (0-100%) of lysis buffer to elution buffer (20 mM Tris-HCl pH 7.6, 300 mM NaCl, 400 mM imidazole) over 20 CV. Each elution fraction was analysed by SDS-PAGE, and the fractions containing His-tagged ecFabH were pooled, and concentrated to a volume less than 5 mL. ecFabH was further purified by size exclusion chromatography (HiLoad Superdex 200 16/60, GE Healthcare) with an isocratic elution of mobile phase buffer (20 mM Tris-HCl pH 7.6, 100 mM NaCl, 10% glycerol) at 1 mL/min over 120 minutes. ecFabH eluted at approximately 70 minutes and the most concentrated fractions were pooled and flash frozen and stored at -80 °C.

Please do not adjust margins
Please do not adjust margins DCL conditions. The library was assembled with each of the hydrazides 1-5 and aldehyde A at a final concentration of 200 μM, buffered at pH 6.2 using 50 mM sodium phosphate with 50% D2O, 10% DMSO, 50 mM NaCl and 3 mM 4-APA. The library cocktail also contained 200 μM 5-fluorouracil as a noncompeting internal reference and either 200 μM ecFabH or an equivalent volume of enzyme purification buffer. Each library was transferred to an NMR tube and a 19 F spectrum (94 MHz, 512 scans, T1 = 1 second) was acquired every two hours over a twelve-hour period. 19 F NMR pulse sequence. A threemember library was assembled from N-acylhydrazones A1, A3 and A5 (200 μM each) in sodium phosphate buffer (50 mM, D2O, pH 6.2), 5-fluorouracil (internal standard, 200 μM) and a total of 10% DMSO. A series of 19 F NMR experiments were conducted with 512 scans and relaxation times of 1, 2, 3 or 4 seconds. The integral of the signals corresponding each N-acylhydrazone relative to the internal standard were compared at different relaxation times to determine the required relaxation time for the DCL 19 F NMR experiment. No difference in relative signal integral was apparent between all T1 intervals, therefore a T1 of 1 second was used in the DCL experiments. General procedure for the synthesis of N-acylhydrazones A1-A5. 2-Fluoro-5-formylbenzoic acid (50 mg, 0.30 mmol, 1.0 eq.) and the hydrazide (0.33 mmol, 1.1 eq.) were dissolved in ethanol (2 mL). A few drops of glacial acetic acid were added and the mixture was stirred overnight at room temperature. The solid formed was collected by vacuum filtration and washed in diethyl ether (5 mL) and water (5 mL

Conflicts of interest
There are no conflicts of interest to declare.