L. K.
Petersen
a,
P.
Blakskjær
a,
A.
Chaikuad
b,
A. B.
Christensen
a,
J.
Dietvorst
a,
J.
Holmkvist
a,
S.
Knapp
bc,
M.
Kořínek
d,
L. K.
Larsen
a,
A. E.
Pedersen
e,
S.
Röhm
c,
F. A.
Sløk
a and
N. J. V.
Hansen
*a
aVipergen ApS, Gammel Kongevej 23A, DK-1610 Copenhagen V, Denmark. E-mail: nha@vipergen.com
bStructural Genomic Consortium, University of Oxford, Old Road Campus Research Building, Roosevelt Drive, Headington, Oxford, OX3 7DQ, UK
cInstitute for Pharmaceutical Chemistry and Buchmann Institute for life sciences, Johann Wolfgang Goethe-University, Max-von-Laue-Str. 9, D-60438 Frankfurt am Main, Germany
dAPIGENEX s.r.o., Poděbradská 173/5, 190 00 Prague, Czech Republic
eThe Faculty of Health and Medical Sciences, Department of Immunology and Microbiology, University of Copenhagen, Blegdamsvej 3B, DK-2200 Copenhagen N, Denmark
First published on 22nd June 2016
A highly specific and potent (7 nM cellular IC50) inhibitor of p38α kinase was identified directly from a 12.6 million membered DNA-encoded small molecule library. This was achieved using the high fidelity yoctoReactor technology (yR) for preparing the DNA-encoded library, and a homogeneous screening technique – the binder trap enrichment technology (BTE). Although structurally atypical to other kinase blockers, this inhibitor was found by X-ray crystallography to interact with the ATP binding site and provide strong distortion of the P-loop. Remarkably, it assumed an alternative binding mode as it lacks key features of known kinase inhibitors such as typical hinge binding motifs. Interestingly, the inhibitor bound assuming a canonical type-II (‘DFG-out’) binding mode by forming hinge hydrogen bonds with the backbone, showed excellent shape complementarity, and formed a number of specific polar interactions. Moreover, the crystal structure showed, that although buried in the p38α active site, the original DNA attachment point of the compound was accessible through a channel created by the distorted P-loop conformation. This study demonstrates the usability of DNA-encoded library technologies for identifying novel chemical matter with alternative binding modes to provide a good starting point for drug development.
The advent of DNA-encoded libraries signifies a new era of combinatorial libraries and high-throughput screening. The first era in the 1990s of combinatorial chemistry based libraries and HTS did not meet the high expectation of accelerating the drug discovery process.3 However, important lessons were learned including the importance of physical/chemical properties of the compounds, purity of synthesized compounds, and assay fidelity. Technologies based on DNA encoding utilize the powerful means to record and store information and in principle control systems on a single molecule level. Further advantages are low protein and library consumption, single tube format, low requirements for instrumentation, and no requirement for pre-set assay conditions for screening. However, the most important driver for its increasing popularity is that it is tabbing into the unprecedented development in DNA sequencing.4
Historically, the identification of target binding compounds from DNA-encoded libraries have in most cases involved immobilization of the target protein to matrices such as sepharose or paramagnetic beads, either via a chemical crosslinking to the functionalized surface, or to immobilized streptavidin. The target is exposed to the library to allow binding to occur, either before or after the target is immobilized to the matrix. The matrix is then washed multiple times to reduce non- and low-affinity binders. The remaining material is finally eluted, and the DNA is PCR amplified and analysed by DNA sequencing. This approach has in many cases been successful, but features some challenges inherent to heterogeneous assays, such as target denaturation associated with binding to a surface, background originating from binding of library molecules to the matrix, as well as the necessity of the washing steps, which are difficult to control during the procedure. Therefore, multiple rounds of selection are normally called for in order to raise the signal above the background, necessitating the use of high amounts of library as well as rigorous data analysis routines to distinguish binders to the target from binders to the matrix.
The challenges related to immobilization of the target to a matrix have led to the development of homogeneous assays, where the target interacts with the library in solution and no washing steps are required. In some of these techniques, the attachment of nucleic acids onto the protein facilitates or stabilizes intramolecular duplex formation with the nucleic acid on the library molecules, and enables the detection. This principle has been exploited employing primer extension5,6 or, DNA ligation by an internally encoded ribozyme,7 thus encoding the binding event in PCR-amplifiable nucleic acid.
In yet another approach, the binding event between the target protein and a DNA-linked small molecule compound tethers a photo-reactive complementary oligonucleotide to the target. The oligonucleotide is subsequently cross-linked which stabilizes the DNA duplex, and protects the compound encoding DNA strand against nuclease degradation, ultimately enabling detection.8 These promising approaches were all shown to be functional in model studies. It will be interesting to follow their performance for de novo discoveries from complex DNA-encoded libraries.
Here is utilized a recently developed homogeneous screening assay, the binder trap enrichment (BTE), which traps binding complexes consisting of a target protein and a library member in minuscule water droplets in a water-in-oil emulsion. The underlying mathematical principle is simple as many more droplets than target protein molecules are formed. If a library member is bound to the target, it will consistently end up in a droplet together with a target molecule, whereas a non-bound library member will only do so by chance. Consequently, a binder will be observed as a frequent event in a background of the random low frequent event.
Mitogen-activated-protein kinases (MAP kinases) are crucial in transducing extracellular signals that regulate a variety of cellular responses such as proliferation, gene expression, differentiation, cell survival, and apoptosis.9 Many MAP kinase inhibitors including p38 inhibitors have been developed, but unlike inhibitors for proteins that regulate MAP kinase signalling, these have not been approved for clinical use.10 p38 MAP kinases are one of three families of MAP kinases and p38α (MAPK14) is particularly involved in the regulation of pro-inflammatory cytokines11 such as TNF-α, IL-1 and IL-6 whereas the function of the three other isoforms, p38β (MAPK11), p38γ (MAPK12) and p38δ (MAPK13) have not been in the focus of drug development.12 p38 MAP kinase is activated by cellular stress factors such as inflammatory cytokines, lipopolysaccharides (LPS), ultraviolet light, and various growth factors.13 p38α have previously been used as target for interrogation of DNA encoded libraries; in one case, a known binding motif was incorporated in a library of triazines14 and in another case a smaller library of macrocycles was used.15 The findings were based on split-pool and DNA-templated library synthesis, respectively.
Using p38α as an interesting drug target, we here demonstrate feasibility of BTE to screen a DNA-encoded yoctoReactor library, which resulted in the identification of selective single digit nanomolar inhibitors of p38α directly from a naïve library of 12.6 million compounds in a single round of selection.
In the first step of the BTE, equilibrium binding between the library members and the protein target was established. Then, the binding mixture was diluted which disturbed the equilibrium, and the kinetics was now dominated by off-rates. Next, a water-in-oil emulsion was formed by combining the aqueous solution with an oil-phase and shaking for one minute (Fig. 1, steps A–C).20,21 Thus, a successful binding event between a library member and target protein causes consistent entrapment of both within the same droplet (Fig. 1, step C). Next, the target and library DNA were ligated inside the droplet to record the co-trapping event (Fig. 1, step D). Then, the emulsion was disrupted by organic extractions22 and the material recovered. The DNA was amplified by PCR where only ligation products comprise two PCR priming sites and were thus amplified exponentially (Fig. 1, step E). Finally, the DNA was subjected to DNA sequencing, and the DNA codes decoded to compounds, which were counted. Compounds observed many times are target binders (hits) whereas compounds observed only a few times are dominantly results of the random co-trapping events (background).
![]() | ||
Fig. 1 Binder trap enrichment (BTE) technique. A) Equilibrium established between DNA encoded library and DNA tagged protein. B) Dissociation phase triggered by dilution of binding mixture into high volume. (C) After tdiss, the emulsion is formed by shaking water and oil. Binders and protein target, i.e. also target DNA and library DNA, are now trapped within droplets. D) Target DNA and library DNA are now ligated. E) DNA ligation product is isolated and amplified by PCR. F) PCR product is sequenced, decoded to compound, and counted.19 |
A mathematical threshold was applied to separate hits from background. This was conveniently done as the number of observations based on random entrapment represents a linear decay line on a logarithmic scale. Having the number of emulsion droplets in great excess over the number of protein target molecules ensured that random trapping of non-target-bound library members with a protein target molecule became very unlikely. Although random entrapment does occur, this leads only to low number of observations.
The signal plot of the decoded and counted DNA sequencing data (Fig. 2A) shows the distribution between number of compounds (y-axis) and how many times each were observed (x-axis). When going from low to high number of observations, the curve has two phases: an initial linear decay phase, and a later tail phase. The initial linear decay phase (up to compounds observed 4 times) is dominated by the random co-trapping events, and is thus discarded (light grey spheres). Compounds observed five or more times are likely results of target binding (green spheres) and thus provide 236 primary hits. For simplicity reasons, the 97 most abundant hits (observed ten or more time) were further analyzed. The actual number of observations of a library member in DNA sequencing does not necessarily correlate with potency. At least three factors control how often it is observed: 1) initial frequency in the library, 2) binding affinity to target protein, and 3) the off-rate of complex with target protein.
To cluster the hits into related series, a pairwise calculation of Tanimoto similarities was performed.23 This 97 × 97 symmetrical matrix was sorted using best neighbour method, and several series were observed in the similarity-based heat map as islands along matrix diagonal (Fig. 2B). As indicated, series 1 (62 hits) comprised the main fraction of hits followed by series 2–4 (7, 4, and 8 hits respectively). The remaining 16 hits were singletons or only minor groups with three or fewer members. The hits in series 1, 2, and 4 could be generalized into the structure in Fig. 2C; the building block closest to the original DNA attachment was an amino acid carrying a lipophilic side chain, the second building block was a cyclic amino acid – aliphatic or aromatic – with a 1,4 orientation.
The third building block was various heteroaromatics with pyrazoles dominating (series 1) followed by pyridines (series 2) and thiazoles (series 4). Series 3 hits showed same distal pyrazole building blocks as series 1 but were distinguished by linkage via a diamine to a sulfonamide containing building block closest to DNA attachment.
To validate hits identified by DNA sequencing, twenty four compounds were chosen for resynthesis (off-DNA) from both series and singletons. Assaying compounds enzymatically revealed that the majority (22 of 24) were indeed inhibitors of p38α. This validated the observed signal and demonstrated a low false positive rate (see Table S1‡ for structures and deduced biochemical IC50 values). Fig. 2B (right column) shows the relationship between hit series identified from DNA sequencing and enzyme inhibition: most potent inhibitors were obtained from major series 1, but singletons were also proven effective. Most potent inhibitor (VPC00628) showed 17 observations in signal plot from DNA sequencing underlining the limited correlation between number of observations and potency.
To demonstrate capability of cell penetration and biological function, a set of inhibitors were further subjected to cellular assay. TNF-α secretion in human monocytic cell line (THP-1) was shown to be completely suppressed by inhibitors in nanomolar concentrations. Three example inhibitors are shown in Fig. 2D together with dose response curves and deduced IC50. The most potent (VPC00628) with 7 nM cellular IC50 was selected for further investigations.
![]() | ||
Fig. 3 Crystal structure of p38α in complex with VPC00628. A) Detailed interactions of the inhibitor within the kinase. Potential hydrogen bonds are shown in magenta dashed lines, and water molecules shown in cyan spheres. B) Schematic illustration of the key interactions and structural alterations upon accommodation of the inhibitor within the kinase type-I and type-II pockets. C) Structural comparison between the binding modes of VPC00628 and other type-II inhibitors (PDB IDs shown in brackets), revealing high similarity of overall p38α pocket except the P-loop conformation which highly distorted upon the accommodation of VPC00628. D) Chemical structures of the compared inhibitors. PDB accession code “5LAR”. |
![]() | ||
Fig. 4 TREEspot™ interaction map for VPC00628 inhibitor was tested at fixed concentration (2 μM) against 99 kinases.27 Red spheres: kinases inhibited to <35% of control; green spheres: >35% kinase activity remains at 2 μM. |
This is interesting in respect to the ongoing focusing and filtration of classical compound collections, whereby compounds with unexpected chemical characteristics may be removed prior to HTS screening to both 1) improve the expected hit-rate and 2) to screen fewer compounds and thereby decrease costs. In contrast, by using DNA-encoded libraries, the cost and consumption of goods is low and vast compound collections can be screened in a single tube format once the library is prepared. From libraries like Lib022, we have successfully identified modulators of enzymes, protein–protein interaction targets, and epigenetic targets. Due to the built-in diversity of the library, no preconceived target knowledge is required. Secondly, by exploiting a low false-positive rate and a highly reproducible selection technique in BTE, compounds with interesting properties were identified using a highly efficient approach. In the present case, with synthesis and assay of only 24 primary hits from a total library size of 12.6 million compounds, the work load to access new chemical matter was drastically reduced. X-ray crystallography proofed an alternative binding mode of the inhibitor and good correlation between biochemical and cellular inhibitory data was verified.
In conclusion, it was shown that a strong and selective inhibitor was identified from a naïve combinatorial DNA-encoded small molecule library in a cost-effective manner. In the process, no optimization of compound properties was performed and a novel binding mode to the kinase was observed. Based on the present work, new molecular modifications can be envisioned. As the library members are all highly modular, optimizing compounds for further development is readily achieved.
Footnotes |
† The authors declare no competing interests. |
‡ Electronic supplementary information (ESI) available: Synthesis and analysis of yoctoReactor library, hit re-synthesis and assay data, and X-ray data. See DOI: 10.1039/c6md00241b |
This journal is © The Royal Society of Chemistry 2016 |