Open Access Article
Alex G.
Waterson
ab,
Brian D.
Lehmann
c,
Zhenwei
Lu
d,
John L.
Sensintaffar
d,
Edward T.
Olejniczak
d,
Bin
Zhao
d,
Tyson
Rietz
d,
William G.
Payne
d,
Jason
Phan
d and
Stephen W.
Fesik
*abd
aDepartment of Pharmacology, Vanderbilt University School of Medicine, Nashville, Tennessee, USA. E-mail: stephen.fesik@Vanderbilt.Edu
bDepartment of Chemistry, Vanderbilt University, Nashville, Tennessee, USA
cDepartment of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
dDepartment of Biochemistry, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
First published on 2nd October 2025
Heterobifunctional molecules that induce targeted degradation have emerged as powerful tools in chemical biology, target validation, and drug discovery. Despite their promise, the field is constrained by the relative paucity of ligands available for E3 ligases. Expanding the ligand repertoire for E3 ligases and other components of ubiquitin-proteasome system could significantly broaden the scope of the targeted degradation field. In this study, we report the identification of ligands for non-essential E3 ligases that are preferentially expressed in cancer tissues relative to normal tissues. Using a protein-observed NMR-based fragment screen, an ideal technique for this purpose, we identified fragment ligands and characterized their binding modes by X-ray crystallography. These ligands represent promising starting points for further optimization toward the discovery of tumor-selective degraders that may enhance the therapeutic window targeting proteins for which inhibition or degradation is associated with systemic toxicity.
Further expansion of the field would be enabled by the identification of ligands for additional E3 ligases that are capable of supporting PROTAC-induced degradation. This expansion could facilitate the targeting of previously inaccessible target proteins, overcome resistance mechanisms associated with popular E3 ligases and their ligands, and enable tissue-specific degradation.2,3,12–14 Given these potential advantages, the exploration of the ligandability of E3 ligases and validation of the suitability of these ligases for TPD is an active area of ongoing research.15
A particularly compelling, yet largely unrealized, opportunity in TPD is the ability to degrade a disease-associated protein selectively in affected tissues while sparing its function in healthy tissues. For example, one could envision degrading an oncogenic protein in tumor cells while sparing that protein in non-cancerous tissues, thereby widening the therapeutic window and minimizing associated toxicities. One strategy to achieve tumor-specific degradation would involve identification of E3 ligases that are highly expressed in tumors but minimally expressed in normal tissues. If ligands to such ligases could be identified, then PROTACs derived from them may induce selective degradation in cancer cells. This concept has been demonstrated by the PROTAC DT2216, which recruits VHL to degrade Bcl-xL.16 Since VHL is expressed at low levels in platelets, DT2216 spares Bcl-xL functions in these cells, mitigating the thrombocytopenia associated with Bcl-xL inhibitors.17
Here, we report a systematic analysis of E3 ligases expression patterns and identify ligases that are overexpressed in cancer cell lines relative to normal tissues at the protein level. Furthermore, we demonstrate the utility of using NMR-based fragment screening for identifying ligands for these ligases and employ X-ray crystallography to determine the structural basis for fragment binding to the ligases. This study provides a framework for the application of fragment-based discovery to expand the repertoire of ligands that bind to E3 ligases.
To identify E3 ligases with restricted expression patterns in cancer, we analyzed RNA-seq gene expression data from two cohorts. We merged raw count gene expression data from 11
057 tumors spanning 20 cancer types (TCGA) and 17
382 normal samples from 30 tissue sites (GTEx, obtained from rapid autopsies) using Ensembl gene identifiers. This merged dataset was then normalized to read depth, scaled (e.g., by a factor of 10
000 and normalized to a total count of 10
000 per sample), and log transformed. Differentially expressed genes were identified using the Wilcoxon rank-sum test (Table S2).
We analyzed the pooled data sets to identify E3 ligases that showed differential E3 ligase expression in tumor versus normal tissue samples. Fig. 1 shows log differences in the expression profiles on the X-axis, demonstrating that several E3 ligases are indeed significantly enriched in tumors compared to normal tissues.
Some E3 ligases are likely essential for cellular function. The emergence of resistance to drugs acting on the non-essential ligase CRBN has been reported and shows no dependency based on gene expression levels or cancer type.19 However, inhibition of the function of an as-yet not validated ligase from the E3-ligase recruiting portion of a PROTAC could result in significant toxicity in multiple tissues. We evaluated E3 ligase essentiality from publicly available CRISPR knockout screens (Broad Institute's Achilles and Sanger Institute's SCORE projects). This was done by averaging gene effect scores (DepMap) across 1365 cell lines.20 Gene effect scores are normalized such that nonessential genes have a median score of 0, and common essential genes had a median score of −1. The Y-axis of Fig. 1 rank-orders the E3 ligases by their essentiality. Essential E3 ligases were highly expressed in cancer and enriched in ubiquitin/proteasome pathways in addition to cell cycle and DNA repair pathways (SKP2, CUL2, ANAPC11, BRCA1, DDB1, MDM2, PRPF19, RNF4, BARD1, RBX1, and MNAT1) (Fig. 1), similar to prior findings reported in the E3 atlas.4
We aimed to identify candidate ligases with restricted expression profiles that would enable degradation of oncogenic targets while minimizing the toxicity risk from inhibition or degradation of that target in normal tissues. An ideal tumor-specific degrader would incorporate a ligand to a ligase that is both highly expressed in tumor tissue and minimally expressed in healthy tissues, such as those in the upper right of Fig. 1. Among the ligases commonly used in the field for PROTACs are CRBN and VHL (Fig. 1, red). CRBN shows no differential expression in this analysis, lying very close to 0 on the X-axis. Thus, induction of tumor-specific degradation is not expected from CRBN-based PROTACs. VHL, while exhibiting some tumor-specific expression (positioned to the right on the X-axis), is considered essential based on CRISPR screen data (positioned near −1 on the Y-axis), and thus use of VHL-derived PROTACs may increase the chances of toxicities in normal tissues.
Our objective is to identify ligands to the E3 ligases of interest in the upper right of Fig. 1. To narrow the candidate list, we conducted a literature review to determine whether a given ligase had been previously shown to ubiquitinate and/or induce proteasomal degradation of a specific substrate. Ligases lacking such validation were excluded from consideration. We also prioritized ligases amenable to protein-observed NMR fragment screening, a technique we planned to use for the identification of hits. Thus, the availability of robust protocols for high-yield protein expression in E. coli was a final selection criterion for ligases of interest. Based on these filters, we identified two ligases for follow-up: sasitas B-lineage lymphoma c (CBL-c) and tumor necrosis factor (TNF) receptor-associated factor 4 (TRAF-4), both of which were found to be non-essential in the DepMap scoring (Fig. S1).
To demonstrate the differential expression of our candidate ligases, we plotted the distribution of mRNA expression across all normal tissues (GTEx) and all cancers (TCGA) for CBL-c and TRAF-4, alongside CRBN (Fig. 2). The similar shapes and widest portions of the density curves for CRBN in both normal and tumor tissues suggest comparable expression distributions. In contrast, both CBL-c and TRAF-4 exhibit higher overall mRNA expression in tumor samples compared to normal tissues. These results are consistent with a prior similar analysis, described in the E3 atlas.4,21 Most normal tissues do not express CBL-c, whereas a substantial proportion of cancers show detectable expression (Fig. S2). TRAF-4 is expressed at low levels across many normal tissues but shows elevated expression in various cancers (Fig. S2), suggesting a potential therapeutic window for selective targeting.
Both ligases have been previously shown to ubiquitinate substrates. The CBL family comprises well-characterized RING E3 ligases involved in modulating receptor tyrosine kinase signaling.22 Specifically, CBL-c has been shown to ubiquitinate EGFR,23 supporting its validated ligase activity. Similarly, TRAF-4, which contains an N-terminal RING finger domain,24 has been reported to ubiquitinate Smurf2,25 CHK1,26 and IRS-1.27
Based on our mRNA expression analysis, we hypothesize that both ligases could be utilized in PROTACs to selectively degrade a target protein in cancers while sparing a given target in non-transformed tissues.
Given this differential expression, a PROTAC utilizing a CBL-c-recruiting ligand could potentially degrade oncogenic targets in tumors while sparing those in normal tissues. For instance, CBL-c is minimally expressed in cardiovascular (CV) tissues and blood vessels, suggesting utility in targeted degradation of proteins associated with CV toxicity (e.g., Bcl-xL29). To confirm these findings, we validated the specificity of the CBL-c antibody and expanded our analysis to include additional CV-related tissues. As shown in Fig. 3B (see also Fig. S7–S9), siRNA-mediated knockdown of CBL-c confirmed antibody specificity, and further western blot analysis of heart and platelet samples corroborated the low expression of CBL-c in CV tissues.
Using 1H–15N SOFAST HMQC NMR, we screened our in-house custom fragment library comprising 13
824 compounds. Screening for fragment binding was first performed using pooled mixtures of 12 fragments, each at a concentration of 800 μM. Individual compounds from mixtures that showed shifts were subsequently analyzed separately to determine the specific fragment responsible for inducing the observed shifts. Several compounds were found to induce significant shifts in the CBL-c HMQC spectrum. As shown in Fig. 4C and D, different shift patterns were observed with the fragment hits. In some cases, the shift patterns matched to portions of that induced by the peptide fragment, but other shifts were more distinct, suggesting a binding site outside the peptide-binding location.
In total, 15 fragment hits showing strong chemical shifts were obtained. This relatively modest number of hits establishes the ligandability of the TKB domain of CBL-c but also indicates that this is a relatively challenging protein domain for small molecule binding. We acquired structurally related analogs of the chemical templates represented in the hit set by mining a combination of internal and commercial sources. These analogs were also evaluated by NMR to assess their binding to CBL-c, helping to clarify structure–activity relationships (SAR) and expanding the pool of validated binders. Fragment hits and analogs displaying strong shifts were titrated against the protein by NMR, and binding affinities were determined by analyzing the dose-responsive shift magnitudes.
The hit compounds clustered into three chemical series, all containing a carboxylic acid. Compound 1 exemplifies an anthranilic acid scaffold (Fig. 5). While the acid moiety was conserved across all compounds, the extension from the aniline nitrogen tolerated additional variation, with lipophilic alkyl groups and aromatic moieties generally demonstrating enhanced affinity. Compound 5 exhibited the highest affinity in this series, with a dissociation constant of 430 μM. Sub-millimolar affinities were also observed from a narrow set of quinoline acid-based hits, with some tolerance for aromatic ring substitutions at the 6-position (7a–d, Table 1).
![]() | ||
| Fig. 5 Anthranilic acid fragments that bind to CBL-c. KD values assessed by titration of induced protein shifts over at least 5 doses of compound. | ||
Additionally, a series of indole acids demonstrated binding to CBL-c with affinities as low as 400 μM (Table 2). In this series, substitutions at multiple positions of the indole ring system retained binding to the protein, with 6-position showing the best affinity values (e.g., 8c).
We used X-ray crystallography to determine the binding location and orientation of the hit compounds. We successfully obtained a high resolution (1.8 Å) structure of compound 1 bound to the TKB domain of CBL-c. Consistent with the NMR shift pattern observed (Fig. 4C), this compound binds at an allosteric site distinct from the EGFR peptide binding site22 (Fig. 6A). Within this allosteric pocket, 1 is anchored by interactions between the carboxylic acid group and the backbone NH of Ser76 and two bound waters, one bridging to the backbone NH of Ser 76 and the other interacting with Arg 62 (Fig. 6B). The anthranilic acid aromatic ring is stabilized by a weak π-stacking interaction with His 147 and van der Waals contacts with Pro 70. The amide NH of 1 forms an internal hydrogen bond with the carboxylic acid of the compound, orienting the thiophene moiety into a hydrophobic subpocket. This subpocket likely accommodates the side chains of other active compounds such as 5 and 6. Notably, a charged subpocket above Arg 62 (Fig. 6B and C) remains unoccupied by 1 and represents a promising avenue for the identification of higher affinity future analogs. It is plausible that the substitutions on the quinoline acids 7b–c or the indole acids 8b–e may extend into this region, offering potential templates for optimization. It is noteworthy that the sequence homology within this region of the CBL family is low, suggesting that the compounds binding to the allosteric site, and derivatives, may selectively bind to CBL-c.
![]() | ||
| Fig. 6 X-ray structure of compound 1 bound to CBL-c. (A) Overlay of the structure of 1 with the structure of phospho-EGFR peptide (PDB ID: 3VRP). 1 is shown in blue sticks, while the peptide substrate from 3VRP is shown in orange. (B) Close-up view of 1 bound to the allosteric site on CBL-c, with surface representation. (C) Binding interactions of 1 in the allosteric site. Likely hydrogen bonding inactions shown as dashed blue lines. Bound waters are represented as red spheres. | ||
TRAF-4 is unique among TRAF family members in that it does not interact with TNF receptors.33 Its expression profile is also distinct, as it has been reported to be overexpressed in several cancers, including breast, lung, ovary, prostate, and colon.33 This expression patten is reflected in observed transcriptomic differences (Fig. S2B). TRAF-4 exhibits high mRNA expression in most tumor samples compared to heart, brain, skin, and several other critical normal tissues. To validate TRAF-4 expression at the protein level, we used Western blotting on protein lysates from cancer cells and tissue samples. Our results demonstrate generally high expression of TRAF-4 in representative cancer cell lines compared to samples of normal tissues (Fig. 7A, see also Fig. S10–S12). As with CBL-c, extended our analysis to a panel of cardiovascular tissue samples and confirmed antibody specificity with siRNA-mediated knockdown of TRAF-4 (Fig. 7B, see also Fig. S13–S15). Notably, protein-level expression differences between tumor cells and normal tissues were more pronounced than those observed at the mRNA level. These findings suggest that TRAF-4 may also be viable for selective protein degradation. For example, the difference in TRAF-4 expression between lung cancer and heart tissue suggests a potential therapeutic window for TPD by minimizing the potential for cardiac toxicity.
824-membered library inducing strong shifts in the NMR spectrum. These hits produce shift patterns that overlap with those observed with the GPIbβ-derived peptide (compare Fig. 8A and B), suggesting that they bind within the narrow substrate recognition groove of TRAF-4.26
To expand upon the initially identified fragment hits, we conducted a limited SAR expansion mining internally available compounds and acquiring additional analogs from commercial sources. In total, this effort yielded two primary hit series, along with fragments lacking close structural analogs. While many of the compounds displayed relatively weak binding to TRAF-4, compounds with affinities below 500 μM were identified in both series. Representative compounds and their NMR-derived affinities are shown in Fig. 9 and 10 as well as Table 3. For compounds without titration data, binding strength was inferred from the magnitude of the chemical shift perturbations induced by compounds at 800 μM.
![]() | ||
| Fig. 9 Quinoxaline-like fragment hits that bind to TRAF-4. KD values assessed by titration of induced protein shifts over at least 5 doses of compound. | ||
![]() | ||
| Fig. 10 Singleton fragment hits for TRAF-4. KD values assessed by titration of induced protein shifts over at least 5 doses of compound. | ||
The thienopyrimidine-based compounds (Table 3) demonstrated the strongest binding to TRAF-4 and appear highly chemically tractable for future medicinal chemistry optimization. However, the template closely resembles known kinase inhibitors, raising concerns about potential off-target effects and selectivity. As a result, we focused our structural studies on elucidating the binding mode of the quinoxaline template.
We successfully obtained the X-ray structure of both compound 9 and a peptide fragment derived from EGFR which has also been shown to associate with TRAF-435 (residues 1198 to 1207, LRVAPQSSEF) in complex with the TRAF-C domain. As predicted from the NMR shift pattern, 9 occupies the peptide recognition groove of TRAF-4 (Fig. 11A). The compound binds in close proximity to Tyr 366 (Fig. 11B and C) and is anchored by hydrogen bonds to the backbone carbonyl and NH of Gly 433 and Gly 435, respectively. Additionally, the thiazole nitrogen likely participates in an interaction with the side chain of Asn 355. Based on the smaller shifts induced by compounds with changes to the thiazole moiety (shown in Fig. 9), we infer that both the interaction with Asn 355 and van der Waals contacts with adjacent residues contribute significantly to the binding affinity. These findings suggest that the affinity could be improved further by structure-guided modifications of this thiazole as well as extensions from the quinoxaline.
Raw count reads for 11
057 tumors spanning 20 cancer types from TCGA (GDC-PANCAN.htseq_counts.tsv) were downloaded (https://xenabrowser.net/). Details of alignment parameters can be found at https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline/. Normal RNA-seq raw counts (GTEx_Analysis_2017-06-05_v8) for 17
382 tissues from rapid autopsies were obtained from the GTEx portal (https://gtexportal.org/). Raw counts from TCGA and GTEx were merged by ensemble gene id, normalized to read depth, scaled (10
000) and log transformed. Differentially expressed genes were identified with the Van Elteren test,36 a stratified version of non-parametric Wilcoxon rank-sum test. All statistical tests were performed in R (version 4.4).
The recombinant plasmids were transformed into electrocompetent BL21-Gold (DE3) E. coli strain and the bacterial cells were cultured in either Luria-Bertani broth or M9 minimal media containing 15NH4Cl for isotopic labeling. The media was supplemented with 50 μg mL−1 kanamycin or 100 μg mL−1 ampicillin and growth was carried out with shaking at 37 °C until the optical density at 600 nm reached 0.8. Protein expression was induced by the addition of 0.5 mM IPTG and incubated for 20 h at 16 °C and 18 h at 18 °C for TRAF-4 and CBL-c, respectively. The cell pellet was harvested by centrifugation at 5000g for 15 minutes, re-suspended in lysis buffer (TRAF-4: 50 mM Na2HPO4, 10 mM KH2PO4 pH 7.4, 300 mM NaCl, 5 mM BME, and 1 mM PMSF, CBL-c: 25 mM Tris, pH 8.0, 500 mM NaCl, 5 mM BME, 10 mM imidazole, 100 μM PMSF), and lysed using the APV-2000 homogenizer (SPX flow) in 3 cycles of 400–900 bar. Cell lysate was clarified by centrifugation at 15
000g for 45 minutes, filtered, and loaded onto a HisTrap FF column (Cytiva). The column was washed with 10 column volumes of buffer A (TRAF-4: 50 mM Na2HPO4, 10 mM KH2PO4 pH 7.4, 300 mM NaCl, 20 mM imidazole, 5 mM BME, CBL-c: lysis buffer minus PMSF) and His6-tagged TRAF-4 protein was eluted from the column using a linear gradient from 0 to 100% of buffer B (TRAF-4: 50 mM Na2HPO4, 10 mM KH2PO4 pH 7.4, 300 mM NaCl, 500 mM imidazole, 5 mM BME, CBL-c: buffer A plus 500 mM imidazole) over 10 column volumes. His6-tagged CBL-c protein was eluted in 3 steps of linear gradients from 0 to 6% B in 6 column volumes (CV), washing for 3 CV's, then to 50% B in 6 CV's, washing for 3 CV's, then to 100% B and washing for 4 CV's. The fractions containing TRAF-4 (residues 292–466) and CBL-c were incubated with Thrombin and TEV protease, respectively, overnight to remove the His6-tag and dialyzed against Buffer A without imidazole. For CBL-c, NaCl concentration was reduced to 300 mM in the dialysis buffer. The tag-free protein was purified from the reaction mix by reverse Ni-NTA affinity and concentrated with an Amicon Stirred Cell (MilliporeSigma). Truncated TRAF-4 and CBL-c proteins were exchanged into an optimized buffer with low ionic strength (25 mM sodium phosphate pH 7.4, 25 mM NaCl, 1 mM DTT) and (25 mM HEPES pH 7.0, 100 mM NaCl, 5 mM BME), respectively for NMR-based fragment screen. For X-ray crystallography, TRAF-4 (residues 292–466) was further purified by size-exclusion chromatography (HiLoad 26/600, Superdex 75 pg, Cytiva) using Buffer C (20 mM Tris-HCl, pH 7.4, 300 mM NaCl, 3 mM DTT, 0.01% NaN3). CBL-c was further purified by SEC on the same column using NMR buffer and concentrated to 1 mg mL−1 for NMR and 2 mg mL−1 for crystallization. Protein concentration was quantified by Pierce 660 nm assay (ThermoFisher).
824 compounds was screened as mixtures of 12 fragments prepared in twelve 96-well plates. Each NMR sample was made of 25 μM of 15N-labeled truncated TRAF-4 or 40 μM of 15N-labeled CBL-c, 800 μM of each fragment, and 5% DMSO-d6 for spectrometer locking in 5 mm-diameter NMR tubes. Screening hits from the fragment mixtures were identified by comparing the chemical shifts of backbone resonances to a ligand-free protein spectrum and then deconvoluted by screening individual fragments to determine which component of the mixture was responsible for inducing the shift perturbations.
The ligases selected for this study were prioritized based on expression profiling using a filtering and selection strategy applied to publicly available datasets. Both CBL-c and TRAF-4 exhibit elevated protein expression levels in many cancer cell lines relative to normal tissues. This differential expression suggests that PROTACs built from optimized ligands to the proteins could achieve more potent and selective degradation of target proteins in cancerous tissues while minimizing effects in normal tissues.
To validate this concept, several follow-up activities would be required. Most prominently, higher affinity ligands are needed for incorporation into PROTACs that can subsequently be used to confirm the ability of these ligases to participate in selective TPD. The discovery of higher affinity ligands to each ligase to enable PROTAC construction would require a medicinal chemistry optimization campaign that can be guided by the X-ray crystal structures of the hit fragments bound to respective proteins. Vectors to grow into additional unused space in the binding pockets are already evident with the structures obtained. Furthermore, since the hits share common shift patterns that indicate a high probability of a shared binding location, application of features of one hit template to another would produce improved compounds whose tighter binding would be revealed by examination with NMR or other biophysical or biochemical assays as appropriate. Given the ligandability demonstrated in our work, we anticipate that ligands based on the identified fragment hits and with an affinity appropriate for use in PROTACs could be obtained for both ligases.
The crystal structure of fragment hit 1 bound to CBL-c reveals solvent-exposed vectors extending from the anthranilic acid ring, which would also be required for developing linker attachment points for a PROTAC. The allosteric binding site occupied by this fragment poses an intriguing question for the use of future optimized versions of these compounds for PROTACs. As most PROTACs used in TPD bind to their respective E3 ligase substrate recognition pockets, the impact of allosteric binding to this domain on inducing, but not inhibiting, protein ubiquitylation mediated by CBL-c remains an unanswered question.
TRAF-4 presents a different set of challenges compared to CBL-c. TRAF-4 has previously been shown to mediate protein degradation in a biological context.3 The structure of compound 9 bound to TRAF-4 also reveals solvent-exposed vectors, particularly from the quinoxaline ring system, that may serve as a viable site for PROTAC linker attachment in optimized analogs. While these compounds bind to the substrate recognition groove, the unique architecture of the TRAF-4 family, specifically the spatial separation between the TRAF-C domain and the RING domain enforced by the TRAF-N domain, raises questions about the influence of this structural arrangement on the ability to achieve efficient degradation using a PROTAC bound to the TRAF-C domain of TRAF-4.
Taken together, our findings highlight a useful application of fragment-based screening and open new avenues for expanding the scope of PROTACs and TPD strategies.
Expression data: the E3 ligase and accessory genes were compiled from several existing resources (Cell Signaling, hUbiquitome database, UbiProt Database, and DUDE v. 1.0, available at (https://esbl.nhlbi.nih.gov/Databases/KSBP2/Targets/Lists/E3-ligases)). Raw count reads for 11
057 tumors spanning 20 cancer types from TCGA (GDC-PANCAN.htseq_counts.tsv) were downloaded (https://xenabrowser.net). Details of alignment parameters can be found at https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline. Normal RNA-seq raw counts (GTEx_Analysis_2017-06-05_v8) for 17
382 tissues from rapid autopsies were obtained from the GTEx portal (https://gtexportal.org). Genetic dependency: gene effect scores derived from CRISPR knockout screens published by Broad's Achilles and Sanger's SCORE projects (DepMap 22Q2 Public) were downloaded from the DepMap portal (https://depmap.org). Crystallography: atomic coordinates and structure factors for CBL-c and TRAF-4 complexed with fragments or peptides can be accessed in the Protein Data Bank via the following accession codes: 9OGW, compound 1, 9OGV, compound 9, 9OLB, EGFR peptide respectively. The authors will release the atomic coordinates upon article publication.
| This journal is © The Royal Society of Chemistry 2025 |