A Paal–Knorr agent for chemoproteomic profiling of targets of isoketals in cells

Natural systems produce various γ-dicarbonyl-bearing compounds that can covalently modify lysine in protein targets via the classic Paal–Knorr reaction. Among them is a unique class of lipid-derived electrophiles – isoketals that exhibit high chemical reactivity and critical biological functions. However, their target selectivity and profiles in complex proteomes remain unknown. Here we report a Paal–Knorr agent, 4-oxonon-8-ynal (herein termed ONAyne), for surveying the reactivity and selectivity of the γ-dicarbonyl warhead in biological systems. Using an unbiased open-search strategy, we demonstrated the lysine specificity of ONAyne on a proteome-wide scale and characterized six probe-derived modifications, including the initial pyrrole adduct and its oxidative products (i.e., lactam and hydroxylactam adducts), an enlactam adduct from dehydration of hydroxylactam, and two chemotypes formed in the presence of endogenous formaldehyde (i.e., fulvene and aldehyde adducts). Furthermore, combined with quantitative chemoproteomics in a competitive format, ONAyne permitted global, in situ, and site-specific profiling of targeted lysine residues of two specific isomers of isoketals, levuglandin (LG) D2 and E2. The functional analyses reveal that LG-derived adduction drives inhibition of malate dehydrogenase MDH2 and exhibits a crosstalk with two epigenetic marks on histone H2B in macrophages. Our approach should be broadly useful for target profiling of bioactive γ-dicarbonyls in diverse biological contexts.

Synthetic chemistry methods have been increasingly underscored by their potential to be repurposed as biocompatible methods for both chemical biology and drug discovery. The most-known examples of such a repurposing approach include the Staudinger ligation 1 and the Huisgen-based click chemistry. 2 Moreover, bioconjugation of cysteine and lysine can be built upon facile chemical processes, 3 while chemoselective labelling of other polar residues (e.g., histidine, 4 methionine, 5 tyrosine, 6 aspartic and glutamic acids 7,8 ) requires more elaborate chemistry, thereby offering a powerful means to study the structure and function of proteins, even at a proteome-wide scale.
The classical Paal-Knorr reaction has been reported for a single-step pyrrole synthesis in 1884. 9,10 The reaction involves the condensation of g-dicarbonyl with a primary amine under mild conditions (e.g., room temperature, mild acid) to give pyrrole through the intermediary hemiaminals followed by rapid dehydration of highly unstable pyrrolidine adducts (Fig. S1 †).
Interestingly, we and others have recently demonstrated that the Paal-Knorr reaction can also readily take place in native biological systems. [11][12][13] More importantly, the Paal-Knorr precursor g-dicarbonyl resides on many endogenous metabolites and bioactive natural products. 14 Among them of particular interest are isoketals 15 (IsoKs, also known as g-ketoaldehydes) which are a unique class of lipid derived electrophiles (LDEs) formed from lipid peroxidation (Fig. S2 †) 16 that has emerged as an important mechanism for cells to regulate redox signalling and inammatory responses, 17 and drive ferroptosis, 18 and this eld has exponentially grown over the past few years. It has been well documented that the g-dicarbonyl group of IsoKs can rapidly and predominantly react with lysine via the Paal-Knorr reaction to form a pyrrole adduct in vitro (Fig. 1). 15 Further, the pyrrole formed by IsoKs can be easily oxidized to yield lactam and hydroxylactam products in the presence of molecular oxygen (Fig. 1). These rapid reactions are essentially irreversible. Hence, IsoKs react with protein approximately two orders of magnitude faster than the most-studied LDE 4-hydoxynonenal (4-HNE) that contains a,b-unsaturated carbonyl to generally adduct protein cysteines by Michael addition (Fig. S3 †). 15 Due to this unique adduction chemistry and rapid reactivity, IsoKs exhibit intriguing biological activities, including inhibition of the nucleosome complex formation, 19 high-density lipoprotein function, 20 mitochondrial respiration and calcium homeostasis, 21 as well as activation of hepatic stellate cells. 22 Furthermore, increases in IsoK-protein adducts have been identied in many major diseases, 23 such as atherosclerosis, Alzheimer's disease, hypertension and so on.
Despite the chemical uniqueness, biological signicance, and pathophysiological relevance of IsoKs, their residue selectivity and target proles in complex proteomes remain unknown, hampering the studies of their mechanisms of action (MoAs). Pioneered by the Cravatt group, the competitive ABPP (activity-based protein proling) has been the method of choice to analyse the molecular interactions between electrophiles (e.g., LDEs, 24 oncometabolites, 25 natural products, 26,27 covalent ligands and drugs [28][29][30] ) and nucleophilic amino acids across complex proteomes. In this regard, many residue-specic chemistry methods and probes have been developed for such studies. For example, several lysine-specic probes based on the activated ester warheads (e.g., sulfotetrauorophenyl, STP; 31 Nhydroxysuccinimide, NHS 32 ) have recently been developed to analyse electrophile-lysine interactions at a proteome-wide scale in human tumour cells, which provides rich resources of ligandable sites for covalent probes and potential therapeutics. Although these approaches can also be presumably leveraged to globally and site-specically prole lysine-specic targets IsoKs, the reaction kinetics and target preference of activated esterbased probes likely differ from those of g-dicarbonyls, possibly resulting in misinterpretation of ABPP competition results. Ideally, a lysine proling probe used for a competitive ABPP analysis of IsoKs should therefore possess the same, or at least a similar, warhead moiety. Furthermore, due to the lack of reactive carbonyl groups on IsoK-derived protein adducts, several recently developed carbonyl-directed ligation probes for studying LDE-adductions are also not suitable for target proling of IsoKs. 33,34 Towards this end, we sought to design a "clickable" gdicarbonyl probe for proling lysine residues and, in combination with the competitive ABPP strategy, for analysing IsoK adductions in native proteomes. Considering that the diversity of various regio-and stereo-IsoK isomers 15 (a total of 64, Fig. S2 †) in chemical reactivity and bioactivities is likely attributed to the substitution of g-dicarbonyls at positions 2 and 3, the "clickable" alkyne handle needs to be rationally implemented onto the 4-methyl group in order to minimize the biases when competing with IsoKs in target engagement. Interestingly, we reasoned that 4-oxonon-8-ynal, a previously reported Paal-Knorr agent used as an intermediate for synthesizing fatty acid probes 35 or oxa-tricyclic compounds, 36 could be repurposed for the g-dicarbonyl-directed ABPP application. With this chemical in hand (herein termed ONAyne, Fig. 2A), we rst used western blotting to detect its utility in labelling proteins, allowing visualization of a dose-dependent labelling of the proteome in situ (Fig. S4 †). Next, we set up to incorporate this probe into a well-established chemoproteomic workow for site-specic lysine proling in situ ( Fig. 2A). Specically, intact cells were labelled with ONAyne in situ (200 mM, 2 h, 37 C, a condition showing little cytotoxicity, Fig. S5 †), and the probelabelled proteome was harvested and processed into tryptic peptides. The resulting probe-labelled peptides were conjugated with both light and heavy azido-UV-cleavable-biotin reagents (1 : 1) via Cu I -catalyzed azide-alkyne cycloaddition reaction (CuAAC, also known as click chemistry). The biotinylated peptides were enriched with streptavidin beads and photoreleased for LC-MS/MS-based proteomics. The ONAynelabelled peptides covalently conjugated with light and heavy tags would yield an isotopic signature. We considered only those modied peptide assignments whose MS1 data reected a light/heavy ratio close to 1.0, thereby increasing the accuracy of these peptide identications. Using this criterium, we applied a targeted database search to prole three expected  probe-derived modications (PDMs), including 13 pyrrole peptide adducts (D273.15), 77 lactam peptide adducts (D289.14), and 557 hydroxylactam peptide adducts (D305.14), comprising 585 lysine residues on 299 proteins ( Fig. S6 and S7 †). Among them, the hydroxylactam adducts were present predominately, since the pyrrole formed by this probe, the same as IsoKs, can be easily oxidized when being exposed to O 2 . This nding was in accordance with a previous report where the pyrrole adducts formed by the reaction between IsoK and free lysine could not be detected, but rather their oxidized forms. 37 Regardless, all three types of adducts were found in one lysine site of EF1A1 (K387, Fig. S8 †), further conrming the intrinsic relationship among those adductions in situ.
State-of-the-art blind search can offer an opportunity to explore unexpected chemotypes (i.e., modications) derived from a chemical probe and to unbiasedly assess its proteomewide residue selectivity. 38,39 We therefore sought to use one of such tools termed pChem 38 to re-analyse the MS data (see Methods, ESI †). Surprisingly, the pChem search identied three new and abundant PDMs ( Fig. 1 and Table S1 †), which dramatically expand the ONAyne-proled lysinome (2305 sites versus 585 sites). Overall, these newly identied PDMs accounted for 74.6% of all identications ( Fig. 2B and Table S2 †). Among them, the PDM of D287.13 ( Fig. 1 and S7 †) might be an enlactam product via dehydration of the probe-derived hydroxylactam adduct. The other two might be explained by the plausible mechanism as follows (Fig. 1). The endogenous formaldehyde (FA, produced in substantial quantities in biological systems) reacts with the probe-derived pyrrole adduct via nucleophilic addition to form a carbinol intermediate, followed by rapid dehydration to a fulvene (D285.15, Fig. S7 †) and immediate oxidation to an aldehyde (D301.14, Fig. S7 †). In line with this mechanism, the amount of FA-derived PDMs was largely eliminated when the in vitro ONAyne labelling was performed in the FA-less cell lysates ( Fig. 2B and Table S3 †). Undoubtedly, the detailed mechanisms underlying the formation of these unexpected PDMs require further investigation, and so does the reaction kinetics. Regardless, all main PDMs from ONAyne predominantly target the lysine residue with an average localization probability of 0.77, demonstrating their proteome-wide selectivity (Fig. S9 †).
Next, we adapted an ABPP approach to globally and site-specically quantify the reactivity of lysine towards the gdicarbonyl warhead through a dose-dependent labelling strategy (Fig. 3A) that has been proved to be successful for other lysine-specic probes (e.g., STP alkyne). 31 Specically, MDA-MB-231 cell lysates were treated with low versus high concentrations of ONAyne (1 mM versus 0.1 mM) for 1 h. Probe-labelled proteomes were digested into tryptic peptides that were then conjugated to isotopically labelled biotin tags via CuAAC for enrichment, identication and quantication. In principle, hyperreactive lysine would saturate labelling at the low probe concentration, whereas less reactive ones would show concentration-dependent increases in labelling. For fair comparison, the STP alkyne-based lysine proling data were generated by using the same chemoproteomic workow. Although 77.5% (3207) ONAyne-adducted lysine sites can also be proled by STP alkyne-based analysis, the former indeed has its distinct target-prole with 930 lysine sites newly identied ( Fig. S10 and Table  S4 †). Interestingly, sequence motif analysis with pLogo 40 revealed a signicant difference in consensus motifs between ONAyne-and STP alkyne-targeting lysines (Fig. S11 †).
Moreover, we quantied the ratio (R 1 mM:0.1 mM ) for a total of 2439 ONAyne-tagged lysines (on 922 proteins) and 17904 STP alkyne-tagged lysines (on 4447 proteins) across three biological replicates ( Fig. S12 and Table S5 †). Strikingly, only 26.7% (651) of quantied sites exhibited nearly dose-dependent increases (R 1 mM:0.1 mM > 5.0) in reactivity with ONAyne, an indicative of dose saturation (Fig. 3B and C). In contrast, such dose-dependent labelling events accounted for >69.1% of all quantied lysine sites in the STP alkyne-based ABPP analysis. 31 This nding is in accordance with the extremely fast kinetics of reaction between lysine and g-dicarbonyls (prone to saturation). Nonetheless, by applying 10-fold lower probe concentrations, overall 1628 (80.2%) detected lysines could be labelled in a fully concentration-dependent manner with the median R 10:1 value of 8.1 (Fig. 3B, C, S12 and Table S5 †). Next, we asked whether the dose-depending quantitation data (100 mM versus 10 mM) can be harnessed to predict functionality. By retrieving the functional information for all quantied lysines from the Uni-Prot Knowledgebase, we found that those hyper-reactive lysines could not be signicantly over-represented with annotation ( Fig. S12 †). Nonetheless, among all quantied lysines, 509 (25.1%) possess functional annotations, while merely 2.5% of the human lysinome can be annotated. Moreover, 381 (74.8%) ONAyne-labelled sites are known targets of various enzymatic post-translational modications (PTMs), such as acetylation, succinylation, methylation and so on (Fig. S13 †). In contrast, all known PTM sites accounted for only 59.6% of the annotated human lysinome. These ndings therefore highlight the intrinsic reactivity of ONAyne towards the 'hot spots' of endogenous lysine PTMs.
The aforementioned results validate ONAyne as a t-forpurpose lysine-specic chemoproteomic probe for competitive isoTOP-ABPP application of g-dicarbonyl target proling. Inspired by this, we next applied ONAyne-based chemoproteomics in an in situ competitive format (Fig. 4A) to globally prole lysine sites targeted by a mixture of levuglandin (LG) D 2 and E 2 , two specic isomers of IsoKs that can be synthesized conveniently from prostaglandin H 2 (ref. 41) (Fig. S2 †). Specifically, mouse macrophage RAW264.7 cells (a well-established model cell line to study LDE-induced inammatory effects) were treated with 2 mM LGs or vehicle (DMSO) for 2 h, followed by ONAyne labelling for an additional 2 h. The probe-labelled proteomes were processed as mentioned above. For each lysine detected in this analysis, we calculated a control/treatment ratio (R C/T ). Adduction of a lysine site by LGs would reduce its accessibility to the ONAyne probe, and thus a higher R C/T indicates increased adduction. In total, we quantied 2000 lysine sites on 834 proteins across ve biological replicates. Among them, 102 (5.1%) sites exhibited decreases of reactivity towards LGs treatment (P < 0.05, Table S6 †), thereby being considered as potential targets of LGs. Notably, we found that different lysines on the same proteins showed varying sensitivity towards LGs (e.g., LGs targeted K3 of thioredoxin but not K8, K85 and K94, Table S6 †), an indicative of changes in reactivity, though we could not formally exclude the effects of changes in protein expression on the quantied competition ratios. Regardless, to the best of our knowledge, the proteomewide identication of potential protein targets by IsoKs/LGs has not been possible until this work.
We initially evaluated MDH2 (malate dehydrogenase, mitochondrial, also known as MDHM), an important metabolic enzyme that possesses four previously uncharacterized liganded lysine sites (K157, K239, K301 and K329, Fig. 4B) that are far from the active site (Fig. S14 †). We found that LGs dramatically reduced the catalytic activity of MDH2 in RAW264.7 cells (Fig. 4C), suggesting a potentially allosteric effect. We next turned our attention to the targeted sites residing on histone proteins, which happen to be modied by functionally important acetylation, including H2BK5ac (Fig. 4B) that can regulate both stemness and epithelial-mesenchymal transition of trophoblast stem cells. 42 We therefore hypothesized that rapid adduction by LGs competes with the enzymatic formation of this epigenetic mark. Immunoblotting-based competitive ABPP conrmed that LGs dose-dependently blocked probe labelling of H2B (Fig. 4D). Further, both western blots and immunouorescence assays revealed that LG treatment decreased the level of acetylation of H2BK5 (average R C/T ¼ 1.3, P ¼ 0.007) in a concentration-dependent manner ( Fig. 4E and F). Likewise, a similar competitive crosstalk was observed between acetylation and LG-adduction on H2BK20 (average R C/T ¼ 1.2, P ¼ 0.01) that is required for chromatin assembly 43 and/or gene regulation 44 (Fig. 4B and S15 †). Notably, these ndings, together with several previous reports by us and others about histone lysine ketoamide adduction by another important LDE, 4-oxo-2-noenal, 11,45,46 highlight again the potentially important link between lipid peroxidation and epigenetic regulation. In addition to the targets validated as above, many other leads also merit functional studies considering diverse biological or physiologic effects of LGs in macrophages.

Conclusions
In summary, we have developed a lysine-specic ABPP probe ONAyne that represents a unique addition to the 'arsenal' for studying LDEs. Unlike activated ester-based lysine probes, 28,29 ONAyne offers an interesting lysine-specic chemistry to yield diverse chemotypes in situ, particularly regarding the reaction of its pyrrole adduct with endogenous FA. Combined with a competitive ABPP strategy, ONAyne enables us to greatly expand the target spectrum of LGs in RAW264.7 cells. Projecting forward, we envisioned several interesting pursuits with the ONAyne probe that should further address fundamental questions about the MoAs of IsoKs. First, whether and how the regiochemistry and/or stereochemistry of IsoKs lead to distinct electrophile-protein interactions in complex proteomes. To this end, the same chemoproteomic approach described herein (Fig. 4A) offers a convenient target proling tool for assessing and comparing the competitive lysine-binding of individual IsoK isomers in cells, although here we admit that this effort is not likely to be soon forthcoming, depending on the availability of 64 enantiomerically pure chemicals. Second, whether IsoKderived lysine adduction is a dynamic process in cells. This question would be presumably addressed by ONAyne-based quantitative chemoproteomics using an established 'recovery' setting. 11,47,48 If yes, discovering an enzymatic mechanism that can afford de-modication will be a task even more technically challenging. Finally, what are the cell-state-specic targets of IsoKs in the more physiologically relevant contexts such as ferroptosis 18 and inammatory immune-activation? 49 The pursuit of the answer to this question may also offer opportunities for basic and translation research purposes. More generally, our approach can also be applied to study many other bioactive g-dicarbonyls, 14,50-52 such as dopamine-derived dicatecholaldehyde, natural products (e.g., Ophiobolin A, polygodial, rearranged spongian diterpenes), and reactive metabolites of furan-containing xenobiotics.

Data availability
The MS data sets have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identiers PXD028270.

Author contributions
M. R. W. and J. Y. He performed most of experiments and analyzed data. J. X. He performed the pChem search. K. K. L. performed bioinformatic analysis. J. Y. conceived the project, supervised the work, analyzed data and the wrote the manuscript with inputs from others.

Conflicts of interest
There are no conicts to declare.