Glycosylated cyclophellitol-derived activity-based probes and inhibitors for cellulases

Cellulases and related β-1,4-glucanases are essential components of lignocellulose-degrading enzyme mixtures. The detection of β-1,4-glucanase activity typically relies on monitoring the breakdown of purified lignocellulose-derived substrates or synthetic chromogenic substrates, limiting the activities which can be detected and complicating the tracing of activity back to specific components within complex enzyme mixtures. As a tool for the rapid detection and identification of β-1,4-glucanases, a series of glycosylated cyclophellitol inhibitors mimicking β-1,4-glucan oligosaccharides have been synthesised. These compounds are highly efficient inhibitors of HiCel7B, a well-known GH7 endo-β-1,4-glucanase. An elaborated activity-based probe facilitated the direct detection and identification of β-1,4-glucanases within a complex fungal secretome without any detectable cross-reactivity with β-d-glucosidases. These probes and inhibitors add valuable new capacity to the growing toolbox of cyclophellitol-derived probes for the activity-based profiling of biomass-degrading enzymes.


Introduction
Activity-based protein profiling (ABPP) is a powerful technique used to detect and identify active enzymes using modified covalent inhibitors. Following early successes in the detection and identification of proteases, 1 this technique has been extended to glycoside hydrolases using a variety of strategies, 2,3 including labelled cyclophellitol derivatives. 4 Though often considered synonymous with cellulases, b-1,4glucanases are enzymes which recognize b-1,4-linked glucan chains which are characteristic of both the cellulosic and hemicellulosic (i.e. mixed-linkage glucans, xyloglucans, and glucomannans) fractions of plant biomass. 5 The catalytic actions of a variety of retaining b-1,4-glucanases contribute to the breakdown of lignocellulosic polysaccharides. 6,7 The efficient and specific profiling of b-1,4-glucanases is thus a valuable tool in the study of biomass-degrading organisms.
Efforts have been made to profile b-1,4-glycanases using different ''warhead'' chemistries. Activity-based probes (ABPs) based on a 2,4-dinitrophenyl 2-deoxy-2-fluoro-b-xylobioside/ cellobioside modified with an affinity tag at the 4' position proved effective probes for retaining b-D-glucanases and b-xylanases with particularly good specificity. 2,8,9 However, the slow hydrolysis of the enzyme-probe complex and the weak initial binding of these probes necessitated the use of high probe concentration (B0.5-1 mM) and time-limited labelling for ABPP. Later experiments using difluoromethylphenyl glycosides and N-haloacetyl glycosylamines demonstrated a unique capacity to label inverting glycoside hydrolases, but suffered from significant non-specific labelling. 3 Cyclophellitol is an inhibitor of b-D-glucosidases originally isolated from the Phellinus mushroom. 10 This cyclitol is an isostere of a glucoside where the acetal group is replaced by an epoxide. Taking advantage of the catalytic machinery of a retaining glycoside hydrolase, this epoxide undergoes an acid-catalysed ring opening addition to form a non-hydrolysable ester in place of the normal glycosylenzyme intermediate, irreversibly inactivating the enzyme. 11 ABPs built around synthetic cyclitols, having configurations which target aand b-D-glucosidases, 12-14 b-D-glucuronidases, 15 b-D-xylosidases, 16 aand b-D-galactosidases, 17,18 and a-L-arabinofuranosidases, 19 among others, have consistently been shown to covalently modify the catalytic nucleophiles of cognate retaining glycosidases. These cyclophellitol-derived ABPs generally bind with good specificity, high affinity, and complete irreversibility.
Recent work has shown that cyclophellitol derivatives can be glycosylated, enabling the development of inhibitors and probes which react specifically with endo-glycanases. 16 This was first demonstrated with the development of an inhibitor and a collection of ABPs for b-1,4-xylanases. Being built around a xylobiose core with an alkylated aziridine warhead, these probes proved to be potent covalent inhibitors of GH10 b-1,4xylanases, but showed cross-reactivity with b-D-xylosidases when applied to the direct detection of b-xylanases within fungal secretomes. This cross-reactivity was traced back to the internal hydrolysis of the probe by the action of b-D-xylosidases.
Building on this understanding, here we report the development of cyclophellitol-derived ABPs designed to target b-1,4glucanases, some of the most abundant glycoside hydrolases in nature. To detect and profile these enzymes, a collection of 4-O substituted (carbohydrate numbering) cyclophellitols have been synthesised and tested for their ability to covalently modify HiCel7B, a well-known endo-b-1,4-glucanase. Through biochemical, structural, and mass spectrometric analyses, we have identified a potent substrate-mimicking probe architecture which shows resistance to hydrolysis by exo-glucosidase and endo-glucanase activities within a fungal secretome.

Synthesis of b-D-glucanase inhibitors and probe
Chemical synthesis details and compound analysis is reported in the ESI. †

Testing b-D-glucanase inhibitors and probes
HiCel7B was a kind gift from Martin Schülein (sadly now deceased) at Novozymes A/S (Lyngby, Denmark). The pH-activity profile of HiCel7B acting on 4-methylumbelliferyl b-D-cellobioside (4MU-GG) was measured by combining 5 mL of 1 mg mL À1 HiCel7B in 200 mM 2 : 7 : 7 succinate-phosphate-glycine (SPG) buffer prepared at various pH values with 45 mL of 0.1 mM substrate in quadruplicate. The reactions were incubated at 25 1C for 30 minutes prior to the addition of 5 mL of 1 M Na 2 CO 3 , transfer of 50 mL to a black 384-well plate and fluorescence measurement (l ex = 360 nm, l em = 450 nm). Rates were determined using a 4-methylumbelliferone calibration series prepared in 0.1 M Na 2 CO 3 .
Intact MS of HiCel7B bound to different inhibitors was performed according to McGregor et al. 19 Briefly, enzyme was diluted to 0.1 mg mL À1 (B2.2 mM) in 20 mM sodium phosphate buffer pH 7. Compounds 1, 5, or 13 were added to a final concentration of 5 mM and incubated at 25 1C. Samples taken at 1 hour were diluted with 4 volumes of 1% formic acid, 10% acetonitrile and analysed. Additional experiments with 5 were performed using different concentrations of inhibitor and enzyme as indicated in the text.
Inhibition kinetics for compounds 1, 5, and 13 acting on HiCel7B were measured at 25 1C using 4MU-GG following a method described previously. 19 Briefly, enzyme was diluted in 50 mM sodium phosphate buffer pH 7. Substrate was dissolved in DMSO to give a 10 mM stock which was diluted with ultrapure water. Inhibitors were dissolved in and diluted with ultrapure water with the exception of 13 which was dissolved in DMSO to give a 5 mM stock which was diluted with ultrapure water. The enzyme and substrate concentrations used in the continuous inhibition assays were 10 ng mL À1 (B220 pM) and 50 mM, respectively. The K M value for the interaction of HiCel7B with 4MU-GG under the assay conditions (corrected for inner filter effect) was measured to be 76 mM (Fig. S1, ESI †) and this was used as a correction factor to determine the K I values in Table 1 from the apparent K I determined from fitting of k app vs.

Enzyme crystallisation, diffraction, and structure solution
Deglycosylated HiCel7B was desalted into 20 mM pH 8 Tris-HCl buffer and concentrated to 12 mg mL À1 using a 30 kDa MWCO centricon. Building on previous reports, 20,21 crystallisation conditions were re-screened using the PACT Suite and AmSO4 Suite (Qiagen) crystallisation screens. High quality tetragonal bipyramidal crystals grew consistently at 20 1C from a 2 : 1 mixture of protein solution:well solution, where well solution was 0.15 M sodium citrate, 0.8 M ammonium sulfate, 1 M lithium sulfate (Fig. S2, ESI †). Crystal soaks were performed in a solution composed of 0.1 mM ligand in mother liquor for 5 hours at 20 1C prior to transfer into mother liquor supplemented with 20% glycerol and cryo-cooling in LN 2 .
Crystals were diffracted at Diamond Light Source (Harwell, UK) on beamline I03 at a wavelength of 0.9762 Å and automatically processed using the Xia2 22 pipeline with 3dii. Computation was carried out using programs from the CCP4 suite 23 unless otherwise stated. All crystal structure figures were generated using Pymol (Schrodinger). Data collection and processing statistics for all structures are given in Table S1 (ESI †).
Data for HiCel7B bound to compound 1 were collected to 1.88 Å. Data were also collected out to 1.2 Å in a higher space group (P4 2 2 1 2) following a soak with 13, though the structure was found to be unliganded. The structure of 1-bound HiCel7B was solved in the P4 1 22 space group by molecular replacement using Phaser 24 with the known structure (PDBID: 2A39) as the search model. Ligand 1 was built using the existing restrains for b-D-glucose (BGC) and cyclophellitol (YLL) with Coot, 25 and structures were refined by alternating rounds of manual model building and density refinement using Coot and REFMAC5 26 respectively.

In-gel fluorescence
The pH-labelling profile of HiCel7B reacting with the probe was measured by combining 10 mL of 1 mg mL À1 HiCel7B in 200 mM SPG buffer prepared at various pH values with 10 mL of 10 mM 14 in quadruplicate. The reactions were incubated at 25 1C for 10 minutes prior to the addition of 8 mL of 4Â SDS-PAGE loading dye and heating to 95 1C for 2 minutes. 10 mL of each reaction was separated on a 10% SDS-PAGE gel prior imaging using the Cy5 laser/filter settings on a Typhoon 5 scanner (GE Healthcare). Bands were integrated using ImageQuant (GE Healthcare). Secretome staining was performed using two aliquots of 20 mL xylan-grown Aspergillus niger secretome (day 4 samples prepared as described previously 16 ). To each was added 5 mL of 0.5 M pH 5 McIlvane buffer. To one was then added 5 mL of 60 mM 14 and to the other was added 5 mL of 60 mM 19 (10 mM final ABP concentration). These were incubated for 30 minutes at 37 1C. The reactions were then split in two and one half (15 mL) was diluted with 5 mL of water. The other half (15 mL) of the reaction with 19 was then supplemented with 5 mL of 40 mM 14 and the other half of the reaction with 14 was supplemented with 5 mL of 40 mM 19 (10 mM final ABP concentration). The reactions were incubated for a further 30 minutes at 37 1C before being diluted with 8 mL of 4Â SDS-PAGE loading dye, separated on a 4-20% SDS-PAGE gel (Bio-Rad) and imaged for fluorescence using the Cy2 and Cy5 laser/filter settings on a Typhoon 5 scanner (GE Healthcare).
Biotin-avidin enrichment proteomic analysis 5 aliquots of 1 mL of xylan-grown Aspergillus niger secretome were thawed from À80 1C, centrifuged at 12 000 Â g for 15 minutes to remove particulate and combined. 0.5 mL of this was then subsampled into 9 separate Lo-Bind 2.0 mL tubes. To three tubes was added 55 mL of 1 mM compound 1 in ultrapure water, and to the rest 55 mL of ultrapure water. All samples were incubated for 1 hour at 37 1C prior to the addition of 60 mL of compound 15 in 10% DMSO to the samples treated with 1 and three of the samples not treated with 1. 10% DMSO was added to the remaining three samples. All of the samples were incubated for a further 2 hours at 37 1C. Proteins were denatured by heating to 95 1C for 5 minutes following the addition of 70 mL of 10Â denaturing buffer (500 mM Na-HEPES, pH 7.5, 50 mM DTT, 5% SDS). Once cooled to RT, thiols were alkylated by the addition of 70 mL of 0.25 M iodoacetamide and incubation in the dark for 30 minutes. Samples were transferred to 5 mL Eppendorf tubes and proteins were precipitated by the addition of 3.2 mL of chilled acetone followed by incubation at À20 1C for 1 hour. Proteins were collected by centrifugation (14 000 Â g for 1 minute) and the supernatant was discarded. The pellet was washed with 3 mL of cold acetone and air-dried. The pellet was then resuspended in 50 mL of 8 M urea, 10 mM HEPES, pH 7.2 and diluted with 150 mL of 0.05% SDS in phosphate-buffered saline (PBS). This was shaken overnight at 20 1C to dissolve. The samples were then diluted with a further 200 mL of 0.05% SDS in PBS and centrifuged to collect any insoluble residue. The supernatant was transferred to a 2 mL Eppendorf tube and mixed with 25 mL of Pierce Avidin Agarose beads (Thermo Fisher Scientific) which had been washed twice with PBS. Following 3 hours of mixing by inversion, beads were collected by centrifugation for 2 minutes at 2500 Â g. The supernatant was removed and the beads were washed with 500 mL of 0.5% SDS in PBS once, 500 mL of 2% SDS at 65 1C for 10 minutes once, 500 mL of 0.5% SDS in PBS again, then 500 mL of 2 M urea followed by two washes with 500 mL of ultrapure water. The beads were finally resuspended in 20 mL of on-bead digestion buffer and trypsinisation, StageTip desalting, LC-MS/MS data acquisition, and data processing were performed as described previously. 27 Peptides were identified by searching against a database of A. niger NRRL3 proteins 28 supplemented with streptavidin, avidin, yeast enolase and trypsin. The combined search results were filtered for a minimum of two unique peptides with a false-discovery rate of 4%. Label-free quantification was performed using Progenesis QI (Waters). Following chromatographic alignment, peaks were integrated and assigned. Protein abundance was estimated using the integrated intensity of nonconflicting peptides. Results of this analysis for all identified proteins can be found in Table S2 (ESI †).
To gain access to cellobiose configured ABPs 4-deoxy-4-azidothioglucoside donor 9 was synthesized. The methods are similar to a published synthesis of 4-deoxy-4-fluoro-thioglucoside donors. 30 The axial 4-OH of partially protected methyl a-D-galactopyranoside 6 was activated as a triflate and substituted by sodium azide leading to 7. Acid-catalyzed displacement of the anomeric methoxy group afforded anomeric acetate 8. Introduction of the anomeric thiophenol yielded donor 9.
The glycosylation reaction was improved, compared to that employed in the inhibitor synthesis. Application of a preactivation protocol (Tf 2 O/Ph 2 SO) circumvents the use of relatively high temperatures and long reaction times required to activate this type of donor using NIS/TfOH. It also allows the activation of the donor to take place without the presence of the acid-labile epoxide. Disaccharide 10 was obtained in 64% yield without the use of a large excess of donor. Unreacted acceptor (2) was also recovered indicating the stability of the epoxide functionality under these conditions. Increasing the amount of donor (9) led to diminished yield and complex mixtures. This was presumably due to the reaction of the epoxide in the product with the excess activated donor. 31 Following the synthesis of disaccharide 10, the benzoyl esters were removed with NaOMe affording 11. Staudinger reduction of the azide followed by benzyl removal under Birch conditions afforded fully deprotected 12.
Azide-terminated triethylene glycol t-butyl ester 16 32 was deprotected using trifluoroactic acid and DIC/DMAP mediated RSC Chemical Biology Paper esterification with pentafluorophenol afforded activated ester 17. The amine in 12 was selectively acylated with 17, yielding probe 13 following semi-preparative HPLC purification. Cy5-labeled probe 14 was obtained after copper catalyzed click reaction of 13 with Cy5 alkyne. Biotin-labeled probe 15 was synthesized in one step from 12 by amide bond formation with biotin-terminated spacer 18, obtained from 16 in 3 steps. BODIPY green-labeled b-glucosidase probe 19 was obtained by methods developed for the previously reported BODIPY red variant using BODIPY green alkyne. 33,34 Testing potential cellulase inhibitors and probes with HiCel7B Humicola insolens Cel7B (HiCel7B) was chosen as a model b-D-glucanase since it is well-characterized, has good hydrolytic performance with chromogenic substrates, can be readily crystallised, and has been studied in our lab previously. 20,21,35 Compound 1 proved to be an efficient covalent inhibitor of HiCel7B, with a k i /K I of 450 M À1 s À1 (Fig. 1 and Table 1). Intact MS confirmed complete, single labelling after 60 minutes at 25 1C (Fig. 2A). These kinetics compare favourably with the reported requirement to incubate F. oxysporum EG I with 8.25 mmol of 3,4-epoxybutyl b-D-cellobioside for 3 hours at 40 1C to achieve complete inhibition. 36 The addition of another b-1,4-linked glucose residue to the non-reducing terminus to give 5 improved the performance of the inhibitor roughly 7-fold (Table 1 and Fig. S3, ESI †), however intact MS with 2.2 mM enzyme and 5 mM inhibitor revealed minimal labelling (Fig. S4, ESI †). Treatment with 5 gave small peaks with mass differences of both B338 Da (equivalent to addition of 1) and B500 Da (expected). Soaking HiCel7B crystals with 0.1 mM 5 also gave an unliganded enzyme structure. Repeating the intact MS experiment with a higher inhibitor concentration (50 mM) resulted in more overall labelling, but still a dominant mass difference of B338 Da. Lowering the inhibitor (5 mM) and enzyme concentrations (0.5 mM) gave overall weaker signal, showing incomplete labelling, with a mass difference attributable to 5 as the dominant modification (Fig. S4, ESI †). We interpret these results as indicative of an internal hydrolysis of 5 to give primarily a mixture of cellobiose and cyclophellitol, which is unreactive, and secondarily a mixture of 1 and glucose, which gives rise to the smaller observed mass difference. The observed concentration-dependence suggests that both hydrolytic processes have a higher K M than the K I of the interaction between HiCel7B and 5. Thus, the course of inhibition of HiCel7B with 5 is enzyme-and inhibitor concentration-dependent, being affected by the K M of the two possible hydrolytic pathways and the K I values for 5 and 1.

Paper RSC Chemical Biology
To hopefully avoid the complication of internal hydrolysis, 13 was built on a b-1,4-glucosyl cyclophellitol inhibitor core.
Probe 13 turned out to be a strong inhibitor of HiCel7B, reacting with a k i /K I of 2100 M À1 s À1 , comparable to that of the 5. Intact MS confirmed complete single labelling at a 5 : 2.2 probe : enzyme stoichiometric ratio (Fig. S4, ESI †), confirming efficient labelling without hydrolysis. Modifying the azide handle of 13 with Cy5 gave compound 14, which is an effective probe for in-gel fluorescence-based detection of HiCel7B. A serial dilution of HiCel7B with 14 gave significant signal for the HiCel7B band from as little as 1.6 pg of enzyme per well (Fig. S5, ESI †). Probe 14 also facilitated measurement of the pH-labelling profile for HiCel7B (Fig. S6A, ESI †). Comparison to the pH-activity profile for the hydrolysis of 4MU-GG shows significant similarity between the pH-labelling profile and pH-activity profile, particularly above pH 5 (Fig. S6B, ESI †).
The structure of the HiCel7B complex with inhibitor 1 Soaking HiCel7B crystals with 1 yielded the complex shown in Fig. 2B. This confirmed that 1 binds in the expected manner, mimicking the 4 C 1 conformation of two glucose units of cellobiose previously observed in the À1 and À2 subsites (Fig. 2C). 37 Binding of the inhibitor had no significant impact on the structure of the active site, inducing no conformational change following addition to the catalytic nucleophile. This is in spite of the epoxide oxygen forming an extremely close (2.3 Å) contact with the general acid/base. Extending beyond the À2 subsite, in which essential hydrophobic stacking with W347 and hydrogen bonding interactions with R108, Y147, and S345 are formed, the active site broadens significantly, suggesting a lack of a specific À3 subsite, possibly accounting for the weak selectivity between 5 and 13.

Enzyme detection and identification by in-gel fluorescence and biotin-avidin enrichment
To test the ability of the probe to stain b-D-glucanases in fungal secretomes without staining b-D-glucosidases, an A. niger xylangrown secretome was stained with 14 or 19, followed by 19 or

RSC Chemical Biology Paper
14, respectively (Fig. 3A). 19-stained bands were present at molecular weights of B45 kDa, B60 kDa, B100 kDa, and B130 kDa. 14-stained bands were present at B35 kDa, B40 kDa (faint), B60, and B80 kDa. Notably, the gel shows minimal overlap between the staining of the two probes and no apparent preclusion of staining of one probe by the other. This suggests that among the retaining glycoside hydrolases secreted by A. niger, there is no hydrolysis of 14 and no crossreactivity between 14 and 19. Thus, 4-O substitution appears to reduce cross-reactivity with exo-glycosidases compared to the previously reported xylanase probes. 16 Based on the known content of this secretome, we tentatively assigned the 60 and 80 kDa 14-stained bands as CbhA and CbhB respectively, two GH7 cellulases. 39 We also assigned the B40 kDa band as EglB, 40 a GH5 endo-b-D-glucanase, and the B35 kDa band as XynC, an abundant GH10 xylanase likely stained due to a loose enzyme-substrate specificity comparable to other fungal GH10 xylanases. 41,42 We tentatively assign the 100 and 130 kDa 19-stained bands as GH3 enzymes, possibly BglA, BglM, and XlnD, 38 which have been detected in this secretome previously. 16 To test the specificity of our cellulase probe architecture, we used the biotinylated derivative (probe 15) and performed a biotin-avidin pulldown enrichment prior to on-bead digestion, peptide identification, and label-free quantification. Three samples were prepared: a negative control, a probe 15-treated sample, and a sample treated with probe 15 after treatment with inhibitor 1. The only proteins from A. niger detected at elevated levels in the probe 15-treated samples relative to the negative control were CbhA, CbhB, and XynC, confirming our assignment of the major bands observed by in-gel fluorescence. Label-free quantification showed a significant drop in CbhA and CbhB signal following treatment with inhibitor 1, but revealed no significant drop in XynC signal, suggesting that XynC was minimally inhibited (Fig. 3B). Thus, the probe architecture presented here shows specificity towards known GH7 cellulases within the context of a complex fungal secretome.
Conclusions b-1,4-Glucanases form the foundations of lignocellulose-degrading systems. Being produced by a wide variety of saprotrophic, symbiotic, and pathogenic microorganisms, the ability to detect small quantities of these enzymes directly offers a variety of opportunities for activity-based protein profiling. We have shown here that a cellobiose-mimicking cyclophellitol derivative is a potent inhibitor of b-1,4-glucanases. We have also shown that extension of this inhibitor from the 4 0 position with glucose enhances inhibitor binding, but facilitates inhibitor hydrolysis. Fortuitously, extension with a PEG linker enhances binding without facilitating inhibitor hydrolysis. Furthermore, the addition of a detection tag to the linker gives a potent and selective activity-based probe which can be applied to the direct detection of b-1,4-glucanases within a fungal secretome.

Data deposition
Coordinates and structure factors have been deposited with the PDB, with accession codes 6OYZ (HiCel7B soaked with 1) and 6YP1 (HiCel7B Soaked with 13). Results from the label-free quantitation proteomic experiment have been deposited in the PRIDE database with the accession code PXD019930.

Author contributions
GJD and HO conceived the study. CB, EP, SPS, JJ, G van der M, and JDCC performed or supervised organic synthesis. BIF performed proteomics. JR, AFJR and GPvanW prepared Aspergillus (B) Label-free quantification of proteins identified from the A. niger secretome following treatment with probe 15 and biotin-avidin enrichment. The ''No Probe'' sample is the negative control, the ''Competition'' sample was pre-treated with inhibitor 1, and the ''Pulldown'' sample was only treated with probe 15. The observed signal intensity was normalised to a 10 fmol ml À1 trysinised yeast enolase internal standard added to each sample prior to analysis. cultures and secretomes. NGSM performed structural biology, intact mass spectrometry, kinetic measurements, and in-gel fluorescence. Manuscript preparation was led by NGSM with help from other authors.

Conflicts of interest
There are no conflicts to declare.