Targeting of the Leishmania mexicana cysteine protease CPB2.8 D CTE by decorated fused benzo[ b ] thiophene sca ﬀ old †

A potent and highly selective anhydride-based inhibitor of Leishmania mexicana cysteine protease CPB2.8 D CTE (IC 50 ¼ 3.7 m M) was identi ﬁ ed. The details of the interaction of the ligand with the enzyme active site were investigated by NMR biomimetic experiments and docking studies. Results of inhibition assays, NMR and theoretical studies indicate that the ligand acts initially as a non-covalent inhibitor and later as an irreversible covalent inhibitor by chemoselective attack of CYS 25 thiolate to an anhydride carbonyl.


Introduction
Despite progress made in both the basic knowledge of many infectious diseases and drug discovery and development, tropical infectious diseases such as leishmaniasis, malaria, trypanosomiasis, and Chagas' disease, continue to cause signicant morbidity and mortality predominantly in the less developed world. 1 Leishmaniasis is one of the major tropical diseases, ranking second only to malaria for mortality rate, 1 caused by protozoan parasites of the genus Leishmania. 2 Depending on the tropism, the disease is characterized by three different clinical forms: visceral, cutaneous, and mucocutaneous.The visceral form is fatal in 85-90% of untreated cases.No effective vaccine is available against leishmaniasis, therefore chemotherapy is the only effective way to treat all forms of this neglected disease. 3urrent therapy relies mainly on drugs that were developed decades ago, such as pentavalent antimony agents, amphotericin B, paromomycin and pentamidine.Severe toxic effects combined with the emergence of drug-resistant parasite strains has created an urgent and continuous need for new, safe and efficacious drugs.
Cysteine proteases (CPs) play crucial roles in the biology of parasites and their inhibition is emerging as an important strategy to combat parasitic diseases. 4,5Leishmania (L.) protozoa express high levels of several classes of CPs belonging to the papain family, that are crucial to parasite metabolism, reproduction and intracellular survival. 6In particular L. mexicana possesses three CPs of the papain superfamily, namely CPA and CPB, both of which are cathepsin L-like, and CPC, which is cathepsin B-like. 7Inhibitors of the CPBs isoenzymes have been shown to reduce the infectivity of L. mexicana both in vitro 6 and in vivo, 7 thus providing further evidences that these CPBs isoenzymes are virulence factors.To the best of our knowledge, only a few reports describe the identication of novel CPBs inhibitors, [8][9][10] as the CPBs are relatively unexplored drug targets.They can be divided in three broad groups: natural compounds (e.g.morelloavones), metal complexes (such as tellurium, palladium, and gold derivatives) and CPBs inhibitors endowed with an electrophilic warhead. 11,12The latter group can be further subdivided in peptidic and non-peptidic CPBs inhibitors, which are labelled as a-ketoheterocycles 1, 10 thiosemicarbazones 2, 8 semicarbazones 3 8 and nitriles 4 8 (Fig. 1), according to their warhead-types that interacts with the cysteine thiolate of the active site. 13,14s a part of an ongoing program of targeting small molecular weight heterocyclic scaffolds, the activity of a fused benzo[b] thiophene derivative 5 (Fig. 1), whose synthesis 15 and properties as ionophore 16 we have already reported, was tested against a panel of human and parasitic CPs, including the mature recombinant form of the amastigote-specic isoform CPB2.8 of L. mexicana cysteine protease, expressed without the C-terminal extension and so designed mature CPB2.8DCTE, and the rhodesain and human cathepsin-B and -L.
The choose of biological targets was done on the basis of a combined PharmMapper 17 and wwLigCSRre 3D ligand-based 18 servers approach, followed by a polish up by Tanimoto similarity index search. 19This strategy allowed us to identify the bicyclic fused dihydropyrrolo[3,2-c]isoxazol-6-one core, belonging to a series of compounds acting as cysteinyl proteinase inhibitors, 20 as the best match to the tetrahydrofuro [3,4-b]pyrrole-4,6-dione core of compound 5.
From this biological screening compound 5 turned out to be active against mature L. mexicana cysteine protease CPB2.8DCTE, with a high selectivity for the parasite's enzyme with respect to the highly similar human CPs.

Results and discussion
][23] The preliminary screening at 20 mM of 5 against mature CPB2.8DCTEproduced a remarkable inhibition (z90%) of the target enzyme (Table 1).No inhibition (n.i.) was detected in the screening against the other parasitic CPs at our disposal (i.e.rhodesain, Table 1) and, more importantly, no signicant crossreactivity was detected towards highly similar human CPs (Table 1) such as cathepsin-B (n.i. at 20 mM) and cathepsin-L (only z30% of inhibition at 20 mM) suggesting that 5 selectively interacts with the target, realistically due to its highly conformationally constrained structure.Therefore, compound 5 was further evaluated by progress curve analysis 24 (Fig. 2 and 3) using a continuous readout.
Compound 5 has a non-peptidic tetracyclic scaffold with two electrophilic moieties, a peripheral cyclic anhydride and a lateral ketone group, both warhead suitable for nucleophilic attack by the CYS 25 thiolate of the enzyme.
Inhibition of CPs can take place via several different methods, including covalent inhibition, blockage or distortion of the catalytic active site via competition with non-covalent inhibitors.Progress curves for the inhibition of mature CPB2.8DCTEby 5 measured over a time period of 10 min (Fig. 2) indicated a non-time-dependent mechanism, suggesting a non-covalent or fast-binding covalent reversible inhibition mechanism.
However, the chemical structure of our hit compound with the presence of two reactive warheads that could undergo covalent modication by the active site thiol of the enzyme suggest an irreversible binding mechanism.
In contrast to these 10 min assays the measurement of enzyme activity over a time period of 30 min showed timedependent inhibition.Thus, these two experiments did not unequivocally prove the inhibition mechanism.In order to clarify the mode of action, dialysis assays were performed (see below).
The assays with 10 min measurement time yielded an IC 50 of 3.7 mM.With the substrate concentration used (10 mM) and the K m value of 5.0 mM, the K i value has been calculated to 1.2 mM (according to Cheng-Prusoff equation for competitive inhibitors in a classic mode). 25,26ith the progress curves obtained over a 30 min time period (Fig. 3) rst-order rate constants of inhibition (k obs values) were obtained which were tted to the inhibitor concentrations 24 yielding a second-order rate constant of inhibition, k 2nd , of 4290 AE 5 M À1 s À1 with a K i value of 1.36 AE 0.075 mM.
To clarify the issue of irreversibility of inhibition, dialysis assays by mixing enzyme and inhibitor 5 were performed in order to check if the enzyme activity could be recovered aer this treatment.As shown in Fig. 4, the dialysis did not lead to the regeneration of enzyme activity, proving the irreversible inhibition which ts with the order of reactivity of the electrophilic moieties of our compound.All together these outcomes suggested that 5 inhibits the target by two inhibition pathways, a reversible fast one and an irreversible one.
Selectivity and potency will combine to make 5 eligible as a new lead structure for the development of anti-leishmanial agents.[29] NMR biomimetic experiments (Fig. 5) were performed in DMSOd 6 /D 2 O 8 : 2 using N-(tert-butoxycarbonyl)-cysteine methyl ester to shed light on the interaction modes of 5 with the target cysteine protease.Therefore, N-(tert-butoxycarbonyl)cysteine methyl ester was added to DMSOd 6 /D 2 O solution of 5 in a nearly equimolar ratio.The reaction was monitored at different times directly in the NMR tube at a probe temperature of 25 C, observing the shi and the appearance/disappearance of selected signals.
NMR studies reveal that 5 chemoselectively reacts with the cysteine sulydryl group by ring-opening of the anhydride moiety (Fig. 5A).In particular the 13 C NMR spectrum (Fig. 5B) shows the disappearance of the resonance a at 167.4 ppm (attributed to the carbon of the anhydride next to the ketone group by gHMBCAD experiments), a slight shi of the carbonyl signal b from 171.3 to 171.8 (b 0 ), and the appearance of two new signals at 181.6 ppm and 171.7 ppm, related to the newly formed thioester carbonyl group (a 0 ) and to the CO 2 Me of the cysteine methyl ester, respectively.A slight downeld shi of the signal of ketone carbonyl was observed from 192.6 to 193.7 ppm.
These ndings support the irreversible conversion of ligand 5 into compound 8. 1 H NMR spectra show the presence of a complex signal pattern, due to the presence of more than one product, according to the multiplicity of the reactive sites.Nevertheless, an unambiguous set of signals are consistent with the formation of 8 as the main product.In particular, arrayed experiments (Fig. 5C) clearly revealed the upeld shi of the signal related to the hydrogen of the thienopyrrole from 4.9 (c) to 4.0 ppm (c 0 ), and the downeld shi of both the cysteine CH from 4.1 (d) to 4.2 (d 0 ) ppm and of the cysteine CH 2 from 2.6-2.8(e) to 2.8-3.1 (e 0 ) ppm.The structure of 8 was unambiguously assigned on the basis of COSY, gHSQCAD and gHMBCAD experiments.
To further rationalize the experimental ndings, homology modeling, molecular dynamics (MD), and noncovalent and covalent docking experiments were carried out.In order to get the 3D structure of the protein for MD and docking studies, a structural model of active mature L. mexicana CPB2.8DCTE was generated with the aid of YASARA Structure soware. 30tarting from the target sequence possible templates were identied by running three PSIBLAST iterations to extract a position specic scoring matrix from UniRef90, and then searching the PDB for a match (i.e.hits with an E-value below the homology modeling cutoff of 0.5).The best result was the crystal structure of cruzain bound to vinyl sulfone analog of Wrr-483 (PDB ID 4PI3) with a resolution of 1.27 Å; this template was downloaded from PDB_REDO database, 31 since re-renement improved the structure quality Z-score by 0.090.The sequence identity of 60.3% and the sequence similarity of 74.8% between mature CPB2.8DCTE and cruzain was reasonable for the generation of a qualied homology model.Then a full unrestrained simulated annealing minimization was run for the entire model and this fully rened model has been accepted as the nal one (see ESI † for model validation) and used as starting point for successive studies.The so obtained homology model showed a Ca RMSD value of 0.42 Å compared to its template structure.The inactivation of a protease by an active-site directed irreversible inhibitor usually proceeds by the rapid formation of a non-covalent reversible enzyme-inhibitor complex (E-I).Successively, in a slower chemical step, a covalent bond is formed with the enzyme to generate the enzymeinhibitor adduct (E-I). 32o, we conducted the study utilizing this sequence: (i) noncovalent docking of ligand upon mature CPB2.8DCTEenzyme; (ii) 40 ns of MD simulation of the best pose obtained for ligand-CPB2.8DCTEcomplex, to accommodate the ligand; (iii) noncovalent re-docking of the complex obtained from the last 3 ns of MD simulation averaged frames; (iv) 40 ns of MD simulation of covalent docked ligand, based on the best re-docked pose; (v) covalent docking of the complex obtained from the last 3 ns of MD simulation averaged frames, to assess the best conformation for the free-moiety of the ligand; (vi) 400 ns of MD simulation of the best non-covalent docked complex, to verify the stability and the correctness of the complex (Fig. S1-3 †).To validate the homology model in performing a suitable level of docking accuracy we successfully docked two well-known CPB2.8DCTE inhibitors with different K i values (Fig. S4 and Table S1 †).
Since compound 5 exists as a racemic mixture and it is a tertiary amine, considering physiological conditions (pH ¼ 7.2), we performed the rst phase on both N-protonated enantiomers, using the most stable protonated diastereomer.
The enantiomer 5-H, with conguration 3aS,4R,4aR,9bR,9cR (Fig. 6A), resulted the best ligand, with a difference in the free energy of binding (DG B ) of 0.6 kcal mol À1 , and with a pose suitable for successive covalent docking, contrarily to its enantiomer.Then, all further studies were conducted on (3aS,4R,4aR,9bR,9cR)-5-H enantiomer, simply mentioned as 5-H.Moreover, to computationally determine the best electrophilic site of 5-H for the nucleophilic attach of cysteine, considering the possibility that this reaction could be either under orbital or charge control, we fully optimized compound 5-H in water, at DFT level of theory, to retrieve the shape and localization of the lowest unoccupied molecular orbital (LUMO) and the charge distribution under natural bond orbital (NBO) scheme.As depicted in Fig. 6B the LUMO is centered on the C a carboxylic moiety and the highest positive charge is located on the same atom; this is perfectly in accord to the result obtained by NMR biomimetic experiments.
The results of the non-covalent re-docking of 5-H showed that it bounds the enzyme exposing both carbonyl and carboxyl moieties to the nucleophilic cysteine 25 residue, that is, as wellknown, 32 involved in the instauration of the covalent bond (Fig. 7A-C).As regards to the water environment no one molecule is directly involved in ligand-receptor complex stabilization; however, a water molecule establish a hydrogen bond with the ammonium of ligand (Fig. 7B).The calculated DG b of À7.3 kcal mol À1 corresponds to a K i of 4.4 mM that is, in itself, well in agreement with the experimental one.Hydrogen bond interactions and their energies, according to Fig. 7A, have been  reported in ESI (Table S2 †).The covalent docking performed starting from the best non-covalent docked pose showed that compound 5-H results even more deeply embedded in the active site with both carboxylic moieties engaged in hydrogen bonds with GLN 19, CYS 25, and TRP 185 residues (Fig. 7D).These results parallel the experimental ones: the ligand acts immediately as non-covalent inhibitor, and then the system is engaged in the formation of the covalent bond.Moreover, it is possible to note both in non-covalent and covalent pose of docked ligand that there is an empty groove in the protein (Fig. 7C and D), that can be further exploited to construct a more performant ligand.
Finally, to deeply investigate the interactions involved in the recognition process we performed a computational study using the quantum mechanics/molecular mechanics (QM/MM) approach, the so called ONIOM method, 33 on the wholly solvated ligand-enzyme complex, including the residues surrounding the ligand in the quantum mechanical layer of the calculation.In this case, the LUMO of the entire system is centered only on ligand but, compared to the result obtained on the isolated ligand, now it is extended to the C b carboxylic moiety (Fig. 8).However, it is evident from Fig. 8 that the HOMO À10, centered on sulfur atom of CYS 25, is in the correct position to the subsequent nucleophilic attach to the C a .Interestingly, the complete optimization of the entire ONIOM system brings to the formation of the intermediate of the classical two step acyl substitution reaction and it is a computational evidence that the system spontaneously evolves towards the covalent interaction (see video in ESI †).

Enzyme assays
The preliminary screening against CPs was performed with 20 mM inhibitor concentrations by using an equivalent amount of DMSO as a negative control.Recombinant enzymes (i.e.rhodesain and mature L. mexicana CPB2.8DCTE) were expressed as previously described. 34,35Product release from substrate hydrolysis (Cbz-Phe-Arg-AMC, 10 mM or 20 mM) was determined continuously over a period of 10 min or 30 min and the uorescence measured using an Innite 200 PRO microplate reader (Tecan, Männedorf, Switzerland) at 30 C with a 380 nm excitation lter and a 460 nm emission lter.The higher substrate concentration of 20 mM was chosen to ensure linear increase of  the uorescent product AMC over time in absence of the inhibitor during 30 min.Rhodesain was incubated at 20 C with test compound in a 50 mM sodium acetate buffer (pH 5.5) containing 10 mM 1,4-dithiothreitol (DTT), 5 mM EDTA, and 200 mM NaCl, whereas L. mexicana CPB2.8DCTE was incubated in the same conditions by using a 50 mM sodium acetate buffer (pH 6.5) containing 5 mM DTT and 5 mM EDTA.Cathepsins-B and -L were purchased from Calbiochem and the assays were performed as previously described. 36Also in this case Cbz-Phe-Arg-AMC was used as the substrate (80 mM for cathepsin-B; 5 mM for cathepsin-L).

Dialysis assays
Mature L. mexicana CPB2.8DCTE (13 mg mL À1 ) was incubated with 5 (50 mM) for 5 min.The reaction mixture was then subjected to dialysis against reaction buffer (5000-fold excess) for 3 h.The residual enzyme activities were determined by adding substrate (10 mM).The enzyme was subjected to the same procedure in the absence of inhibitor in order to conrm its stability to the dialysis conditions.Positive and negative control assays, i.e. dialysis with fully irreversible (E-64) and fully reversible inhibitors, 37 were also performed.

Preparation of ligands, DFT and ONIOM calculations
The 3D structures of ligands were built using Winmostar (5.009) soware 38 and all geometries were fully optimized, in the same soware, with the semi empirical PM6 (ref.39) Hamiltonian implemented in MOPAC2012 (15.156W). 40 The optimized structure of 5-H was further fully optimized, in water, into the DFT framework, at the M06-2X/cc-pvtz level of theory, utilizing the implicit polarizable continuum model for solvation.The electronic structure of 5-H was studied by the NBO method.All the optimizations were carried out by Berny's analytic gradient method as implemented in Gaussian 09 suite. 41he YASARA2 optimized geometry of the best re-docked pose obtained by point iv, in the above described sequence, was used as a starting point for the QM/MM calculations using a two-layer ONIOM method. 33The TAO package 42 was used to prepare the system and the parmchk module of antechamber package 43 to retrieve non-standard force eld parameters.The ligand and 17 surrounding amino acid fragments (GLN 19, GLY 20, CYS 22, GLY 23, SER 24, CYS 25, CYS 63, ASP 64, GLY 65, GLY 66, ALA 142, PHE 145, MET 146, ASN 162, HIS 163, GLY 164 and TRP 185) were included in the QM region (349 atoms).The QM part of the system was described at the M06-2X/6-31G(d,p) level of density functional theory for optimization and at M06-2X/ccpvtz one for single point calculations.The MM part of the system was described using the parm96 parameters of the AMBER force eld, 44 as implemented in Gaussian 09, and includes the remaining amino acids and 6338 water molecules (only water molecules extending 3 Å from the surface of the complex were retained).

Molecular dynamics simulations
The molecular dynamics simulations of the mature CPB2.8DCTE/ligandcomplexes (based on the PDBs prepared as described above) were performed with the YASARA Structure package (15.3.8). 30A periodic simulation cell with boundaries extending 10 Å from the surface of the complex was employed.The box was lled with water, with a maximum sum of all bumps per water of 1.0 Å, and a density of 0.997 g mL À1 with explicit solvent.YASARA's pK a utility was used to assign pK a values at pH 7.2, 45 and the cell was neutralized with NaCl (0.9% by mass); in these conditions 5-H ligand results protonated at pyrrolidinic N-Me.Waters were deleted to readjust the solvent density to 0.997 g mL À1 .The YASARA2 force eld was used with long-range electrostatic potentials calculated with the Particle Mesh Ewald (PME) method, with a cutoff of 8.0 Å. 44,46,47 The ligand force eld parameters were generated with the AutoS-MILES utility, 48 which employs semiempirical AM1 geometry optimization and assignment of charges, followed by assignment of the AM1BCC atom and bond types with renement using the RESP charges, and nally the assignments of general AMBER force eld atom types.Optimization of the hydrogen bond network of the various enzyme-ligand complexes was obtained using the method established by Hoo et al., 49 in order to address ambiguities arising from multiple side chain conformations and protonation states that are not well resolved in the electron density. 50A short MD was run on the solvent only.The entire system was then energy minimized using rst a steepest descent minimization to remove conformational stress, followed by a simulated annealing minimization until convergence (<0.01 kcal mol À1 ÅÀ1 ).The MD simulation was then initiated, using the NVT ensemble at 298 K, and integration time steps for intramolecular and intermolecular forces every 1.25 fs and 2.5 fs, respectively.The MD simulation was stopped aer 40 ns and, on the averaged structure of the last 3 ns frames, a second cycle of energy minimization, identical to the rst, was applied.

Docking protocol
Macromolecules and ligands, as obtained aer MD simulation and energy minimization, were prepared with Vega ZZ 51 (3.0.3.18)assigning Gasteiger charges to protein and AM1BCC ones to ligand.Docking was performed with AutoDock Vina (1.1.2) soware. 52Because no water molecule is directly involved in complex stabilization they were not considered in the docking process.All protein amino acidic residues were kept rigid whereas all single bonds of ligands were treated as full exible.The ligand box was centered at x ¼ 9.32, y ¼ À5.95 and z ¼ À7.50 coordinates with a grid size of 15 Â 20 Â 18 Å and the exhaustiveness parameter related to Lamarckian genetic algorithm was set to 15.

Conclusions
In summary, we have discovered a potent and highly selective anhydride-based inhibitor of mature L. mexicana cysteine protease CPB2.8DCTE, with an K i of ca.1.3 mM, that possess no signicant cross-reactivity towards highly similar human CPs such as cathepsin-B and cathepsin-L.The details of the catalytic reaction mechanism of ligand with the enzyme active site amino acids were investigated by NMR biomimetic experiments and docking studies.All results indicate that the ligand acts immediately as non-covalent inhibitor and then as irreversible covalent inhibitor by chemoselective attack of CYS 25 thiolate to C a anhydride carbonyl.Moreover, docking studies suggest that compound (3aS,4R,4aR,9bR,9cR)-5-H can be considered the eutomer with a modest eudysmic ratio of 3. Finally, the inspection of 5-H-CPB2.8DCTEcomplex molecular surface account for a ne tuning of ligand substituents to further improve both potency and selectivity.

Fig. 4
Fig.4Histogram of the residual activity of mature CPB2.8DCTEupon dialysis.The enzyme was incubated with 5 (50 mM) for 5 min.The reaction mixture was then subjected to dialysis against reaction buffer for 3 h.The residual enzyme activities were determined at different times by adding the substrate (10 mM).

Fig. 5
Fig. 5 NMR biomimetic experiments.(A) Scheme for transformation of ligand 5 in the new adduct 8. (B) Selected region of 13 C NMR spectra.(C) Stacked plot of selected region of 1 H NMR spectra during a time of 48 h.The resonances c 0 and d were assigned on the basis of COSY, gHSQCAD and gHMBCAD experiments.

Fig. 7 (
Fig. 7 (A) 2D sketch interactions of non-covalent docked pose of 5-H.(B) 3D representation as in A, including surrounding water molecules.(C) 3D mapped surface of CPB2.8DCTE with 5-H non-covalent docked; CYS 25 overlooks on both carbonyl and carboxyl moieties.(D) Covalent docked pose of 5-H with the CYS 25 bounded to the carboxyl moiety a (according to the numbering in Fig. 6A) of the anhydride.

Fig. 8
Fig. 8 ONIOM 3D molecular model of 5-H-CPB2.8DCTEcomplex with HOMO À10 and LUMO.It is showed only the QM layer; non-polar hydrogens are omitted for clarity.
NMR biomimetic experimentsNMR measurements were performed on a Varian 500 (500 MHz for 1 H; 125 MHz for 13 C) spectrometer at 25 C using DMSOd 6 / D 2 O 8 : 2 as solvent (D 2 O was phosphate buffered at pH 7.2).Chemical shis are expressed in ppm respect to TMS.The NMR sample was prepared by dissolving compound 5 (4 mg, 0.008 mmol) in 0.6 mL of DMSOd 6 /D 2 O 8 : 2, then 2 mg of N-(tertbutoxycarbonyl)-cysteine methyl ester (0.008 mmol) were added.The biomimetic experiments were carried out directly in the NMR tube by recording 1 H NMR spectra every 30 min for 48 h.Aerwards 13 C NMR spectrum was registered, and nally gCOSY, gHSQCAD, and gHMBCAD experiments were run for the determination of the structure of compound 8.

Table 1
Screening against a panel of human and parasitic CPs and antileishmanial activity of 5. IC 50 value includes standard deviation from two independent measurements, each performed in duplicate a n.i.¼ no inhibition.b n.d.¼ not determined.