Machine intelligence decrypts β-lapachone as an allosteric 5-lipoxygenase inhibitor† †Electronic supplementary information (ESI) available: Supplementary figures, data and methods. See DOI: 10.1039/c8sc02634c

Using machine learning, targets were identified for β-lapachone.

Nor-Lapachol. Nor-lapachol was synthesized by the Hooker oxidation 16 and data are consistent with those reported in the literature. [17][18][19] In a 500 mL flask was added 4.84 g of lapachol and then 40 mL of THF. Separately, a solution of 2.4 g of anhydrous Na 2 CO 3 in 50 mL of H 2 O was prepared, which was also added to the flask, forming a dark red solution. At 60 °C, 6 mL of 30% H 2 O 2 were added (10 drops at a time) over 1 hour, until a pinkish solution was formed. The solution was cooled (0 °C to -2 °C) and concentrated HCl was then added dropwise under stirring until a white precipitate appeared. The mixture was left in the refrigerator for 2 hours and then filtered to obtain a white solid (70% yield). This substance was then dissolved in 32 mL of THF and a solution of 1.4 g of Na 2 CO 3 in 59 mL of H 2 O was added. 20 mL of 25% NaOH, followed by 12 g of CuSO 4 .5H 2 O, in 59 ml of H 2 O were added under constant stirring. The solution was left in a water bath for 1 hour and 45 minutes, then filtered on Celite ® (infusion ground) to give a deep red mother liquor. The solution was acidified with concentrated HCl to form an orange precipitate, which was filtered and washed successively with distilled H 2 O until complete neutralization. Nor-lapachol was obtained as an orange solid (160 mg, 0.7 mmol, 70% yield); mp 121-122 °C. 16

3-Arylamino-nor-β-lapachone.
Nor-lapachol (228 mg, 1.0 mmol) was dissolved in 25 mL of dichloromethane, followed by the addition of 2 mL of bromine. A bromo intermediate precipitated immediately as an orange solid. Dicholoromethane was added and the solvent evaporated in vacuum to remove bromine. Aniline (2 mmol) was added and the mixture was stirred overnight. The crude reaction mixture was poured into 50 mL of water. The organic phase was separated and washed with 10% HCl (3 × 50 mL), dried over sodium sulfate, filtered, and evaporated under reduced pressure to yield a solid, which was purified by column chromatography in silica gel and eluted with an increasing polarity gradient mixture of hexane and ethyl acetate (9/1 to 7/3). 20

3-Hydroxy-nor-β-lapachone.
A solution of nor-lapachol (228 mg, 1.0 mmol) in 25 mL of dichloromethane, and 2 mL of bromine was prepared. The bromo intermediate precipitated immediately as an orange solid. The reaction mixture was transferred to the separatory funnel and extracted with sodium bisulfite (3 × 10 mL) to form the hydroxylated product.

Biophysical methods 1.3.1 Dynamic light scattering
Dynamic light scattering (Zetasizer Nano S, Malvern, UK) was used to determine compound colloidal aggregation potential and kinetic solubility. The particle sizes were measured at 25 °C. Water solubility was measured as described elsewhere with successive measurements within 60 minutes. 28 A 100 mM stock solution of β-lapachone was prepared in DMSO, following dilution to deionized water to obtain an analyte solution of 100 µM (0.1% DMSO). Colloidal aggregation was measured through sequential dilutions. Solubility was assessed at 25 µM after 0 and 30 minutes.

Intrinsic tryptophan fluorescence
5-Lipoxygenase (Cat No. ab114310, Abcam) was concentrated to 0.5 µM in buffer (50 mM Tris-HCl, pH 8.0, 5 mM CaCl 2 ). A stock solution of β-lapachone was prepared in DMSO and added to the enzyme solution in a concentration range of 0-10 µM. The final DMSO concentration was kept under 0.1%. Fluorescent measurements were performed using a 1 cm pathlength quartz cuvette. Spectra were collected in an Edinburgh Instruments FLS920 Series Fluorescence Spectrophotometer at 25 °C with fluorescence excitation and scanning emission set to 295 nm and 310 to 450 nm, respectively. All assays were carried out in triplicate.

Biology 1.4.1 Kinase assays
The ExpresS Diversity Kinase Profile (Ref. P10) was performed at Cerep, SA (Celle l'Evescault, France) on a fee-for-service basis, and as described in Table S1. S10  30 ) were assayed as described in Table S2.  33 ) were assayed as described in Table S3.

Minimum inhibitory concentration (MIC) assays
Escherichia coli K12 and Staphylococcus aureus ATCC 25923 were grown overnight at 37 °C and re-inoculated in 24-well plates containing 2.5 mL of Luria Bertani medium (LB) to give an optical density of ~0.01 at 600 nm. Stock solutions of β-lapachone, and lapachol were prepared in DMSO (1 % final concentration) and added to the cell suspensions to obtain final concentrations between 5-210 µM. Wells containing a growth control (cell suspensions with 1 % DMSO) and a sterile media control were also prepared. The plates were incubated for 18 h at 37 ºC and 90 r.p.m.. The concentration of compound in the first well in the series that presented no sign of visible growth was reported as the MIC. The OD600 of the cultures was also measured. All the MIC wells were serially diluted in phosphate buffer saline (PBS) and plated onto LB agar plates. Growth was evaluated after 24 h of incubation at 37 °C to access the minimum bactericidal concentration (MBC).

Cancer cell assays
HL-60 cells were routinely cultured in RPMI medium supplemented with 10% FBS, 1% Penstrep and 1% HEPES at 5×10 5 cells/mL. 5-LO overexpression is stimulated by initially starving the cells in medium with 1% FBS for at-least 2-3 passages followed by growing the cells in 1.5% DMSO for another 3-4 days. 40 For intracellular staining of 5-LO, 10 6 cells were collected and concentrated in 50 mL PBS.
Cells were fixed with 100 mL BD fix buffer while vortexing. Cells were thoroughly washed with PBS and permeabilized using Perm Buffer followed by one hour incubation with primary anti 5-LO antibody (AB376, Merck Millipore). After the required incubation time, cells were washed and incubated for another hour with goat anti-rabbit Alexa fluor 488 antibody. Cells are then washed and resuspended in PBS and acquired by LSRFortessa TM flow cytometer (BD Biosciences, USA) with a 488 nm laser, a 505 nm long-pass filter and a 530/30 nm band-pass filter (for FITC detection). Data was analyzed using the FlowJo software. HL-60 cells with and without DMSO stimulation were seeded at a concentration of 5×10 5 cells/mL in a 96 well plate format. Cells were treated with varying concentration of βlapachone for a time period of 48 hours. Cell death was analyzed using standard Alamar Blue assay. Data is represented after being normalized to the vehicle control. S15

Supplementary data and discussion 2.1 Analyses of compound databases
Natural products are enriched in substructures often found in nuisance compounds, such as "frequent hitters" 41 and the so-called "pan assay interference compounds" (PAINS), 3,[42][43][44] which may afford intractable assay readouts and attrition. Despite the structural alerts, there is robust evidence that natural products provide less promiscuous target engagement profiles compared to structural alert-free synthetic small molecules. 45 Indeed, it has been recently shown by Bajorath and co-workers that PAINS may also bind specifically to drug targets. 46 For example, the troublesome quinone and catechol moieties are commonly featured by ligands in complex to proteins, as surveyed in the protein data bank. 46 Moreover, the original PAINS study reported that 86 out of 370 quinones presented a "clean" profile, 47 suggesting that such compounds may be used as prototypes for medicinal chemistry programs. β-Lapachone may thus afford opportunities for drug discovery.
Besides the therapeutic potential of β-lapachone, we drew inspiration from Clemons and coworkers 45 who have shown that natural products are generally better starting points for optimization than synthetic small molecules due to decreased promiscuity. Interestingly, our analysis of the Clemons et al. 45 dataset shows that only 30% of natural products pass the rapid elimination of swill (REOS) 48 and PAINS filters, whereas 53% of the more promiscuity-prone synthetic molecules are structurally "clean". From substructural and 2D pharmacophore vantage points, 8 we found overlapping chemical spaces between REOS/PAINS-complying and violating chemical matter ( Figure S1a). Moreover, the druglikeness of approved drugs, "clean" fragments in ZINC and REOS/PAINS-hitting fragmentlike natural products, including 1, is highly distributed. Our data thus advocates druglikeness as a poor metric for prioritizing chemical matter ( Figure S1b). Furthermore, it suggests that one cannot rationalize the elimination of structurally "ugly" compounds from screening libraries as a general approach using the abovementioned filtering rules. Of note, we computed that only ca. 50% of approved drugs are fully compliant with these filters, i.e. proper hit validation rather than a priori exclusion of chemical entities from screening assays may be more appropriate in select cases to avoid missed research opportunities. Figure S1. REOS/PAINS substructure-containing natural products (NPs) as potential leads for development. a) Principal component analysis for visualization of fragment-like REOS/PAINS-free chemical entities in ZINC15 (gray) and REOS/PAINS containing fragment-like NPs (red) using RDKit descriptors. b) Box plots of drug-likeness calculated with DataWarrior for FDA-approved drugs (white), REOS/PAINS-free fragment-like entities in ZINC15 (gray), REOS/PAINS-free fragment-like NPs (green) and REOS/PAINS-hitting S16 fragment-like NPs (red). Outliers were excluded. Drug-likeness is significantly different between approved drugs and fragment-like NP (two-sided Mann-Whitney test, p < 0.0001). Approved drugs: n = 1506; REOS/PAINS-free fragment-like NPs: n = 35376; REOS/PAINS-hitting fragment-like NPs: n = 35544.

Target prediction
We carried out drug target predictions with SPiDER. Importantly, known targets (e.g. DNA topoisomerase, 49 cyclooxygenase 50 ) were predicted retrospectively, advocating for the appropriateness of our approach, while other methods underperformed in this particular case (Table S6-11).   Table S8. Confident target predictions for β-lapachone with SEA.

Kinase, ion channel and enzyme screening
With drug target predictions in hand, we then screened β-lapachone and its isomer lapachol at 150 µM, i.e. where 50% effect equates to a ligand efficiency of 0.30 (Tables S12-16). Given the potent effects observed against 5-LO, we confirmed inhibition in concentration response curves using cell-free assays with two different detection methods: i) direct measurement of 5-LO reaction products and ii) indirect fluorescence detection method (Cerep, France). In the latter case, β-lapachone potently inhibited 5-LO (IC 50 = 2.1 µM ± 0.23 log units, n = 2; LE = 0.44; Figure S2). No interference from auto-fluorescence of β-lapachone was detected in the 5-LO assay, corroborating the data obtained for the first method.   Controls -EP 1 functional agonist: PGE 2 EC 50 = 6.9×10   Control -Dipyridamole IC 50 = 1.6×10 -6 M (nHill = 1.3).   From a cheminformatics vantage point, the result is also important as β-lapachone is only scarcely related on a substructural level to known 5-LO inhibitors on average. The most related 5-LO inhibitor presents a Tanimoto index < 0.30, which supports structural dissimilarity to β-lapachone ( Figure S3a) and that the ortho-quinone scaffold is not exploited as motif in 5-LO inhibitors. Conversely, analyzing topological pharmacophores, the conclusion is opposed, as β-lapachone lies in a region populated by other 5-LO inhibitors ( Figure S3b) and providing a rationale for testing of β-lapachone against 5-LO. Testing of β-lapachone against human neutrophils showed selectivity for 5-LO, over 12-and 15-LO ( Figure S4a), albeit with lower potency compared to the cell-free 5-LO inhibition assays. Our data show that β-lapachone must be converted to the hydroquinone form for potent 5-LO modulation. A possible explanation for the obtained lower potency might be the insufficient conversion in the native neutrophil environment. However, supplementing neutrophils with dithiothreitol reinstates potency similar to that obtained in cell free assays, which corroborates our hypothesis ( Figure S4b). Overall, the obtained cell-free and whole cell data was reproducible with three different sources of β-lapachone -two different synthetic routes devised by us, and one commercial (Bide Pharmatech Ltd).

β-Lapachone-inspired chemical library
To further validate binding and engagement of 5-LO by β-lapachone we built a focused library to probe not only structure-activity relationships but also ascertain a non-flat bioactivity landscape which could be considered a flag for unspecific binding (Figure 3a). Generally, the in situ bromination of an appropriate starting material afforded the respective key intermediates, which were subsequently functionalized with the required nucleophilic species. 20,23,26 A range of inhibition potencies were obtained in cell-free 5-LO assays for compounds 2-8 (Figure 3a), supporting the importance of the substitution pattern for bioactivity and the specific, directed interactions of 1 with 5-LO. For example, the activity of the β-lapachone-inspired entities against 5-LO appears to be sensitive to ring contraction (e.g. β-lapachone vs. 4), stereogenic centre configuration (β-lapachone vs. 2-8) and potentially to desolvation/thermodynamics penalties (β-lapachone vs. 5). Although not probed in this study it is conceivable that (R)-and (S)-configured molecules present different binding affinities to 5-LO. Molecular docking of (R)-4 and (S)-4, indeed suggests that diverging configurations can impact molecular recognition and bioactivity against 5-LO ( Figure S5).

Mechanism of 5-LO inhibition by β-lapachone
To understand the molecular mechanism of 5-LO inhibition by β-lapachone we carried out wash-out experiments ( Figure S7a) to probe the reversibility of the binding interaction. In addition, because β-lapachone presents the structural requirements to chelate active site iron, we performed competition assays with the natural substrate arachidonic acid. Our results show that binding of β-lapachone is non-competitive, suggesting an allosteric modulation mechanism ( Figure S7b). Altogether we present evidence that β-lapachone requires to be reduced to its hydroquinone form (e.g. through NQO1 in cancer cells) in order to modulate 5-LO ( Figure S7c). Naturally, several mechanisms of anticancer activity may come into play. β-Lapachone may itself modulate hitherto unknown drug targets while the generated reactive oxygen species in the redox cycle are also accountable for the phenotypic effects. Indeed, β-lapachone binds to 5-LO competitively to phosphatidylcholine, which is known to increase the catalytic activity of 5-LO ( Figure S8). The predicted binding pose suggests no interaction with tryptophan residues. Because tryptophan residues display fluorescence under certain experimental conditions, we challenged our binding model by monitoring tryptophan fluorescence; wherein a blue shift indicates a binding interaction. Using purified human 5-LO and supplementing it with 1 mM dithiothreitol to ensure reduction of β-lapachone to the corresponding hydroquinone we observed no shift in fluorescence at relevant binding concentrations ( Figure S9). Hence, one may conclude that tryptophan residues do not intervene in the binding event of β-lapachone to 5-LO and that at concentrations as high as 10 µM of β-lapachone no denaturation of 5-LO occurs.

Anticancer assays
To assess the importance of modulation of a given target in cancer, the model system most commonly employed is the use of siRNA to suppress expression of the gene of interest and then assess cell survival in the silenced population. A statistically significant difference in cell viability between the gene-silenced cells and the wild type cells usually provides initial proofof-concept. The method however suffers from caveats: i) silencing is often not very efficient, being common to find only 60-70% of cells not expressing the protein of interest or with reduced expression; ii) cell viability may be deeply affected by silencing genes. Herein we used the approach of overexpressing 5-LO. While transfection still suffers from identical success rates, through this approach one does not shut down potentially critical pathways but instead exacerbates them. Thus, cells with exacerbated activity for a given protein will be more sensitive to modulators if: i) the target is important for cancer cell survival and ii) if inhibition of the protein of interest is relevant for the anticancer activity of the studied molecule. A similar approach has already been successfully followed to study DYRK3 biology.   Figure S13. 1 H and 13 C NMR spectra of β-lapachone recorded in CDCl 3 (300 K).