Anthony
Tabet‡
abc,
Thomas
Gebhart‡
d,
Guanglu
Wu
a,
Charlie
Readman
a,
Merrick
Pierson Smela
e,
Vijay K.
Rana
a,
Cole
Baker
f,
Harry
Bulstrode
b,
Polina
Anikeeva
c,
David H.
Rowitch
b and
Oren A.
Scherman
*a
aMelville Laboratory for Polymer Synthesis, Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, UK. E-mail: oas23@cam.ac.uk
bDepartment of Paediatrics, Addenbrooke's Hospital, University of Cambridge, Cambridge CB2 0QQ, UK
cDepartment of Materials Science & Engineering and Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA, USA
dDepartment of Computer Science, University of Minnesota, Minneapolis, MN, USA
eDepartment of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
fDepartment of Electrical Engineering & Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
First published on 22nd June 2020
Machine learning is a valuable tool in the development of chemical technologies but its applications into supramolecular chemistry have been limited. Here, the utility of kernel-based support vector machine learning using density functional theory calculations as training data is evaluated when used to predict equilibrium binding coefficients of small molecules with cucurbit[7]uril (CB[7]). We find that utilising SVMs may confer some predictive ability. This algorithm was then used to predict the binding of drugs TAK-580 and selumetinib. The algorithm did predict strong binding for TAK-580 and poor binding for selumetinib, and these results were experimentally validated. It was discovered that the larger homologue cucurbit[8]uril (CB[8]) is partial to selumetinib, suggesting an opportunity for tunable release by introducing different concentrations of CB[7] or CB[8] into a hydrogel depot. We qualitatively demonstrated that these drugs may have utility in combination against gliomas. Finally, mass transfer simulations show CB[7] can independently tune the release of TAK-580 without affecting selumetinib. This work gives specific evidence that a machine learning approach to recognition of small molecules by macrocycles has merit and reinforces the view that machine learning may prove valuable in the development of drug delivery systems and supramolecular chemistry more broadly.
Cucurbiturils are a class of symmetric macrocycles that have applications within drug delivery, biosensing, catalysis, and energy.6,7 These macrocycles have many advantages over their non-symmetric counterparts such as cyclodextrins, including temperature stability and robustness at acidic and basic pH values,6 such as those that occur naturally in physiology. The use of cucurbiturils to change the release kinetics or pharmacokinetics of drugs has been previously reported for chemotherapies such as temozolomide.8,9 Cucurbituril acts as a competitive substrate and binds to the active ingredient. This binding can reduce the effective concentration and increase the half life of biologic and hydrophobic small molecule drugs.10 Predicting whether a molecule will bind to any cucurbituril, in particular cucurbit[7]uril (CB[7]), a priori could be a valuable tool in developing new chemical or material systems.11
In this work, we show that support vector machines (SVMs) can be used to provide utility towards predicting 1:
1 complexation of small organic molecules with CB[7]. Finding no comprehensive, compiled body of data that could be used for regression, we first created one using much of the published literature on small molecules that bind to CB[7].6 We also report the utility of this regression in predicting the binding of two new small molecule drugs that have received promising results in the clinic and verify these predictions with experimental data. Finally, we provide a qualitative example of the potential use of these predictions in developing cocktail drug therapies against a pediatric low grade glioma cell model.
With 194 molecular samples consisting of 17 experimental and structural features, the constructed data set is small sample-wise with a relatively high-dimensional feature space. Without heavily sub-setting the feature space and losing potentially integral feature-interaction information, training a model to find a subspace parameterising the underlying binding dynamics is difficult without strong inductive biases provided a priori. We instead looked to kernel methods to provide a more sample-efficient learning paradigm that can still capture the dynamics of the feature space through the lens of properly-defined sample similarity. A mathematical background for kernel methods is provided (ESI†).
Kernel featurisation provides a non-linear representation of the samples within some inner-product space. Support Vector Machines (SVMs) are a family of models that can capitalize on this expressive kernel structure by representing examples as points in this space and determining an optimal but well-behaved mapping that best describes the differences between individual points. Although originally designed for classification tasks, SVMs have a natural extension to regression. Given the mathematical framework developed (ESI†), we explored the capacity of SVMs to predict the equilibrium binding constants of published molecules.6 We performed a search over features to determine the best-performing subset of the feature space in coordination with grid search over hyperparameters within the model pipeline, namely γ, ε, C, |θ|, σ, and all permutations of addition or multiplication of each kernelised feature. The optimal hyperparameters (ESI†) were chosen based on 5-fold cross-validation with respect to mean absolute error.
The environmental data were largely incomplete due to the fact that many experiments in the literature do not report at least one and often several of the environmental parameters such as temperature or pH. For samples missing this information, we assumed temperatures of 298.15 K, and pH values of 7. We also set other values, such as salt concentration, to zero. These assumptions resulted in an environmental feature set that was sparse and largely uniform (Fig. 3, Supplementary data set, ESI†). Viewing environmental factors as a single feature vector, we also explored how the addition of environmental information affected prediction performance.
Because the available environmental data lack diversity and are unnaturally uniform across samples, their usage as an additional feature often masked the underlying predictive capacity of the structural features. This process of feature reduction resulted in an optimal model consisting of 4 features derived from DFT calculations (1), (3), (4), and (6) only (Fig. S18, ESI†). These results are intuitive: both the size and electron distribution of small molecule organics are key in determining binding to cucurbit[7]uril.6 Environmental parameters including salt concentration are known to affect the binding of some molecules.6 However, the extent of changes is less than the error of our model, so environmental parameters were not considered going forward.
Optimised orientation was a large driver of model accuracy in predicting logK (ESI†). In pursuit of better intuition regarding model performance, the equivalent SVM classifier was trained using the same process as above. The confusion matrix in Fig. S19 (ESI†) is largely diagonal, with a bias towards over-predicting samples with a low value for log
K. Also of interest was the extent to which the preprocessing methods provided separation between samples. Fig. S17 (ESI†) shows non-linear 2D projections of the combined kernels as well as the pre-kernelised and post-kernelised features for the optimised DFT orientation.16 It is evident from these plots that the featurisation process creates useful separation between high and low values of log
K.
We next sought to challenge the model and identify its limits. We first removed any duplicate molecules at different conditions and set the true logK as the average of all the reported values. For example, methyl viologen was reported 14 times at different parameters such as temperature or salt concentration, and so instead of having methyl viologen appear 14 times, it appeared once (ESI†). Interestingly, and perhaps expectedly, the optimal model in this duplicate-free data set remained the same. The duplicate-free data set was chosen for subsequent analysis and the performance, confusion matrix, and corresponding receiver operating characteristic/area under the curve (ROC-AUC) plot are reported (Fig. 1, 2 and Fig. S17, ESI†). These classification results demonstrate the nature of the model's performance. Error accumulates primarily at either extemum of the log
K distribution, but large errors are uncommon and performance in the denser parts of the training distribution is higher. Next, we removed classes of families and tested the model's ability to predict any one member of that family (Table 1, Fig. 4 and Table S2, ESI†). Given the limited size of the data set, we expect this algorithm to be useful in identifying the binding of molecules which a supramolecular chemist might expect to bind to cucurbiturils a priori. For example, molecules with extended aromaticity are generally hypothesised to have some interactions with cucurbiturils.15 This analysis shows that in order to capture binding of molecules such as imidazolium derivatives and adamantyl compounds, a data set containing these molecules is required. Imidazolium derivatives performed the best out of all the groups considered when their family was included in the training, and their error increases more than 5 times when left out, suggesting the algorithm is particularly sensitive to training on these types of molecules. Small arylamines and viologen derivatives performed better than the average data set regardless if the family was kept in or out, suggesting analysis of these kinds of molecules is robust and the physics of their binding is well-captured with the remaining data relative to the other families.
Family of molecules | Unique entries |
---|---|
Small arylamines | 4 |
Viologen derivatives | 6 |
Methylene blue derivatives | 9 |
Perfluorinated compounds | 13 |
Amino acids | 10 |
Imidazolium derivatives | 8 |
Adamantyl compounds | 12 |
We next performed classical machine learning controls.17 We first tested the performance of environmental parameters alone, which contain no chemical information about the guest. We found they had poor predictive capabilities (Fig. 3). We also tested whether we could predict the logK by counting the number of carbons in each molecule (Fig. S21, ESI†). Similarly, we found poor predictive capabilities with this approach. Both models performed worse than models which considered 3D structural data. One potential bias in the data that could be leading to the difference in the controls' performance is the slight negative relationship of molecular weights of guests (Fig. S22, ESI†) to log
K. Finally, we generated a random data set of identical dimension with the same log
K outputs and found this had poor predictive capabilities (Fig. S23, ESI†). We also randomly reassigned log
K values to different input data and found this reshuffling had, as expected, poor predictive capabilities (Fig. S24, ESI†).
We then compared whether there were similarities among the top 10 performing models based on score (of the 127 models considered when no environmental parameters are included). All the 10 best models utilised the electric field gradient and a geometry, while seven also used electrons condensed to atoms. No other parameters were used in 50% or more of the top 10 models. While we chose to highlight the results from the top performing model, the differences between the top 10 are small. This shows that the identity of the top performing model is less important than identifying which features consistently improve performance. We conclude from these results that these spatial and electronic data about each molecule improve the predictive capabilities of the algorithm. Five of these top 10 also had among the highest 10 R2 values, including the top model based on score. All of the top 10 models had among the top 25 R2 values.
Within the domain of utility, our results suggest these models may has some predictive ability towards binding constants of molecules we might suspect a priori have binding to cucurbiturils. We next used the top model to predict whether binding can occur between CB[7] and two small molecule organics recently identified as potentially promising drugs against pediatric low-grade gliomas: a type II RAF inhibitor TAK-580 (formerly MLN2480; referred to here as RAF), and a MEK inhibitor selumetinib (also called AZD6244; referred to here as MEK).18,19 Sun and colleagues recently reported RAF as a more promising therapy than type I RAF inhibitors due to its ability to bind to both fused and truncated v600.18 Banerjee and colleagues also recently reported a promising phase I clinical trial of MEK in children with low-grade gliomas.19 We performed DFT geometry optimisations on these two molecules and applied the SVM model. It was predicted that RAF would be a good guest to CB[7] with a logK of 4.61, while MEK would have very poor binding with a log
K of 1.18 (Fig. 5C). Similar values were obtained if duplicate inputs were considered (Fig. S20, ESI†). Synergistic drug cocktails have more potent responses than the sum of their individual components.20 A key challenge in developing drug cocktails is in their delivery because drugs have different therapeutic windows requiring different release kinetics.21,22 The ability to independently modulate release kinetics is an invaluable tool in the development of combination drugs. Different binding constants with macrocycles such as CB[7] is one promising approach to independently modulate these kinetics. This prediction that two promising drugs (Fig. S25, ESI†) against pediatric low grade gliomas is a potentially promising ‘hit’ in combination drug delivery.
We experimentally validated whether these predictions on the strong and poor CB[7] binding of RAF and MEK were accurate. Upon addition of CB[7] to an aqueous solution of RAF (1:
1 molar ratio), the drug's aromatic 1H NMR peaks remained sharp and well resolved. The proton signals of CB[7] split into two sets of equivalent peaks (Fig. 5A). These two observations strongly suggest that RAF and CB[7] bind favorably and statically.15 Isothermal titration calorimetry (ITC) revealed that KCB[7] = 3.5 × 106 M−1 (Fig. S29, ESI†). We also sought to identify precisely where RAF was binding with CB[7]. No information on 1H or 13C NMR peak assignments could be found on RAF from the manufacturer or in the literature, and so further characterisations were carried out utilising 1D and 2D NMR techniques (ESI,† Section S3: Binding analyses via NMR). Our results show that CB[7] binds statically at the trifluoromethyl-substituted ring in a 1
:
1 fashion (Fig. S29, ESI†). Surprised by this result, we sought to understand why CB[7] preferentially bound to the bulkier trifluoromethyl-substituted ring if there was an alternative pyrimidine with a positively charged amine.6 Deuterated hydrochloric acid solution (0.1 M) was titrated into a solution of RAF alone (Fig. S30, ESI†). The aromatic peak meta to the primary amine shifted after a reduction to pH ≤ 2. This suggests that the primary amine is, in fact, uncharged, which may be a reason why CB[7] does not bind at the pyrimidine ring. We then investigated whether RAF could bind to CB[8] (Fig. S31, ESI†). The aromatic peaks of the drug do not remain well resolved as in the case with CB[7], but rather they broaden and disappear. This suggests that RAF does interact with CB[8] with low affinity and in a highly dynamic manner. Thus, CB[8] is not a good carrier for RAF, while CB[7] is an excellent one.
We then validated whether the SVM prediction for MEK was correct. MEK was added to an aqueous solution of excess CB[7] to determine whether any interactions were occurring (Fig. S32, ESI†). In depth analysis is described in the ESI.† These data demonstrated that the drug does not bind to CB[7], confirming that the SVM predicted poor binding of MEK with CB[7]. We then screened its binding to CB[8] (Fig. S33, ESI†). The shift and retention of sharp peaks in the 1H NMR spectra suggested that the MEK inhibitor binds more strongly and statically to CB[8] than RAF. The downfield shifts of protons c, g, and h suggested that the extended imidazole ring is located near but outside the CB[8] cavity. The upfield shift of protons a and b suggested that the ethylene glycol unit is inside the CB[8] cavity. The minimal changes in protons d, e, and f were consistent with the hypothesis that the bromo-substituted ring was not inside or near the CB[8] cavity. It is well known that CB[8] can thread poly(ethylene glycol) chains.6 The thermodynamically favorable interactions between ethylene glycol repeat units and CB[8] may explain why CB[8] preferentially binds to the ethylene glycol unit of MEK. After addition of CB[8] in ratios greater than 1:
1, little change occurs in the spectra, which suggested MEK and CB[8] bind in a 1
:
1 fashion. These data show that two different drugs with different therapeutic windows bind to different CB macrocycles. MEK shows no binding with CB[7], yet RAF and CB[7] bind strongly in a 1
:
1 fashion. Conversely, MEK binds to CB[8] more statically than RAF. Combining these two drugs into one therapy could give rise to a paradigm that provides a unique opportunity to selectively tune the release or residence time of one drug independently of the other by simply tuning the concentrations of CB[7] and CB[8] in the system.
We next sought to provide a qualitative example of the potency of these drugs, and why modulating drugs to have different release kinetics is an important capability in the development of combination therapies. RAF/MEK combination therapies have been found to be efficacious against other malignancies.23,24 In this work we explore whether such a combination is potent in a pediatric glioma model. We screened for combinations of RAF and MEK against a v600e mutant and identified a synergistic effect at 102 nM concentration of both RAF and MEK together (Fig. S34, ESI†). This result suggests that by co-delivering RAF and MEK, the total drug concentration required can be reduced. Further optimisations may yield further reductions in required concentrations.
Finally, we utilised a simple mass transport model to showcase how with these binding affinities, CB[7] can be used to independently tune the release kinetics of one drug without changing the kinetics of the other (Fig. 6). We modeled a spherical, non-degradable hydrogel depot 0.375 mL in volume25 with CB[7] bound within the matrix and 100 μM loaded drug concentrations (Gtotal). Our lab has recently shown that divalent crosslinkers can form hydrogels loaded with CBs,26–28 however the gel modeled here differs from these previous reports as the CB does not participate in the non-covalent network (ESI†). This model leads to an initial value problem, which is dictated by the differential equation (full derivation in ESI†):
csurf = f(K,Gtotal,CB[7]total) |
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c9cp05800a |
‡ These authors contributed equally to this work. |
This journal is © the Owner Societies 2020 |