Sheng
Chen
ab,
Junjie
Xie
ab,
Renlong
Ye
b,
David Daqiang
Xu
b and
Yuedong
Yang
*a
aSchool of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
bAixplorerBio Inc., Jiaxing 314031, China. E-mail: yangyd25@mail.sysu.edu.cn
First published on 13th June 2024
Dual-target drug design has gained significant attention in the treatment of complex diseases, such as cancers and autoimmune disorders. A widely employed design strategy is combining pharmacophores to leverage the knowledge of structure–activity relationships of both targets. Unfortunately, pharmacophore combination often struggles with long and expensive trial and error, because the protein pockets of the two targets impose complex structural constraints. In this study, we propose AIxFuse, a structure-aware dual-target drug design method that learns pharmacophore fusion patterns to satisfy the dual-target structural constraints simulated by molecular docking. AIxFuse employs two self-play reinforcement learning (RL) agents to learn pharmacophore selection and fusion by comprehensive feedback including dual-target molecular docking scores. Collaboratively, the molecular docking scores are learned by active learning (AL). Through collaborative RL and AL, AIxFuse learns to generate molecules with multiple desired properties. AIxFuse is shown to outperform state-of-the-art methods in generating dual-target drugs against glycogen synthase kinase-3 beta (GSK3β) and c-Jun N-terminal kinase 3 (JNK3). When applied to another task against retinoic acid receptor-related orphan receptor γ-t (RORγt) and dihydroorotate dehydrogenase (DHODH), AIxFuse exhibits consistent performance while compared methods suffer from performance drops, leading to a 5 times higher performance in success rate. Docking studies demonstrate that AIxFuse can generate molecules concurrently satisfying the binding mode required by both targets. Further free energy perturbation calculation indicates that the generated candidates have promising binding free energies against both targets.
Towards rational design of dual-target drugs, the pharmacophore combination strategy has been widely employed.10,13–17 It includes simple “linking” of two distinct pharmacophores with a linker, “merging” of pharmacophores with shared fragments, and “fusing” of pharmacophores. The “linking” approach often results in higher molecular weight and is prone to suffering from pharmacokinetic issues.10,13 The “merging” approach is limited to targets that share some binding modes.14,15 The “fusing” approach allows tight joining of active fragments and offers the opportunity to achieve favorable physicochemical properties in the resulting molecules.16,17 Designing dual-target drugs through pharmacophore combination allows for effective integration of structure–activity relationship (SAR) knowledge for both targets.2 However, finding correct pharmacophore combination patterns is often hindered by long and expensive trial and error, mainly due to the complex structural constraints imposed by the protein pockets of the dual targets. Therefore, it is valuable to develop dual-target drug design methods that improve the efficiency and success rate.
Computational drug design methods have been developed over several decades to generate molecules with desired properties.18,19 Traditional molecular generation primarily navigated the chemical-space exploration through population-based stochastic optimization procedures, such as evolutionary algorithms (EA)20 or swarm intelligence.21 In recent years, deep learning algorithms have been widely employed in molecular generation.22,23 Early studies such as those on ChemVAE,24 CVAE25 and Heteroencoder26 generated SMILES strings as molecular representation. On the other hand, Liu et al.27 and You et al.28 made exploratory attempts to generate molecules represented by the graph, whose node-edge structure naturally matches the atom-bond structure of molecules. However, most of them focused on non/single-objective generation and thus are not suitable for dual-target drug design. There are only a handful of multi-objective molecular generation algorithms.29–33 Li et al.33 trained machine learning (ML) models as approximate empirical measurements of activity against glycogen synthase kinase-3 beta (GSK3β) and c-Jun N-terminal kinase 3 (JNK3). Following their work, related methods (e.g. RationaleRL,31 MARS29) employed the same activity predictors in their methods and assessments. The utilization of ML predictors brought up two issues. Firstly, the generalizability of the ML model cannot be guaranteed when available activity data are limited. Secondly, the ML model based on molecular fingerprint representation is not structurally interpretable. Compared with the ML-based activity predictor, molecular-simulation-based computational tools such as molecular docking and free energy perturbation (FEP) are more generalizable and structurally interpretable. Since molecular simulation is less computationally efficient, recent efforts have been made on molecular docking34,35 and FEP36,37 by active learning (AL) to improve the efficiency of virtual screening. Therefore, integrating multi-objective molecular generation with active learning on molecular-simulation-based activity estimation is a promising way to improve generalizability and structural interpretability.
However, combining molecule generation with AL presents another challenge. In the context of virtual screening, the target domain for AL is well-defined, given that the compound library is known and readily available. In contrast, when it comes to molecule generation, the target domain for AL remains elusive because the molecules within this domain haven't been generated. This dilemma is akin to training an artificial intelligence algorithm for the GO game in the absence of any historical chess records. AlphaZero38 has masterfully navigated this very conundrum by self-play with reinforcement learning (RL) and Monte Carlo tree search (MCTS). Inspiringly, searching pharmacophore combination patterns to satisfy the dual-target structural constraints can also be modeled as two self-play MCTS agents against two targets. Through self-play pharmacophore combination, target domain molecules can be generated for AL training, and the trained models can subsequently provide feedback to navigate the molecular generation.
In this study, we propose AIxFuse, a structure-aware dual-target drug design method. To our knowledge, this is the first approach that learns pharmacophore fusion patterns to satisfy the dual-target structural constraints simulated by molecular docking. In AIxFuse, pharmacophores are extracted automatically through protein–ligand interaction analysis of active compounds and target proteins. The pharmacophore-fusion chemical space is represented by multi-level molecular sub-structure trees. Exploration in this chemical space towards multiple desired properties (i.e. binding affinity, drug-likeness, and synthetic accessibility) is navigated by an actor–critic-like39 RL framework. The RL framework consists of two self-play MCTS actors for molecule generation and a dual-target docking score critic trained by AL. After iterations of collaborative RL and AL, AIxFuse learns to generate molecules with multiple desired properties. AIxFuse was first evaluated on designing a dual-inhibitor for GSK3β and JNK3, where it showed outperformance (32.3% relative improvement in success rate) compared to other state-of-the-art (SOTA) methods. We also applied AIxFuse to another task that aims to design dual inhibitors against retinoic acid receptor-related orphan receptor γ-t (RORγt) and dihydroorotate dehydrogenase (DHODH).3 Consistently, AIxFuse achieves the best success rate of 23.96%, over 5 times higher than that of other methods. Docking studies demonstrated that AIxFuse can generate molecules that concurrently satisfy the binding mode required by both targets. Furthermore, as revealed by FEP, a generated candidate exhibits better/comparable binding free energy with known inhibitors on both targets, demonstrating its dual-target activity potential. This holds promising prospects for AIxFuse to accelerate dual-target drug design in real-world drug development scenarios.
Fig. 1 The AIxFuse pipeline encompasses three core stages: (A) data processing (example shown by RORγt and DHODH), entailing the preprocessing of protein structures and active compounds for both targets; (B) automatic pharmacophore extraction (example shown by RORγt), accomplished through automatic scrutiny of protein–ligand interactions within the docking poses of active compounds to isolate core fragments that preserve key pharmacophores; and (C) collaborative learning of pharmacophore fusion and molecular docking. The actor–critic-like reinforcement learning on pharmacophore fusion and the active learning on molecular docking are run collaboratively and iteratively. It employs two self-play MCTS actors for molecule generation and a dual-target docking score critic to navigate generation. The critic is a multi-task AttentiveFP (schematic diagram depicted by Xiong et al.43) and is trained by active learning. The green arrow lines represent the training procedure. |
All methods were first assessed in terms of three general metrics for molecular generation: validity, uniqueness, and diversity. As shown in Table 1, all methods achieved a validity rate exceeding 98%, indicating their ability to generate molecules that conform to essential chemical constraints. Notably, significant differences were observed in the uniqueness metric. AIxFuse demonstrated the highest uniqueness at 89.7%, surpassing other previous methods, including REINVENT2.0 (82.6%), RationaleRL (53.3%), and MARS (24.1%). While fragment-based methods often struggle with diversity, AIxFuse obtained a score of 0.719, outperforming RationaleRL (0.656) and MARS (0.597), and matching the performance of REINVENT2.0 (0.722). This unexpected success may be attributed to the upper confidence bounds (UCB) of reward expectations in MCTS, which aims to strike a balance between sampling frequency and reward expectations.
Metrics | REINVENT2.0 | RationaleRL | MARS | AIxFuse |
---|---|---|---|---|
a Success rate is defined as the proportion of generated molecules that obtain better multiple-property values than the average of active compounds. b USR (unique success rate) denotes the proportion of non-repeating molecules that obtain better single-property values than the average of active compounds. | ||||
Validity (%) | 98.9 | 99.9 | 100 | 100 |
Uniqueness (%) | 82.6 | 53.3 | 24.1 | 89.7 |
Diversity | 0.722 | 0.656 | 0.597 | 0.719 |
Success ratea (%) | 1.75 | 11.95 | 17.83 | 23.59 |
USR QEDb (%) | 14.77 | 46.84 | 24.05 | 37.64 |
USR SA (%) | 55.45 | 16.39 | 17.02 | 55.66 |
USR dockingGSK3β (%) | 19.75 | 38.47 | 11.90 | 81.74 |
USR dockingJNK3 (%) | 16.73 | 23.94 | 6.63 | 78.20 |
FCD (GSK3β|JNK3) | 29.5|30.8 | 28.2|27.8 | 53.8|46.8 | 16.7|24.9 |
SNN (GSK3β|JNK3) | 0.35|0.41 | 0.38|0.44 | 0.36|0.45 | 0.47|0.45 |
We then checked the ability of each method to generate molecules with multiple desired properties. The success rate is defined as the proportion of generated molecules that outperform the mean values of known active compounds in multiple properties. These properties include dual-target docking scores, Quantitative Estimate of Drug-likeness (QED), and synthetic accessibility (SA). As shown in Table 1, AIxFuse achieved the highest success rate at 23.59%, outperforming the second-place method MARS (17.83%) with a relative improvement of 32.3%. This result demonstrated AIxFuse's capability to generate molecules that simultaneously satisfy multiple property constraints. To evaluate the performance of each individual property, we calculated the unique success rate (USR), which represents the proportion of non-repeating molecules satisfying a specific property constraint. The baseline model AIxFuse(w/o RLAL) was constructed by replacing RL and AL with random sampling on both trees. As shown in Fig. 2, baseline model AIxFuse(w/o RLAL) exhibited a promising USR in docking scores, but its USRs in SA and QED were not optimal. By incorporating RL and AL, AIxFuse achieved the best USR in SA (55.66%) and the second-highest USR in QED (37.64%), demonstrating significant improvements to AIxFuse(w/o RLAL). Furthermore, AIxFuse achieved the highest USR on the docking scores of GSK3β (81.74%) and JNK3 (78.20%), surpassing other methods by a significant margin.
We were interested in studying the similarity between generated molecules and active compounds. As shown in Table 1, AIxFuse consistently achieved the lowest Fréchet ChemNet Distance (FCD) and the highest Similarity to Nearest Neighbor (SNN) against active compounds of both targets. This is as expected since AIxFuse preserved key pharmacophores of active compounds to incorporate the SARs of both targets. To gain an intuitive understanding of similarity, Fig. 2E visualizes the chemical spaces of generated molecules and active compounds. The chemical space of GSK3β inhibitors closely resembled that of JNK3 inhibitors. Notably, AIxFuse explored the gaps between the main clusters of known inhibitors, while other methods didn't. Molecules generated within the chemical space gaps between the inhibitors of the two targets are likely to exhibit high similarity to both ends. Looking deeper into the molecular property similarity between generated molecules and active compounds, we plotted the property distributions in Fig. 2F–K. It can be found that the QED, SA, logP, and weight distribution of AIxFuse-generated molecules closely aligned with those of GSK3β inhibitors and JNK3 inhibitors. However, for both GSK3β and JNK3 docking scores, AIxFuse-generated molecules were distributed towards the lower end of the spectrum compared to the corresponding known inhibitors. It demonstrated that AIxFuse-generated molecules obtained better estimated binding affinities against both targets while keeping other properties similar to those of the known inhibitors.
What contributed to the outperformance of AIxFuse? We first explored the contribution of running multiple iterations of AL. As shown in Fig. 3A and B, as AL iterated, the overall MSE decreased while R2 increased. This trend demonstrated that multiple iterations of AL contribute to the accuracy of the dual-target docking score critic. We then presented the accuracy of the final dual-target docking score critic in Fig. 3C and D. The final Pearson correlation coefficients indicated strong positive correlations between the predicted docking scores and the ground truth (0.691 and 0.704 for GSK3β and JNK3, respectively). This affirmed the models' robust ranking ability, which is an essential factor for molecular generation to improve molecular properties. We then investigated the contribution of RL optimization. As is shown in Fig. 3E–H, the docking score distribution of AIxFuse(w/o RLAL) closely resembles that of known active molecules. This suggests that our automatically extracted pharmacophores can effectively preserve the essential SARs of both targets. However, AIxFuse(w/o RLAL)'s QED distribution and SA distribution are not ideal. After iterations of collaborative RL and AL, AIxFuse achieves improvements in all four properties. These phenomena underscore the synergistic effect of collaborative RL and AL: AL trained dual-target docking score critic for RL optimization, while RL provided molecules with higher quality for AL training.
Compared with other methods, AIxFuse demonstrated superior performance in various evaluation metrics, as depicted in Table 2. Specifically, AIxFuse achieved the highest validity, uniqueness, success rate, USR QED, USR SA, USR dockingRORγt, USR dockingDHODH, USR 3D SNNRORγt, and USR 3D SNNDHODH. In terms of diversity, that achieved by AIxFuse is lower than that of REINVENT2.0, comparable with that of RationaleRL, and far higher than that of MARS. When compared across benchmark tasks, AIxFuse exhibited consistent results, whereas other methods displayed fluctuations. Fig. 4A highlights that AIxFuse achieved a success rate of 23.96%, consistent with its performance in the GSK3β|JNK3 task (23.59%, Fig. 4B). Conversely, MARS failed to generate any successful molecules against RORγt|DHODH, resulting in a success rate drop (17.83% → 0%). A similar decrease in success rate occurred in RationaleRL (11.95% → 1.16%). Both MARS and RationaleRL rely on machine learning (ML) activity predictors. However, due to the limited availability of public non-active data for DHODH, training an ML predictor posed challenges (detailed in ESI Section 1†). In contrast, AIxFuse replaced ML-based activity predictors with molecular docking, which should contribute to its consistent performance.
Metrics | REINVENT2.0 | RationaleRL | MARS | AIxFuse |
---|---|---|---|---|
a 3D SNN measures the maximum 3D similarity to known active compounds, while USR 3D SNN is the proportion of non-repeating compounds that obtained higher 3D SNN than dual-target inhibitor (R)-14d.3 | ||||
Validity (%) | 100 | 99.9 | 100 | 100 |
Uniqueness (%) | 33.5 | 33.7 | 9.9 | 94.1 |
Diversity | 0.708 | 0.672 | 0.530 | 0.661 |
Success rate (%) | 4.65 | 1.16 | 0.00 | 23.96 |
USR QED (%) | 25.94 | 1.39 | 9.83 | 41.53 |
USR SA (%) | 31.39 | 33.58 | 9.77 | 82.41 |
USR dockingRORγt (%) | 21.60 | 23.56 | 1.67 | 85.04 |
USR dockingDHODH (%) | 4.94 | 16.54 | 0.02 | 72.74 |
USR 3D SNNRORγta (%) | 20.33 | 12.83 | 3.17 | 58.18 |
USR 3D SNNDHODH (%) | 22.81 | 10.71 | 6.95 | 83.87 |
We then checked the dual-target activity potential of AIxFuse-generated molecules. As shown in Fig. 4C, AIxFuse's docking score distributions on both targets consistently favored the lower (better) end of the spectrum. A majority of AIxFuse-generated molecules exhibited lower RORγt and DHODH docking scores than the average score of known RORγt inhibitors and DHODH inhibitors respectively. When referring to compounds designed by Chen et al., AIxFuse also successfully generated numerous molecules with comparable RORγt docking scores and lower DHODH docking scores. This observation indicates promising dual-target activity potential. For a more accurate estimation of protein–ligand binding free energies, we employed the Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) calculation.44Fig. 4D illustrates the computation results. When referring to the GSK_98E (active ligand in the co-crystal structure of RORγt, PDB id: 5NTP45), 17 out of 134 AIxFuse-generated molecules obtained lower binding free energies. All these 17 candidates achieved lower binding free energies than BAY2402234 (active ligand in the co-crystal structure of DHODH, PDB id: 6QU7 (ref. 46)). Additionally, we performed a t-SNE visualization of the chemical space of these generated molecules in Fig. 4E. We also displayed the structures of 5 representative molecules, showcasing their notable structural diversity. Looking into a detailed case, AF-3 (structure shown in Fig. 4F) exhibited dual-target activity potential with MM/GBSA binding free energies of −76.97 kcal mol−1 on RORγt and −65.76 kcal mol−1 on DHODH. We also displayed structures of AF-3's most similar RORγt inhibitor (ChEMBL id: CHEMBL4438289) and most similar DHODH inhibitor (BindingDB id: BDBM470379) in Fig. 4G and H, respectively. Notably, the left half structure (LHS) of CHEMBL4438289 showed a similar pattern to the LHS of BDBM470379. AF-3 successfully merged this pattern by fusing the RORγt rationale (red in Fig. 4F) and the DHODH rationale (blue in Fig. 4F). This fusion likely contributes to its lower MM/GBSA binding free energies on both targets.
We first studied the binding mode of AF-5 on both targets by molecular docking. As shown in Fig. 5B, when docking into RORγt (PDB id: 5NTP), AF-5 established hydrogen bonds and π–π stacking interactions with key residues, including HIE479, PHE377, GLU379, HIE323, and GLN 286. When docking into DHODH (PDB id: 6QU7, Fig. 5D), AF-5 exhibited hydrogen bonds and π–π stacking between AF-5 and key residues GLN47, TYR356, and PHE62. AF-5's binding modes on both targets matched previous studies on corresponding known inhibitors.3,45,46Fig. 5C and E illustrated that the docking pose of AF-5 (colored in orange) on both targets closely resembled those of known inhibitors in the co-crystal structure (RORγt inhibitor GSK-98E45 and DHODH inhibitor BAY2402234,46 both colored in cyan).
We proceeded to analyze the dual-target activity potential of AF-5 using ABFEP. AF-5 was compared with GSK-98E and BAY2402234 on their binding free energies against corresponding targets. According to its optimal protonation state at pH 7.0 ± 2.0, GSK-98E had a +1 net charge, while AF-5 was electrically neutral. The disparity in net charge would cause systematic errors. To account for this, we also calculated the binding free energy of the GSK-98E in the electrically neutral state. As depicted in Fig. 5C, AF-5 exhibited a lower RORγt binding free energy (−18.68 kcal mol−1) compared to the charged GSK-98E (−6.54 kcal mol−1) which was comparable to that of the neutral GSK-98E (−18.97 kcal mol−1). Regarding the other target, DHODH, AF-5's ABFEP binding free energy (−21.45 kcal mol−1) was significantly lower than that of BAY2402234 (−17.09 kcal mol−1). We also included additional references by supplementing the ABFEP calculation results of several other active molecules. As shown in ESI Table S6,† AF-5's RORγt binding free energy was comparable to those of known RORγt inhibitors, while its DHODH binding free energy was lower than those of known DHODH inhibitors. In summary, the in silico analysis demonstrated the potential of AF-5 as a dual-target inhibitor against RORγt and DHODH.
AIxFuse was benchmarked on two dual-target drug design tasks and compared with previous SOTA methods. In the dual-target drug design task against GSK3β and JNK3, AIxFuse exhibited 32.3% relative improvement in success rate compared with the best of other methods. In the RORγt and DHODH dual-target inhibitor design task with limited activity data, AIxFuse achieves a success rate of 23.96% while compared methods suffered from performance drops, leading to a 5 times higher performance in success rate. Further in silico binding free-energy calculation results highlighted the dual-target activity potential of AIxFuse-generated drug candidates.
As a structure-aware dual-target drug design method, AIxFuse can generate potential dual-target hit compounds for subsequent lead optimization. For example, our latest structure-based scaffold decoration method DiffDec can be used for lead optimization on AIxFuse-generated molecules. In the future, active learning on FEP36,37 can also be introduced into AIxFuse to enhance the accuracy of binding affinity estimation. As the demand for dual-target drugs becomes significant in the treatment of complex diseases, AIxFuse can provide rapid and effective starting points for various dual-target design tasks.
The second benchmark task aims to design dual-target inhibitors against retinoic acid receptor-related orphan receptor γ-t (RORγt) and dihydroorotate dehydrogenase (DHODH), named as the RORγt–DHODH benchmark. RORγt and DHODH have both emerged as attractive targets for the treatment of autoimmune diseases.51,52 Chen et al.3 advocated for the development of the RORγt|DHODH dual inhibitor as a promising strategy to enhance therapeutic efficacy, reduce toxicity, and mitigate drug resistance in the context of inflammatory bowel disease (IBD) therapy. They designed a series of compounds and identified one candidate with both in vitro and in vivo activities, along with favorable mouse pharmacokinetic profiles.
• RationaleRL:31 RationaleRL begins by extracting rationales from active compounds via MCTS to retain essential activity-contributing elements. Subsequently, it employs the pharmacophore merge strategy to combine rationales, preserving the maximum common substructure. Finally, a molecular generation model is fine-tuned to produce side chain groups.
• MARS:29 MARS selects the top 1000 fragments from the ChEMBL dataset, ensuring that each fragment contains no more than 10 heavy atoms and appears most frequently. It employs Graph Neural Networks (GNN) and Markov Chain Monte Carlo (MCMC) sampling to manipulate these fragments and generate molecules with enhanced properties.
• REINVENT2.0:32 This method generates SMILES representations of molecules using a Recurrent Neural Network (RNN). Reinforcement learning is then applied to optimize the properties of the generated molecules, facilitating the generation of multi-target molecules through the application of multiple rewards.
We employed a diverse set of metrics to assess the performance of each method. Initially, we computed general metrics such as validity, uniqueness, and diversity using Moses.53 Additionally, we scrutinized properties including drug-likeness and synthesizability of the generated molecules and dual-target activity. Quantitative Estimate of Drug-likeness (QED) and synthetic accessibility (SA) were calculated using RDKit.54 Notably, for dual-target activity estimation, no public ML-based active predictor is available for the RORγt|DHODH task. Therefore, we opted for the molecular docking score as the approximation of binding affinity in both tasks. As a molecular-simulation-based method, molecular docking offers structural interpretability and has been widely used in drug design.35,55–58 Following the approach of Chen et al.,3 Glide40 software was employed to estimate binding modes and affinities. The co-crystal structures of RORγt LBD (PDB: 5NTP45), DHODH (PDB: 6QU7 (ref. 46)), GSK3β (PDB: 6Y9S59), and JNK3 (PDB: 4WHZ60) were selected and processed for docking preparation. The generated molecules were designed to concurrently satisfy multiple property constraints, prompting us to calculate the success rate, denoting the percentage of generated molecules meeting multiple property constraints concurrently. Diverging from prior works29,31 which set the thresholds of success as fixed values, we adopted the average value of active compounds as the threshold for each property. Those active compounds were collected from ChEMBL and BindingDB (details available in ESI Section 4†). For the GSK3β|JNK3 benchmark, the thresholds for QED, SA, and dual-target docking scores were set at 0.538, 2.76, −8.12, and −8.53 according to the average value of active compounds. For the RORγt|DHODH benchmark, the corresponding thresholds were 0.430, 3.49, −9.31, and −10.8. To mitigate the contribution of repeated molecules, we also computed the “unique success rate” for each property, which accounts for the proportion of non-repeating compounds that satisfied the constraint. Fréchet ChemNet Distance (FCD)61 and Similarity to Nearest Neighbor (SNN) scores were calculated using Moses53 to assess the similarity between the generated compounds and known inhibitors. Since designing molecules with higher 3D similarity to lead compounds is desirable in drug design,62 for the RORγt|DHODH benchmark, we additionally calculated the 3D SNN of the generated molecules by ShapeTanimotoDist in RDKit,54 measuring the maximum 3D similarity to known active compounds.
To analyze PLIs, we employed the PLIP tool41,42 on the molecular docking complex structures due to the unavailability of co-crystal structures for most active compounds. For each pharmacophore, we systematically extracted its minimal connected substructure, termed core fragments, as illustrated in Fig. 1B. Our interest lies in core fragments that possess the following characteristics: ① interaction or contact with key residues, ② prevalence across multiple active compounds, and ③ appropriate size. To achieve this, we have reward fragments using scoring functions of interactions, distance, and heavy atom number, with details shown in ESI Section 2.† Furthermore, for the core fragment shared by two or more compounds, we aggregated its score so that fragments favored by medchemists rank higher. Ultimately, the top-scoring core fragments were chosen to construct trees of the two targets. The overall pharmacophore extraction and selection process is visualized in Fig. 1B and detailed in ESI Algorithm 1.†
Once two cores have been selected from two trees, the challenge at hand transforms into how to modify and fuse them. Given that the byproduct of extracting cores from active compounds is the side chain fragments, it is natural to consider using side chains from two trees to build linkers. The linker construction process unfolds through three stages. First, two anchor atoms were chosen from two core fragments as the linker growth anchors, named as Growth Anchor in Fig. 1C. Second, for each growth anchor, we extracted a side chain that grew from this anchor. Third, from each side chain, we selected an anchor atom to fuse them, denoted as Fusing Anchor in Fig. 1C. By connecting two fusing anchors from distinct trees, core fragments from distinct targets can be consequently linked together. For growth anchors that have not been selected, decorating R-groups was considered to help improve the diversity of generated molecules. Side chains growing from these anchors can be sliced at different atoms to construct R-groups. Ultimately, as illustrated in Fig. 1C, the leaf node rationale was constructed by decorating valid R-groups (R-groups that are chemically valid and do not disrupt the ring within the original side chain) on unselected growth anchors. The final molecule was generated by fusing two rationales at their fusing anchors.
The sizes of the rationale libraries for GSK3β, JNK3, RORγt, and DHODH surpassed 130k, 60k, 150k, and 13k, respectively. It can be estimated that the chemical space accessible to AIxFuse ranges between 109 and 1010 on both GSK3β|JNK3 and RORγt|DHODH benchmarks. Virtual screening within chemical spaces of such magnitude demands an overwhelming amount of computational resources, often beyond reach. This challenge mirrors that encountered by AlphaZero when evaluating the value of numerous states within the game space. Therefore, we did not assess the entire chemical space but instead employed iterative simulation and exploration by Monte Carlo tree search (MCTS). To guide the exploration, we also trained a multi-task graph neural network (GNN) to predict the docking scores against distinct targets, which is described in the next subsection.
Graff et al. demonstrated that in the realm of molecular docking active learning, message-passing neural networks (MPNN) outperformed both random forest and traditional neural network approaches.35 Nonetheless, MPNN, when applied to the processing of molecular graphs, may encounter the issue of diluted effects. This arises from its uniform treatment of all nodes regardless of their distance to a target node, leading to weakened impacts of topologically adjacent nodes and functional groups.43 To address this limitation, attention mechanisms have been introduced, offering improved neighbor aggregation in the form of the Graph Attention Network (GAT).65 More recently, the Attentive Fingerprints (AttentiveFP) framework43 was proposed as an approach capable of capturing both local atomic interactions through node information propagation and non-local effects within a molecule through a graph attention mechanism. It was tailored for molecular feature extractions and outperformed other architectures (including MPNN) across comprehensive benchmarks.
In our study, we applied the AttentiveFP framework to the domain of molecular docking active learning. Two convolutional layers were constructed to extract atomic features, and a readout layer was utilized for generating molecular embeddings. Subsequently, docking scores were predicted by feeding molecular embeddings into a multi-task fully connected layer. The network architecture can be represented as eqn (1)–(3):
(1) |
(2) |
(3) |
The attention mechanism utilized in CONTEXT, AGGREGATE, and READOUT functions is described by eqn (4), (6) and (8):
(4) |
a(0)ij = softmax(leaky_relu(W3[W4·vi, leaky_relu(W5[eij, vj])])) | (5) |
(6) |
aij(l) = softmax(leaky_relu(W7(l)[hi(l−1), hj(l−1)])) | (7) |
(8) |
(9) |
We used MSELoss to measure the mean-squared error of our docking score regression model, which can be mathematically represented as:
(10) |
The entire network was trained in a multi-target fashion, yielding two predicted docking scores for a given molecule on the two targets. During each training iteration, the dataset was randomly divided into a training dataset and a validation dataset in a 9:1 ratio. To mitigate the risk of overfitting, early stopping techniques were implemented. Specifically, we monitored each model's performance on the validation set and continued training until no improvement was observed over 20 epochs or the maximum epochs of 1000 was reached.
Our models were trained on the Pytorch71 framework using the Adam72 optimizer for gradient descent optimization. The Deep Graph Library (DGL) package73 and the DGLLifeSci74 extension were employed for the implementation of AttentiveFP. All experiments were conducted within a uniform computational environment, consisting of 4 GPUs of Quadro RTX 5000, 96 CPU cores of Intel(R) Xeon(R) Gold 6248R, and the Ubuntu 20.04.6 LTS operating system. ESI Section 5† describes the details of training and evaluation.
The INIT_GEN function in Algorithm 1 is described in ESI Section 3.† In short, AIxFuse initializes the weight of each node when building trees, and randomly samples child nodes according to their weights. The random-sampled rationales from the two trees were fused to generate molecules.
The SELF_PLAY function in Algorithm 1 is detailed in ESI Algorithm 3.† Two MCTS actors run in a self-play manner, where we run simulations of pharmacophore fusion to reward each exploration. The reward function of a given molecule m can be formulated as:
(11) |
SQED = max(min(10 × QED(m), 8) − 6, 0.5) | (12) |
SSA = max(min(5.5 − SA(m)), 0.5) | (13) |
SNN = max(−NN(m) − 6, 0.1) | (14) |
(15) |
Molecular dynamics simulations were performed by GROMACS (version 2020.6).75 The AMBER99SB-ILDN76 force field was used for proteins. The General Amber Force Field (GAFF)77 with AM1-BCC78,79 partial charges was used for ligands. The protein–ligand complex was solvated in the dodecahedral box of TIP3P80 water with the minimum distance between the solute and the box of 10 Å. The system was minimized using the steepest descent algorithm with max steps of 10000, followed by 100 ps NVT equilibration at 300 K and then 100 ps NPT equilibration at 300 K and 1 bar. Subsequently, 2 ns NPT production was performed at 300 K and 1 bar and the frames were saved every 10 ps for further analysis and MMGBSA calculations. The distance cutoff of short-range interactions was set to 10 Å. Long-range electrostatic interactions were treated with PME algorithm.81 LINCS82 algorithm was applied to constrain the hydrogen-involved bonds and all the MD simulations were performed with a timestep of 2 fs. The MM/GBSA calculations were performed with AMBER83 with a time interval of 20 ps for extracted frames via the single-trajectory scheme, where the structures of the separate protein and ligand were extracted from the trajectory of the complex. ABFEP calculations were performed following the schemes employed by Aldeghi et al.84,85 More specifically, 42 lambda windows and 31 lambda windows were used for the complex leg and solvent leg, respectively. The relative position of the ligand with respect to the protein was restrained by applying harmonic potentials to one distance, two angles, and three dihedrals as proposed by Boresch et al.86 The free energy difference for adding these restraints to the dummy ligand in solvent was calculated analytically.86 The perturbation of van der Waals interactions between the ligand and the environment was performed with soft-core potential.87 After the relaxation and equilibration of each lambda window, 5 ns NPT production runs were performed using Hamiltonian-exchange Langevin dynamics for enhanced sampling.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4sc00094c |
This journal is © The Royal Society of Chemistry 2024 |