Shen
Wang
a,
Tong
Lin
bc,
Tianyi
Peng
d,
Enming
Xing
a,
Sijie
Chen
a,
Levent Burak
Kara
b and
Xiaolin
Cheng
*ae
aCollege of Pharmacy, The Ohio State University, Columbus, OH 43210, USA. E-mail: cheng.1302@osu.edu
bMechanical Engineering Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
cMachine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
dElectrical and Computer Engineering Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
eTranslational Data Analytics Institute (TDAI), The Ohio State University, Columbus, OH 43210, USA
First published on 8th January 2025
Recent advancements in 3D structure-based molecular generative models have shown promise in expediting the hit discovery process in drug design. Despite their potential, efficiently generating a focused library of candidate molecules that exhibit both effective interactions and structural diversity at a large scale remains a significant challenge. Moreover, current studies often lack comprehensive comparisons to high-throughput virtual screening methods, resulting in insufficient evaluation of their effectiveness. In this study, we introduce Topology Molecular Type assignment (TopMT-GAN), a novel approach using Generative Adversarial Networks (GANs) for direct structure-based design. TopMT-GAN employs a two-step strategy: constructing 3D molecular topologies within a protein pocket with one GAN, followed by atom and bond type assignment with a second GAN. This integrated approach enables TopMT-GAN to efficiently generate diverse and potent ligands with precise 3D poses for specific protein pockets. When tested on five diverse protein pockets, TopMT-GAN exhibits promising and robust performance, demonstrating a potential enrichment of up to 46000 fold compared to traditional high-throughput virtual screening methods. This highlights its potential as a powerful tool in early-stage drug discovery, such as hit and lead generation.
A crucial aspect of structure-based generative modeling lies in generating 3D molecular structures that accurately capture protein–ligand interactions. Molecular structure generation has been the subject of various studies, primarily categorized into three types based on molecular structure featurization:5 cubic grid-based, Euclidean distance matrix (EDM)-based, and Cartesian coordinate-based. Besides valid chemical connectivity and appropriate conformational geometry, it is essential for generated molecules to adopt binding poses that complement the target pocket and facilitate favorable interactions between ligands and the target. However, the integration of pocket information into molecule generation remains limited. For instance, LiGANN6 maps the pocket shape to ligand shape and decodes this into SMILES strings, representing an initial attempt in structure-based ligand design with generative models. Nevertheless, it does not generate 3D molecular structures.
Most structured-based models adopt autoregressive methods to sequentially build molecules atom by atom. Some models, such as SBDD,7 Pocket2Mol,8 GraphBP,9 SurfGen,10 ResGen,11 and PocketFlow12 use Graph Neural Networks (GNNs) to process pocket context and predict atom placements. In contrast, DeepLigBuilder13 utilizes Monte-Carlo Tree Search (MCTS) with smina14 docking scores to guide its L-Net generation process. Diverging from autoregressive approaches, TargetDiff,15 ShapeMol16 and PMDM17 employ diffusion models to generate entire molecules conditioned on the pocket environment. Additionally, fragment-based methods such as Lingo3DMol,18 ligandED,19 and DESERT20 assemble molecules by concatenating fragments within pockets. Although promising, methods like Pocket2Drug21 and SBMolGen,22 which rely on SMILES strings coupled with subsequent conformational sampling, are not 3D molecular generative models, and we there exclude them from comparison and discussion below.
Despite their promising potential, there are notable limitations that most current implementations have yet to overcome.
In addition to the limitations of scale, diversity, and efficiency, another critical issue is the lack of a standardized benchmark for comparing generative models. SBDD, Pocket2Mol, DESERT, TargetDiff, Lingo3DMol, and PMDM utilized proteins from the DUD-E25 or CrossDock26 datasets as test cases, averaging docking results across 10–100 protein targets. However, this approach fails to capture the crucial diversity of protein pockets. Some pockets are large and deep, facilitating strong binding, while others are shallow and surface-exposed, leading to weaker interactions. Using a single, averaged score over such varied targets obscures a model's specific performance on individual pocket types, impeding meaningful comparison and optimization. In addition to internal comparisons among the generative models, SurfGen and ResGen also included a comparison with molecular docking of random compound libraries, but their docking evaluations were limited to just 200 randomly selected molecules. A comprehensive approach is needed to assess the true effectiveness of generative models, particularly in the context of large-scale high-throughput virtual screening (HTVS).
To overcome these prevailing limitations in scale, diversity, and efficiency, we have developed a novel two-step 3D structure-based generative model called Topology Molecular Type assignment (TopMT-GAN). The initial phase of our approach focuses on generating valid molecular topologies that align closely with the contours of the target pockets. This shape complementarity is critical in guiding the exploration of the discovery space.27 The generation is achieved through the use of a Graph Translation Generative Adversarial Network (GAN),28 which effectively models potential molecular configurations to match the spatial characteristics of the pocket. This phase also includes an advanced search strategy and a local topology filter to ensure the validity and diverse sampling of topologies. In the second step, we employ another GAN for molecular assignment that predicts atom and bond types for each topology to generate valid molecules. This process, combined with local minimization, allows for accurate positioning of generated molecules within the target pocket, facilitating the rapid scoring of these molecules.
In addition to the novel architecture of TopMT-GAN, we have implemented a rigorous benchmarking method for evaluating 3D molecular generative models that is different from those employed in previous studies.7–22 As shown in Table S6,† existing approaches vary significantly in their evaluation methods, from generating just 100 molecules per target to using different baseline comparisons, making it difficult to assess relative performance. To address these limitations, our benchmarking task is designed to reflect the complexities of real-world drug design scenarios and address the corresponding critical challenges. This comprehensive task involves selecting a diverse set of five protein targets with distinct pocket features, including enzymes, kinases, G-protein coupled receptors (GPCRs), and nuclear receptors. We then generated a substantial number of ligands—50000 for each of the five protein pockets. For comparison, we also conducted moderate-scale high-throughput virtual screening (HTVS) of over 1 million compounds from the Enamine HTS collection43 against these targets. Our evaluation across the diverse targets revealed that TopMT-GAN could efficiently generate tens of thousands of molecules, which exhibit robust binding scores and strong scaffold diversity. Furthermore, the results indicated that molecules generated by TopMT-GAN could achieve up to a 46
000 fold enrichment compared to random HTVS, marking the first quantitative demonstration of a structure-based generative model's superior efficiency over traditional HTVS approaches. These findings showed not only the accuracy (i.e., potential binding in the pocket) but also the effectiveness (i.e., the speed and scaffold diversity) of our model, supporting its practical utility in real-world drug discovery.
TopMT-GAN works in two distinct modes to generate ligands for specific targets: scaffold-hopping and pocket-mapping. When co-crystal structures for a target pocket bound with a known ligand are available, pocket binding molecules can be generated through the scaffold-hopping strategy. In this case, the initial pocket shape input to TopMT-GAN is derived from the structure of the bound ligand. On the other hand, when no ligand data for a target pocket are available, TopMT-GAN can map the pocket shape directly by using small fragment probes. The mapped pocket shape can then serve as the basis for ligand generation, allowing for the design of molecules that are tailored to fit the target site. Results from both strategies were evaluated and compared.
To rigorously evaluate the performance of TopMT-GAN in addressing the limitations of existing structure-based generative models, we conducted a benchmark comparison with high-throughput virtual screening against a library of 1327
116 molecules from the Enamine HTS library. For this purpose, we generated 50
000 molecules for each protein target under investigation. Concurrently, we performed molecular docking of each compound in the library into the target pockets using AutoDock Vina.29,30 Additionally, we also gathered all known actives for each protein pocket from the Binding Database,31 and employed them as an additional benchmark to assess TopMT-GAN's performance. To facilitate performance comparisons, we report the hit rates for our 5 protein targets using Enamine HTS collections across different Vina score thresholds (Table S7†).
The performance of structure-based generative models can be influenced by the characteristics of target pockets. To demonstrate the robustness of TopMT-GAN, we selected five protein systems that represent a range of distinct pocket characteristics. These include 3C-like protease (PDB ID 7d3i)32 and androgen receptor (PDB ID 1e3g),33 both of which feature typical deep and buried protein pockets. In contrast, allosteric sites are often shallow and situated on protein surfaces, posing a greater challenge for ligand design. To facilitate a direct comparison between these different pocket types, we focused on two kinases with analogous overall structures yet distinct pocket types: c-Src kinase, which is bound with ponatinib at its orthosteric site (PDB ID 7wf5 (ref. 34)), and checkpoint kinase 1 (CHK1) that accommodates an allosteric ligand (PDB ID 3jvs35). Additionally, we selected the glucagon-like peptide-1 (GLP-1) receptor (PDB ID 5vew)36 to further evaluate TopMT-GAN's ability to design allosteric ligands for unusually large and shallow pockets. The five selected proteins are summarized in Table S1,† which includes their PDB ID, calculated solvent accessible surface area (SASA), pocket volume, ratio of SASA/volume and number of known actives.
![]() | ||
Fig. 2 Distributions of TopMT-GAN generated molecules for 3C-L protease. (a1 and a2) Vina docking score distributions for scaffold-hopping and pocket-mapping modes. (b1 and b2) Scatter plots of QED versus docking scores for scaffold-hopping and pocket-mapping modes. (c1–c4) NPR space distributions. (c1) For scaffold-hopping, (c2) for pocket-mapping, (c3) for Enamine HTS collection, and (c4) for random PubChem compounds. (d) RDKit generic scaffold of the original ligand from PDB 7d3i (red box) alongside representative examples of generated scaffolds. Top row: scaffold-hopping mode, bottom row: pocket-mapping mode (e) T-map visualization of generated molecules for pocket-mapping mode. Salmon: generated molecules, light blue: drug-bank molecules, light green: known actives. |
Target | Redock score kcal mol−1 | Sampling mode | EF < redock | EF < −8 | EF < −9 | EF < −10 | EF < −11 | EF < −12 |
---|---|---|---|---|---|---|---|---|
3C-like protease PDB 7d3i | −8.3 | Scaffold-hopping | 69 | 32 | 440 | 5751 | N/A | N/A |
Pocket-mapping | 115 | 43 | 1389 |
46![]() |
N/A | N/A | ||
Androgen receptor PDB 1e3g | −11.7 | Scaffold-hopping | 770 | 9 | 121 | 1262 | 4725 | N/A |
Pocket-mapping | 743 | 9 | 105 | 915 | 4247 | N/A | ||
c-SRC kinase PDB 7wf5 | −13.2 | Scaffold-hopping |
46![]() |
2 | 5 | 29 | 288 | 3300 |
Pocket-mapping |
46![]() |
2 | 5 | 27 | 261 | 2883 | ||
CHK1 kinase PDB 3jvs | −8.1 | Scaffold-hopping | 52 | 43 | 330 | N/A | N/A | N/A |
Pocket-mapping | 64 | 52 | 441 | N/A | N/A | N/A | ||
GLP-1 receptor PDB 5vew | −7.2 | Scaffold-hopping | 39 | 272 | 1497 | N/A | N/A | N/A |
Pocket-mapping | 40 | 329 | 2580 | N/A | N/A | N/A |
The diversity of the molecules generated by TopMT-GAN is quantified using an internal diversity metric based on Morgan fingerprints (eqn (2)). All generated sets exhibited a diversity score exceeding 0.8, indicating a broad spectrum of molecular structures (see Table 2 for details). Although TopMT-GAN is not prone to the issue of mode collapse or mere replication of training set outcomes, we sought to further validate its ability to generate novel scaffolds. This was achieved by comparing the similarity of the generated molecules to known actives, using a metric defined in eqn (3) which is also based on Morgan fingerprints. The distribution of maximum similarity score to known actives is depicted in Fig. S7,† the chart shows that the majority of generated molecules have low similarities to known actives, mostly ranging from 0 to 0.2. The average similarity scores to known actives were 0.1 or lower, indicating minimal overlap with the chemical space of the known ligands. This distribution suggests that the generated ligands exhibit minimal resemblance to known actives, emphasizing the model's ability to generate novel scaffolds and structures.
Target | Mode | Internal diversity | Averaged similarity to actives | Max. similarity to actives | # unique scaffolds |
---|---|---|---|---|---|
3C-like protease PDB 7d3i | Scaffold-hopping | 0.87 | 0.09 | 0.33 | 7574 |
Pocket-mapping | 0.87 | 0.08 | 0.32 | 5961 | |
Androgen receptor PDB 1e3g | Scaffold-hopping | 0.88 | 0.07 | 0.28 | 3527 |
Pocket-mapping | 0.88 | 0.07 | 0.31 | 2824 | |
c-SRC kinase PDB 7wf5 | Scaffold-hopping | 0.88 | 0.08 | 0.42 | 6558 |
Pocket-mapping | 0.88 | 0.08 | 0.33 | 4802 | |
CHK1 kinase PDB 3jvs | Scaffold-hopping | 0.88 | 0.09 | 0.38 | 9474 |
Pocket-mapping | 0.88 | 0.09 | 0.39 | 5463 | |
GLP-1 receptor PDB 5vew | Scaffold-hopping | 0.88 | 0.10 | 0.31 | 12![]() |
Pocket-mapping | 0.87 | 0.10 | 0.32 | 6704 |
We also identified RDKit scaffolds within the generated sets and highlighted the most common scaffolds and their structures in Fig. 2d. Fig. 2d depicts the scaffold of the ligand bound in 3C-like protease (PDB ID 7d3i, in a red box), alongside representative generated scaffolds: those from the scaffold-hopping mode on the top row, and those from the pocket-mapping mode on the bottom row. These generated scaffolds showed very similar 3 hydrophobic interaction sites as the original ligand. For both modes, each 50000-molecule set contained over 2500 unique scaffolds, as detailed in Table 2.
The spatial distribution of these molecules is illustrated in T-map37 space (Fig. 2e), with molecules color-coded as follows: salmon for generated molecules, light blue for DrugBank38 molecules, and light green for known actives. The T-map offers a tree-like visualization, where similar molecules cluster closely, often on the same branch, providing an intuitive view of the molecular diversity within the sets. The extensive branching observed in the T-map confirms TopMT-GAN's ability to explore diverse chemical spaces, encompassing the range of DrugBank compounds and known actives, and even extending beyond them. Collectively, these findings demonstrate that TopMT-GAN excels in generating molecules that are both potent and structurally diverse for orthosteric pockets.
To further elucidate these differences, we selected two kinase targets for comparison: the c-Src kinase (PDB ID 7wf5), which has a ligand bound in the orthosteric pocket, and the serine/threonine–protein kinase CHK1 (PDB ID 3jvs), which is in complex with a ligand at an allosteric site. Fig. 3a provides a detailed comparison of these two pockets. As expected, the orthosteric pocket in the c-Src kinase, located in a cleft between two domains (pink mesh), is significantly deeper and more buried than the allosteric pocket in the CHK1 kinase (cyan mesh), which is much shallow and exposed on the surface. The SASA/volume ratios for the c-Src kinase and CHK1 are 1.48 and 1.12, respectively. Moreover, the Vina redock scores for their co-crystalized ligands are −13.2 kcal mol−1 and −8.2 kcal mol−1, respectively, supporting the notion that allosteric ligands do not bind as strongly as orthosteric ones.
![]() | ||
Fig. 3 Comparison of two kinases and their ligand binding poses. (a) Structural alignment of c-Src kinase (PDB ID 7wf5, light pink) and CHK1 kinase (PDB ID 3jvs, pale cyan) with the orthosteric pocket of c-Src kinase and the allosteric pocket of CHK1 shown in a mesh surface representation. (b1) Ligand from the crystal structure of c-Src kinase. (b2 and b3) Two ligands generated for the orthosteric pocket of c-Src kinase, with generated poses in salmon and redocked poses in cyan. Vina scores are reported as redocked scores and RMSDs are calculated between generated and redocked poses of the ligand. (c1) Ligand from the crystal structure of CHK1 kinase. (c2 and c3) Two generated ligands for the allosteric pocket of CHK1 kinase, with their corresponding redocked scores and RMSDs. |
The ligands generated for c-Src kinase show similar results to those for 3C-like protease and androgen receptor, with the enrichment factor reaching impressive 46042 and 46
228 at the redock score threshold of −13.2 kcal mol−1. This demonstrates TopMT-GAN's capability in designing exceptionally potent ligands. However, due to the inherent challenges of allosteric pockets, the generation of potent allosteric ligands is notably more difficult. Nonetheless, for CHK1 allosteric ligand generation, TopMT-GAN achieved significant enrichment factors of 330 and 441 at a threshold of −9 kcal mol−1 for the scaffold-hopping and pocket-mapping modes, respectively. Moreover, the Vina score distribution of the molecules generated for allosteric sites was better than or equal to that of known actives (Fig. S4†), underscoring TopMT-GAN's efficiency in generating ligands for allosteric pockets.
To further evaluate the robustness of TopMT-GAN, we selected the GLP-1 receptor, a prominent protein in diabetes treatment. Its native ligand, GLP-1,39 comprises approximately 30 receptor-interacting amino acids,40 posing a significant challenge for the design of small molecule agonists that can produce similar biological effects as GLP-1. Consequently, the design of allosteric ligands has emerged as a more viable strategy. The allosteric pocket of the GLP-1 receptor is shown in Fig. S3b,† featuring a shallow and irregular shape, which, however, complicates the design process. This challenge is evidenced by the weak redocking score of −7.2 kcal mol−1 for the co-crystalized ligand at the allosteric site (PDB ID 5vew).
Despite these complexities, TopMT-GAN successfully generated allosteric ligands for the GLP-1 receptor. The results were promising; compared to HTVS, TopMT-GAN achieved enrichment factors of 1497 and 2580 (at a threshold of −9 kcal mol−1) for scaffold-hopping and pocket-mapping modes, respectively, Furthermore, the Vina score distributions of our generated molecules were superior even when compared to 796 known actives (Fig. 4). Fig. 4c shows the top 10 scaffolds for pocket-mapping mode, along with their molecular structures and occurrence counts. Despite their structural diversity, most of these scaffolds notably feature two branches and an anchoring site, allowing them to fully occupy the pocket and interact extensively with the receptor, facilitating similar binding as in the crystal structure.
We next examined the drug-likeness properties of our generated molecules by computing their Quantitative Estimate of Drug-likeness (QED). QED is a dimensionless score, whose value ranges between 0 and 1, with 1 being the most drug-like. The correlation between QED and Vina docking scores is depicted in Fig. 2b and 4b. The upper left quadrants represent potential high-quality hit ligands characterized by both high QED and low docking scores. Notably, a substantial fraction of the molecules generated by TopMT-GAN outperforms the co-crystalized ligands and known actives in terms of both QED and docking scores. This indicates a superior potential of TopMT-GAN for hit and lead identification. Other molecular properties, such as logP and synthetic accessibility score are provided in Fig. S4.† The distribution of log
P resembles the distribution of Enamine HTS collection and PubChem dataset. Regarding the SAS, our generated molecules tend to have higher scores, indicating more complex molecular structures, as seen in the comparison with the PubChem dataset. This is expected as we didn't set out to optimize this property during the initial version of our TopMT-GAN implementation.
![]() | ||
Fig. 5 Selected ligand poses for 3C-like protease. (a) Ligand in the crystal structure of 3C-like protease (PDB ID 7d3i). (b1) Pose and shape of the original ligand in the crystal structure; (b2) detected pocket. (c1 and c2) Selected poses generated in the scaffold-hopping mode with generated poses in salmon and redocked poses in cyan, including Vina redocked scores and RMSD values between generated and redocked poses. (d1 and d2) Selected ligands and their poses generated in the pocket-mapping mode. |
In the subsequent analysis, the generated ligands were subjected to a redocking process, with the poses with the lowest redocking scores highlighted in cyan. The comparison revealed that while some generated poses align closely with the best redocked poses, there were instances where the redocked poses showed stronger binding. It is worth noting that our model was not explicitly trained to predict the best binding mode. Additionally, the docking algorithms don't guarantee to produce the most favorable binding poses either. Nevertheless, the generated poses set a baseline for potentially favorable binding interactions, with the redocked poses either meeting or surpassing this baseline. For all five systems, an average of 87.4% of redocked poses were either equally good or better than the generated poses (Table S4†). This capability is crucial for the initial ranking and selection of promising hit candidates for further detailed analysis.
To assess the quality of our generated binding poses, we performed a molecular geometry analysis using PoseCheck,42 focusing on two key metrics: steric clashes and strain energy. The performance of TopMT-GAN on these metrics was compared with other leading structure-based generative models (Table S3†). TopMT-GAN demonstrates encouraging performance in steric clashes, averaging 27 clashes per molecule, comparable to Pocket2Mol's 15 clashes. This relatively low clash count reflects the effectiveness of our two-stage geometric optimization strategy. In the first stage, only the ligand structure is optimized to establish a reasonable initial geometry. In the second stage, the structure is refined within the binding pocket, minimizing unfavorable steric interactions.
However, our analysis revealed a limitation in strain energy for TopMT-GAN. It generated poses with an average strain energy of 1170 kcal mol−1. While this is significantly lower than PMDM (30822 kcal mol−1) and PocketFlow (6939 kcal mol−1), it remains higher than Pocket2Mol's remarkably low strain energy of 91 kcal mol−1. This discrepancy may stem from our use of AutoDock Vina for rapid pose scoring, which prioritizes computational efficiency over accurate energetic evaluation. Although Vina effectively captures protein–ligand interactions, it does not explicitly account for internal ligand strain, potentially resulting in less energetically favorable conformations. Future improvements could incorporate explicit strain energy terms into the scoring function to generate more energetically favorable conformations.
The generation speed varies depending on the size of the pocket to be explored and the size of the desired molecules. The pocket-mapping mode, which encompasses larger pocket volumes, typically results in longer sampling times. Notably, the slowest process—pocket-mapping mode for the c-Src kinase orthosteric pocket, the largest among the five pockets tested—generates molecules at a rate of 330 seconds per 100 molecules. Overall, TopMT-GAN requires between 0.25 to 1.89 days to generate 50000 promising ligands. These results underscore TopMT-GAN's exceptional potential in scaling up and generating a modestly sized, focused library of potent and diverse molecules. To the best of our knowledge, this positions TopMT-GAN as the fastest among 3D structure-based generative models to date. The model's ability to rapidly produce a large number of potential ligands highlights its efficiency in exploring chemical spaces, significantly enhancing its potential utility in drug development processes.
The generated molecules were docked to their respective crystal structure pockets using AutoDock Vina. The docking score distributions for all models are shown in Fig. S6.† Across all five targets, TopMT-GAN consistently outperformed other generative models. While other models produced molecules with good docking scores, they did not show significant improvements over the HTVS baseline. This finding underscores the importance of contextualizing docking scores within specific protein pockets rather than relying on docking scores alone. For instance, deep and buried pockets, such as kinase orthosteric sites, tend to yield high docking scores regardless of the generative method used. Thus, while generated molecules may achieve impressive absolute docking scores, the relative improvement over the HTVS baseline provides a more meaningful measure of generative model performance.
In addition to docking performance, we evaluated other model quality metrics across the generative models (Table S2†), focusing on general molecular properties such as drug-likeness (QED), synthetic accessibility (SAS), and internal diversity. Each model shows distinct strengths: Pocket2Mol achieves the highest QED (0.65 ± 0.17), while TopMT-GAN and PocketFlow show moderate QED values (0.45 ± 0.19 and 0.46 ± 0.17, respectively). For synthetic accessibility, TopMT-GAN and PMDM generated molecules with higher SAS scores (5.6 ± 0.7), indicating potentially more complex structures, whereas PocketFlow and Pocket2Mol produced molecules with lower synthetic complexity (3.4 ± 1.1 and 3.1 ± 1.3, respectively). All models demonstrated high internal diversity (0.88–0.91), suggesting effective exploration of diverse chemical spaces. These additional metrics provide valuable context for comparing generative models, though binding performance remains the primary evaluation criterion.
To address these challenges, we introduced TopMT-GAN, a novel structure-based 3D molecular generative model, for generating chemically valid 3D molecular structures within target binding sites. This model leverages deep learning techniques, such as generative adversarial network (GAN) models and A-star search, to predict high-affinity drug-like compounds with novel chemical structures. Comprehensive testing across five diverse protein targets demonstrated its robust performance in generating large-scale, highly diverse potent ligand sets targeting both orthosteric and allosteric pockets. Importantly, this work represents the first extensive comparison of a structure-based generative model with high-throughput virtual screening. Our results reveal TopMT-GAN's superior efficiency in exploring the relevant chemical space.
The current version of TopMT-GAN prioritizes the generation of potent and diverse ligands. This focus may result in some generated molecules with complex synthetic accessibility. However, the flexible two-step framework offers significant potential to address this limitation. By integrating synthetic accessibility score (SAS) with our generative model through tuning or constraining the generation space, TopMT-GAN will be able to generate molecules with enhanced synthetic accessibility.
In conclusion, we developed TopMT-GAN – a highly efficient 3D generative model for ligand design based on protein target pockets, which has demonstrated superior performance in efficiently generating diverse and potent ligands with precise 3D poses within protein pockets. With continued improvement and development, this model offers a promising tool for the efficient identification of viable therapeutic candidates, holding the promise of revolutionizing the process of drug discovery by significantly shortening the path from concept to clinic.
We define the binding pocket shape using two complementary strategies: a scaffold-hopping mode that leverages known ligand structures and a probe-based pocket-mapping mode developed specifically for cases where reference ligands are unavailable. In the pocket-mapping mode, probes were selected from Enamine's curated mini-fragment library, which comprises 80 small fragments with heavy atom count ranging from 5 to 7. To ensure comprehensive pocket coverage, each fragment was docked into the pocket, retaining the top 10 docking poses. These poses were manually inspected to remove outliers positioned outside the primary binding region. The combined spatial volume occupied by the validated probe poses defines the binding pocket shape. We opted to use physical small molecule fragments as probes because accurately defining the interface volume is critical for our model's performance. Overestimating the pocket volume could lead to generated molecules with steric clashes, while underestimating it might result in missed opportunities for favorable interactions or overly constrained designs.
Spheres with a radius of 1.6 Å, resembling the van der Waals radius of carbon, are compactly arranged in a face-centered cubic (FCC) lattice to fill the pre-defined shape space. A detailed evaluation of this approach can be found in the ESI.† This explicit shape depiction constrains the generative space to relevant pockets. To account for the discrete nature of the sphere stacking and enhance the diversity in molecular generation, random orientations and minor perturbations are introduced during the stacking process. Once the spheres are stacked, a fully connected graph is constructed by connecting all the first and second nearest neighbors among the spheres. This graph effectively captures the spatial relationships within the chosen pocket.
For the topology generation GAN, we trained the model for 5 epochs, corresponding to approximately 360000 training steps, with a batch size of 96. Both the generator and discriminator networks were optimized using the Adam optimizer with a learning rate of 0.0002. To ensure stable GAN dynamics, we employed a dynamic training strategy, selectively pausing discriminator updates when its performance became overly dominant. The model was trained using only adversarial loss; however, we monitored multiple metrics during training to identify the optimal model checkpoint. These metrics included generator loss, discriminator loss, and statistical properties of generated graphs compared to real molecular graphs, such as edge number distributions. At each checkpoint, we generated sample graphs for visual inspection to ensure the quality of the outputs. Based on these metrics and manual evaluation of the generated samples, we empirically selected the model checkpoint that produced the most suitable molecular graph topologies.
For the type assignment GAN, we extended the training duration to 10000 epochs with a batch size of 128, while maintaining the same optimizer configuration and learning rate. To improve training stability, we employed Wasserstein GAN (wGAN) loss instead of the standard adversarial loss. Throughout training, generator and discriminator losses were continuously monitored, and generated type assignments were regularly inspected to ensure diversity and prevent mode collapse. Since mode collapse cannot be effectively assessed from loss values alone, we carefully examined the diversity of generated type assignments and then empirically selected the model checkpoint that produced sufficiently diverse molecular type assignments for inference.
For the training data, we applied a straightforward filtering criterion: molecules containing more than 50 heavy atoms were excluded to focus on drug-like molecules and maintain computational efficiency. Since GANs learn to generate data by matching distributions rather than memorizing specific examples, we did not use a traditional training/validation split. Instead, the evaluation focused on the quality and diversity of generated molecules, which serve as the primary metrics for assessing model performance.
The graphs generated by this GAN may encounter validity issues. However, most of these can be effectively resolved with minimal adjustments in bond types. Instances where validity issue cannot be resolved are simply discarded.
![]() | (1) |
In addition, our primary focus is on generating a wide variety of diverse molecules on a large scale. To assess this diversity, we applied four complementary metrics: internal diversity,51 average similarity to the known actives, maximum similarity distribution to the known actives, and the count of unique scaffolds.
While properties like the quantitative estimate of drug-likeness (QED)52 and synthetic accessibility score (SAS)53 are important, we placed less emphasis on them in this initial implementation.
Internal diversity is defined:
![]() | (2) |
While molecular similarity can have different definitions, in this study, we specifically employed the Tanimoto similarity of Morgan fingerprints54 with a radius of 2. The radius of 2 refers to the consideration of the molecule's environment up to two bonds away from each atom. This also applies to the calculation of similarity to the actives. The average similarity to the actives is quantitatively defined as the average Tanimoto similarity between the generated molecules and known active compounds:
![]() | (3) |
Using the same similarity definition, we calculated the maximum similarity between each generated molecule and the known actives and plotted their distributions. Beyond average similarity, these histograms facilitate a detailed comparison of similarity between active compounds and our generated molecules.
There are many ways to define the scaffold of a molecule, here, we use the definition from Bemis and Murcko scaffolds55 and remove the atom types. We count the number of unique scaffolds for each generated set of molecules as an indication of the diversity of generated molecules.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4sc05211k |
This journal is © The Royal Society of Chemistry 2025 |