Data-driven design and screening of novel Klebsiella pneumoniae carbapenemase-2 β-lactamase inhibitors using a generative CLM
Abstract
The rapid emergence of carbapenem-resistant Enterobacterales, particularly among ESKAPE pathogens such as Klebsiella pneumoniae, has significantly compromised the effectiveness of existing antibiotics. This resistance, usually mediated by KPC-2 β-lactamase, poses a critical threat to effective antimicrobial therapy, necessitating the urgent need for novel inhibitors. In this study, a chemical language model (CLM) was developed to generate novel drug candidates against KPC-2 by integrating deep generative modeling with a SELFIES-based recurrent neural network. The CLM was trained on approximately 2.3 million ChEMBL compounds, achieving stable convergence and syntactic validity during generation. The generated molecules were then evaluated using RDKit and in silico ADME profiling, while Fréchet ChemNet Distance (FCD) was used to assess alignment with known drug-like chemical space. With an FCD score of 0.93, the generated compounds were found to be 100% RDKit-valid, with 71% compounds satisfying Lipinski's criteria, while only 3% were flagged as PAINS. The generated compounds were shortlisted based on multiple drug-like filters and were then docked into the KPC-2 active-site, while their binding stability and interaction profiles were further studied via extensive all-atom molecular dynamics simulations. Stability metrics, including RMSD, RMSF, Rg, PCA and FEL were benchmarked against the clinically approved inhibitor of KPC-2, relebactam. As a result, compounds 46, 72, 75 and 88 demonstrated stable binding modes and favorable interaction profiles with key active-site residues of KPC-2. These findings establish a robust and scalable computational framework for the discovery of novel KPC-2 inhibitors, demonstrate the potential of CLMs as powerful tools for accelerating antibiotic discovery in the fight against antimicrobial resistance, and provide a generalizable strategy for targeting other critical resistance determinants. The CLM used in this study is publicly available at https://github.com/sumayya-tariq/Chemical-Language-Model-CLM-.

Please wait while we load your content...