Shashi Kumar
Sampangin Venkatesh
,
Arpan
Das
and
Naga Rajiv
Lakkaniga
*
Department of Chemistry and Chemical Biology, IIT(ISM) Dhanbad, Dhanbad, Jharkhand 826004, India. E-mail: nagarajiv@iitism.ac.in; Tel: +91 326 223 5550
First published on 30th October 2025
Generating libraries of congeneric series of compounds and development of structure–activity relationships (SAR) is a common practice among medicinal chemists. While various computational methods are available to guide scaffold optimization, their reliability in prediction can vary. Ligand free energy perturbation (FEP+) is a rigorous computational program that computes the relative binding free energies between two congeneric ligands against a target, thereby identifying the ligand with greater binding affinity. In this study, we evaluated the FEP+ method to predict the relative binding affinity of two ligands from a congeneric series towards the target proteins. A total of 34 ligand transformations were performed, spanning across 21 soluble proteins. Relative binding free energies were calculated and compared with the experimental value to assess the accuracy of prediction. With a mean unsigned error of 0.46 kcal mol−1 and coefficient of determination, R2 = 0.85, the results of this work suggest that the FEP+ is a reliable tool for the medicinal chemist to predict the relative free energies of binding with good statistical significance, demonstrating its utility in SAR development in drug discovery.
Despite its advantages, molecular docking often falls short in helping medicinal chemists prioritize which compounds in a congeneric series should be synthesized for improved target binding.5 Traditional docking algorithms often assume a rigid receptor conformation, failing to account for the dynamic nature of proteins and ligand-induced structural changes. Additionally, scoring functions used in docking simulations frequently oversimplify complex molecular interactions,6 leading to inaccuracies in ranking potential binders. Factors such as solvation effects are not accounted for,7 resulting in significantly higher false positives or negatives. To overcome these challenges, medicinal chemists integrate docking with molecular dynamics simulations, quantum mechanics-based calculations, and free energy perturbation methods to refine predictions and improve the accuracy of binding affinity assessments.
FEP introduced by Zwanzig in 1954,8 is a class of alchemical free energy estimation based on molecular dynamics that has gained popularity for the accurate prediction of relative binding free energy.9 FEP, based on principles of statistical mechanics, estimates the free energy difference, ΔΔGAB, between two ligands by modeling a gradual transformation along a well-defined thermodynamic path with a number of intermediate states (λ).10 The Desmond Ligand FEP+ program performs alchemical transformation between two ligands, A and B, in both the free state involving simple solvent and bound complexed states. By separately estimating the free energy changes in each state and calculating their difference, the relative binding free energy (ΔΔG) is determined.11 This approach provides a rigorous and quantitative method for comparing ligand binding affinities. The workflow incorporates the REST2 (replica exchange with solute tempering 2) algorithm necessary to sample the relevant phase space for accuracy.12 The workflow allows simulations of a chosen subsystem with replicas, in which a higher effective temperature is concentrated on the area of the ligand where the alchemical change is being carried out. The protein residues around the active site may also be included in the enhanced sampling region if necessary (Fig. 1).13
This study evaluates the performance of the ligand mutation FEP+ module of Desmond in predicting the relative binding affinities of a set of two congeneric ligands. By benchmarking against experimental data and analyzing key factors influencing accuracy, we aim to provide insights into the capabilities of FEP+. Our findings contribute to the growing body of evidence supporting FEP+ as a robust tool for computational studies that can guide the medicinal chemist in synthesizing compound libraries.14–16
Relative binding free energy calculations were conducted using the Ligand FEP+ module of the Desmond 2022 software suite. The OPLS_2005 force field was employed for accurate molecular interactions, and the REST2 algorithm was applied for enhanced sampling. The simulation system was solvated in an orthorhombic SPC water box, with a 5 Å buffer for the protein–ligand complex and a 10 Å buffer for the solute undergoing alchemical transformation.
A standard multistep equilibration protocol was followed (i) energy minimization to resolve steric clashes and optimize the system's geometry, (ii) two-phase equilibration, first under an NVT ensemble followed by an NPT ensemble, with positional restraints applied to the heavy atoms of the solute to stabilize the system. A final 240 ps NPT equilibration at room temperature, with all restraints removed. For the FEP+ calculations, the alchemical transformations were carried out over 5 ns at 300° K, utilizing 12 λ-windows to smoothly transition between ligand states. All simulations were performed on a desktop workstation powered with an NVIDIA TITAN V GPU.
Experimental relative binding free energies (ΔΔGexp) were derived from Gibbs free energy equations using equilibrium dissociation constants (KD) reported in the literature from surface plasmon resonance (SPR) measurements. These values were then compared with predicted free energy changes (ΔΔGpred) to assess computational accuracy.
The first case was based on the 2023 report by Li et al.22 on novel DCAF1 ligands, where pyrazole derivatives were identified as inhibitors through SPR assays. Two compounds, 1 and 2, with reported KD of 490 ± 90 nM and 38 ± 1.5 nM, respectively, were selected for FEP+ calculations. The predicted ΔΔG by the FEP+ protocol was −1.78 ± 0.05 kcal mol−1, while the experimental estimate was −1.5 kcal mol−1, calculated from the KD values. It is noteworthy that accurate predictions are impractical considering the differences in experimental setup, assay optimization conditions, handling errors, etc. For compound 1, the standard deviation is nearly 20% of the mean. Despite the high standard deviation, a close prediction and a qualitatively positive result is encouraging. In this first case, there are transformations at two positions of the ligand. We next chose to study two other transformations from the same publication, wherein the structures differ at only one position. Compounds 3 to 4 and 5 to 6 differ by only one chlorine atom at the para-position, but have nearly 3.5-fold and 5.8-fold differences in binding affinity, respectively. Again, the FEP+ protocol showed good accuracy with the predicted ΔΔG at −0.98 ± 0.12 and −1.62 ± 0.08 kcal mol−1, against the experimental estimate of −0.72 and −1.05 kcal mol−1, respectively.
A 2021 study by Mann et al.23 reported the structure–activity relationships of a series of compounds as USP5 inhibitors. Through a fragment library screening, the authors identified an initial hit with KD = 80 μM and a potent compound with low micromolar binding affinity using the SPR assay. Congeneric ligand series from the report were selected to assess the reliability of the FEP+ protocol. As a first case, we considered the transformation from compound 7 to compound 8. Although these two structures differ only in one position, with a fluorine replaced by a chlorine on the central phenyl ring, they show more than 20-fold difference in binding affinity. The FEP+ protocol could again predict the effect of this transformation with remarkable accuracy and a prediction error of 0.06 kcal mol−1. In the same report, we observed that replacement of fluorine by chlorine in the neighboring carbon of the central core (compounds 9 and 10) did not show any significant effect on the binding affinity towards USP5 (KD = 47 ± 9 μM and 42 ± 5 μM). Simulating this change using FEP+ protocol resulted in a ΔΔG of 0.06 ± 0.11 kcal mol−1, while the experimental ΔΔG was calculated to −0.07 kcal mol−1. Next, for the alchemical change from the compound 11 to 12, wherein the terminal benzene is replaced by a pyrrolidine, the predicted ΔΔG of 0.01 ± 0.09 kcal mol−1 was on par with the experimental ΔΔG of 0.07 kcal mol−1 obtained. The perturbation of compound 13 to 14, involving the replacement of methyl with hydrogen on the piperidine ring, resulted in the prediction error of ∼1.05 kcal mol−1, while the transformation from the azepane to piperidine (compound 15 to 14) gave an error of 0.82 kcal mol−1. Both solvent-exposed transformations produced qualitatively incorrect ΔΔG predictions and were treated as outliers, although they were retained in the dataset for completeness. While the default OPLS_2005 force field in the academic-licensed version of FEP+ generally reproduces binding energetics well, occasional deviations such as these may arise from the inherent challenges of modeling specific structural transformations.
A 2025 study from Kairos Discovery24 reported the discovery of indole derivatives targeting the CK2α subunit. The SAR initially was centered around the modifications at the C2 and C5 position of the indole ring (compounds 16 to 17). A 1.7-fold difference in the binding affinity was noted when the hydrogen on the C5 position of the indole was replaced with fluorine. The FEP+ study successfully predicted this decrease in KD with an error of about 0.22 kcal mol−1.
A new class of 1,5-benzodiazepine targeting the allosteric pocket of hepatitis C virus NS5B was identified by Tibotec Therapeutics and Johnson & Johnson25 in 2009. Replacement of the acyl group (compound 18) with sulphonyl (compound 19) conferred 19 with the ability to form intermolecular hydrogen bonding with the residues Tyr448 and Gly449. This was marked by a 19-fold increase in the binding affinity. The FEP+ ΔΔG prediction for this transformation was −1.00 ± 0.25 kcal mol−1, while the experimental ΔΔG value was estimated to be −1.75 kcal mol−1. However, the magnitude of error here is about ∼0.75 kcal mol−1, and the default protocol produced a free energy estimate that was within the ballpark estimate.
A 2025 study on the discovery of an allosteric inhibitor of DHX9, a member of the family of helicases, by Accent Therapeutic26 was utilized for the next case study. Notably, the contributing factor for the stability in the compound 21 against 20 was bromine that formed a sulfur–halogen bond within the distance of 3.3 Å with residue Cys490. This was reflected in the binding affinities of the compounds 20 and 21 (KD = 142 nM and 27 nM, respectively). The ΔΔG values for the alchemical transformation were predicted to be −0.71 ± 0.09, which closely aligns with the experimental value of −0.99 kcal mol−1.
Syros Pharmaceuticals Inc. and Paraza Pharma Inc.'s27 study from 2021 revealed the discovery of a noncovalent inhibitor of CDK7, SY-5609, that entered the clinical trials (NCT04247126). Two cases were selected for the FEP+ simulations. The first one (compound 22 to compound 23) involved two changes: (i) hydrogen to methyl sulphonyl group on the 6th position of the indole ring, and (ii) ethyl group to trifluoromethyl on the pyrimidine core. The second case (compound 22 to compound 24) also involved two changes: (i) hydrogen to a cyano group on the 5th position of the indole ring, and (ii) ethyl group to a trifluoromethyl group on the pyrimidine core. The ΔΔG predicted were −0.41 ± 0.17 and −0.42 ± 0.10 kcal mol−1 for the first and second cases, respectively, of alchemical transformations performed. The calculated experimental ΔΔG were −1.04 and −0.89 kcal mol−1 for the first and second case transformation, respectively. The values, though, are underestimates from the experimental estimate; the FEP+ protocol yielded qualitatively accurate result, but with an error <0.7 kcal mol−1.
An article from Incyte Corporation28 identified a KRAS G12D inhibitor (INCB159020) and achieved oral bioavailability using a nonhuman primate model in 2025. The replacement of chloromethylbenzene (compound 25) with a bulkier fluoronaphthyl (compound 26) on pyrroloquinoline resulted in a 14.7-fold decrease in KD. The FEP+ simulation for this transformation predicted the ΔΔG of −2.31 kcal mol−1 while the experimentally estimated ΔΔG was −1.68 kcal mol−1. It is noteworthy that the standard deviation for the KD values was not reported, for which the error of 0.63 kcal mol−1 can be accounted for.
For our next case study, we used a 2016 article from Sally-Ann Poulsen's lab29 describing fragment-based drug design using various biophysical techniques to identify inhibitors against carbonic anhydrase II (CAII) was utilized for our case study. Initially, we performed ligand mutation on a cinnamic acid derivative, a chemotype identified using SPR from a 720-fragment library binding to CAII. The transformation involved the mutation of compound 27 to 28, in which the replacement of a vinylic hydrogen with a methyl group showed a 3.4-fold increase in binding affinity. The FEP+ program predicted this with an error of about 0.26 kcal mol−1. Another chemotype, a phenylacetic acid derivative, was subjected to ligand mutation FEP+, wherein a methyl group was replaced by the bulkier phenyl moiety (compound 29 to compound 30). The SPR KD values were found to be 5150 μM and 1080 μM for the compounds 29 and 30, respectively. The estimated and FEP+ predicted ΔΔG were −0.93 and −1.47 kcal mol−1, respectively. Again, while the FEP+ program resulted in a qualitatively correct prediction, the difference of 0.54 kcal mol−1 can be attributed to the lack of standard deviations in the reported experimental data.
Next, we looked at DNA gyrase B inhibitors reported by Brvar et al.30 (2012) and Zidar et al.31 (2015). In compound 31, replacing the acetamide with a propionamide on the thiazole ring decreased the KD value from 47.4 ± 8.8 to 6.6 ± 2 μM. The FEP+ predicted ΔΔG was −1.17 kcal mol−1, which was on par with the estimated ΔΔG of −0.81 kcal mol−1. The ligand mutation from methyl propionate in compound 33 (970 ± 120) to acetic acid in compound 34 (470± 150) resulted in a 2-fold drop in the KD value. The relative binding free energy prediction with FEP+ for the transformation had an error of about 0.4 kcal mol−1.
A 2023 study by Ahmad et al.32 utilized a DNA-encoded chemical library and machine learning to identify potential inhibitors against WD40 repeat-containing protein 91 (WDR91), a host factor for viral infection. The transformation from indolin-2-one in compound 35 to 1-phenylpyrrolidin-3-ol in compound 36 was performed. It is noteworthy that the groups transforming here are larger, unlike methyl, hydroxyl, or ester, which were discussed previously. The predicted free energy change for this transformation is −1.21 kcal mol−1, while the experimentally determined value was −1.66 kcal mol−1.
A study from AbbVie33 in 2020 on allosteric inhibitors of TNFα was utilized for the next case in FEP+ transformation. Two transformations were performed using the data from this study: (i) ortho–para hydrogen on the phenyl ring of compound 37 to ortho–para methyl on the phenyl ring of compound 38, and (ii) ortho hydrogen on the phenyl ring of compound 39 to difluoromethoxy on the phenyl ring of compound 40. The FEP+ prediction errors associated with these transformations were 0.29 and 0.23 kcal mol−1, respectively. Interestingly, the third alchemical change performed involved a mutation from hydrogen on indol-2-one (compound 40) to a piperazine-based substituent (compound 41). Although the transformation in this case involved a significant increase in the sterics, the FEP+ program predicted a ΔΔG value of −7.24 kcal mol−1 against the experimentally determined value of −5.72 kcal mol−1. It is noteworthy here that FEP+ was able to qualitatively predict the change in binding affinity, within the ballpark estimate, despite a very large transformation.
Small molecule inhibitors of choline kinase were reported in 2025 by ARIAD Pharmaceuticals.34 The fragment-based drug discovery approach fetched symmetrical and asymmetrical inhibitors with low nanomolar binding affinity. Compounds 42 and 43 were reported with KD values of 382 nM and 10 nM, respectively. Differences in the binding affinity originate from the introduction of the cyano group on the ortho position of the biphenyl core. For this transformation, the experimentally calculated ΔΔG was −2.17 kcal mol−1 while the predicted ΔΔG was −1.25 kcal mol−1. The standard deviations for the KD values are not reported in the article. Another transformation was performed from the same article that showed a 4.81-fold difference in the KD value. The ligand mutation of para-hydrogen of the biphenyl ring in compound 44 was replaced with a bulkier N-methylpiperazine in compound 45. The SPR estimated ΔΔG was −0.94 kcal mol−1, and FEP+ predicted free energy change for the transformation was −1.38 kcal mol−1. The error associated with this transformation is 0.45 kcal mol−1, demonstrating the applicability of FEP+.
Ferreira de Freitas et al.35 in 2021 identified inhibitors of lysine methyltransferase NSD2 PWWP domain, thus blocking histone H3K36me2 interaction. The replacement of the methyl group in compound 46 with the cyclopropyl group in compound 47 results in a 10.6-fold decrease in KD. The FEP+ prediction and experimentally estimated ΔΔG for this transformation were −1.75 kcal mol−1 and −1.41 kcal mol−1, respectively.
A study by Schuetz et al.36 in 2018 showed that adding polar groups to hydrophobic regions of Hsp90 inhibitors slows binding and increases their residence time. In this work, the methoxy group on compound 48, with KD = 195 ± 78 nM, was replaced with a hydrogen to yield compound 49, with KD = 42 ± 0.57 nM. The ΔΔG for this transformation predicted was on par with the experimentally estimated value. The error of −0.23 kcal mol−1 demonstrates the accuracy and the ease with which the selection of substituents can be made with the FEP+ protocol.
In another study, reported by Merck in 2013,37 7-azaindole derivatives were reported as potent inhibitors of focal adhesion kinase (FAK). We studied two transformations from this work. First, we explored the transformation of compound 50 (KD = 6380 nM) to compound 51 (KD = 603 nM), which involved the substitution of the C-5 position of the azaindole hinge-binder with a cyano group. The predicted ΔΔG for this transformation was −1.68 ± 0.11 kcal mol−1, with an error of 0.29 kcal mol−1. Visualization of the protein–ligand complex indicated that the transformation is occurring in a region of the ligand that is very little solvent-exposed. As discussed later in this section, we observed that the reliability of the results is higher when the transformation occurs in a region with less solvent exposure. A second transformation that was studied from this work was from compound 51 (KD = 603 nM) to compound 52 (KD = 24 nM). Here, the cyano group was transformed to a –CF3 moiety. Although a 25-fold difference was noted in the reported experimental KDs, which accounts for a ΔΔG of −1.92 kcal mol−1, the predicted ΔΔG was −1.16 ± 0.09 kcal mol−1, which has a slightly higher error but a qualitatively positive result. Like the previous study, the standard deviation for the calculated binding affinity was not reported in this work, which could account for the error (Fig. 2).
![]() | ||
| Fig. 2 Structures of the ligands utilized for ΔΔG calculations. The alchemical transformations undergone are illustrated in blue. | ||
Next, we examined a series of piperidinone-based inhibitors targeting MDM2, a key regulator of p53 degradation, reported by Amgen in 2014.38 Compounds 53 and 54, with KD values of 0.4 nM and 0.045 nM, respectively, were chosen for study as the difference in their binding affinities was significant. The FEP+ calculations predicted a relative free energy of −1.69 ± 0.12 kcal mol−1, favoring compound 54, with an associated prediction error of 0.39 kcal mol−1. However, in this work, the authors do not mention a standard deviation. The methodology section provides no details on the number of replicates in the binding affinity study.
A 2016 study from Novartis39 applied fragment-based screening and structure-based drug design to identify non-nucleoside inhibitors of dengue virus nonstructural protein 5 polymerase. A key transformation from a methyl sulfonamide (compound 55) to quinoline sulfonamide (compound 56), involving a ligand pair with KD values of 120 nM and 7 nM, respectively, resulted in a predicted ΔΔG of −1.64 ± 0.11 kcal mol−1, which is remarkably close to the experimentally determined ΔΔG of −1.69 kcal mol−1.
Next, we explored a 2017 study by Hao et al.40 which reported a series of quinazoline-based inhibitors of p21-activated kinase 4 (PAK4) for lung cancers. The mutation of an N-methyl imidazole (compound 57) to a cyclopropyl pyrazole (compound 58) led to a 26-fold decrease in KD (from 156 nM to 6 nM). When we studied the effect of this transformation on the relative binding free energies, an error of 0.05 kcal mol−1 was observed, which again is a very remarkable prediction. It is noteworthy that in the earlier case (compounds 55 → 56), the transformation was from a methyl to a quinoline group, which causes significant changes in the sterics of the binding pocket, while in this latter case, the transformation is from N-methyl imidazole to cyclopropyl pyrazole, which are sterically similar. In both cases, the FEP+ predictions were of high accuracy.
Next, we looked at a study from the Gunda I. Georg lab, which identified allosteric inhibitors of cyclin-dependent kinase 2 (CDK2) with nanomolar affinities, reported in 2023.41 We studied the transformation from compound 59 to compound 60, which involved introducing a bromine on the C-6 position of indole and replacing the –NO2 with –CF3. In this case, while the predicted ΔΔG was −4.99 ± 0.10 kcal mol−1, the experimental value was only −3.07 kcal mol−1. This was one of the cases in which the predicted ΔΔG was not very close to the experimental value, although it is a qualitatively positive result (Table 1).
| Sl. No. | K D from SPR assay (nM) | ΔΔGExperimental kcal mol−1 | ΔΔGpredicted kcal mol−1 | PDB ID | |
|---|---|---|---|---|---|
| Ligand A | Ligand B | ||||
| a Standard deviation not reported in literature; errors reported for FEP+ are the Bennett error. | |||||
| 1 | 490 ± 90 | 38 ± 1.5 | −1.52 | −1.78 ± 0.05 | 8F8E |
| 2 | 282a | 85 ± 10 | −0.72 | −0.98 ± 0.12 | 7UFV |
| 3 | 16 900a |
2890a | −1.05 | −1.62 ± 0.08 | 7UFV |
| 4 | 210 000 ± 79 000 |
9000 ± 2000 | −1.88 | −1.82 ± 0.22 | 7MS6 |
| 5 | 47 000 ± 9000 |
42 000 ± 5000 |
−0.067 | 0.06 ± 0.11 | 7MS6 |
| 6 | 8000 ± 2000 | 9000 ± 3000 | 0.07 | 0.01 ± 0.09 | 7MS7 |
| 7 | 94 000 ± 27 000 |
34 000 ± 6000 |
−0.62 | 0.45 ± 0.26 | 7MS7 |
| 8 | 90 000 ± 21 000 |
34 000 ± 6000 |
−0.58 | 0.24 ± 0.05 | 7MS7 |
| 9 | 1160a | 690a | −0.31 | −0.52 ± 0.02 | 8C5Q |
| 10 | 15a | 0.79a | −1.75 | −1.00 ± 0.25 | 3GNW |
| 11 | 142a | 27a | −0.99 | −0.71 ± 0.09 | 9MFR |
| 12 | 0.4a | 0.07a | −1.04 | −0.41 ± 0.17 | 7RA5 |
| 13 | 0.4a | 0.09a | −0.89 | −0.42 ± 0.10 | 7RA5 |
| 14 | 22a | 1.5a | −1.60 | −2.31 ± 0.12 | 9E5D |
| 15 | 3 080 000a |
905 000a |
−0.73 | −0.99 ± 0.13 | 5FLS |
| 16 | 5 150 000a |
1 080 000a |
−0.93 | −1.47 ± 0.09 | 5FLQ |
| 17 | 47 400 ± 8800 |
6600 ± 2000 | −1.175 | −0.81 ± 0.17 | 4DUH |
| 18 | 970 ± 120 | 470 ± 150 | −0.43 | −0.03 ± 0.2 | 4ZVI |
| 19 | 97 000a |
6000a | −1.66 | −1.21 ± 0.16 | 8SHJ |
| 20 | 300 000a |
106 000a |
−0.62 | −0.91 ± 0.07 | 6X83 |
| 21 | 325 000a |
19 000a |
−1.69 | −1.95 ± 0.07 | 6X85 |
| 22 | 19 000a |
1.3a | −5.72 | −7.24 ± 0.23 | 6X86 |
| 23 | 382a | 10a | −2.17 | −1.25 ± 0.07 | 5EQY |
| 24 | 3700a | 769a | −0.94 | −1.38 ± 0.18 | 5EQY |
| 25 | 74 000a |
7000a | −1.41 | −1.75 ± 0.07 | 6UE6 |
| 26 | 195 ± 78 | 42 ± 0.57 | −0.92 | −0.69 ± 0.15 | 5LNY |
| 27 | 6380a | 603a | −1.41 | −1.68 ± 0.11 | 4GU6 |
| 28 | 603a | 24a | −1.92 | −1.16 ± 0.09 | 4GU6 |
| 29 | 0.4a | 0.045a | −1.30 | −1.69 ± 0.12 | 4ERF |
| 30 | 120a | 7a | −1.69 | −1.64 ± 0.11 | 5HMZ |
| 31 | 156a | 6a | −1.94 | −1.99 ± 0.10 | 5XVF |
| 32 | 2600a | 15a | −3.07 | −4.99 ± 0.10 | 7RWF |
| 33 | 1700 ± 120 | 44 ± 8 | −2.18 | −1.86 ± 0.24 | 4IQK |
| 34 | 280a | 50a | −1.03 | −0.91 ± 0.22 | 5DTK |
For the next case, we chose a 2015 report by Jain et al.42 that identified naphthalene-based inhibitors of Kelch-like ECH-associated protein 1 (Keap1). N-substitution of sulfonamide with the acetamide group in compound 61 yielded compound 62 with nearly a 39-fold decrease in KD (1700 ± 120 nM to 44 ± 8 nM). The crystal structure analysis of compound 61 with Keap1 (PDB ID: 4IQK) revealed the presence of the crystal waters in the active site. The two acetamide groups in compound 62 could occupy the pocket and displace the crystal waters without incurring a large desolvation penalty (4XMB). The FEP+ protocol accurately captured this effect, predicting the relative free energy value of −1.86 ± 0.24 kcal mol−1 with the error of only 0.3 kcal mol−1. This case indeed was unique as it involved calculating energetics not just from the alchemical transformation but also the effects of desolvation of the binding pocket.
For the final case study for this work, we chose a 2016 work from Lund et al.,43 in which, a library of 490 fragments was screened to identify new inhibitors of the class D β-lactamase oxacillinase-48. The SPR-based assay was utilized preliminarily to identify 77 out of 490 fragments that bind to OXA-48. Meta substituted benzoic acid compound 63 was the 1st hit identified with the KD = 251 ± 97 μM. The replacement of meta-hydrogen on the benzene core with pyridine was favorable, as noted by the decrease in KD (50.1 ± 3.6 μM) for compound 64. FEP+ successfully predicted a relative free energy change of −0.91 ± 0.22 kcal mol−1, closely aligning with the experimental estimate of −1.02 kcal mol−1.
The statistical analysis of the FEP+ predictions demonstrated good correlation with experimental binding free energies. The MUE was 0.46 kcal mol−1, indicating that, on average, the predicted values deviated only slightly from experimental measurements. The RMSE of 0.62 kcal mol−1 further supports the reliability of the predictions, as it falls within the expected range for free energy perturbation calculations.44 Additionally, the R2 was 0.85 (Fig. 3A), suggesting a good correlation between computed and experimental values. The 95% confidence interval (CI) for the prediction error was calculated to be −0.00384 ± 0.21 kcal mol−1 (Fig. 4), indicating that the model's predictions are, on average, unbiased, with errors distributed symmetrically around zero. The relatively narrow CI suggests a consistent predictive performance with minimal systematic deviation (Table 2).
![]() | ||
| Fig. 3 (A) Correlation between experimental and FEP+ predicted ΔΔG (kcal mol−1). (B) Correlation between experimental ΔΔG (kcal mol−1) and predicted ΔΔG (kcal mol−1) using molecular docking. | ||
![]() | ||
| Fig. 4 Error distribution for experimental and predicted ΔΔG values. Mu and sigma represent mean and standard deviation for the errors. | ||
| Sl. No | Statistical method | Estimated | |
|---|---|---|---|
| 1 | MUE (kcal mol−1) | 0.46 | |
| 2 | RMSE (kcal mol−1) | 0.62 | |
| 3 | R 2 | 0.85 | |
| 4 | Confidence interval of error at 95% (kcal mol−1) | Lower bound | −0.22 |
| Upper bound | 0.21 | ||
To further characterize the protocol's predictive accuracy, the dataset was divided into two categories: (i) ligand transformations occurring within the protein's binding pocket and (ii) transformations located in solvent-exposed regions. While transformations 1, 2, 3, 9, 10, 11, 12, 13, 14, 15, 17 20, 21, 22, 25, 26, 27, 28, 31 and 32 fall under the first category, transformations 4, 5, 6, 7, 8, 16, 18, 19, 23, 24, 29, 30, 33 and 34 are categorized under the second. Linear regression analysis revealed a strong correlation for the first category (R2 = 0.89), indicating that FEP+ accurately captured the energetics of buried binding-site perturbations. In contrast, the second category exhibited a lower correlation (R2 = 0.70), reflecting reduced predictive reliability for solvent-exposed modifications. This trend highlights the method's higher robustness in contexts where ligand–protein interactions are well-defined and conformational sampling is less influenced by solvent dynamics.
Further, this study also focused on the applicability based on the nature of transformations involved. Several cases we studied included the transformation of a small moiety to a significantly larger one. Transformations 22, 24, 30, and 34, for example, involve ligand mutation of a hydrogen or methyl group to a significantly larger heterocycle or substituted heterocycle. Another scenario explored was the transformation of a smaller group into another smaller group. Transformations 5, 9, 11, and 25, among others, represent this category. In both scenarios, the FEP+ protocol showed reliable predictions of relative binding energies.
The case studies also included effect of varying electronics of the scaffold. Transformations 10, 12, 13, and 27 involved ligand mutation to electron-deficient cores by introducing stronger electron-withdrawing groups, while transformations 1, 20, and 26 did the opposite by mutation to electron-donating or removal of electron-withdrawing groups. We also included studies with ligand mutation to a different aryl or heteroaryl or cycloalkyl moiety, like in transformations 14, 19, and 31. Also, we included scenarios where the scaffold is modified at a single position (transformations 2, 4, 9, 15, 21, etc.) and multiple positions (transformations 1, 12, 20, and 32). It is noteworthy that the FEP+ protocol gave reliable results in all these scenarios.
Overall, these studies indicate that Ligand FEP+ is a reliable method to predict the relative binding free energy of a pair of congeneric ligands towards a target. This is particularly helpful for medicinal chemists who make everyday decisions to modify an active compound for improvement in other properties. For example, a lead compound with excellent enzymatic and cellular activity may suffer from poor solubility, clearance, or any other pharmacokinetic property. The medicinal chemist then has to modify the active compound while retaining the target engagement. The Ligand FEP+ method is helpful in such scenarios, at least qualitatively, to predict the effect of a transformation of a ligand on the target engagement.
It is noteworthy that in this work, we used the academic license from Desmond Research for Ligand FEP+, which does not include some features, like changing the number of lambda windows. The commercial version with all features included may provide more robust results.
Also, the applicability of a computational method is highly dependent on the quality of the starting structures. In this work, all simulations were based on high-quality crystal structures, and only protein systems with fully resolved binding sites were included. Structures with missing residues or disordered loops proximal to the ligand pocket were modelled using the Swiss-Model Expasy web server during curation, as preliminary trials showed that such gaps often led to qualitatively incorrect ΔΔG predictions due to altered local dynamics.
Next, we looked at the overall free energy difference (ΔG) as a function of time. The evolution of free energy differences over time was examined to confirm whether the system had reached convergence. Fig. S3 illustrates that the computed ΔG values stabilized after approximately 4 ns, indicating that sampling was sufficient for most ligand transformations. However, minor fluctuations in the final 1 ns were observed for certain ligands, likely due to residual sampling errors.
Overall, these findings confirm that the majority of FEP+ calculations achieved sufficient convergence. It may be noted that the studies were performed using the academic license obtained from DE Shaw Research, which does not include changing the number of λ-windows. The observed variability in work distribution overlap highlights the need for adaptive strategies to improve sampling efficiency in FEP+ simulations. While REST2-enhanced sampling was applied, ligands undergoing transformation in the solvent-exposed regions exhibited poor work distribution overlap, suggesting that additional refinements may be necessary for these challenging cases.
We compared the results of our analysis with molecular docking studies performed using Autodock.48 The docking scores generated were used for comparison with the ΔΔG scores generated by the Ligand FEP+ algorithm. As illustrated in Fig. 3B and tabulated in Table 3, the molecular docking results were highly unreliable with poor statistical significance. Specifically, the R2 for the 34 transformation studies was 0.38, indicating that the tool has several limitations. In comparison, the R2 using the FEP+ method was 0.85.
| Sl. No | ΔΔGExp kcal mol−1 | ΔΔGFEP+ kcal mol−1 | ΔΔGDocking kcal mol−1 |
|---|---|---|---|
| 1 | −1.52 | −1.78 | 0.78 |
| 2 | −0.72 | −0.98 | −0.41 |
| 3 | −1.05 | −1.62 | −0.56 |
| 4 | −1.88 | −1.82 | −0.17 |
| 5 | −0.067 | 0.06 | 0.14 |
| 6 | 0.07 | 0.01 | −0.56 |
| 7 | −0.62 | 0.45 | −0.35 |
| 8 | −0.58 | 0.24 | 0.73 |
| 9 | −0.31 | −0.52 | 0.05 |
| 10 | −1.75 | −1.00 | −1.13 |
| 11 | −0.99 | −0.71 | −0.08 |
| 12 | −1.04 | −0.41 | −0.31 |
| 13 | −0.89 | −0.42 | −0.07 |
| 14 | −1.60 | −2.31 | −0.8 |
| 15 | −0.73 | −0.99 | −0.12 |
| 16 | −0.93 | −1.47 | −0.55 |
| 17 | −1.175 | −0.81 | −0.56 |
| 18 | −0.43 | −0.03 | −1.3 |
| 19 | −1.66 | −1.21 | 0.45 |
| 20 | −0.62 | −0.91 | −0.06 |
| 21 | −1.69 | −1.95 | −0.39 |
| 22 | −5.72 | −7.24 | −4.21 |
| 23 | −2.17 | −1.25 | −0.34 |
| 24 | −0.94 | −1.38 | −2.1 |
| 25 | −1.41 | −1.75 | −0.34 |
| 26 | −0.92 | −0.69 | 0.23 |
| 27 | −1.41 | −1.68 | −0.83 |
| 28 | −1.92 | −1.16 | 0.43 |
| 29 | −1.30 | −1.69 | −0.82 |
| 30 | −1.69 | −1.64 | −1.25 |
| 31 | −1.94 | −1.99 | −1.21 |
| 32 | −3.07 | −4.99 | −0.9 |
| 33 | −2.18 | −1.86 | 0.38 |
| 34 | −1.03 | −0.91 | −0.61 |
In all the transformations performed, FEP+ identified the relative ranking of ligands with decent accuracy, making it a valuable tool for the medicinal chemist for qualitative predictions in lead optimization and drug discovery. The results also highlight that work distribution overlap in FEP+ depends on the nature of ligand transformations. Ligands transitioning in the solvent-exposed regions showed relatively poor overlap, leading to increased free energy deviations. The transformations occurring entirely within the binding pocket demonstrated good overlap, suggesting better sampling and more reliable ΔΔG predictions, as indicated by R2 values.
Supplementary information is available. See DOI: https://doi.org/10.1039/d5md00748h.
| This journal is © The Royal Society of Chemistry 2026 |