Open Access Article
Ellen E.
Guest
a,
Stephen D.
Pickett
b and
Jonathan D.
Hirst
*a
aSchool of Chemistry, University of Nottingham, University Park, Nottingham, NG7 2RD, UK. E-mail: jonathan.hirst@nottingham.ac.uk; Fax: +44 (0)115 9513562; Tel: +44 (0)115 9513478
bGlaxoSmithKline R&D Pharmaceuticals, Computational Chemistry, Stevenage, UK
First published on 4th June 2021
The bromodomain-containing protein 4 (BRD4), a member of the bromodomain and extra-terminal domain (BET) family, plays a key role in several diseases, especially cancers. With increased interest in BRD4 as a therapeutic target, many X-ray crystal structures of the protein in complex with small molecule inhibitors are publicly available over the recent decade. In this study, we use this structural information to investigate the conformations of the first bromodomain (BD1) of BRD4. Structural alignment of 297 BRD4-BD1 complexes shows a high level of similarity between the structures of BRD4-BD1, regardless of the bound ligand. We employ WONKA, a tool for detailed analyses of protein binding sites, to compare the active site of over 100 of these crystal structures. The positions of key binding site residues show a high level of conformational similarity, with the exception of Trp81. A focused analysis on the highly conserved water network in the binding site of BRD4-BD1 is performed to identify the positions of these water molecules across the crystal structures. The importance of the water network is illustrated using molecular docking and absolute free energy perturbation simulations. 82% of the ligand poses were better predicted when including water molecules as part of the receptor. Our analysis provides guidance for the design of new BRD4-BD1 inhibitors and the selection of the best structure of BRD4-BD1 to use in structure-based drug design, an important approach for faster and more cost-efficient lead discovery.
Like its family members, BRD2, BRD3 and BRDT, BRD4 consists of two N-terminal BDs (BD1 and BD2) and an extra C-terminal domain (ET).8 Each BD is composed of four helices (αZ, αA, αB and αC), which are connected by the ZA loop and BC loop, creating the acetyl-lysine binding site (Fig. 1a). BRD4-BD1 recognises histone H4, which is anchored by a hydrogen bonding interaction between the carbonyl oxygen atom on the acetylated lysine and an asparagine residue Asn140 on the BC loop of the receptor (Fig. 1b).9 A second interaction is formed through a water mediated hydrogen bond between the acetyl lysine and tyrosine residue Tyr97 on the ZA loop. Additional binding site residues create a deep hydrophobic cavity, with Trp81 and Met149 also considered key residues in H4 and small molecule inhibitor binding.10 Over the last decade, there have been many small molecule inhibitors published, with some of them reaching human clinical trials.11–16 Contained within these inhibitors is a wide chemical diversity with core motifs encompassing azepines, 3,5-dimethyl isoxazoles, pyridones, triazolopyridines, tetrahydroquinolines, 4-acyl pyrroles and 2-thiazolidinones.17,18 Each of these structural classes contains a unique warhead, which competes with H4 to replicate the interactions with Asn140 and Tyr97. An additional common feature of small drug-like compounds bound to BRD4-BD1 is a lipophilic group, which can extend into the binding pocket and interact with the hydrophobic WPF shelf (Trp81-Pro82-Phe83).
![]() | ||
| Fig. 1 (a) Structure of BRD4-B1 (PDB 3UVW) with histone H4 (light blue) bound. (b) Binding site of BRD4-BD1. Key binding site residues are highlighted as sticks. Histone H4 is shown in light blue, with acetylated lysine residues shown as sticks. Water molecules at the base of the binding site are shown as red spheres. (c) The common sequence of BRD4-BD1 across the X-ray crystal structures in our data set. Secondary structures of α-helices αZ, αA, αB and αC are coloured in red, blue, green and orange, respectively. | ||
The consideration of water molecules is important for ligand binding, as they can stabilize a complex by acting as a bridge for hydrogen bonds and their displacement by a ligand can result in a decrease in binding affinity of that ligand.19 However, it is also possible for displacement to be associated with a favorable gain in entropy with the release of a well-ordered molecule into the bulk solvent, resulting in an increase in ligand binding affinity. It is therefore crucial in drug design to understand which water molecules mediate protein–ligand interactions and which can be targeted for displacement. At the base of the BRD4-BD1 binding pocket, there is a network of highly conserved water molecules, which has been found to be important in ligand binding and stabilising the protein structure.10,20,21 Zhong et al.22 investigated the structural and thermodynamic properties of the crystallographic water molecules. They found that it is energetically unfavourable to displace a water molecule with a small drug-like compound, as this would require a large amount of energy to compensate breaking the hydrogen bonding network. Furthermore, molecular dynamics simulations demonstrated a high occupancy for several of the water sites. Using these findings, Zhong et al. developed a docking based virtual screening protocol to identify novel inhibitors towards BRD4. This study highlighted the importance of the water network in structure-based drug design.
The pool of BRD4 inhibitors continues to grow, with most identified through fragment or structure-based drug design based on properties of known BRD4 inhibitors.23–27 Additionally, a recent review28 identified three novel strategies in targeting BRD4, including bivalent BRD4 inhibitors, proteolytic targeting chimeric molecules and re-purposing of kinase inhibitors. The selectivity of inhibitors is also being targeted.29–31 Initially many compounds were developed as BRD4 inhibitors. However, due to the structural similarity across the BDs of BET proteins, many of these were pan-inhibitors which could cause adverse side effects such as dizziness and nausea.32,33 Gilan et al.29 used structure-based design to discover compounds that interact specifically with either the BD1 or BD2 of BET proteins, providing new insights into improving therapeutic strategies with fewer toxic side effect. Speck-Planche et al.31 have also developed the first multi-target quantitative structure activity relationship (mt-QSAR) model, which can predict BET inhibitor potency against BRD2, BRD3 and BRD4.
As in many drug discovery campaigns, computational methods remain an important tool in finding BRD4 inhibitors.23,25,34,35 Structure based virtual screening methods, such as docking and 3D-QSAR, can facilitate high-throughput approximations of binding affinities. Molecular dynamics simulations expand on static representations of protein–ligand complexes by providing a more dynamic view and develop our understanding of structural patterns and interactions, which lead to high potency and selectivity.36,37 As a crucial goal of drug development is designing a compound that binds competitively and strongly, alchemical free energy calculations are becoming increasingly important as, when done well, they can provide accurate estimations of binding free energies and present a way to minimize the number of compounds that are made in the laboratory. Pan et al.23 discovered a BRD4-histone deactylase (HDAC) inhibitor, a promising therapeutic strategy for colorectal carcinoma, through high-throughput rigid molecular docking of fragments of known BRD4 inhibitors, embedded into the fragment-like library of ZINC.38 A flexible docking method was then implemented on the top 200 scoring fragments. Following this docking, 24 compounds were synthesised and tested based on the 10 top scoring fragments, resulting in a promising lead compound. The extensive research already conducted on BRD4 and its inhibitors also makes it an excellent test case for the development of novel computational workflows. For example, Fusani et al.39 merged active learning with the comparative binding energy (COMBINE) method and demonstrated its performance using a BRD4 dataset. Active learning was used to introduce an uncertainty estimation component to the COMBINE method, which is a powerful tool in studying the structural information of protein–ligand complexes and deriving QSAR for structurally similar series of compounds.
A fundamental component of structure based in silico methods is the use of X-ray crystal structures. It is therefore important to be able to rely on the starting conformations of proteins in order to make accurate predictions. For example, different active site conformations may lead to different binding poses, which can severely impact predicted binding affinities.40,41 While molecular dynamics simulations, combined with enhanced sampling methods, can be used to find the most stable binding pose, these methods come with a computational expense and it is still sensible to choose the starting structure with care. In a study on the T4 lysozyme L99A, Lim et al.42 demonstrated that predicted relative free energy values are sensitive to initial protein conformation, even when using enhanced sampling. In our study, we examine the X-ray crystal structures of BRD4-BD1 complexes in the PDB and perform structural clustering, to identify the variation of conformations and the best static representative structures of the receptor. To compare the binding site of multiple complexes and identify ligands which cause structural variation, we use WONKA.43,44 WONKA is a tool for ligand-based, residue-based and water-based analyses of protein–ligand structural ensembles. The advantage of using WONKA over other visualisation and analysis tools, such as PyMOL,45 is its ability to identify trends within a data set. It can identify patterns between structure and individual ligand complexes, and these observations are displayed on a web based graphical user interface. WONKA also analyses water displacements and relates which ligands displace conserved water molecules. Therefore, we are able to explore the extent of the conservation of crystallographic water molecules in the BRD4-BD1 binding pocket and highlight functional groups present in the ligands, which displace the usually highly conserved water network. Molecular docking is also used to assess the accuracy of predicted binding poses, with and without the water network present.
Although there is no specified upper limit to the number of crystal structures that WONKA can analyse, we found that a cutoff of 100 structures is preferable for the software to perform smoothly. To ensure that effects of a wide range of ligand activity and structural diversity were captured, the crystal structures studied using WONKA were chosen based on structural clustering of the co-crystallised compounds. Out of the 297 crystal structures of BRD4-BD1, there are 266 unique ligands in complex with the receptor, which have accessible experimental data. Compounds that show little or no activity (pIC50 < 5) were discounted, leaving 175 compounds to be clustered. Ligand structural clustering was performed in DataWarrior,50 using Fragp descriptors and Tanimoto similarity (T). Two structures were considered to be similar if T > 0.8. Representative compounds from each cluster, and their respective crystal structures, were chosen for analysis using WONKA.
54 and the CHARMM force field.55 Ligand parameters were generated using CGenFF.56 The complexes were solvated in a dodecahedral box with an edge distance of 3 nm, to construct an explicitly modelled solvent consisting of around 32
000 TIP3P water molecules57 and two Cl− ions to give a net neutral charge. In absolute FEP simulations, ligand binding affinity is estimated by calculating the difference between the free energy change of decoupling the ligand from solution and decoupling from the receptor. Therefore, a system was also prepared with the ligand free in solution. Upon setup, systems were minimized for 200 ps using a steepest descent algorithm. Equilibration was performed for 200 ps in the NVT ensemble with harmonic position restraints applied to the heavy atoms with a force constant of 1000 kJ mol−1 nm−2. Temperature coupling was achieved by using velocity rescaling with a stochastic term and a reference temperature of 298 K. A further equilibration was performed for 4 ns in the NPT ensemble with a Berendsen pressure and temperature coupling scheme. The decoupling of the ligands from the receptor was split into 30 lambda windows, where ligand restraints were applied and van der Waals and Coulomb interactions were gradually turned off. The relative positions of the ligands with respect to the receptor were restrained by one bond, two angles and three dihedral harmonic potentials. To account for these restraints, a correction is applied to the free energy of binding. For the free ligand in solvent, van der Waals and Coulomb interactions were decoupled over 20 lambda windows. Each lambda window consisted of 1 ns of equilibration in the NPT ensemble, followed by 2 ns of data collection. Free energy changes were evaluated with the Bennett Acceptance Ratio (BAR) method58 as implemented in GROMACS.
![]() | ||
| Fig. 2 Distribution of the resolution of 297 X-ray crystal structures of BRD4-BD1 complexes (dark grey) and the 101 crystal structures analysed using WONKA (light grey). | ||
Structural differences within the ensemble of crystal structures were measured using pairwise RMSD. The median and mean pairwise RMSDs are 0.56 Å and 0.58 ± 0.22 Å respectively. These RMSD values are small and suggest high similarity between most protein structures within the ensemble. The structural similarity between the superimposed structures can be observed in Fig. 3a. The maximum pairwise RMSD is 1.70 Å between PDB entries 5KU3 and 6V1L. The overlay of these structures (Fig. 3b) suggests the largest structural variance occurs in the tail leading to the N-terminus. There is a small amount of deviation in the ZA loop. Given the narrow distribution of RMSD values, it is sensible to group the structures into five clusters. Fig. 4 shows an overlay of the representation crystal structures of the five structural clusters. The position of the binding site residues further demonstrates the similarity between different structures of the receptor. The centroid of the whole ensemble is the crystal structure with PDB code 4BJX, which has a resolution of 1.59 Å. With such a high number of crystal structures available for BRD4, these results can aid the selection of the most representative structures to use for the computational study of BRD4-BD1.
![]() | ||
| Fig. 3 (a) Superimposed structures of 297 BRD4-BD1 X-ray crystal structures available in the PDB. (b) Comparison of structures PDB 5KU3 (blue) and 6V1L (orange), which have the highest pairwise RMSD. | ||
![]() | ||
| Fig. 4 Representative X-ray crystal structures after grouping 297 crystal structures of BRD4-BD1 into five clusters. The PDB codes for the structures are 5WA5 (purple), 4PS5 (red), 5NNE (blue), 5D0C (yellow), 5U28 (lime) and 4BJX (orange). (a) Secondary structures of the five representative structures. (b) Active site of the five structures, key binding residues are highlighted as sticks. (c) The resolutions of each X-ray crystal structure. | ||
![]() | ||
| Fig. 5 Distribution of pIC50 values for the representative 101 compounds (shaded) compared to the total data set (open). | ||
![]() | ||
| Fig. 6 The position of key binding site residues in BRD4-BD1 over 101 X-ray crystal structures. Conformations of Trp81, which show the largest deviation, are shown in blue, red and green. | ||
Visual inspection reveals that all key binding site residues, with the exception of Trp81, have a high level of conformational similarity across all of the crystal structures, regardless of the structure or activity of the bound ligand. There are three structures of Trp81 that show dissimilar conformations, highlighted in blue, red and green in Fig. 6. There are multiple factors that could play a role in the observed disorder of Trp81. It is important to recognise that these crystal structures provide information on only one static conformation. A protein is flexible in solution, and it is possible that the positions of these Trp81 residues are more ordered when sampling a different conformation. Interactions with other amino acids within the protein and with the bound ligand can also influence the position of binding site residues. The bound ligands that correlate with these deviations in Trp81 position are shown in Fig. 7. The compounds have experimental pIC50 values of 5.26, 5.89 and 5.20, which are towards the lower end of the range. As these values were measured using different biological assays, we should be cautious about directly comparing them with each other and the remaining dataset.59 However, for a potent compound, we would expect a pIC50 value upwards of 7.5, regardless of the assay conditions. A possible reason for these Trp81 positions could be that any hydrophobic interactions formed by the ligand with Trp81 do not outweigh the stability gained by polar ligand atoms forming solvent interactions. No additional binding site residue interactions are observed, in place of an interaction with Trp81. Therefore, compounds that result in this disorder of Trp81 are not desirable and do not correlate with higher binding affinity. Furthermore, it would be sensible to use structures with a ‘regular’ Trp81 position for the basis of computational studies.
![]() | ||
| Fig. 7 The three crystal structures and corresponding ligands, which contain Trp81 conformations most dissimilar from the whole PDB ensemble. | ||
![]() | ||
| Fig. 8 Crystallographic water molecules (red) in the binding pocket of BRD4-BD1 (orange). The size of the spheres indicate the extent of their conservation across 101 crystal structures. For perspective, a small molecule binder (PDB 3ZYU) is shown in blue. | ||
Using WONKA, we can easily identify the ligands which displace the water molecules in the sites where they are not present. The only crystal structure with no water molecules at site 1 or 2 is PDB 6MH1 (Fig. 9). Water molecules at sites 3 and 4 are also displaced. Divakaran et al.67 acknowledge the reorganisation of the usually conserved water network and attribute it to the fluorophenyl group of the ligand. A moderate pIC50 value of 5.77 was measured for this compound. However, increased selectivity over other BET receptors was observed and it was hypothesised that this was in part due to the displacement of the water molecules. In our ensemble of crystal structures, which were analysed using WONKA, there are no other fluorine containing groups which occupy the same region of the binding site, which perhaps explains why sites 1, 2, 3 and 4 are occupied by water molecules for the majority of remaining complexes.
![]() | ||
| Fig. 9 The structure of the compound that displaces crystallographic water molecules at sites 1–4 is shown on the left. The structure on the right displaces water molecules at sites 3 and 4. | ||
Beside the structure previously discussed, there is one other crystal structure, PDB 5I88, which does not show water molecules at sites 3 and 4. The butenyl group on the ligand displaces the water network and induces a rearrangement. The addition of a butenyl group to the compound corresponds with a reduction in activity as the butenyl containing compound has a pIC50 of 6.43, while its equivalent without the butenyl group has a pIC50 of 7.04. Crawford et al.21 suggest that, while there may be multiple parameters that contribute to the decreased potency, the position of the active site water molecules may play a role.
Site 5 is occupied by a water molecule in all crystal structures, with the exception of PDB 4GPJ and 5DLZ. The displacement of this water molecule does not correlate with high activity ligands. Furthermore, there are ten crystal structures in our data set, which do not contain a water molecule at site 6. The corresponding ligands for these crystal structures have a pIC50 range of 5.30 to 7.30. The remaining water molecules depicted in Fig. 8 are present in ≤80% of the crystal structures (see ESI† for a full list of PDB codes and the presence of active site crystallographic water molecules in each). While we have not found a correlation between the displacement of specific water molecules in the network and the activity of the co-crystallised compounds, we have demonstrated the extent of their conservation. We expect this analysis to be a useful tool in selecting the best crystal structure and number of crystallographic water molecules to retain in computational studies of BRD4-BD1.
![]() | ||
| Fig. 10 Docked poses with (light blue) and without (green) including crystallographic water molecules as part of the receptor. The crystallographic pose (orange) is also shown for comparison. | ||
The receptor used in this study was the centroid of the ensemble of 297 PDB crystal structures of BRD4-BD1. Regardless of the inclusion of crystallographic water molecules as part of the receptor, all docked compounds show a good similarity to their original crystallographic conformations. This indicates that crystal structure 4BJX is a suitable starting conformation for the in silico study of BRD4-BD1.
To achieve a more detailed view of the binding site, we used WONKA to compare the conformations of individual residues that are important for histone and small molecule binding. In this analysis the positions of Asn140, Tyr97, Met149, Trp81, Pro82 and Phe83 in 101 X-ray crystal structures were compared. With the exception of a handful of Trp81 conformations, the positions of these residues were extremely similar. This shows the size and shape of the BD1 cavity remains unchanged with different ligands bound, highlighting the importance of the chemical features needed in a potential inhibitor. A polar group at the head of the ligand is necessary to form both the interaction with Asn140 and a water mediated interaction with Tyr97. Simultaneously, a lipophilic group is needed to extend into the hydrophobic cavity of BD1 and strengthen ligand binding.
Water molecules also play an important role in BRD4 ligand binding. Therefore, we examined the conservation of crystallographic water molecules in the binding site. Analysis in WONKA showed that the majority of crystal structures contain the four or five water molecules generally considered important for ligand binding. In total, there are up to 11 water molecules within 8 Å of the bound ligands, which are largely conserved across the ensemble. While there have been previous studies on this highly conserved water network,21,22 ours is the first to consider such a high number of experimental structures. Our work demonstrates the extent of the conservation and, through molecular docking and absolute FEP calculations, highlights the importance of retaining binding site water molecules in computational studies. Through this examination of BRD4-BD1 crystal structures, we have provided a quantitative basis to facilitate the selection of structures in future computational studies.
Footnote |
| † Electronic supplementary information (ESI) available: Full list of PDB codes of X-ray crystal structures of BRD4-BD1; details of active site crystallographic water molecules in each PDB used in this study; input scripts for the molecular docking experiments and absolute free energy calculations; RMSD distributions from the docking analysis. See DOI: 10.1039/d1ob00658d |
| This journal is © The Royal Society of Chemistry 2021 |