Konstantin
Röder
Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK. E-mail: kr366@cam.ac.uk
First published on 17th February 2021
The structural versatility of histone tails is one of the key elements in the organisation of chromatin, which allows for the compact storage of genomic information. However, this structural diversity also complicates experimental and computational studies. Here, the potential and free energy landscape for the isolated and bound H4 histone tail are explored. The landscapes exhibit a set of distinct structural ensembles separated by high energy barriers, with little difference between isolated and bound tails. This consistency is a desirable feature that facilitates the formation of transient interactions, which are required for the liquid-like chromatin organisation. The existence of multiple, distinct structures on a multifunnel energy landscape is likely to be associated with multifunctionality, i.e. a set of evolved, distinct functions. Contrasting it with previously reported results for other disordered peptides, this type of landscape may be associated with a conformational selection based binding mechanism. Given the similarity to other systems exhibiting similar multifunnel energy landscapes, the disorder in histone tails might be better described in context of multifunctionality.
The H4 histone tail likely plays an important role in the internucleosome interaction,12 and deletion causes significant decompaction.13 Residues 16–20 (KRHRK) form the basic patch, a positively charged region, which can interact with the acidic patch in H2A and H2B, as well as DNA.14–16 Switching between the binding modes relies on ionic screening,14 but the interactions of the H4 tail are not simply electrostatic.17,18 The H4 tail likely regulates DNA-DNA interactions between nucleosomes.19,20 These observations have led to the suggestion that the H4 tail is responsible for the polymorphic nature of chromatin structure.21
The H4 tail itself exhibits a high degree of disorder in solution NMR22 and CD spectroscopy,23 even in highly condensed states.24 A small amount of α-helical content has been observed in the isolated tail, and acetylation of lysine residues in the tail increases the helical character significantly.23 Furthermore, it has been reported that the higher degree of helicity is conditional on DNA binding.25 In another combined CD and NMR study, isolated and hyperacetylated H4 tails were observed to exhibit flexible and elongated structures.26 A multiscale computational investigation, in combination with NMR-derived constraints, reported a large amount of disorder in the isolated, unmodified tail, with increased order observed after acetylation, in particular for K16Ac.27 The increased order leads to a larger persistence length (increase of 41%27), a similar increase in secondary structure, and altered structural ensembles. The folding of the histone tails in this computational study was associated with weaker internucleosome contact, and hence poorer chromatin compaction. More recently, this change has been further characterised through a direct link between acetylation of H4 K16 and the free energy surface.28 Super-resolution microscopy revealed the loss of compaction as a result of hyperacetylation in vivo.29
In contrast, other studies have reported enhanced binding30 and decreased helical content31 for H4 K16Ac. Metadynamics simulations suggested increased conformational diversity with loss of binding to the acidic patch, and the increased disorder upon acetylation may be related to loss of intramolecular hydrogen-bonding upon binding.32 Clearly, the highly dynamic structure of the H4 tail complicates analysis significantly in experimental and theoretical studies. The highly dynamic character persists in long time scale simulations.5 This behaviour differs significantly from globular proteins, and might be closer to the theoretical limits predicted by polymer theory.33 While a number of recent studies have focused on the mesoscale,34–38 the experimental and computational descriptions of the structural ensembles adopted by the H4 tail and the associated dynamics are incomplete, mainly due to the disorder.
The energy landscape of a molecular system contains all the information necessary to describe the kinetic, thermodynamic and structural properties of molecular systems.39 The energy landscape associated with intrinsically disordered proteins, are often described as rugged. However, previous work has shown that intrinsically systems may exhibit a large number of funnels relatively similar in energy,40 while a less structured energy landscape has been observed for Aβ monomers.41
Exploration of the potential energy landscape,42,43 can provide direct insight into the structural ensembles of the histone tail and in the organisation of the energy landscape. The computational framework exploits on geometry optimisation, and can therefore overcome the broken ergodicity associated with high energy barriers and long time scales.44 This approach has previously been applied to other intrinsically disordered peptides,40,41 and to analyse mutational changes.45
In this contribution, the potential and free energy landscape for the H4 histone tail are presented. In addition, the H4 histone tail was modified by removing a proton from K16 and K20, i.e. from the basic patch, to model binding to chromatin at a simplistic level. The energy landscapes for this modified system are presented alongside the isolated H4 tail, highlighting how chromatin binding modifies the structural ensembles and alters the energy barriers between different states. The key result of this study is that the energy landscape for the H4 histone tail exhibits a number of clearly defined states, with the associated funnels separated by large energy barriers. The hierarchy observed in the energy landscape is reminiscent of similar organisation observed in multifunctional systems, and offers an interpretation of the polymorphism of the histone tail in the context of an evolved, multifunctional system. These clearly defined states are observed to be even more stable in the bound model, indicating that the corresponding structures may play important and distinct roles.
The chosen force field retains some bias,65,66 in particular with disordered states that are too compact67 and secondary structure.68,69 Furthermore, a study of the effect of solvation models33 shows that the applied model does not represent the unfolded cases to a high accuracy. The representation of unfolded state requires accurate capture of entropic effects, but these will not affect the potential energy landscape. The discussion of the organisation of the potential energy landscape remains therefore valid within the error usually encountered in force fields, and has been applied previously.40,41 More caution needs to be applied when considering the free energy landscapes, and this point will be discussed in that context below.
The sampling was initiated by finding discrete paths between pairs of structures from previous work,27 using an improved version of the quasi-continuous interpolation scheme (details have been published elsewhere55). After discrete paths were found all databases were merged into one database for each system. Further sampling to remove unphysical barriers70,71 and kinetic traps,71 and to improve local connectivity within regions41 of the landscape was conducted. Convergence was judged by the disconnectivity graphs72,73 and convergence of the pathways between different funnels. Free energies were obtained using the superposition approach within a harmonic approximation,74 and a recursive regrouping scheme was employed to define free energy states.75
To analyse the resulting landscape, DSSP76 was employed for the secondary structure assignments, half-sphere exposure (HSE)77 for the solvent-accessibility, and the radius of gyration was considered as another structural measure. Further calculations included the solvent-accessible surface area (SASA)78 for the entire molecule and the contribution of the basic patch, the distance between the lysyl nitrogen atoms in residues 16 and 20, and the hydrogen-bonds formed, as well as the ϕ and ψ dihedrals for all residues. Apart from the HSE calculations, CPPTRAJ79 was used for all calculations.
![]() | ||
Fig. 1 The potential energy landscapes for the isolated (left) and bound (right) H4 histone tail. Both landscapes exhibit a number of funnels, and high structural diversity. While the relative size of the funnels, their depths, and the barriers between them vary, the binding does not reduce the structural diversity, but biases towards different structural motifs. Besides the five shared set of structural ensembles, A to E, each of the landscapes exhibits an additional distinct set, highlighted with a red star. The structural representations show residues 16 to 20 (KRHRK) explicitly, and the regions forming important intramolecular contacts are highlighted as follows: Gly7 to Lu10 in red, Lys12 to Gly14 in green, and Leu22 to Asp24 in purple. The assignments of the corresponding funnels are based on the key structural features exhibited by structures in each funnel, as discussed in the text and the ESI.† UCSF Chimera80 was used for the structural representations. |
Four of the sets of structures, A, B, D and E, are based on hairpin like structures. The hairpins differ in size, but they all exhibit some free residues at the N-terminus, and there are distinct additional contacts for each set. Only one set, C, deviates and displays a different structure, which involves helices. The conservation of structures between the landscapes is strongly related to the formation of key interactions between residues. In Fig. 2, the lowest energy structures in the five sets for the isolated tail are shown. For each structure, important regions are highlighted, hydrogen-bonds between residues are shown, and a surface map highlighting the basic patch orientation is displayed. For set B, there are two slightly different subsets. A detailed description of all the sets, and some key properties are provided in the ESI.†
![]() | ||
Fig. 2 Detailed structures for the sets A to E in the isolated H4 tail. For set B, the two subfunnels are represented in the second and third row. The first column shows a simplified cartoon representation with intramolecular hydrogen bonds between residues. The structural representations show residues 16 to 20 (KRHRK) explicitly, and the regions forming important intramolecular contacts are highlighted as follows: Gly7 to Lu10 in red, Lys12 to Gly14 in green, and Leu22 to Asp24 in purple. The structures in the second column shows the solvent accessible surface, with the basic patch (residues 16 to 20) highlighted in dark blue. UCSF Chimera80 was used for the structural representations in the first column, and PyMOL87 for the surface representations in the second column. |
A key observation for the potential energy landscape is their organisation. The existence of multiple, stable funnels gives this system a multifunnel character. Such character has been associated with multifunctionality before,81 and hints at an evolved minimisation of frustration in the system.82–85 Indeed, when comparing the frustration index86 for the histone tail with the one derived form the potential energy landscape for Aβ monomers,41 we observe an order of magnitude lower frustration at low temperatures for the histone tail (see ESI†).
Set | ΔG/kcal mol−1 isolated | ΔG/kcal mol−1 bound |
---|---|---|
A | 8.8 | 1.9 |
B | 6.5 | 2.5 |
C | 10.7 | 10.1 |
D | 0.0 | 0.0 |
E | 7.9 | 8.3 |
The high energy barriers observed in the potential energy barriers are still observed in the free energy landscapes, but there are differences. The highest transition state in the landscape of the isolated tail is around 45 kcal mol−1, while the largest barrier for the bound molecule is about 33% larger. Even the smallest barrier between sets, the free energy barrier between set C and set E in the isolated molecule, is over 20 kcal mol−1.
It has been reported that the structures observed for H4 are quite rigid, based on the formation of stable hairpins, and the resulting ψ and ϕ dihedral angles are restricted to few values, particularly for the Lys-Arg pairs in the basic patch.64,88 In fact, we find that this rigidity is even more restrained in the local funnels corresponding to each set, each one exhibiting a distinct pair of values for most residues. The residues heavily involved in structure stabilisation and binding (residues 12 to 23) appear to be particularly constrained. This result is in agreement with the differences in organisation and order reported for the C- and N-tail for the histone tail.64
Another property that has been described for the H4 tail is the distribution of the end-to-end distance.64 It was reported that the tail features very short and very long end-to-end distances, rather than a higher propensity for mid-ranged distances. Again, our observation of structures with very close contacts between the termini (for example set C), and structures where strong interactions keep the termini apart (for example D) are consistent with this observation.
![]() | ||
Fig. 3 Simplified free energy landscape for the isolated H4 tail at 310 K. The graph is coloured according to the surface area formed by the basic patch, with blue indicating large surface exposure, and red indicating a small contribution. Only the lowest 200 free energy minima are shown. This simplification was applied as the funnel for set A is very wide, and therefore the funnels formed by B to E are difficult to see clearly in full representations. A regrouping threshold75 of 2 kcal mol−1 was used. |
The overall structural ensemble is very diverse, in agreement with NMR experiments,22 as long as the system is at or close to equilibrium. The helical content observed in experiment23 and other simulations27 is best matched by set C. The fact that the diversity in the structures is preserved in condensed environments24 might be a result of the high energy barriers.
![]() | ||
Fig. 4 Free energy landscape for the bound H4 histone tails at 310 K. The graph is coloured according to the surface area formed by the basic patch, with blue indicating large surface exposure of the basic patch, and red indicating a small contribution to the surface area. A regrouping threshold75 of 2 kcal mol−1 was used. |
The differences in the free energies of the lowest energy structures is fairly similar for the bound and unbound model, with D the lowest and C the highest energy set. A key difference arises in the energies of sets A and B. They are significant lowered by deprotonation, probably due to the fact that sidechain interactions are partly replaced by backbone interactions, allowing for a larger entropic contribution from the now flexible sidechains. Another key difference is observed for the barriers between states. In the bound model, barriers are increased, which we would expect from a system upon binding, as further constraints are introduced, making refolding less likely.
The kinetics of the system are a result of the free energy barriers we observe. Based on the presented energy landscape we expect the states to be stable and long-lived with slow transitions between states, exceeding those reported recently for reversible motion within nucleosomes.89 However, this picture is for an isolated tail, and does not take into account that other molecules may alter the energy landscape and facilitate faster transitions. Furthermore, the force field inaccuracies, as discussed earlier, will likely lead to an overestimate of the barriers, as disordered states are nor represented accurately. While such a correction will not merge all funnels, it will lower barriers between them. Unfortunately, from the data presented here, we cannot precisely describe this effect, and further work is needed to estimate the exact transition time scales between the ensembles. Similarly, further work is needed to study the impact of other molecules and crowding effects on the energy landscape.
To achieve DNA binding the key residues must be accessible, and not involved in strong intramolecular interactions. Set B fits this criterion perfectly, as all residues in the basic patch are accessible. The basic patch is also generally exposed in set E. For set D, only part of the basic patch is exposed. Finally, in ensembles A and C the residues are quite heavily involved in intramolecular binding. However, as the changes upon binding for A show, these residues can be somewhat more exposed if necessary. Similarly, in set C, Lys20 is somewhat exposed to the surroundings. However, even in these cases the exposure is limited. As a result, it is likely that structures of set A and D, if they bind, would likely only interact through one of the groups, and consequently only with one binding partner. For set D, the arginines are involved in intramolecular structure formation. However, the distance between the lysyl groups is reduced, such that they end up in close proximity, and a single binding site contact would be possible.
Both sets B and E could potentially interact with two duplexes at once, or more strongly with one duplex. An interesting point to notice here is the distances between the lysyl groups in these two sets (see Tables S1 and S2 in the ESI†). For set B, which is the ensemble of hairpin structures that matches the structures described in previous studies, this distance is around 14 Å and effectively unchanged upon binding. In contrast, for set E, the distance is around 8.2 Å in the isolated case, which is reduced to 4.5 Å after binding. This shorter distance might allow for a stronger binding pattern with a single duplex. The changes in lysyl distance based on charges might also underlie the variations in bonding observed in ionic screening.14
While it is not known what binding partners the ensembles described are bound to, the existence of a number of these well-defined structures seems unlikely to be a coincidental. Instead, the different structures present different options for binding to other molecules, and we suggest that this organisation may correspond to evolved multifunctionality.81,85 The apparent stability and robustness of the energy landscape would facilitate chromatin binding. Recently, work on the H1 N-terminal domain revealed that while the region is disordered in solution, a disorder-to-order transition increases its binding affinity.90 The increased binding-affinity after the formation of a ordered structure for H1 and the increased stability of the observed structural ensembles upon binding, indicate that multifunctionality may be an underlying feature of the histone tails in general. This relation between multiple funnels observed and intrinsic multifunctionality consistent with the notion of evolved minimal frustration for biomolecules.81
To achieve the liquid-like organisation of chromatin37,91 transient binding of the histone tails is necessary. Recent work92,93 showed that intrinsic liquid–liquid phase separation is a key factor in chromatin organisation. Nonspecific electrostatic contacts are thought to be one key feature involved, alongside the disordered behaviour of histone tails. As discussed above, all the structural ensembles of the H4 histone tail exhibit some exposure of the binding residues in the basic patch, but the resulting contacts, as they are likely based on electrostatic interactions, will be nonspecific, though they would lead to different structures.
The behaviour of intrinsically disordered proteins and their participation in protein–protein and protein–DNA complexes has also been described through the concept of fuzziness.94,95 The concept establishes that folding and binding are not necessarily coupled in intrinsically disordered proteins, and that conformational heterogeneity can be maintained in bound complexes. Fuzziness exists in DNA complexes,95 through non-specific contacts. It has been proposed that fuzziness is related to structural multiplicity.96 The prediction of distinct structural ensembles in the case of the histone tail described here, with the potential for non-specific interactions fits the picture of the formation of fuzzy complexes well.
The energy landscapes characterised in this work suggest that the organisation might be better described in terms of well-defined, competing, meta-stable structures, rather than disorder. One question that arises from this observation is how histone tails can adopt alternative structures. Clearly, these transitions will not readily occur due to the high energy barriers observed. This stability is a desirable feature for chromatin binding as described above, and consequently external stimulus may be needed to alter the adopted structure. Such a stimulus might be provided by other molecules or environmental conditions, but it would be controllable rather than random. Further, it could be hypothesised that various modifications, such as acetylation, will select structures based on shifting the relative energies of funnels, similar to processes that are observed for mutational changes.45 From this point of view the chemical modifications provide a control mechanism for chromatin organisation. This idea is supported by the previously observed alterations of binding properties as a result of various acetylation patterns.26,27,29 Furthermore, post-translational modifications have been described as one possibility to control fuzzy complexes,97 like those encountered in chromatin assembly.
Finally, the results of this work can be seen in relation to previous studies of intrinsically disordered proteins and disordered tails. It has been proposed that the disorder in tails allows for better capture of binding partners via the ‘fly-casting’ mechanism.98 While the original work proposing this mechanism was based on a single-funnel picture, the multifunnel landscapes presented here are compatible with this mechanism, with other constraints, such as the available space, alternative contacts or environmental condition, potentially determining the funnel. This suggestion also hints at a conformational selection mechanism99 as a means to obtain an ordered bound structure.100 The existence of multifunnel landscapes for conformational selection has been proposed previously,101 and a dependence of the mechanism, i.e. induced-fit versus conformational selection, has been reported based on structural propensities in the isolated molecules.102 We may hypothesise that a multifunnel energy landscape, which supports a number of stable structural ensembles, leads to a conformational selection mechanism, while a higher disorder in the landscape organisation would likely lead to an induced fit mechanism. As most intrinsically disordered peptides are found in well-ordered structures in bound complexes,100 the potential energy landscape, which effectively removes the local vibrational entropic contributions, may allow us insight into the available structures for a given peptide. In this context, it has also been proposed that the inherent lack of order in these biomolecules does not stem from a desirability of disorder itself, but is an efficient way of reducing size and consequently energetic cost.103 Within this interpretation, the peptides still fulfil distinct functions, and multifunnel energy landscapes may be seen as a pathway to obtain such evolved small peptides.
The target functionality in the case of the H4 histone tail is binding to DNA to achieve chromatin organisation. A searching process to locate the binding sites via the ‘monkey bar’ mechanism has been proposed for similar processes,104 and the structural ensembles available for this histone tail could provide the means to facilitate this mechanism. Evidence of optimisation in disordered tails for such DNA searches was previously reported,105 with sequence patterns evolved to allow for speedy DNA searches. The existence of the organised, multifunnel landscape for such a tail provides further evidence for such optimisations.
Our key observation is the organisation of the landscapes, characterised by the existence of deep funnels, each containing a well-defined structural ensemble. While the histone tail has often been described as disordered, this organisation might be better interpreted in terms of a multifunctional system. In fact, the landscapes bear some resemblance to other multifunnel systems, such as G-quadruplexes.106 Hence it is possible that histone tails are evolved multifunctional molecules, where different ensembles fulfil distinct roles. While these roles cannot be assigned from the current calculations, the relationship between multiple funnels and multiple functions is established for other systems.81 The existence of a well-defined set of funnels supporting distinct structures is qualitatively different from the disorder characterised for structural glasses.107,108
The distinct ensembles may facilitate the formation of different bound complexes, helping to provide the necessary transience required for the liquid-like chromatin organisation. The stability of the chromatin-bound structures is a desirable feature in this context. A number of interesting further questions arise from this work. Other histone tails should be investigated to examine whether they also support multifunnel, and perhaps multifunctional, landscapes. Furthermore, modifications, such as acetylation, provide a potential mechanism for structural selection. Again, this hypothesis can be tested with further calculations. Finally, the energy landscape suggests a multifunctional structural ensemble, and further research on the potential bound complexes formed by each set of structures will be insightful.
Footnote |
† Electronic supplementary information (ESI) available: Detailed descriptions of the structural ensembles for the bound and unbound tail; radii of gyration, percentage of the sovlent-accessible surface area formed by the basic patch, and the distance between the lysyl groups in Lys16 and Lys20. Frustration indices. See DOI: 10.1039/d0cp05405d |
This journal is © the Owner Societies 2021 |