Structure of the H-NS–DNA nucleoprotein complex †

Nucleoid associated proteins (NAPs) play a key role in the compaction and expression of the prokaryotic genome. Here we report the organisation of a major NAP, the protein H-NS on a double stranded DNA fragment. For this purpose we have carried out a small angle neutron scattering study in conjunction with contrast variation to obtain the contributions to the scattering (structure factors) from DNA and H-NS. The H-NS structure factor agrees with a heterogeneous, two-state binding model with sections of the DNA duplex surrounded by protein and other sections having protein bound to the major groove. In the presence of magnesium chloride, we observed a structural rearrangement through a decrease in cross-sectional diameter of the nucleoprotein complex and an increase in fraction of major groove bound H-NS. The two observed binding modes and their modulation by magnesium ions provide a structural basis for H-NS-mediated genome organisation and expression regulation.


Introduction
The prokaryotic genome is highly compacted in the nucleoid, despite the lack of scaffolding in terms of lipid membranes and/or chromatin organization. 1 Although much work has been done to elucidate the mechanisms involved in stabilising the compacted state, the structural arrangement of the condensing agents near DNA is not clear. In previous small angle neutron scattering (SANS) works, we have measured the distribution of ions, including polyamines, around DNA. [2][3][4][5][6] More recently, we have investigated the structure of the bacterial protein Hfq bound to DNA. 7 Here, we describe the structure of the complex formed by the generic nucleoid associated protein H-NS (histone-like nucleoid structuring protein, M w = 15.6 kDa) with double-stranded DNA. H-NS plays a global role in gene regulation and represses hundreds of genes, most of which are involved in the adaptation to stress, virulence and chemotaxis. 8 It exists as a dimer as well as higher oligomeric forms and binds DNA with a preference for curved or AT-rich sequences. These sequences serve as nucleation sites for oligomerization of the protein along the duplex; thus covering regions as long as 1.5 kbp in vivo. 9 H-NS binding results in an increase in bending rigidity and/or bridging of distant, like-charged segments of the DNA molecule by protein-mediated attraction. 10,11 The relative importance of bridging and stiffening depends on buffer composition and, in particular, the presence of divalent cations (Mg 2+ ). 12,13 The formation of the nucleoprotein complex has been proposed to be the structural basis for H-NS-mediated gene silencing. [14][15][16][17] Detailed structural information on the binding of H-NS to DNA is scarce. Two distinguishable H-NS binding states have been identified, depending on the interaction of specific and nonspecific DNA target sites. 18 H-NS specifically binds to the minor groove of double-stranded DNA with a short C-terminal loop. 19 Nonspecific binding is thought to be predominantly controlled by electrostatics and is much more prone to variation in ionic strength. Here, we describe a SANS study of H-NS complexed to rod-like DNA fragments (contour length 54 nm) in solution with monovalent and divalent salts. The contributions to the scattering (structure factors) from DNA and H-NS are obtained using solvent contrast variation. The H-NS to DNA base-pair ratio was 1 : 6, so that the DNA fragments are almost fully covered with protein. Information on the arrangement of H-NS about B-form DNA is obtained by comparison of the H-NS structure factor with coarse-grained model calculations involving the radial distribution in amino acid density. Key structural features of the nucleoprotein complex are derived, including the cross-sectional radius of gyration and the extent to which H-NS penetrates the grooves. In particular, the effect of magnesium ions on the structure of the complex is explored. The structural arrangement of bound H-NS is discussed in the context of H-NS-mediated genome organisation and gene expression regulation.

Sample preparation
DNA fragments (150 bp) were obtained by micrococcal digestion of calf thymus chromatin. 20 His-tagged Escherichia coli H-NS was purified from over-expressing BL21(DE3)/pLATE 31-hns cells. The pLATE 31-hns expression vector was constructed by the ligation independent cloning method (Thermo Fisher). As the His-tag potentially modifies the properties of H-NS, a Tobacco Etch Virus (TEV) protease site ENLYFQG was inserted between the C-terminal His-tag and H-NS sequence. The oligonucleotides used for cloning were AGAAGGAGATATAACTATG AGCGAAGCACTTAAAATTCTGAAC (for) and GTGGTGGTGATGG TGATGGCCCTGGAAGTACAGGTTCTCTTGCTTGATCAGGAAAT CGT (rev). Cells from post-induction cultures were resuspended in 20 mM Tris-HCl pH 7.5, 0.5 M NaCl, 10% (v/v) glycerol and a protease inhibitor (Sigma) at 277 K. The suspension was sonicated and the lysed cells were cleared by centrifugation at 15 000g for 30 min. DNase I (40 g L À1 ) and RNase A (30 g L À1 ) were added to the cleared lysate at 303 K. The solution was then applied to a Ni 2+ -NTA column (GE Healthcare). The resin was washed with 20 mM Tris-HCl, pH 7.8, 0.3 M NaCl, 20 mM imidazole and the protein was eluted with a gradient of imidazole (20-500 mM). TEV digestion was carried out according to the manufacturer's instructions (Thermo Fisher). After digestion, fragmented and non-digested H-NS were removed with Ni 2+ -NTA magnetic beads (GE Healthcare). Several rounds of purification were performed to obtain the required amount of 50 mg H-NS. For gel electrophoresis characterisation of DNA and H-NS see ESI, † Fig. S1 and S2, respectively, online.
Two sets of samples with 3.

Small angle neutron scattering
SANS experiments were carried out using the D11 diffractometer at the Institut Laue-Langevin, Grenoble. A wavelength of 0.6 nm with a 10% spread was selected and the sample-detector distances were 1.1 and 13.5 m, respectively. The total counting times for all detector settings was approximately 2 h per sample. Data reduction allowed subtraction of background scattering, sample transmission and detector pixel efficiency. The efficiencies of the detector pixels were determined using the scattering of H 2 O. Absolute intensities were obtained with reference to pure water, and the scattering of the sample cell with solvent at the same isotopic composition was subtracted. The sample temperature was 298 K.

Contrast variation
We have used contrast variation to match or highlight the nucleic acid and protein components of the nucleoprotein complex. The scattering contrast is hence a key experimental parameter. The scattering length contrast of the nucleotides (i = n) and amino acids (i = a) with respect to solvent (water) is given by Here, b i and b s are the scattering length of solute and solvent, respectively. Note that the relevant parameter is the contrast per unit volume, so that the subtracted scattering length of the solvent has to be multiplied in the ratio of the corresponding partial molar volumes % v i /% v s . In a mixture of H 2 O and D 2 O, the scattering length of the solvent is where x is the mole or volume fraction of D 2 O. The contributions to the scattering from DNA and H-NS (structure factors) are obtained from the intensities by solvent contrast variation, that is by adjusting the scattering length of water b s . The scattering lengths and partial molar volumes of nucleic acid and protein have been calculated using the values reported by Jacrot, 21 (1) and (2) using the parameters in Table 1 and are shown in Table 2.

From intensities to structure factors
In the analysis of the structure factors, the nucleotides and the amino acids are considered to be the elementary scattering units. For a solution of nucleoprotein complexes with density r f , the coherent part of the solvent corrected scattered intensity is the sum of three partial structure factors describing the density correlations among the nucleotides and amino acids The number of nucleotides and amino acids per complex are denoted by N n and N a , respectively. Note that for our samples with an H-NS to bp ratio of 1 : 6, N a exceeds N n by more than an order of magnitude. Unless the protein is exactly matched with % b a = 0, the scattering is dominated by protein. The contribution from the small ions to the scattering at small angles is negligible. Momentum transfer q is defined by the wavelength l of the radiation and scattering angle y according to q = 4p/l sin(y/2). The partial structure factors S ij (q) with i, j = n, a (S ii is abbreviated as S i ) are the spatial Fourier transforms of the nucleotide and amino acid density correlation functions and are normalised to unity at q = 0. The nucleoprotein complex can be seen as a cylindrical object with a length L. Any possible ordering of protein in register with the phosphate moieties along the DNA molecule is beyond detection in scattering from solution given isotropic ensemble averaging. In the longitudinal direction (along the DNA molecule), the nucleotide and amino acid distributions are hence assumed to be uniform. In the perpendicular direction, away from the axis of the complex, the corresponding distributions are given by the radial profiles r n (r) and r a (r), respectively. In the present range of momentum transfer, the scattering is sensitive to correlations over distances of the order of the thickness of the complex and effects of finite contour length and flexibility are negligible. The partial structure factors can then be expressed as a product of a term related to the structure of an equivalent solution of complexes with vanishing cross-section and terms involving the radial profiles with the cylindrical Fourier (Hankel) transformation of the radial profile a i ðqÞ ¼ 2p where J 0 denotes a zeroth order Bessel function of the first kind. 4,5 For sufficiently high values of momentum transfer and/or diluted samples, inter-complex interference becomes negligible.
In the latter situation S(q) reduces to the form factor of an infinitesimally thin rod, that is S(q) = p(qL) À1 , and the nucleotide and amino acid structure factors take the limiting forms S n = pa n 2 (q)/qL and S a = pa a 2 (q)/qL (qL c 1) respectively. Accordingly, the DNA and H-NS structure factors can also be obtained directly from the H-NS and DNA-matched samples, respectively, without fitting procedure. We found that the fitted and directly measured structure factors are identical within statistical accuracy. There is hence no uncertainty associated with obtaining the structure factors from a combination of intensities from different samples. The DNA structure factor multiplied by momentum transfer qS n is shown in panel A and B of Fig. 2 for the two buffer systems, respectively. In the plot of qS n versus q, for an infinitesimally thin cylinder a high q plateau is expected. However, no plateau is observed. The absence of q À1 scaling occurs as a result of the finite cross section of the DNA molecule. For the sample in T-buffer with 100 mM KCl, the DNA structure factor agrees with the high q limiting form of the rigid rod form factor with a radius of 0.8 nm. 22 The sample is sufficiently diluted for absence of intermolecular interference in the present range of momentum transfer. In the presence of 10 mM MgCl 2 , the agreement between the DNA structure factor and the rod form factor is not as good. A distinct upturn in the DNA structure factor is observed at low values of momentum transfer q o 0.8 nm À1 (distances p/q 4 4 nm). We attribute this upturn to H-NS mediated aggregation (bridging), which is known to be triggered by divalent cations. 12,13   Besides experimental difficulties associated with the relatively weak scattering of DNA, it is unlikely that H-NS mediated bridging (and, hence, the upturn) can be eliminated by performing experiments at lower concentrations of DNA. For larger values of momentum transfer (q 4 0.8 nm À1 ), the effect of aggregation disappears. We refrain from further interpretation of the DNA structure factor and focus on the structure of the nucleoprotein complex as revealed by the structure factor pertaining to H-NS. The protein structure factor S a is shown in panel A and B of Fig. 3 for H-NS-DNA in the absence and presence of magnesium ions, respectively. The H-NS to bp ratio follows from the normalisation of S a and agrees with the ratio of 1 : 6 set by the respective concentrations of DNA and H-NS. In the double logarithmic representation, S a shows a characteristic shoulder at higher values of momentum transfer. This shoulder becomes less prominent, but does not disappear, in the presence of MgCl 2 . A similar shoulder, albeit at smaller q-values, was previously reported for another bacterial protein Hfq. 7 This feature in the protein structure factor can be attributed to shell-like ordering of protein about DNA. As in the case of the DNA structure factor, S a shows a small, but distinct low q upturn in the presence of MgCl 2 . The effect of divalent cations induced aggregation on the structure factors disappears however for q 4 0.8 nm À1 . Accordingly, the characteristic shoulder exhibited by S a at larger values of q is not related to aggregation nor inter-complex solution structure. Information on the structure of the thus formed protein coat can be obtained by comparing coarsegrained model calculations with the low resolution experimental data. For this purpose, the protein structure factor is compared to the relevant form factor S a = pa a 2 (q)/qL, with a a (q) being the Hankel transform of the radial amino acid density profile (see Materials and methods). In the model calculations, the predicted structure factors are convoluted with the instrumental resolution function.
Based on 150 bp DNA fragments and an H-NS to bp ratio of 1 : 6, the intensity of free H-NS without long range order is a factor of 25 lower than the one of H-NS bound to DNA. Accordingly, free H-NS does not significantly contribute to the scattering at small angles. For the amino acid distribution in the radial direction away from the axis of the nucleoprotein complex, we have used three different models. In the first model, a Gaussian profile r a (r) = exp(Àr 2 /r a 2 )/(pr a 2 ) with cross-sectional radius of gyration r a [Hankel transform a a (q) = exp(Àr a 2 q 2 /4)] is assumed. The corresponding model calculations are shown in the dashed curves in Fig. 3 (the radial profiles are shown in the insets). The use of a Gaussian profile results in a reasonable fit in the lower range of momentum transfer, but fails to predict the characteristic shoulder observed at higher values of q. The fit is not significantly affected by the (aggregation related) low q upturn in the presence of MgCl 2 . For the Gaussian model, we have optimised the value of r a . The results are shown in Table 3, together with the corresponding result reported for Hfq. 7 The cross-sectional radius of gyration of the H-NS-DNA complex is significantly smaller than that for the Hfq hexamer, which can   be attributed to the smaller molecular weight of the H-NS dimer. Furthermore, the cross-sectional dimension decreases by 35% with the addition of 10 mM MgCl 2 , which indicates a major structural rearrangement of the nucleoprotein complex. A deficiency of the Gaussian model is that it ignores the depletion of protein density at the core of the complex. In the second model, H-NS is assumed to form a cylindrical coat surrounding the duplex, with inner and outer radii r 1 and r 2 , respectively. The radial amino acid distribution is, hence, constant for r 1 o r o r 2 , and given by r(r)p(r 2 2 À r 1 2 ) = 1. As was previously reported, the coat model reproduces the structure factor of Hfq well. 7 For H-NS, the shell-like radial profiles and corresponding model calculations are shown by the dotted curves in Fig. 3. In the model calculations, the inner radius of the protein distribution was set to the outer radius of the DNA molecule, that is r 1 = 1.0 nm. The outer radius of the complex r 2 was optimised to reproduce the position of the shoulder in q-space (results are shown in Table 3). In the presence of 10 mM MgCl 2 , the shoulder is shifted towards higher values of momentum transfer due to a decrease in outer radius. Although its position in q-space can be well reproduced, the coat model does not provide a good description of the intensity and shape of the shoulder. This fit cannot be improved by a decrease in the value of r 1 (for instance, to account for insertion of a short C-terminal loop into the minor groove, see below) and/or by a helical distribution of H-NS around the DNA molecule. The poor fit of the coat model and the reduction in crosssectional radius of gyration with the addition of magnesium salt suggest that a fraction of H-NS is distributed at a smaller distance, closer to the axis of the complex than the DNA outer radius of 1.0 nm. The major groove of B-form DNA has a depth and width of 0.9 and 1.2 nm, respectively, and can accommodate (part of) a small protein such as H-NS. 23 In our third model, the amino acid density follows a bimodal radial distribution. Sections of the DNA molecule are surrounded by protein, as in the second model. For this coat of protein, the inner radius is set to r 1 = 1.0 nm and the outer radius r 2 is obtained from the fit of the position of the shoulder (Table 3). In other sections of the DNA molecule, H-NS is bound to the major groove. The corresponding amino acid profile is assumed to be step-like with an arbitrary width of 0.5 nm and centred at a distance r b from the axis of the complex. We have verified that the structure factor is most sensitive to fraction F and binding distance r b of major groove bound H-NS, whereas the width of the profile is of lesser importance. Long range correlation and, hence, interference between sections of coat-forming and major groove bound H-NS are neglected. The radial profiles and corresponding model calculations with optimised values of F and r b (Table 3) are shown by the solid curves in Fig. 3. The experimental structure factors and, in particular, the effect of MgCl 2 , are well reproduced. In T-buffer with 100 mM KCl, a small fraction of H-NS is major groove bound and distributed mid-way between the central axis and outer radius of the DNA molecule. However, most of H-NS covers the region next to the duplex. With the addition of 10 mM MgCl 2 , the complex tightens as shown by a reduction in outer diameter.
Furthermore, about half of H-NS is now major groove bound and distributed close to the phosphates of the DNA molecule.

Discussion
Our most important result is the determination of the contribution to the scattering from H-NS. This corresponds to a cylindrical nucleoprotein complex with an H-NS to DNA basepair ratio of 1 : 6. We have evaluated and tested various amino acid density profiles in the radial direction away from the axis of the complex. A cross-sectional radius of gyration is obtained from a fit of the Gaussian model in the low range of momentum transfer. The Gaussian model fails however to predict the characteristic shoulder observed in the protein structure factor at higher values of momentum transfer. This shoulder is related to the cut-off in amino acid density at the outer radius of the complex. 7 A radial distribution representing a single coat of H-NS surrounding the duplex does however not provide a satisfactory fit. Fairly good agreement between coarse-grained model calculations based on the radial density profiles and experimental protein structure factors is obtained using a heterogeneous, two-state binding model with sections of the duplex surrounded by protein and other sections with protein bound to the major groove. In a buffer of monovalent salts, most of H-NS covers the region next to the duplex. With the addition of magnesium chloride, we observe a structural rearrangement through a decrease in diameter and increase in the fraction of major groove bound H-NS.
The scattering data adds quantitative information to existing knowledge of the H-NS-DNA complex and, in particular, the effect of divalent cations. The 83 N-terminal region is connected through a flexible linker to a short C-terminal DNA binding region (residues 91-137). 26,27 Furthermore, the N-terminal region contains a primary dimerization site 1 (residues 1-46) and secondary oligomerization site 2 (residues 67-83). Head-tohead dimers are formed by interaction of sites 1. Dimeric H-NS binds selectively to the minor groove by insertion of the C-terminal loop. 19 The head-to-head dimerization through sites 1 together with tail-to-tail oligomerization of sites 2 creates a chain of linked H-NS molecules that form a superhelical scaffold. 24 The resulting nucleoprotein filament has been proposed to be critical for the gene silencing function. [14][15][16][17] It should also be noted that the linked protein coat significantly stiffens the filament, as shown by an increase in bending persistence length from around 60 to 130 nm. 10-13 A model of H-NS oligomerization and the filament are shown in Fig. 4. In a buffer of monovalent salts, we observed that most of the DNA molecules or sections thereof are surrounded by H-NS. This coat of protein covers the area next to the duplex, which agrees with selective binding of dimeric H-NS with the C-terminal loop to the minor groove and oligomerization of the N-terminal region to form a filament. The outer diameter of 6.4 AE 0.4 nm shows that the complex is relatively slender. Furthermore, the contour length of the complex is close to that of naked DNA, which shows that the superhelical protein scaffold is deformed with respect to the structure observed for mutants lacking the C-terminal DNAbinding region. 24,28 A small fraction of H-NS is bound to the major groove. The latter fraction has likely a different biological function, as indicated by the effect of divalent salts on the DNAcompaction properties of H-NS.
With the addition of magnesium chloride, H-NS-mediated bridges between distal DNA segments are formed. Bridging has been proposed to occur by interfacing sites 1 or, more recently, sites 2. 24,29 Furthermore, the persistence length of the complex decreases and takes a value of around the one pertaining to bare DNA. [10][11][12] Bridging results in compaction and, eventually, a collapse of the DNA molecule into a condensed state. 13 Here, we observe a concurrent change in the structure of the nucleoprotein complex, that is the complex tightens and the fraction of H-NS bound to the major groove increases. The decrease in persistence length indicates that the chain of linked H-NS proteins is broken by disruption of either interfacial site 1 or 2. If the chain is broken at interfacial site 1, the N-terminal region may be folded into the major groove, facilitated by the flexible linker, and stabilised by hydrophobic interaction with the stacked base pairs. A small number of broken links explains the small fraction of major groove bound H-NS even in the absence of divalent salts. However, as shown by the DNA compaction properties, major groove bound H-NS is involved in bridging rather than the formation of a filament. A major groove bound H-NS monomer can form a bridging link through site 2 with a monomer bound to another DNA segment, resulting in short range inter-DNA segment attraction (a model is shown in Fig. 4D). Alternatively, the chain of linked H-NS proteins can be broken at site 2 and the bridge can be formed through interfacing sites 1 (Fig. 4E). The latter scenario is less likely, because a larger section of the protein (site 1, residues 1-46) needs to be accommodated next to the duplex to form the bridging link at the cost of reduced stabilising hydrophobic interaction with the stacked base pairs. Both models explain major groove binding, tightening of the complex, increase in bending flexibility, and propensity for bridging. Unfortunately, the exact structure of the H-NS mediated bridges cannot be determined based on the present low resolution diffraction data.
The two binding states identified here may correspond to two distinct biological functions. Coat-forming H-NS may be involved in gene silencing through the formation of a filament, whereas major groove bound H-NS might have a predominant function in genome organisation through protein-mediated DNA segment interaction.