Interplay between chromophore binding and domain assembly by the B12-dependent photoreceptor protein, CarH†

Organisms across the natural world respond to their environment through the action of photoreceptor proteins. The vitamin B12-dependent photoreceptor, CarH, is a bacterial transcriptional regulator that controls the biosynthesis of carotenoids to protect against photo-oxidative stress. The binding of B12 to CarH monomers in the dark results in the formation of a homo-tetramer that complexes with DNA; B12 photochemistry results in tetramer dissociation, releasing DNA for transcription. Although the details of the response of CarH to light are beginning to emerge, the biophysical mechanism of B12-binding in the dark and how this drives domain assembly is poorly understood. Here – using a combination of molecular dynamics simulations, native ion mobility mass spectrometry and time-resolved spectroscopy – we reveal a complex picture that varies depending on the availability of B12. When B12 is in excess, its binding drives structural changes in CarH monomers that result in the formation of head-to-tail dimers. The structural changes that accompany these steps mean that they are rate-limiting. The dimers then rapidly combine to form tetramers. Strikingly, when B12 is scarcer, as is likely in nature, tetramers with native-like structures can form without a B12 complement to each monomer, with only one apparently required per head-to-tail dimer. We thus show how a bulky chromophore such as B12 shapes protein/protein interactions and in turn function, and how a protein can adapt to a sub-optimal availability of resources. This nuanced picture should help guide the engineering of B12-dependent photoreceptors as light-activated tools for biomedical applications.


Supplementary Figures
Chemical structure of AdoCbl in the base-on conformation (with the lower axial coordination site of the central cobalt occupied by the 5,6-dimethylbenzimidazole (DMB) base). b) Close-up of the B12-binding site of TtCarH showing AdoCbl in its base-off/His-on conformation. The residues highlighted in yellow are those that contact the upper axial ligand of AdoCbl, Ado (cyan). The cobalt-coordinating His (H177) replacing the DMB base in the lower axial coordination site is highlighted greeb. The domain colors are the same as in Figure 1a. Figure S2a) Representative structures of apo-TtCarH from MD simulations chosen using a single linkage clustering method (see Experimental section). The structures shown cover 95.3% of the total conformational space, with the most populated structure shown in color and the next four most populated clusters represented in grey to illustrate the conformational sampling observed during MD simulations. b) Root mean squared deviations (RMSDs) of the simulated apo-TtCarH monomer structure for each of three simulations run in parallel after alignment to the backbone atoms of the starting, photoconverted holo-TtCarH structure (PDB: 5C8F). Black traces: RMSD of the protein backbone atoms relative to the starting structure; blue traces: RMSD of the protein backbone atoms relative to the average structure across all three simulations. These data suggest that the system reaches equilibration in less than 150 ns in each simulation. i.e., with the upper axial Ado missing) and apo-TtCarH (purple, simulated). Arrows indicate displacement of the 4-helix bundle in photo-converted holo-TtCarH and apo-TtCarH relative to holo-TtCarH. b) RMSD of the protein backbone atoms of the apo-TtCarH B12-binding domain (black) and 4-helix bundle (blue) relative to the holo-TtCarH structure after alignment to the B12-binding domain. The average of the blue RMSD traces gives a displacement of the apo-TtCarH 4-helix bundle of 8.14  1.33 Å relative to the holo-TtCarH structure. For comparison, the RMSD of the 4-helix bundle in the photo-converted holo-TtCarH structure relative to holo-TtCarH after alignment to the B12-binding domain is 9.7 Å, and the RMSD for the photo-converted holo-TtCarH structure relative to the apo-TtCarH is 3.7  1.2 Å. The 4-helix bundle displacement for both apo-TtCarH and photo-converted holo-TtCarH is therefore of a similar magnitude, but they end up in slightly different positions. c) Overlay of monomers from holo-TtCarH and apo-TtCarH but aligned to the 4-helix bundle instead of the B12-binding domain as in Figures 1b-d and S3a. Salt-bridge residues identified here (D178 & R149) and previously 1 (D201 & R176) are highlighted for each monomer in orange and purple, respectively. In combination with Figure 1d, this image serves to illustrate that, whatever domain the structures are aligned to, the residues in the apo-TtCarH monomer are no longer in a position relative to the dimer interface (approximately indicated by the dashed line) to form stabilising salt-bridges. This shifting away from the dimer interface between the holo-TtCarH and apo-TtCarH structures is highlighted by the curved arrow.
S5 Figure S4. The collision cross section ( TW CCSN2, TW N2) distributions from ion mobility data of the apomonomer species (a) and of all tetramer species (b-g) observed at charge states 23+ to 26+ for WT TtCarH:AdoCbl ratios: 1:0.15, 1:0.25, 1:0.5, 1:0.75, 1:1 and 1:2, respectively. All were performed in triplicate. a) The apo-monomer distrubution remains unchanged with increasing proportions of AdoCbl. The charge states ranging from 10+ to 12+ all sit within a very similar range whereas TW CCSN2 distribution of 13+ appears to indicate a slightly more unfolded state. b-g) Three different tetramer species are evident -AdoCbl2-TtCarH4, AdoCbl3-TtCarH4 and AdoCbl4-TtCarH4 -with relative populations that vary across the range of ratios. Each tetramer returns the same TW CCSN2 distribution within error, however, regardless of the number of AdoCbl bound. This suggests that all tetramer structures are highly similar. Figure S5. TW CCSN2 ( TW N2) distributions for all charge states of each tetrameric species (AdoCbl4-TtCarH4red; AdoCbl3-TtCarH4 -yellow; AdoCbl2-TtCarH4 -blue) at the different WT TtCarH:AdoCbl ratios (indicated on the left). For each species at each ratio, the global TW CCSN2 is shown as a colored line with the different charge states displayed beneath in gradient shades. Data for each charge state have been corrected for their m/z peak height and peak area and were then summed together to give the global TW CCSN2. There is little change in the distribution of AdoCbl4-TtCarH4 with increasing AdoCbl content implying that the global 3D structure remains similar throughout. There is slightly greater fluctuation between charge states for AdoCbl3-TtCarH4 and AdoCbl2-TtCarH4. This could be because these forms adopt slightly less rigid structures than AdoCbl4-TtCarH4, but the reduction in the species signal intensity these sub-populations undergo with increasing AdoCbl might also contribute. Figure S6. The relative peak area as a function of WT TtCarH:AdoCbl ratio for the low amplitude mass spectral signals between 3500 -4500 m/z in Figure 2a that correspond to WT TtCarH dimer species. For the sake of simplicity, these peaks are not highlighted on the mass spectrum in Figure 2a with colored, dashed lines as for the tetramer species. Three dimer species are observed -apo-TtCarH2, AdoCbl1-TtCarH2 and AdoCbl2-TtCarH2 -the populations of which vary across the range of ratios. At a ratio of 1:0.15, there is only apo-TtCarH2 present. When compared to the equivalent plots for the tetramers (Figure 2a), these data suggest that any AdoCbl1-TtCarH2 and AdoCbl2-TtCarH2 that form at this ratio rapidly combine to form tetrameric species. This trend appears to continue. Although increasing populations of AdoCbl1-TtCarH2 and AdoCbl2-TtCarH2 are evident at higher AdoCbl concentrations, they always have a lower relative population than the equivalent tetramers (AdoCbl2-TtCarH4, AdoCbl3-TtCarH4 and AdoCbl4-TtCarH4) at the same ratios. Figure S7. An expansion of the region of the mass spectrum that corresponds to the TtCarH monomer species. Data are compared for a sample of apo-TtCarH (bottom) to data when AdoCbl is in excess (top, i.e., a WT TtCarH:AdoCbl ratio of 1:2) where additional, low amplitude signals (highlighted) are evident. The mass difference is equivalent to AdoCbl (~1550 Da) and these signals therefore correspond to AdoCbl1-TtCarH1. There is no evidence of this signal at WT TtCarH:AdoCbl ratios less than 1:2.   Figure 3a with the dashed lines). Much like for AdoCbl ( Figure S6), three dimer species are observed -apo-TtCarH2, MeCbl1-TtCarH2 and MeCbl2-TtCarH2 -and again their populations vary across the range of ratios. The lowest MeCbl concentration is also predominantly apo-TtCarH2, but unlike AdoCbl, small but significant populations of the other dimer species are clearly evident. Indeed, relative to apo-TtCarH2, MeCbl1-TtCarH2 and MeCbl2-TtCarH2 become far more prominent with increasing concentration of MeCbl. This is almost certainly because, unlike those binding AdoCbl, the dimers here do not combine to form tetramers and therefore their populations are not depleted in the same way. Nevertheless, all dimer species remain very much minor subpopulations in a spectrum dominated by monomer species (Figures 3a&b). Figure S10. The relative peak area as a function of G192Q:AdoCbl ratio for the mass spectral signals in Figure  3c that corresponds to monomer (a) (color matched in Figure 3c with the dashed lines) and tetramer species (b) (not highlighted in Figure 3c, for the sake of simplicity, but the low amplitude signals are visible between 5500 -6500 m/z). a) Two monomer species are observed -apo-G192Q1 and AdoCbl1-G192Q1 -the populations of which vary across the range of ratios. Like for WT TtCarH ( Figure S6), apo-G192Q1 is by far the most significant monomer species across all ratios. This is likely to reflect the dominant dimer species (Figures 3c&d) being more stable and that most AdoCbl1-G192Q1 that does form combines to form AdoCbl1-G192Q2 or AdoCbl2-G192Q2. Unlike WT TtCarH, however, where there is only the slightest hint of AdoCbl1-TtCarH1 when the AdoCbl is in excess ( Figure S7), AdoCbl1-G192Q1 is evident even at the lowest AdoCbl concentrations. The dominant G192Q dimers are therefore not as stable as the corresponding dominant tetramer for WT TtCarH, which draws the position of equilibrium further away from the holo-monomer. b) Although published SEC data suggest G192Q does not form tetramers, 1 a small sub-population is clearly evident in Figure 3c at ratios >1:1. Unlike when AdoCbl is bound to WT TtCarH, there is only one tetramer population, AdoCbl4-G192Q4, which only appears at AdoCbl concentrations where AdoCbl2-G192Q2 becomes the predominant species (Figure 3d). The G → Q substitution is thought to sterically hinder the native association between pairs of head-to-tail dimer, which means these tetramers could be the result of nonspecific association between AdoCbl2-G192Q2. This is supported by the fact that for the WT, tetramers are observed even at the low [AdoCbl] concentrations. The TW CCSN2 for both the AdoCbl4-TtCarH4 and the AdoCbl4-G192Q4 are compared in Figure S11b. b) a) S12 Figure S11. TW CCSN2 ( TW N2) distributions from ion mobility data of AdoCbl2-G192Q2 (a) and AdoCbl4-G192Q4 (b) for a G192Q:AdoCbl ratio of 1:1. For each species, the global TW CCSN2 is shown as a colored line with the different charge states displayed beneath in gradient shades. Data for each charge state have been corrected for their m/z peak height and peak area and were then summed together to give the global TW CCSN2. a) The 15+ and 16+ states for AdoCbl2-G192Q2 each give similar TW CCSN2. The minor 17+ state is slightly broader and hence has more conformation flexibility. b) The AdoCbl4-G192Q4 distribution follows a very similar pattern. AdoCbl4-G192Q4 adopts a slightly smaller global TW CCSN2 than AdoCbl4-TtCarH4 (Figure 2c), perhaps indicative of non-specific association, this marginal (1.6%) difference is within instrumental error.
One would therefore expect some extent of quenching by AdoCbl of tryptophans within ~ 40 Å (see Table  S1).  . Distances between the Cα in each of the 20 tryptophans (W) in the entire holo-TtCarH tetramer (i.e., 5 in each monomer, Figure S12a) and the Co ion of AdoCbl in monomer A. Distances were measured using the published 1 crystal structure (PDB: 5C8D) and each monomer of the tetramer is given a letter designation A-D, with A/B and C/D being the two head-to-tail dimers. There will, of course, be equivalent and corresponding distances between each W and the AdoCbl in monomers B-D. From Figure S12, we can estimate that fluorescence from W within ~ 40 Å is likely to be quenched with non-negligible efficiency by AdoCbl (highlighted in green below). This includes W from each monomer, which means the fluorescence quenching experiments conducted by stopped-flow spectroscopy ( Figure S13) are highly likely to be sensitive to both AdoCbl-binding and the subsequent tetramer domain assembly steps.    . The amplitudes for the faster kinetic phase (black) are the same within error for each variant, indicating that they represent an equivalent process (i.e., binding of AdoCbl to protein monomers). The amplitudes for the slower kinetic phase (red) are consistently smaller for G192Q than for WT TtCarH. Because this phase is linearlydependent on [protein] (Figures 4b&d) for both variants, it corresponds to quenching from protein domain assembly. The smaller amplitude for G192Q therefore indicates that there is less quenching than for WT TtCarH, consistent with domain assembly predominantly stopping at the dimer rather than going on to form a significant proportion of tetramers. . In contrast to the stopped flow data, panels (a-c) show no difference in the fluorescence quenching between the two variants. d) Emission spectra of WT TtCarH before (black) and after (green) exposure to a 530 nm LED for 5 s. The photo-converted monomer has a larger fluorescence magnitude than the 'dark' state tetramer. Inset: long time-base fluorescence stopped flow traces following rapid mixing of WT TtCarH (solid black line) and G192Q (solid red line) with AdoCbl. In each case, the data were fit to the sum of three exponentials (two negative phases and one positive phase) and extrapolated to longer times (dashed lines) to illustrate convergence. Although the quenching amplitudes in panels (a-c) are ostensibly the same, the data in (d) reveal this to be an artefact of the measurement. In each case the static spectra where acquired from the same protein sample with sequential additions of AdoCbl, with each TtCarH sample therefore exposed to the excitation light for several minutes. This light was also absorbed by bound AdoCbl over the same period, which thus slowly and irreversibly activated the TtCarH photoreceptor. The static fluorescence spectra for each variant following the titration were the same and dominated by the photoconverted protein. Figure S17. Fluorescence spectra of 0.5 µM (black) and 5 µM (red) apo-TtCarH for the WT (a) and G192Q variants (b). To normalise, the 0.5 µM spectra have been multiplied by 10. For both variants, the normalised fluorescence amplitude for the 5 µM protein sample is lower than for the 0.5 µM sample, presumably owing to the inner filter effect.

Molecular Dynamics Simulations
Molecular dynamics (MD) simulations of apo-TtCarH were performed using the crystal structure 1 of photoconverted TtCarH (PDB: 5C8F) as a starting point, using the Gromacs package 4 with the Amber03 force field, 5 a solvation box of minimum 10 Å around the protein and periodic boundary conditions. Three MD simulations were run in parallel after energy minimization using the following protocol: the system was initially thermalized to 300 K for 100 ps using the NVT ensemble (constant volume), and the pressure was then equilibrated for 100 ps using the NPT ensemble (constant pressure) with harmonic constraints of force constant 10 kJ mol -1 Å 2 applied to the protein; constraints and pressure couplings were then switched off and the system relaxed using NPT at 250, 280, 290 and 300 K for 1 ns each. Finally, 400 ns of NPT dynamics were run at 300 K. Representative structures ( Figure S2a) were chosen using a single linkage clustering on the whole protein (not including hydrogen atoms) for all three simulations together, with a cut-off of 0.17 nm (the average RMSD between any two structures being 0.35 nm), which resulted in 95.3% of structures present in the top 5 clusters, which have the following populations: 78.7%, 8.89%, 4.56%, 2.38% and 0.833%.

Materials and Protein Production
Unless otherwise stated, all commercial reagents were obtained from Sigma-Aldrich and used without further purification. Plasmids containing the genes encoding WT and G192Q variants of TtCarH were kindly provided by S. Padmanabhan and Montserrat Elías-Arnanz. The genes had been cloned into modified pET15b (Novagen) expression vectors using NdeI and BamHI restriction enzymes, to provide a N-terminal 6xHis affinity tag as previously described. 6 The following overexpression and purification protocols were used for both WT and G192Q variants of TtCarH. The plasmids were transformed into E. coli BL21 (DE3) cells (Novagen), and a single colony inoculated into a small volume of selective (Amp R ) LB medium. The cell cultures were grown to an OD600 ~ 0.8 at 37 o C and 200 rpm, at which time they were inoculated into a larger volume of fresh LB. The fresh cell cultures were grown to an OD600 ~ 0.7 at 37 o C and 200 rpm, at which time they were cooled down to the induction temperature of 25 o C. Protein expression was then induced with 0.5 mM IPTG and the cells were left to incubate overnight at 25 o C and 200 rpm. Cell were then harvested by centrifugation at 5,000 rpm and 4 o C for 20 mins, collected, rapidly frozen in liquid nitrogen, and stored at -80 o C. For purification, cells were thawed and resuspended in buffer A (50 mM sodium phosphate + 300 mM NaCl, pH 7.5) supplemented with 2.5 mM imidazole, lysozyme, DNAse, MgCl2, and protease inhibitors, and lysed by cell disruption. Cell debris was removed by centrifugation at 20,000 rpm and 4 o C for 1 h. The collected supernatant was filtered through a 0.22 m membrane and incubated with 5 mL TALON metal affinity resin (Clontech) for 2 h at 4 o C with rolling. Protein-bound resin was washed with buffer A supplemented with 5 mM (wash 1) and 10 mM imidazole (wash 2). Protein was eluted with buffer A supplemented with 150 mM imidazole following 1 h of incubation at 4 o C with rolling. Fractions containing purified protein were further purified by size-exclusion chromatography using a Superdex200 high-performance liquid chromatography column (Cytiva), equilibrated with 50 mM phosphate buffer + 150 mM NaCl, pH 7.5. Purified sample was brough to the desired concentration using 10K molecular weight cut off Vivaspin centrifugal filter devices (Sartorius).

Native Mass Spectrometry and Ion Mobility Mass Spectrometry CCS Measurements
Mass spectrometry experiments were performed on a modified traveling-wave ion mobility enabled Synapt G2-S (Waters), described previously. 7 The n-ESI tips were pulled in-house from borosilicate capillaries (outer diameter 1.2 mm, inner diameter 0.69 mm, length 10 cm, Science Products GmbH) using a laser-based P-2000 micropipette puller (Sutter Instrument Company). A positive voltage was applied to the solution via a platinum wire (Goodfellow Cambridge Ltd). Data were analysed using MassLynx v4.1 (Waters Corporation), OriginPro 9.1 (OriginLab Corporation), and Microsoft Excel 2010 (Microsoft). Native mass spectra were recorded on a modified Synapt G2-S employing gentle source conditions with a capillary voltage between 1.1-1.4 kV, cone voltage of 10 V and all radio frequencies set to zero. TW CCSN2 measurements were performed on the same modified mass spectrometer following the standard calibration procedure utilising the Mason-S21 Schamp equation. [8][9][10] Measurements were made in nitrogen and spraying conditions were again kept as gentle as possible with an applied capillary voltage range of 1.1 to 1.4 kV, cone voltage maintained at 10 V and a source temperature of 333.13 K. Mass spectrometry titration experiments WT TtCarH and G192Q variants were buffer exchanged into 250 mM Ammonium Acetate, pH 6.8 using Slidea-Lyzer Dialysis Cassettes (ThermoFisher). The concentration of this new stock solution was then checked by absorption at 280 nm using a DS-11 Spectrophotometer (Denovix). The AdoCbl and MeCbl were diluted to the desired concentration in ultrapure water that was obtained from a Milli-Q Advantage A10 ultrapure water filtration system (Merck Millipore, Darmstadt, Germany). Data were acquired for a range of TtCarH:B12 ratios, from excess proteins to excess B12. Mixtures at each ratio were made up under red light to avoid photoactivation, vortexed and inserted into the nESI tip, which was then covered and kept in the dark during acquisition. The final monomeric concentration for the mass spectrometry was 12 µM for the apo-protein and an overall concentration of 10 µM for the titration mixtures. Fresh titration mix was prepared for all experimental MS runs and all experiments were performed in triplicate.

Fluorescence stopped-flow measurements
Fluorescence stopped-flow measurements were carried out in a SX20 stopped-flow spectrometer (Applied Photophysics), with Pro-Data SX and Pro-Data viewer operating software. All samples were prepared in 150 mM NaCl, 50 mM phosphate, pH 7.5. Protein samples at each working concentration (see Figures 4 and 5) were prepared under white light, whereas the cobalamin samples were prepared under red light and stored in black vials due to their photosensitivity. Stopped flow measurements were performed in a 20 µL quartz cell at ~ 25 o C, using an excitation wavelength of 280 nm, 2 mm monochromator slit width (entrance and exit), a 320 nm long-pass filter, and 400 V in the fluorescence channel. All measurements were performed under red light to stop unwanted photo-activation.
After loading into the drive syringes, all samples were thermal equilibrated with the water bath for ~ 10 minutes. Fluorescence emission data were acquired following rapid mixing of protein and B12 with 1000 data points acquired over the acquisition period. The first three shots were discarded per syringe fill to account for the dead volume, and the kinetic data acquisition repeated for each of the following 5 shots. Each repeat trace was fitted from 0.1 ms to 30 s to an appropriate sum of exponentials using the fitting tool on OriginPro 2020 (OriginLab Corporation). Each fitting yielded decay constants that were converted to pseudo first-order rate constants; these were then averaged, and standard deviations and standard errors calculated. Concentration dependences of each rate constants was assessed by plotting the averaged values as a function of both cobalamin and TtCarH concentrations. Second order rates were derived from the gradient of linear fits (Figures 4, 5 and S18, and Table S3).  Table S3. S23 Table S3. Second order rate (slope) and dissociation coefficients (y-intercept) for both B12 binding to WT and G192Q TtCarH variants and for their domain assembly. Fitting was conducted in two ways. First, for the [AdoCbl]-dependences, the variation of the pseudo first order rate was fit linearly at each concentration of the two TtCarH variants. Similarly, for the [protein]-dependences, data were fit for each [AdoCbl]. For these fits, the parameters were averaged and errors calculated (highlighted yellow; c.f., Figure S18). Second, concatenated fits were performed (c.f., Figures 4&5), where a single linear fit was made for the [AdoCbl]-dependence to all of the data from every [protein] (separately for WT and G192Q). The same was done for the [protein]-dependences for all [AdoCbl]. This was either conducted including (highlighted orange) or excluding (highlighted grey) the non-pseudo first order data.

Static fluorescence measurements
Static fluorescence measurements were carried out in a FLS920 spectrofluorometer (Edinburgh Instruments), with the F980 spectrometer operating software. Samples were diluted to ~ 0.1 OD at the wavelength of interest with the same buffer used in the stopped-flow measurements. Quartz cuvettes with a fully masked standard exterior size, 10 mm emission aperture window pathlength and 2 mm sample chamber were used. For the photoconversion experiment, a holo-TtCarH sample was illuminated using a high-power lightemitting diode (LED; Thorlabs) with λmax = 530 nm.