The circularly permuted globin domain of androglobin exhibits atypical heme stabilization and nitric oxide interaction

In the decade since the discovery of androglobin, a multi-domain hemoglobin of metazoans associated with ciliogenesis and spermatogenesis, there has been little advance in the knowledge of the biochemical and structural properties of this unusual member of the hemoglobin superfamily. Using a method for aligning remote homologues, coupled with molecular modelling and molecular dynamics, we have identified a novel structural alignment to other hemoglobins. This has led to the first stable recombinant expression and characterization of the circularly permuted globin domain. Exceptional for eukaryotic globins is that a tyrosine takes the place of the highly conserved phenylalanine in the CD1 position, a critical point in stabilizing the heme. A disulfide bond, similar to that found in neuroglobin, forms a closed loop around the heme pocket, taking the place of androglobin's missing CD loop and further supporting the heme pocket structure. Highly unusual in the globin superfamily is that the heme iron binds nitric oxide as a five-coordinate complex similar to other heme proteins that have nitric oxide storage functions. With rapid autoxidation and high nitrite reductase activity, the globin appears to be more tailored toward nitric oxide homeostasis or buffering. The use of our multi-template profile alignment method to yield the first biochemical characterisation of the circularly permuted globin domain of androglobin expands our knowledge of the fundamental functioning of this elusive protein and provides a pathway to better define the link between the biochemical traits of androglobin with proposed physiological functions.


Introduction
The hemoglobin (Hb) of the erythrocyte is one of the most studied proteins in science.However, a decade aer the discovery of androglobin (Adgb), a multi-domain Hb rst identied in the testis of metazoans, there is still very little known about this novel member of the Hb superfamily.Adgb is unusual due to its long length of 1667 amino acids (human), with a centrally positioned globin domain. 1 This compares to the 140-190 amino acids of other human globins such as the erythrocyte Hb, 2 myoglobin (Mb), neuroglobin (Ngb) 3 and cytoglobin (Cygb) 4 and is signicantly larger than other multi-domain globins such as avohemoglobin. 5Initial sequence alignment analysis predicted the heme-binding globin domain of Adgb (Adgb-GD) to consist of an eight alpha-helical structure (termed A to H) with a 3-on-3 alpha-helical fold that encloses the heme moiety and is typical of most non-truncated globin architectures. 1 However, highly unusual in the globin family is that Adgb-GD is circularly permuted with a calmodulin binding domain situated between the H and A helices. 1 The N-terminal region of Adgb is reported to contain a calpain-7-like region and the C-terminal region is reported to contain sequences for a coiled-coil region, a nuclear localization signal (NLS) and an ER membrane endoplasmic reticulum retention signal. 1 Calpain C2-like domains anking the globin domain are predicted from calpain homologues studies. 6he potential interaction of Adgb with nitric oxide (NO) and calcium may underlie its relevance to spermatogenesis and ciliogenesis, as well as to potential diseases.A knockdown study in cells showed enhanced apoptosis and proliferation inhibition in glioma cell lines, relating to changes in the level of several proteins involved in cell proliferation, survival or apoptosis, including STAT3 cleaved caspase-3 and Bcl-2. 7dditionally, STXBP5 antisense procient GR pancreatic cancer cell lines overexpress Adgb through ADGB promotor methylation, leading to drug resistance and inhibition of cell apoptosis.Recently, mRNA-Seq data from mammalian tissue have shown that Adgb is expressed in the female reproductive tract, lungs, and brain.In each case, Adgb is specically associated with cell types forming motile cilia, with expression linked to transcription factors involved in ciliogenesis. 8An absence of Adgb leads to defective sperm head and agellum formation during spermatogenesis. 9here remains a lack of data concerning the structure, properties or functions of this protein.This lack of knowledge on the structural and functional aspects of the protein is due to the difficulty in generating such a large full-length hemoglobin by recombinant techniques and the instabilities of the hemebinding, circularly permuted globin domain. 102][13] Our unique identication of the globin domain yielded a structure by homology modelling that was stable for initial 500 ns of molecular dynamics simulations, giving support for the potential expression of this protein despite initial negative expression results. 10Thus, based on our new alternative alignment, we have expressed the circularly permuted globin domain as a stable recombinant protein.An intramolecular disulde bond linking the N and C terminal sections of the heme binding globin domain appears to stabilize the 'CD loop' heme pocket region and likely the whole globular domain.
With a stable form of the protein expressed, we have characterized the protein using optical, electron paramagnetic resonance (EPR), stopped-ow and femtosecond laser ash photolysis spectroscopy to show that Adgb binds NO as a vecoordinate heme iron.5][16] Furthermore, the protein exhibits high nitrite reductase activity, which is inuenced by the redox state of the disulde bond and exhibits a high autoxidation rate similar to Ngb.Based on our ndings we propose that the globin domain may serve as an NO storage or NO synthesis protein under hypoxic conditions.[19][20]

Results and discussion
The alignment of the androglobin helices to known templates Fig. 1 (le) shows the scores for various alternative alignments of the six key helices of Adgb-GD to their counterparts in alpha-Hb, beta-Hb, Cygb, Ngb, and Mb.Helix A (Adgb residues Val939-Glu951) is the rst helix in alpha-Hb, beta-Hb, Cygb, Ngb and Mb, but in the naturally circular permuted Adgb it follows helix H (Adgb residues Phe866-Ser877) and the IQ domain (residues Val891-Thr933).The main peak lies at 0 for alpha-Hb, beta-Hb, Cygb, Ngb and Mb, which corresponds to the alignment in Fig. 2; a small alternative peak lies at +4 for each globin bar Mb, which corresponds to shiing the Adgb helix A four residues to the right.For the alpha-Hb-Adgb-GD alignment, there were 17 155 pairwise alignments (because there were 235 alpha-Hb sequences and 73 Adgb-GD sequences).When the Adgb sequences were moved between −25 and +25, the 0 alignment obtained 10 664 votes and the +4 alignment obtained 3664 votes; when the alpha-Hb sequences were moved between −25 and +25, the zero alignment obtained 10 001 votes and the +4 alignment obtained 774 votes.In this case, the mean of 10 332 for alignment 0 is considerably higher than the next nearest mean of 2219 (alignment +4).A similar predominant score was obtained for all HA alignments, and so when the scores for all ve alignments were multiplied together, Fig. 1 (right), there is an overwhelming preference for alignment 0, despite the low percentage identity for HA of 15.3%, 15.2%, 16.6%, 21.1%, and 15.1% for the Adgb-GD alignments to alpha-Hb, beta-Hb, Cygb, Ngb, and Mb respectively.The results for the other ve helices are similar, with percentage identities ranging between 7.8% (HH of the Adgb-GD-Ngb alignment) and 26% (HF of the Adgb-Mb alignment), Table S1.† Based on these percentages, Adgb-GD is marginally more similar to Mb (mean helical percentage identity 17.8%) and less similar to beta-Hb (mean helical percentage identity 14.0%).The full alignment is given in Fig. 2 and the mean percentage identity for the pairwise alignments for each helix-helix prole alignment between androglobin and the other hemoglobins is given in Table S1.† The threefold Modeller structural alignment of both Adgb models (complex 1 and complex 2, see below) with the AlphaFold 2 model shows agreement in the sequence alignment except in two places, namely 828 PVPFHDKEL 834 (where helix F of the AlphaFold 2 model lls the space normally occupied by the heme) and the distal half of helix H, which terminates prematurely to accommodate the IQ domain ( 872 DLWLLN 877 is not helical).
The list of Adgb sequences, greatly expanded since the original sequence analysis in 2012, together with the helix alignment analysis above, permits a re-assessment of the overall sequence identication of the globin domain of Adgb.From the previous assignment of the globin domain structure, 1 helices C to H are the rst to be expressed on the N terminal side (His761 onwards).This is followed by C terminal helices A and B from Asp935 onwards.These sections are interconnected by a 33amino acid section incorporating an IQ calmodulin binding domain (Fig. S1A †).This assignment took into account several key amino acids that are highly conserved throughout the hemoglobin superfamily.These comprised: (i) the proximal His connecting the heme iron to the protein, situated on the F helix (F8); (ii) the E7 distal heme iron ligand in Adgb, which is a Gln instead of the more common His residue; 1 (iii) the CD1 Phe residue playing a role in anchorage and binding of the heme group within the heme pocket. 21,22While our sequence alignment agrees with most of the original alignment, the identities of the C helix region and the CD1 residue differ.With the Phe770 amino acid assignment as the CD1 residue in human Adgb, it becomes clear that in other species ∼22% of such alignments place a Cys in the CD1 position (Table S2 †).Indeed, the previous alignment of Adgb globin domains shows three (9% of sequences) as Cys residues at the CD1 position, 1 but this was interpreted as mismatched alignments.The adjusted sequence alignment, shown in Fig. S1B and C, † was used for the recombinant expression.
A Cys residue in the CD1 position is likely to have profound impact on the stabilization of heme binding as the substitution of the CD1 Phe by Cys removes an important contact with heme leaving a gap at the surface of the heme pocket which could result in instability.Mutations in human hemoglobin of CD1 Phe almost invariably result in instabilities in the protein, resulting in Heinz body formation, cyanosis and severe hemolytic anemia.A natural variation with a Cys residue in the CD1 position was found in the beta Hb chain of a Caucasian male infant (Hb Little Venice, b42[CD1] Phe / Cys). 23At 2 years of age, the infant showed severe chronic hemolytic anaemia, positive Heinz body formation, and haptoglobin depletion and required a monthly regular transfusion regime used more commonly for the cases of severe forms of thalassemia. 23e conducted a reassessment of the C to D helix sequence assignment, disregarding the prerequisite of a Phe as the CD1 component of the structure, based on the globin domain helix alignments (Fig. 1 and 2).Our assessment places a different sequence as the C to D helix section, thus placing a Tyr residue (Tyr976) in the CD1 position of the human Adgb globin domain (Fig. S1B and S2 †).Although essentially unique in eukaryotes, the presence of a Tyr in the CD1 position of prokaryotic Hbs is common and does not signicantly affect heme binding, but Fig. 1 The alignment of the androglobin heme-binding globin domain helices.The left-hand column gives the individual alignments for helices A-B and helices E-H, one per row (denoted HA, HB, etc.).The alpha-Hb (red), beta-Hb (green), Cygb (blue), Mb (yellow), and Ngb (cyan) alignments to Adgb are denoted by the bars representing the number of votes (scaled between 0 and 1).The right-hand column gives the consensus alignment where the votes for the individual alignments are multiplied together.The results point overwhelmingly to the 0 alignment given in Fig. 2.An alignment of +1 would correspond to the movement of the Adgb helix one residue to the right in Fig. 2; an alignment of say −3 would correspond to the movement of the Adgb helix three residues to the left in Fig. 2.
can affect oxygen binding affinity. 24Sequence alignment comparison shows that the frequency of Tyr residues is ∼82% with the majority of the remaining sequences being Phe residues (Table S2 †).Therefore, we propose the assignment of CD1 to Tyr976 instead of Phe770, resulting in a circular permutation where helices D to H are expressed on the N-terminal side, followed by helices A to C following the IQ calmodulin binding domain sequence.
The previous alignment sequence did not allow expression of a stable form of the globin domain to be generated recombinantly, 10 likely as a result of the expression of the incomplete globin domain.However, this new approach to sequence alignment resulted in expression of a stable globin domain (vide infra).

The androglobin model structures
Adgb-GD complexes 1 and 2, shown in Fig. 3, align well with the ve Hb structural templates.The root mean square deviations (RMSDs) over the common globin domain to Mb are 1.3 Å for complex 1, 1.1 Å for complex 2 and 2.3 Å for the AlphaFold 2 model, over the alpha helices, as determined using SSM. 25 The three models differ in the orientation of the IQ (calmodulin-binding) domain, which is not included in the RMSD calculations because it is absent in Mb and the other traditional Hbs.The orientation of the IQ domain differs in all three structures and is connected by a very exible loop, and this exibility is probably important for binding calmodulin.When superposed onto the full Adgb AlphaFold structure, all three structures present the IQ domain in an accessible orientation.The AlphaFold 2 and Modeller structures differ in two other aspects.Firstly, helix H is truncated in the AlphaFold 2 model.Secondly, helix E moves slightly into the space occupied by the heme; this structural topology may bear some resemblance to puried Adgb before the heme group is added back (see ESI †).The globin domain contains four cysteine residues, none of which form a disulde bond in the Alpha-Fold 2 model.Adgb-GD complexes 1 and 2 were investigated through molecular (MD) dynamics simulations to assess stability over the time course.Three 500 ns MD replicas were produced for each model in complex with the heme (Fig. 4).The emerging scenario suggests the overall higher stability of complex 1 as indicated by the root mean square uctuation (RMSF) analysis (Fig. 4A and D) and the RMSD of the heme (Fig. 4B and D).In both structures, the IQ domain and the loop connecting it to the H helix were the most dynamic parts (ESI Videos 1 and 2 †); however, in complex 1 the whole structure underwent high thermal uctuations.This affected the stability of the heme: it was less steady in complex 1 (Fig. 4B).In both systems, the distal Gln792 remained in the proximity of the heme, while the CD Tyr976 was more exible during the simulations, especially in complex 2. Models are accessible via https://zenodo.org/records/10509366.Optical properties, heme iron spin state and the intramolecular disulde bond in the androglobin globin domain SDS PAGE and western blot analysis show expression and purication of a soluble Adgb-GD protein at ∼26 kDa (Fig. S3 †), close to the predicted 27 kDa weight based on the sequence as shown in Fig. S1C.† This includes the Histag and linker sequences and the alternative C-terminal C helix identied (vide supra).Reverse phase HPLC analysis of Adgb-GD shows the heme content used for extinction coefficient calculations as directly compared to a known concentration of Mb (Fig. S4 †).Size exclusion analysis shows that in a phosphate buffer the protein is primarily monomeric with some dimer (Fig. S5 †).In 150 mM NaCl buffer, this changed to mostly tetramer, likely as a result of the decrease of charge repulsion on the protein surface.All subsequent data were collected in the absence of salt unless otherwise stated.The predicted model of this domain (see ESI videos †) results in a large positive surface electrostatic potential for much of the protein.This may explain why SDS facilitates purication, i.e. prevents aggregation with endogenous E. coli proteins through blockage of charged sites.
As expressed, the optical properties of Adgb-GD are shown in Fig. 5 and the calculated extinction coefficient and peak wavelengths are in Table S3.† The ferric protein has bands in the visible region at 534 and 566 nm, suggesting a hexacoordinate state of the heme iron like that observed for Cygb and Ngb. 3,4owever, a small peak at ∼630 nm suggests that the protein also has some pentacoordinate-like properties.The coordination is pH dependent as shown in Fig. S6.† There is a single transition with pK a 8.6 that, from the optical spectra, is typical of a transition from penta-to hexa-coordination, with the loss of a 630 nm peak (typical of the water-bound pentacoordinate state) and the evolution of peaks at 566/534 nm. 26Mb and Hb have similar pK a values (8.93 and 8.0 for Mb and Hb, respectively). 26However, the change in coordination is opposite to that of Adgb, with Adgb showing hexa-coordination at more acidic pH values and penta-coordination at alkaline pH.Thus the change in coordination cannot be due to the H 2 O/OH − ligands observed in Mb and Hb and is more indicative of the coordination change observed with Cygb (pK 8.2) with His-Fe(III)-X (X = His for Cygb and Gln for Adgb) at acidic pH and His-Fe(III)-H 2 O at alkaline pH. 27Reduction to deoxy ferrous iron shows two prominent peaks at 531 and 560 nm indicative of hexacoordinate heme iron conguration.The CO-bound spectrum is typical for most globins, but the NO-bound spectrum exhibits a Soret peak at an unexpectedly hypsochromic (blue) shied peak at 395 nm (Fig. 5).Typical wavelength maxima for hexacoordinate ferrous-NO bound protein are observed ∼420-430 nm in other NO-bound globins such as Mb and Cygb. 28,29his suggests that the NO is bound in an unusual form in Adgb-GD.Optical changes were not observed with addition of NO (20 mM) to ferric Adgb-GD, indicating that NO affinity for the ferric heme iron is low.
The EPR spectra of ferric heme Adgb-GD at a slightly acidic pH (Fig. 6A and B, pH 6) recorded at 10 K shows a mixture of high spin (S = 5/2, HS) and low spin (S = 1/2, LS) signals corresponding to penta-and hexa-coordinated heme iron, respectively.The HS signal with the perpendicular g x = g y = 5.95 and the parallel g z = 2.00 components is typical of other globins in a pentacoordinate conformation.The g = 2.95 and g = 2.26 EPR signals are the g x and g y components of a LS signal, and the third g z component was not observed as it was too broad, which is typical for many globins, particularly for the signal-to-noise level.The LS signal with these g-values has been classied as a low spin form in ferric Hb, with one of the axial ligands likely to be a histidine's nitrogen and the other not identied. 30At pH 7, the HS signal at g = 5.95 is smaller than at pH 6, which is concurrent with a noticeable increase of the LS form (as per both g = 2.95 and g = 2.26 signals).This is in agreement with the total ferric heme concentration being conserved in the pH 6-pH 7 range.The intensity of the HS peak partially returns at pH 8, with a small decrease of the LS form as compared to pH 7.However, this was likely not due to an acid-alkaline transition between the HS and LS ferric heme forms as the pK a for such a transition is clearly reported in the liquid phase at 8.4  (Fig. S6 †).The intensity changes are likely an artifact of freezing, similar to that observed with the EPR spectra of Cygb. 27This effect appears largely negated with the disulde reduced protein where there was no effect of pH on both HS and LS signals (Fig. S7 †).The reduced cysteine protein shows a change in the line shape of the perpendicular components area, at g ∼ 6.Not only does the g = 6 signal becomes much wider, the new effective g-values become apparent at g ∼ 6.5 (or maybe lower/ higher).A small amount of 'rhombic iron' can be observed as a g = 4.3 signal in both oxidized and reduced proteins as a typical small component in native and recombinant proteins. 313][34] With our reassignment of the C helix sequence, the CD section of Adgb-GD possesses a predicted disulde bond from residues Cys787 and Cys978 (Fig. S1 †).Absent in the original sequence alignment, this disulde bond could stabilize this crucial juncture of the heme pocket, now predicted as the N and C terminal sections of the domain.With four cysteine residues in the globin domain sequence, we determined how many of those are surface exposed and free to bind dithiodipyridine.As shown in Fig. 6C, there were two (1.9 ± 0.3) free sulydryls per heme detected with tris(2-carboxyethyl)phosphine (TCEP) reduced Adgb-GD, meaning that two of the four cysteines are surface exposed.In the protein as expressed in E. coli (without reduction by TCEP), no free cysteines were observed.Thus, as expressed, the globin domain possesses a single disulde bond.From the predicted positions of the cysteines in complexes 1 and 2, only two are both surface exposed and close enough to form an intramolecular disulde, which are Cys787 and 978 in the "CD loop" region of the protein; however, both these two (reduced) cysteine residues Cys894 and Cys970 moved sufficiently close to form a disulde bond in the MD simulation of the AlphaFold 2 structure aer 1.4 ms (Fig. 4C), even though the sulfur atoms were originally 11.6 Å and 17.4 Å apart respectively (Fig. 4D).Further studies will be required to conrm the position and the micro-environmental conditions favoring the formation of the disulde bond within Adgb-GD.

Ferrous androglobin globin domain binds nitric oxide in a primarily pentacoordinate form
In order to further establish that the heme group is fully incorporated into a heme pocket, as in other heme proteins and not merely adventitiously bound to the surface, we have studied the ligand binding properties of Adgb-GD.By using stopped-ow and femto-second laser spectroscopy, we have shown that the heme behaves as expected for a fully incorporated heme.
Adgb-GD binds NO in the ferrous form to generate an optical spectrum as observed in Fig. 5, suggesting a ve-coordinate NO heme iron.This is supported by EPR at 10 K (Fig. 6D).This spectrum showed a characteristic three-line hyperne signal, essentially identical to that reported for cyt c 0 from Shewanella frigidimarina and Alcaligenes xylosoxidans. 16This is again consistent with ve-coordinate NO binding to the ferrous heme. 16,35he optical changes and kinetics of NO binding to deoxyferrous Adgb-GD, followed using stopped-ow spectroscopy, are shown in Fig. S8.† Increases at 390 nm are concurrent with decreases at 426 nm with a number of isosbestic points (e.g.407 nm) consistent with a simple Fe 2+ + NO / Fe 2+ -NO binding mechanism.However, the time course follows a double exponential function (Fig. S8C, E and F †), suggesting either a heterogeneous population or an intermediate that is distinct from the NO-bound or deoxyferrous species.A heterogenous population is supported by a global t to a 3-component parallel or serial mechanism showing essentially identical calculated component spectra; the starting deoxyferrous, the end NObound and a spectrum that is intermediate between the deoxyferrous species and ferrous-NO species with clear isosbestic points (Fig. S8D †).This eliminates the possibility that the slow phase results from a displacement of a bound ligand by NO as optically distinct intermediates would be expected.This suggests that a conformational change from a closed to an open form of the protein may be required for NO binding.
The rate constant for NO binding is NO concentration independent for both phases (Fig. S8E and F †), suggesting that there two populations of the heme pocket, both having an occluded (closed) heme where a non-binding ligand sterically hindered the approach of NO.The rate limits observed are thus the rate constants of the conformational change in each population for the closed to open transition.The oxidation state of the cysteines makes no difference to the rate of NO binding, either the fast or the slow phase.Therefore, any effect of the cysteine oxidation state on heme pocket dynamics does not hinder the entrance, binding and dissociation of NO to the heme iron.An estimate of the maximum rate of binding may be made from Fig. S8E and F, † at concentrations of NO less than 12 mM.The affinity of NO binding is measured to be ∼1 × 10 −10 to 1 × 10 −11 M (Table S4 †) with k on values $ 10 7 and 10 6 M −1 s −1 based on the lowest concentration of NO used and k off calculated from ligand exchange with oxygen (1.6 × 10 −4 s −1 , Fig. S9 †).This affinity is consistent with other globins such as Mb, Hb and Ngb where k on is ∼10 7 to 10 8 M −1 s −1 and k off 10 −3 to 10 −5 s −1 . 36,37O binding exhibits a similar kinetic pattern to NO binding, albeit at slower rates (Fig. S10 †).At the concentrations used the CO binding was CO concentration independent, like that of NO.However, unlike NO binding the reduction of the cysteines did show an effect on CO binding, with a decrease in the kinetics of CO binding of ∼3.5 fold (Fig. S10E and F †).The affinity of the CO binding is ∼10 −7 to 10 −8 M (Table S4 †) with fast and slow k on values $ 3 × 10 6 and 3 × 10 5 M −1 s −1 respectively based on the lowest concentration of CO used and k off on ligand exchange with NO (0.31 s −1 , Fig. S11 †).This affinity is ∼6000 fold lower than NO binding, consistent with the 'sliding scale rule' developed by Olson and coworkers 38 and is consistent with other globins such as Mb and Ngb where k on is ∼10 7 to 10 8 M −1 s −1 and k off 10 −3 to 10 −5 s −1 . 38,39he ve-coordinate NO-bound species found in this work is similar to those reported for cyt c 0 , with the NO on the proximal side of the heme.The kinetics of NO binding to deoxyferrous Adgb-GD (Fig. S8 †) is consistent with the NO binding mechanism proposed for cyt c 0 , that is a single optical transition comprising two kinetic phases. 16NO can only bind to the distal iron location of cyt c 0 following a conformational change involving an occlusion in the heme pocket (by a Phe residue in the case of cyt c 0 ).These two conformations of cyt c 0 are optically indistinguishable.Subsequent to proximal histidine displacement, NO binding to the proximal side of the heme and NO dissociation from the distal side are rapid and not observed as separate events, but only as a single observed spectrokinetic event.This model may also be applied to the observations of NO binding to Adgb-GD.However, there is insufficient evidence to assign whether NO binds to the proximal or distal side, although a simple hypothesis is that it binds to the distal side.Furthermore, cyt c 0 shows a mixture of penta-and hexacoordinate NO bound species as observed by a split Soret peak 395/415 nm. 40As a weak band is consistently present in NO binding and NiR experiments at ∼420-430 nm, it is likely that a small fraction of NO-bound Adgb-GD remains as a hexacoordinated species.
In addition to cyt c 0 , Hb has also been reported to bind NO in a pentacoordinate form, but only in a T state at low NO to Hb ratios and only with the alpha chain. 33,41,42he transient absorption spectra observed following dissociation of the Adgb-GD : NO complex resulting from a short (femtosecond) light pulse are shown in Fig. 7A.The spectra are characterized by a broad bleaching around 390 nm due to the disappearance of the 5-coordinate NO-bound state and a relatively strong induced absorption centered at 427 nm assigned to the 4-coordinate NO-dissociated state. 43,44This supports the EPR data (Fig. 6D) showing that the NO bound state of the ferrous Adgb is pentacoordinate.Apart from small relaxation signals with a time constant of ∼1.5 ps, corresponding to a blue shi of the induced absorption band (Fig. 7A inset) and assigned to vibrational cooling, the spectral evolution is characterized by a decay (associated spectra in the inset of Fig. 7A) dominated by a 5.3 ps phase (1.9 × 10 11 s −1 ) and a minor (∼14%) 20 ps phase (5.0 × 10 10 s −1 ) (Fig. 7B).The remaining spectrum aer these phases corresponds to only ∼1% of the photo-dissociated NO, meaning that NO rebinding is almost completely geminate.This implies that dissociated NO stays within the connes of the heme pocket and only minor quantities of NO escape the heme pocket.Rebinding of NO to the heme iron from bulk solution outside the heme pocket would be expected to occur on the ms to ms timescale, as observed with Mb and Cygb for NO or other gases such as CO. 29,45,468][49] In Adgb, however, a slower, 20 ps phase of NO binding is also present.This nding suggests a relaxation process competing with initial NO rebinding, allowing NO to explore a larger conformational space (rototranslational freedom) and indicating a less constrained heme pocket.Such multiphasic NO rebinding dynamics reecting a more accessible heme pocket is also observed in other globins, which all form 6-coordinate NO complexes (Table S5 †).

Androglobin nitrite reductase activity and effect of the cysteine redox state on behaviour
1][52] This reaction was assessed for Adgb-GD under anaerobic conditions (Fig. 8).The reaction follows a mechanism in which NO 2 − slowly reacts with ferrous heme to generate NO and ferric heme. 53,54The ferric heme is rapidly reduced by excess dithionite in solution resulting in the formation of a ferrous-NO complex.The optical changes following this reaction are essentially identical to those of NO binding (Fig. 8A and B; compare with Fig. S8A and B †) with twophase kinetics (Fig. 8C).The concentration dependence of the fast kinetics of the nitrite reaction (Fig. 8E) exhibits a high error due to the small amplitude of the optical changes observed in the global t (Fig. 8D) but appears to follow a hyperbolic concentration dependence with an apparent K D of 2.91 ± 0.57 mM NO 2 − and a maximum rate of 1.54 s −1 ± 0.01 s −1 (Fig. 8E, C).The slower rate representing the formation of the deoxyferrous-NO bound species also follows a hyperbolic curve as a function of nitrite concentration with an apparent K D of 8.42 ± 0.46 mM NO 2 − and a maximum rate of 2.54 × 10 −1 s −1 ± 6.1 × 10 −4 s −1 (Fig. 8F, C).Unlike NO binding, the reduction of the disulde bond has a signicant effect on the rate of NiR activity (Fig. 8E and F) with TCEP reduced free sulydryl (B) exhibiting signicantly decreased kinetics, both for the fast kinetics with an apparent K D of 9.13 ± 1.38 mM NO 2 − and a maximum rate of 3.23 × 10 −1 s −1 ± 0.02 s −1 and slower NO binding kinetics with an apparent K D of 3.26 ± 1.12 mM NO 2 − and a maximum rate of 3.67 ± 0.40 × 10 −2 s −1 .This effect of sulydryl reduction on the NiR activity may arise from structural changes in the heme pocket, affecting exogenous ligand affinity, as observed for Ngb, 34 or affecting the endogenous distal ligand off-rate, as observed for Ngb 54 and Cygb. 45he oxyform which can only be observed transiently by stopped-ow spectroscopy binds with an association rate constant of ∼0.8 × 10 6 M −1 s −1 .This compares to 14, 50, 60 and 250 × 10 6 M −1 s −1 for Mb, Hb a, Hb b and Ngb, respectively. 55,56utoxidation rates for globins vary greatly, with Mb and Ngb reported at 0.055 and 5.4 h −1 . 55The measured autoxidation rate for Adgb-GD is $2.6 s −1 (9360 h −1 ) being limited by the oxidation of dithionite.This value is three orders of magnitude higher than the corresponding value for Ngb (Fig. S12 †).This remarkable autoxidation rate strongly suggests that O 2 binding is not a physiological function of the protein except for the possibility of oxygen sensing.Similarly, NO dioxygenase activity is unlikely, as this requires a semi-stable ferrous-O 2 complex for the NO to react with.The oxidation state of the iron in vivo remains unknown, like that of Ngb and Cygb.8][59] Consequently the existence of the ferrous form of the heme iron in vivo, at least in the testes, cannot be completely discounted.
Recent studies have shown that many globins appear to have properties relating to NO homeostasis, although the exact nature of this biochemistry in vivo is still under debate.Our results in Fig. 8 conrm that Adgb-GD reacts with nitrite to generate NO under hypoxic/anoxic conditions and subsequently binds NO as a pentacoordinate species.The NiR rate constants of other globins are typically linear as a function of nitrite concentration. 29Assuming that the rate of NO formation relates to the rate of ferrous-NO generation (Fig. 8F), the initial slope of ∼25 M −1 s −1 at low nitrite concentrations is higher than that for Ngb, Mb or Hb, and similar to that reported for globin X 50 and Cygb (Table S6 †). 29,60O has important roles in the testes and hence NO homeostasis is also important.Four Nitric Oxide Synthases (NOS) are present in the testes, endothelial NOS (eNOS), inducible NOS (iNOS), and neuronal NOS (nNOS) and testis-specic NOS (TnNOS).17 NO has been proposed to play a unique role in modulating germ cell viability and development, 20 with high NO concentrations exhibiting a deleterious role in the mobility of spermatozoa 61 and thus acting on some aspects of male infertility.17 NO also has vital roles in steroidogenesis, gametogenesis and the regulation of germ-cell apoptosis with excess NO linked to induction of germ cell apoptosis and oxidative damage.62 Many globins have proven or proposed functions in NO regulation, either in primary or secondary roles.Thus, there is a distinct possibility that Adgb could also play a role in NO homeostasis and hence germ cell viability.Our data showing an unusual NO binding and high NiR activity provide a proof-of-principle that Adgb could be involved in this process, but it requires further studies to verify.
The presence of a calpain-like motif on the N terminal side of the globin domain and a calmodulin binding domain directly dissecting the globin domain strongly suggests a role for calcium in the functional mechanism of this protein.The links between NO and calcium are well established. 63Calcium channel blockers decrease intracellular calcium levels and increase the vasodilator efficacy of NO in smooth muscle. 64onversely, an increase in NO by vascular endothelial cells in the liver enhances calcium signaling in surrounding hepatocytes. 65In the testes, calcium channel blockers to relieve hypertension causes reversible male infertility in mice. 66Hence it is not implausible that calcium and NO binding are linked to Adgb function.
Recently, the establishment of a role for Adgb in ciliogenesis has heightened the importance of Adgb.Overexpression studies show an Adgb-dependent increase in ciliated cells. 8The expression of Adgb is in turn linked to Forkhead Box J1 (FOXJ1), a transcription factor involved in ciliogenesis, and as overexpression of FOXJ1 directly led to increased Adgb mRNA levels through binding to the ADGB promotor. 8Furthermore, Adgb has been shown to directly interact with the cytoskeletal protein Sept10. 9Adgb knockout in mice leads to mislocalisation of Sept10 in sperm, resulting in malformations of the agella and head.The mechanism of Adgb action appears to be via proteolytic cleavage of Sept10, with the proteolytic activity controlled via calmodulin-dependent binding to the IQ motif of the circularly permuted globin domain of Adgb. 9

Conclusions
In summary, the helical alignment for Adgb-GD from our method, designed to work in the twilight zone, yielded an alternate helix alignment around the C and D helical regions.This alignment is consistent with that obtained from the AlphaFold-2 model.The identication of a Tyr in the CD1 position (Tyr977 in the human Adgb) is to our knowledge unique for eukaryotic globins, but is common in prokaryotes.
The validation of our proposed helix alignment lies, at least in part, in the agreement with the AlphaFold-2 model and the generation of a stable recombinant form of the protein using the alternative globin domain sequence.This is supported by heme insertion and tight heme binding to generate optical and EPR spectra, typical of pure authentic heme proteins.The femtosecond laser ash data indicate a true heme pocket from which ligands (NO) cannot escape and kinetics measured by stopped-ow spectroscopy that conform in general to known heme proteins.The presence of an intramolecular disulde bond goes some way to explain the stability of the recombinant protein, given that the N and C terminal regions of this circularly permuted globin domain are in the critical area of the heme-binding pocket.
The biochemical evaluation presented here illustrates the characteristics of the globin domain of Adgb, showing that it can potentially participate in NO sensing or regulation.Our results, together with the known calcium-linked structural aspects of the calpain and calmodulin binding properties of Adgb, lay the foundation for further investigation into the functional role of Adgb, given its potential medical signicance.

Data availability
Molecular simulation models have been deposited as videos and are accessible via https://zenodo.org/records/10509366.Additional experimental details and data are provided in the ESI, † including pairwise alignments for each helix-helix prole alignment, globin domain sequence alignments, frequency analysis of amino acid CD1 assignment, a comparison of globin domain optical wavelength maxima, extinction coefficients, ligand binding constants, geminate recombination binding constant and nitrite reductase activity with other globins, SDS-PAGE and immunoblot analysis of globin domain purity, heme content analysis, oligomer analysis, ferric heme acid-alkaline transition, cysteine reduced EP data, NO and CO binding spectra and kinetic data, O 2 -NO and NO-CO ligand exchange data and autoxidation data.

Fig. 2
Fig. 2 Androglobin heme-binding globin domain alignment.The alignment of the traditional globins was generated by structural alignment; the alignment of Adgb was determined using the in-house multi-template approach.The alignment denoted as Adgb_2012, previously reported by Hoogewijs et al., 1 is shown where it differs from our alternative proposed alignment.The six alignment zones coincide (bar helices C and D) with globin helices denoted as 'Helix A', 'Helix B' etc. as determined by inspection from the X-ray crystal structures.'/' denotes the chain break in the discontiguous Adgb sequence, 'XXX.' denotes the extra residues used in the Adgb construct (this work) while 'xxx.' denotes the extra residues in the globin domain of Adgb reported by Hoogewijs et al., 2012 (ref.1), which here are identified by us as part of the preceding calpain C2-like domain (unpublished work).Key amino acids are denoted by labelled arrows, 'a' for the conserved CD1 loop aromatic residue, 'b' for the proximal histidine and 'c' for the distal histidine/glutamine; the disulfide bond is denoted by '#'.

Fig. 3
Fig.3Structures of the heme-binding globin domain of androglobin.(A) The complex 1 structure determined using Modeller, (B) the complex 2 structure determined using Modeller and (C) the AlphaFold 2 structure.The structure is coloured blue for the N-terminus (helices D-H), magenta for the IQ domain insertion and cyan for the C terminus (helices A-C).(D) Stereo view of the predicted heme environment from complex 1 (blue) and complex 2 (purple).

Fig. 4
Fig. 4 Molecular dynamics simulations of Adgb-GD complex 1, complex 2, and the AlphaFold 2 model.(A) RMSF comparison between complexes 1 (red) and 2 (black); (B) heme RMSD within complexes 1 and 2 during the MD replicas; (C) distances between pairs of cysteine residues during the MD simulation of the AlphaFold 2 model (C787-C978 in black and C894-C970 in blue); (D) RMSF plotted on complex 1, complex 2, and the AlphaFold 2 model (represented as a ribbon; heme is represented as a green stick); red ribbon colour indicates high structural flexibility; the four cysteine residues of the domain and their distance are shown on the AlphaFold 2 model.

Fig. 5
Fig. 5 Optical characteristics of the androglobin globin domain.Protein was expressed based on revised amino acid assignment shown in Fig. 2 and S2.† Ferric (black line), dithionite reduced deoxy ferrous (blue line), ferrous-CO bound (red line) and ferrous-NO bound (green line).

Fig. 6
Fig. 6 Androglobin heme iron ligation and cysteine oxidation states.(A) EPR spectra of the 80 mM ferric androglobin globin domain showing both high spin and low spin Fe 3+ EPR signals at various pH values.The spectra were recorded at 10 K, 3.16 mW microwave power and 5 G modulation amplitude.(B) Expanded view of (A) highlighting the low spin signals.(C) Measurement of the surface exposed disulfide number and oxidation state.Protein (6.4 mM) was titrated with 4,4 0 -dithiodipyridine and the optical changes followed at 324 nm.The fractional saturation of dithiodipyridine binding to free sulfhydryl as a function of dithiodipyridine concentration for TCEP reduced (B) and purified forms (no disulfide reduction, C) shows that one surface-exposed disulfide bond is present in the globin domain when expressed.Inset shows the optical changes of titration of TCEP reduced Adgb-GD with dithiodipyridine.(D) An EPR spectrum of ferrous NO bound Adgb-GD, exhibiting three lines, separated by 16 G, around g = 2.011, typical of other EPR spectra of five coordinate NO binding such as cyt c 0 .The spectrum was recorded at 10 K, 3.16 mW microwave power and 3 G modulation amplitude.

Fig. 7
Fig. 7 Ultrafast photo-dissociation and rebinding of NO from the ferrous Adgb-GD five coordinate NO complex.(A) Transient absorption spectra after different delay times upon excitation at 570 nm.(Inset) Decay associated spectra corresponding to the NO geminate rebinding phases obtained from a global analysis and (B) dual-timescale kinetics and fits at selected wavelengths.

Fig. 8
Fig. 8 Nitrite reductase activity of Adgb-GD and effect of the disulfide redox state on reductase activity.(A) Optical spectra of deoxyferrous Adgb-GD (5 mM) with sodium nitrite (5 mM).(B) Difference spectra with initial ferrous protein set to zero.(C) Time course of optical changes fitted to a double exponential function (k 1 = 9.25 × 10 −1 s −1 and k 2 = 1.02 × 10 −1 s −1 ).(D) Global fit of initial deoxyferrous protein (blue), intermediate (red) and final ferrous-NO bound spectra (black).(E and F) The dependence of the rate constants for Adgb nitrite reductase activity.The observed rate constants of the fast (E) and slow (F) phases on the reaction in the presence (C) or absence (by reduction using TCEP, B) of the CD disulfide bond.