Combining experimental and theoretical methods to learn about the reactivity of gas-processing metalloenzymes

After enzymes were ﬁ rst discovered in the late XIX century, and for the ﬁ rst seventy years of enzymology, kinetic experiments were the only source of information about enzyme mechanisms. Over the following ﬁ fty years, these studies were taken over by approaches that give information at the molecular level, such as crystallography, spectroscopy and theoretical chemistry (as emphasized by the Nobel Prize in Chemistry awarded last year to M. Karplus, M. Levitt and A. Warshel). In this review, we thoroughly discuss the interplay between the information obtained from theoretical and experimental methods, by focussing on enzymes that process small molecules such as H 2 or CO 2 (hydrogenases, CO-dehydrogenase and carbonic anhydrase), and that are therefore relevant in the context of energy and environment. We argue that combining theoretical chemistry (DFT, MD, QM/MM) and detailed investigations that make use of modern kinetic methods, such as protein ﬁ lm voltammetry, is an innovative way of learning about individual steps and/or complex reactions that are part of the catalytic cycles. We illustrate this with recent results from our labs and others, including studies of gas transport along substrate channels, long range proton transfer, and mechanisms of catalysis, inhibition or inactivation. review how a great deal of information can be obtained using an interdisciplinary approach thatcombinesstate-of-theartkineticsandcomputationalchemistry.Thisdi ﬀ ersfrom — andcomplements — themoretraditionalstrategies thatconsistintrying to see the catalytic intermediates using methods that rely on the interaction between light and matter, such as X-ray di ﬀ raction and spectroscopic techniques.


Introduction
Chemists are fascinated by the catalytic power of enzymes, which accelerate reactions by many orders of magnitude. Since they were discovered, more than a century ago, the amount of information that has been acquired about their working principles has been phenomenal. Thanks to the contributions of many physical chemists, great progress has been made regarding the use of both classical and quantum mechanics to describe the mechanism at a molecular level. However, depending on the intrinsic complexity of the catalytic system, the level of understanding that theoretical chemists can achieve varies greatly.
Regarding enzymes that have either no cofactors or organic cofactors, and where the chemical transformation of the substrate occurs at the protein surface, substrate binding is essentially a matter of docking (rather than a complicated, intramolecular, multi-step diffusive process) and the main features of the mechanism can be inferred from X-ray data and site-directed mutagenesis experiments that identify the crucial amino acids. In these cases, theoretical chemists can focus on detailed yet important aspects of function, such as the role of protein motions in determining the turnover rate. Complications may arise in the case of "oppy" enzymes where a large conformational change on the micro-second time scale might partly determine the turnover rate.
The situation is very different in the case of many other enzymes (including some of those discussed here) that use an inorganic cofactor to transform a small substrate. This is for several reasons: (1) X-ray investigations oen give an ambiguous picture of the structure of the active site, and/or, as occurs with hydrogenases, cannot detect the substrate because it is not sufficiently electron-dense; (2) the reactivity of complex inorganic active sites is sometimes difficult to predict, in part because it is largely tuned by the surrounding protein matrix, so that the catalytic mechanism is far from being straightforward (the exact mode of substrate binding, the sequence of events that take place at the active site during catalysis are oen unknown); (3) theoretical methods have not yet been tuned to achieve the same accuracy as with organic cofactors, so that the results of calculations must be considered with caution; (4) these enzymes oen house several cofactors and the catalytic mechanism involves a number of steps which are very different in nature (long range intramolecular substrate and product diffusion, long range proton and electron transfers and active-site chemistry per se) which occur on sites of the protein that are very far apart from one another. Any of these steps may, under certain conditions, limit the overall rate of the reaction and may therefore determine the enzyme's global catalytic properties. Oen it cannot even be ascertained that active site chemistry limits the rate of turnover and regarding three out of the four enzymes discussed here, the calculation of turnover rates using theoretical methods still appears to be out of reach. A combination of theoretical chemistry and experimental methods can nonetheless be very useful to understand many different aspects of the mechanism, as discussed herein. Fig. 1 summarizes the different approaches, both experimental and theoretical, that can be used independently or in combination to nd out how such complex catalysts work. The outcome of experiments and calculations are the observables listed in the central column of Fig. 1 and organized in three groups: thermodynamic, structural and kinetic properties. The mechanism itself is not an observable, which is the main reason why theoretical calculations are essential. Of course, experimental observables derive from the structure and reactivity of the enzyme, but in such a complex way that it is generally not possible to deduce the mechanism from the values of the observables. In that respect, confronting theoretical results to experimental observations can help uncover the molecular details of the catalytic mechanism of an enzyme. This process is sketched in Fig. 1 and illustrated in the last section of this paper.
The feedback process shown on top of Fig. 1, which is the key to understand the mechanism, is necessarily bootstrapped by experimental observations. We have classied the latter into three main approaches (structural, spectroscopic, kinetic), which can be used to probe three kinds of enzyme samples (at equilibrium, frozen, or turning over under catalytic conditions). Our goal here was not to list all existing techniques, but to show how they relate to each other. Any experiment, indicated in red in the right part of Fig. 1, is at the intersection between two or several domains: for example a redox titration consists in using a spectroscopic technique to monitor the redox state of a sample under equilibrium conditions. Experimental observables are very complex functions of the structures and kinetic properties of intermediates of the catalytic cycle. They can be interpreted to give structural or mechanistic information (e.g. "this IR spectrum shows that there are probably 3 CO ligands", or "the pH dependence of this rate constant shows that the corresponding reaction involves a protonation"), but they do not usually give a complete description of the catalytic mechanism.
As illustrated in Section 4, direct electrochemistry has proved important in kinetic investigations of metalloenzymes, 1 and we briey introduce the technique here. Enzyme molecules are adsorbed or covalently attached 2,3 as a submonolayer onto an electrode; the electrode potential is set to a value that forces the oxidation or the reduction of the enzyme, and the continuous catalytic transformation of substrate results in a ow of electrons across the electrode. This catalytic current is proportional to the turnover frequency times the electroactive coverage of enzyme participating in the reaction. If the electroactive coverage is constant, the current is proportional to turnover rate. That the current can be sampled at sub-second intervals is a strong advantage compared to traditional solution assays. Most signicantly, using an electrode adds a control parameter (the electrode potential) to traditional enzyme kinetic measurements performed in solution. By changing the electrode potential, using steps or sweeps, it is possible to observe how the enzyme responds to changes in electrochemical driving force. Provided that kinetic models are used to quantitatively interpret the data, information can be gained about the properties of the enzyme's redox centers and the kinetics of intramolecular electron transfer, 4,5 or the (in)activation of the enzyme that oen occurs under conditions of extreme potential. [6][7][8] The concentrations of substrate, product or inhibitors can also be changed while the activity is being recorded, making it easy to determine Michaelis and inhibition constants but also, and most importantly, rates of the reaction with inhibitors. 9,10 The technique has obvious limitations: not all enzymes can be directly wired to electrodes and some artefacts sometimes arise from the protein/electrode interaction. We have discussed in a previous review some of the artifacts that may occur in PFV experiments. 1 Apart from that, the main pitfalls of the technique are the same as those described in all enzyme kinetics textbooks: observing an agreement between a kinetic model and experimental data does not imply that the model is correct (or unique), and ingenious approaches have to be used to learn about the rates of individual steps in the catalytic cycle, or the molecular mechanisms of the chemical transformations that are at stake, based on a global measurement of turnover rate.
Regarding mechanistic investigations, it is important to realize that key intermediates are intrinsically short-lived, and consequently difficult to accumulate, detect and characterize experimentally. This implies that experimental results oen need to be complemented by theoretical studies. The growing role of quantum chemical methods in the investigation of metalloenzymes is well testied by the 2013 Nobel Prize in Chemistry, which was awarded to Martin Karplus, Michael Levitt and Arieh Warshel for the development of multiscale computational models of complex chemical systems, i.e. the development of methods, based on classical and quantum mechanical theory, which can be used to study large chemical systems and their reactivity.
The computational methods used to study the molecular properties of metalloenzymes can be classied into two families (le part of Fig. 1). The rst includes methods grounded in classical physics, such as Molecular Mechanics (MM) and Molecular Dynamics (MD). MM methods are used to calculate potential energies, whereas the goal of MD calculations is to describe the evolution of the structure of the protein, using Newton equations, based on the known energies of interaction between different atoms. MM and MD calculations allow to investigate the "physical" properties of the system, such as the dynamics of proteins in solution, as well as the diffusion of substrates and inhibitors into enzymes, but such approaches cannot be used to investigate properties that explicitly depend on electrons, such as reaction pathways and most spectroscopic features. The second family of computational tools includes Quantum Mechanical (QM) methods, which allow to calculate reaction energies and spectroscopic properties. QM methods are now routinely used to investigate large molecular systems, such as the active site of enzymes. Among all available QM methods, those based on the Density Functional Theory (DFT) are extremely popular due to their favorable trade-off between accuracy and computational costs.
In the context of bioinorganic chemistry, theoretical methods are useful for learning about active site geometries, for interpreting spectroscopic properties, and for elucidating reaction mechanisms (based on the energies of minima and saddle points along putative reaction pathways). The calculated observables are the same as those determined from experiments, but the approach usually takes a different route. In most theoretical calculations, especially QM, one needs to rst postulate a structure or mechanism and then compute the observables. That the calculated observables match the measured ones suggests that the postulated structures are correct. Or the fact that calculated observables do not match experimental ones demonstrates that the mechanistic hypotheses can be ruled out. The comparison with experimental results is also fundamental to ensure that the system is described with sufficient accuracy with the approximations used (for instance, in quantum chemical calculations one has to choose the level of theory, the basis set, the cluster size, etc.). Comparison between theory and experiments can be made on different levels, from a qualitative point of view ("this intermediate is much too high in energy, so it is very unlikely that catalysis proceeds this way") to semi-quantitative ("theory predicts that this species should be easier to oxidize than this one, in agreement with the experiments") or quantitative (comparing the calculated and measured values for IR frequencies or rate constants).
Recently, advances in both experimental and theoretical methods have favored the dialogue between the "wet lab" and in silico approaches, and this interaction can now provide answers to open issues in the eld of enzyme-catalyzed fuel production. In the present paper, we aim at showing how the combination of computational and experimental methodologies in enzymological studies can be fundamental for favoring the crossfertilization of ideas, which is a prerequisite for any future change of paradigm in energy production and supply.
In fact, since most technological processes currently rely directly or indirectly on fossil fuels, which are non-renewable (in non-geological time scales) and consumed at an everincreasing rate, one challenge facing current world economy is related to the availability and cost of energy. In addition, the burning of fossil fuels is continuously increasing CO 2 concentration in the atmosphere, causing environmental problems. Therefore, the development and exploitation of alternative and renewable fuel sources and energy carriers, as well as advances in CO 2 processing technologies, have very high priority.
The production of solar fuels is one of the best answers to such energy and environmental crisis and certainly one of the grand challenges of this century. Storing sunlight in the form of energy-rich chemical bonds offers the prospect of using existing or only slightly modied technologies that currently run on fossil fuels, such as e.g. car engines. Biology provides much inspiration for the development of such catalysts. Over millions of years, Nature has evolved highly efficient metal-clusters bound to proteins, for the purpose of converting small, inert molecules such as CO 2 , N 2 and even water, with the help of sunlight, into highly energetic molecules (fuels) such as CO, methanol, ammonia or H 2 . We believe that a deep understanding of these fundamental biological reactions will provide the key for a successful translation into articial processes. For this to happen, it will be vital to take advantage of the synergistic strengths of combined experimental and computational approaches.
Here is the structure of the paper and the scope of each section. In the second section, we introduce and describe the structures of the four enzymes that we shall discuss throughout the paper. In the third section we discuss how observables can be either measured in experiments or calculated, and at which accuracy; we shall also illustrate the drawbacks and pitfalls of several approaches. In the last section, we critically discuss selected literature in this eld. We identify certain discrepancies between experimental and theoretical results, and gaps in the existing knowledge that will clearly be of interest in the future. We emphasize cases where combining experiments and theory provided much more insights than using the two approaches independently. Theoreticians should be able to start from educated guesses based on the experimentalists' results, while experimentalists should be able to perform the experiments that help discriminate between different hypotheses. This synergy is illustrated with several examples taken from our work and the work of others, focussing on four different metalloenzymes, three oxidoreductases ([NiFe] and [FeFe]-hydrogenases, carbon monoxide dehydrogenase) and one non-redox enzyme (carbonic anhydrase), all of which catalyse reactions of importance in the context of renewable energy and environmental-friendly processes.
2 Background information about the four enzymes discussed in this paper

Hydrogenases
Hydrogenases 11,12 are enzymes that catalyse the reversible oxidation of H 2 into protons and electrons according to: They are divided into two classes based on the metal content of their active site. The so-called "[NiFe]-hydrogenases" house a dinuclear [NiFe] active site, in which the Ni is coordinated by 4 cysteines (two of which bridge the metal ions), and the Fe is coordinated by two CO and one CN À ligand ( Fig. 2A). MD and DFT calculations suggest that H 2 binds to the Ni ion. 13,14 The active site is buried inside the protein matrix, and connected to the solvent via a hydrophobic tunnel that guides the transport of substrate, a network of protonatable amino acids that transfer protons to/from the active site, and a chain of three iron-sulfur clusters to mediate electron transfer to/from the redox partner. These clusters are referred to as "proximal", "medial" and "distal" according to their distance from the active site.
[FeFe]-hydrogenases oxidize or produce H 2 at an active site, the so-called H cluster, that is composed of a standard [4Fe4S] cluster covalently attached by a cysteine residue to a [Fe 2 (CO) 3 (CN) 2 (dtma)] subsite (dtma ¼ dithiomethylamine) 15,16 (Fig. 2B). The iron atoms of this [FeFe] subsite are named proximal (Fe p ) or distal (Fe d ) according to the distance to the [4Fe4S] cluster. In the catalytic mechanism, the [FeFe] subsite cycles between at least two redox states, referred to as Hox and Hred, which can be formally described as Fe(II)Fe(I) and Fe(I)Fe(I), respectively. Dihydrogen or protons (depending on the direction of the reaction) bind on the distal Fe. The enzyme from Chlamydomonas reinhardtii (Cr) has no cofactor other than the H cluster. The enzymes from Clostridium pasterianum (Cp) and Clostridium acetobutylicum (Ca) bind 4 additional FeS clusters, which act as electron relays. The enzyme from Desulfovibrio desulfuricans (Dd) houses the H cluster and two [4Fe4S] clusters.

ACS/CODH
Acetyl-CoA synthase/CO-dehydrogenase (ACS/CODH) is a bifunctional enzyme that plays a crucial role in anaerobic bacteria such as acetogenic organisms, which rely on the Wood-Ljungdahl pathway of carbon xation. 17 It is estimated that z10 11 tons of acetate per year are produced globally from CO 2 through this pathway. 18 ACS/CODH catalyses the synthesis of acetyl-CoA from CO 2 , CoA, and a methyl group donated from the corrinoid-iron-sulfur protein (CoFeSP). This complex reaction occurs in two steps, that take place in different subunits: the two-electron reduction of CO 2 to CO according to reaction (2) is catalysed in the b subunit, at the C cluster, a [NiFe 4 S 4 ] active site (Fig. 2C).
It is proposed that CO 2 binds the C cluster in the so-called C red2 redox state, with the C atom of CO 2 bound to Ni(0), and one O atom to a Fe(II) atom of the cluster. CO and water release leaves the cluster in the C red1 state (Ni(II)Fe(II)). Electrons are transferred via the B and D clusters to the external electron acceptor. Some aspects of this mechanism are still under debate. For instance, a revised mechanism has been recently suggested where CO 2 is inserted into a Ni(II)-hydride bond. 19 A second active site, a [Ni 2 Fe 4 S 4 ] cluster in the a subunit (the A cluster), catalyses the incorporation of the CO in a methyl group to give acetyl-CoA.
The ACS (a) and CODH (b) subunits of the bifunctional enzyme are associated in a dimer of dimers (a 2 b 2 ). The C and A clusters are 70Å apart from one another and a 138Å long cavity runs along the entire length of the enzyme, connecting all A clusters and C clusters, from the sites where CO is produced to the sites where it is consumed.

Carbonic anhydrase II (CA II)
This enzyme is a small protein (29 kDa) which catalyses CO 2 hydration and HCO À 3 dehydration: 20,21 It is involved in many biological processes, such as maintaining the correct acidity of blood in mammals. It is also important in photosynthesis since the substrate of RubisCO, the enzyme involved in the rst major step of carbon xation, is CO 2 rather than its hydrated forms. The active site of CA II is a Zn 2+ centre coordinated by three His nitrogens and one water molecule (Fig. 2D).

A general introduction to computational methods: calculations of structures and spectroscopic properties
Two strategies can be followed for the denition of QM models of metalloproteins. In the cluster approach, only the active site and some neighbouring atoms are taken into account, and the rest of the protein environment is only implicitly modelled. In the QM/MM approach, the active site is described using quantum chemistry, whereas all other atoms of the protein are modelled using a molecular mechanics formalism. Both approaches have advantages and disadvantages, which have been extensively discussed in recent reviews. [22][23][24][25][26] The cluster approach is generally well suited for modelling metalloenzymes, since the chemical steps of the catalytic mechanism usually involve only the metal ions and nearby residues. [27][28][29][30] However, the selection of the atoms included in the model is oen far from trivial. In addition, the modelling of the peripheral atoms (i.e. those at the boundary of the QM model) can be problematic. When a cluster model is used, the presence of the protein matrix that surrounds the active site is generally modelled by soaking the QM portion in a continuum dielectric. This is particularly important for metal-containing active sites, which oen are not electrically neutral. In fact, an unbalanced charge distribution in the active site can result in unrealistic electron transfers within the model cluster. As a continuum dielectric, several solvation models like the conductor-like screening model (COSMO) [31][32][33][34] and the polarizable continuum model (PCM) have been developed. [35][36][37][38][39][40] When the architecture and stereoelectronic features of the protein matrix are expected to affect the structural properties of the active site, as well as the regiochemistry of substrates or inhibitors binding, modelling the protein environment in an explicit manner can be very important. The development of QM/ MM models, which has allowed the investigation of whole proteins, was pioneered by Warshel and Levitt. 41 These methods have become increasingly popular in the last twenty years.
The structures of organic molecules calculated with DFT, which is the only affordable level of theory when dealing with large systems, can be very reliable, with errors on bond distances and angles that are generally lower than 2 pm and a few degrees. Regarding coordination compounds, strong metal ligand bonds (such as those involving CO and CN À ligands) are generally predicted with excellent accuracy, whereas the prediction of weaker metal-ligand bonds can be more problematic. Very weak interactions like hydrogen bonds can also be challenging.
DFT calculations have been useful also for the elucidation of structural properties of proteins. The so-called quantum renement approach is a crystallographic renement procedure in which a molecular mechanics force eld, which is generally used to supplement the X-ray diffraction data, is replaced with more accurate DFT calculations; 42 it has been used to clarify the chemical structure of cofactors or the protonation state of aminoacids. As an example, the nature of the dithiolate ligand in the active site of [FeFe]-hydrogenases ( Fig. 2B) was initially controversial, since it was suggested that it may contain C, N, or O as the bridgehead atom. To shed light on this issue, Ryde and collaborators 43 carried out quantum renement calculations taking into account different models of the dithiolate ligand, nding that structures with a N bridgehead atom provide the best t to the raw crystallographic data, in agreement with previous proposals. [44][45][46] These results were conrmed recently when it became possible to change the nature of the bridging dithiolate ligand: 16 the enzyme is active only if the bridging ligand bears a nitrogen atom.
It is also important to keep in mind that metalloproteins oen contain metal ions with unpaired electrons, which must be described using spin polarized methods, where electrons with different spin are treated with a different potential. In addition, in some enzymes, such as those containing [4Fe4S] clusters, the metal atoms can interact, generating antiferromagnetic coupling between electrons localized on different atoms. Spin-coupled systems are intrinsically difficult to describe using DFT because their ground state wavefunctions generally correspond to linear combinations of multiple determinants. However, approximate methods have been shown to produce reliable results: in the broken symmetry (BS) approach developed by Noodleman and coworkers 47,48 the opposite spins are localized to give a mono-determinant representation of the spin exchange interactions within the molecule.
The prediction of vibrational frequencies, and consequently of IR spectra, is closely related to the accuracy in the calculation of equilibrium geometries. In general, harmonic frequencies computed using DFT, when scaled using ad hoc empirical correction factors, agree very well with experimental data and can allow to distinguish among different plausible chemical structures that might correspond to the species under investigation. As an example, the combination of data obtained from infrared (IR) spectroscopy with the corresponding computed spectra has been one of the most effective approaches used to characterize hydrogenases. In fact, the peculiar presence of CO and CN À ligands in the active site of these enzymes has allowed to monitor the shis of their vibrational modes and to correlate them with the molecular structure of different redox and protonation states of the enzyme. [49][50][51] The calculation of other spectroscopic properties, such as UV-Vis, CD and EPR, is more challenging and high-level ab initio methods, such as CCSD(T) and CASSCF, are oen required to obtain reliable results. However, as these methods are computationally very expensive, theoretical chemists make extensive use of DFT to compute spectroscopic properties of bioinorganic systems 24,52,53 and the performance and reliability of this method has recently been discussed. 54 In general, computed spectroscopic properties obtained using DFT are not always accurate, and sometimes even qualitative results can be incorrect. For this reason, DFT derived properties must be carefully checked and tuned using experimental data as reference. DFT calculations of Mössbauer isomer shis for the 57 Fe nucleus have generally produced encouraging results. 55 In contrast, the computation of EPR parameters is more problematic. Indeed, g-shi values are oen underestimated when using standard functionals, and some metal ions, such as Cu(II), can be particularly challenging. The accurate prediction of hyperne coupling constants can also be difficult, with results that can be strongly dependent on the nature and oxidation state of the metal ion under investigation. Nevertheless, DFT calculations of g values and hyperne coupling constants have oen well complemented data obtained from EPR spectroscopy, as documented by their role in the characterization of structural features of paramagnetic [NiFe]-hydrogenase forms. 12 Since only electronic ground states can be rigorously computed using DFT calculations, the investigation of excited states and their properties can be carried out only indirectly. In this context, DFT has beneted from the development of timedependent linear response theory within the ab initio methods. Time-dependent density functional theory (TDDFT) is now routinely applied to compute the electronic spectra of bioinorganic systems, even though the quality of the results is very dependent on the molecular system under investigation and on the choice of the exchange-correlation functional. Multi-congurational approaches, such as CASPT2 and MRCI, can give more accurate results, but these methods are still computationally very expensive.

Calculating and measuring thermodynamic parameters
3.2.1 Energy and free energy proles (intermediates and Michaelis complexes). QM calculations can give quantitative information about the thermodynamics and the kinetics of a reaction pathway, through the computational characterization of the structure of reactants, products, intermediate species and the corresponding transition states, as well as their energy differences. While the computation of the structures of reactants, intermediate species and products is relatively straightforward, because they correspond to energy minima on the potential energy surface, the computation of transition states (TSs) in a reaction pathway (i.e. saddle points on the potential energy surface) requires deep chemical intuition, because they cannot be deduced unambiguously just from the specication of reactants and products. 52 Standard reaction energies of organic molecules, such as additions and substitutions, when computed with DFT methods, are generally within 2-3 kcal mol À1 of the corresponding experimental values. The level of accuracy slightly decreases when considering bioinorganic systems containing transition metals, but the trade-off between accuracy and computational costs remains extremely good, allowing to cautiously discuss and compare computed reaction energies. As an example, an average accuracy of about AE5 kcal mol À1 can be expected in the computation of metal-ligand dissociation energies. 52 However, it is important to remark that an error of 1.4 kcal mol À1 in binding energies corresponds to an order of magnitude difference in dissociation constant (K d ) at room temperature; the same problem arises in attempts to deduce rates from activation energies. Also due to the approximations necessarily introduced to model large biological molecules, the discrimination among alternative reaction pathways only on the basis of energy differences between intermediates and transition states can be problematic. In fact, for some difficult cases, such as Cu 2 O 2 or Fe(IV)-oxo containing systems, even a qualitative analysis might lead to wrong conclusions. 24 In addition, to describe the energy prole of a reaction, standard free energy differences (DG 0 ) should be computed, whereas QM calculations provide directly only the electronic energy differences (DE 0 ). The comparison of DE 0 values is sufficient to discriminate among different reaction pathways when the energy corrections that should be computed and added to DE 0 to obtain the corresponding DG 0 values can be assumed to be similar for the different reaction pathways under investigation. Experimental observables are free energies, but their computation is oen affected by large errors. First, computed energies should be corrected with the vibrational zero-point energy (ZPE) contribution, which is crucial if the aim is to compute deprotonation energies 56 or to evaluate proton-transfer energies and barriers, proton tunneling and kinetic isotope effects. 57,58 Second, entropic contributions should be taken into account, and calculated from the roto-translation partition function of the system, at a given T and P. However, only approximated partition functions can be computed for molecules containing a large number of atoms. Of course, entropic corrections are crucial for the description of associative and/or dissociative elementary reaction steps; their values are in the range of +10 kcal mol À1 for an associative reaction step, when considering standard state concentrations.
Regarding experiments, among the various quantities which are directly related to free energy variations, only association/ dissociation constants and reduction potentials can be easily measured (if we exclude equilibrium constants between substrate and product and reaction energies, which give no information about the catalyst). This is described hereaer.
3.2.2 Experimental dissociation constants. Many experimental methods make it possible to measure either equilibrium dissociation constants between enzyme and ligands (hence free energies of binding) or apparent dissociation constants for the reaction The main issues in interpreting these results are that not all parameters in units of concentration are true dissociation constants (related to a free energy of binding), and that the different parts of the system that contribute to the apparent affinity of the enzyme for a ligand are difficult to resolve.
If one is interested in the catalytic transformation of a substrate S into a product P, the change in steady state turnover rate against substrate concentration can oen be understood from a very simple scheme: An experimental parameter that is easily measured is the Michaelis constant, K m , dened from the change in turnover frequency (v) against substrate concentration: The Michaelis constant is greater than the true dissociation constant K d ¼ k out /k in unless the transformation of the enzymesubstrate complex is slow compared to substrate release 59 and True dissociation constants are more easily obtained from inhibition experiments. If the inhibition by a certain ligand is reversible, then the turnover rate reaches a non-zero, steady-state value in the presence of substrate and inhibitor, and the inhibitor binding constant is deduced by looking at how the steady-state turnover rate v changes with inhibitor concentration [I]: The apparent dissociation constants K app i can also be deduced from the ratio of experimentally determined binding/ dissociation rate constants. It may depend on substrate concentration. For example, if the substrate and the inhibitor compete for binding to the same active site, 60 then the apparent K i measured by changing [I] at a constant [S] is If the inhibitor reversibly binds to form a dead-end complex, as occurs with CO binding to hydrogenase for example, then inhibitor binding is at equilibrium in the steady-state, 59 and the measured K i is a true thermodynamic parameter. H 2 inhibits proton reduction in both [NiFe]-and [FeFe]-hydrogenases (the former more strongly than the latter). However, the enzyme-H 2 complex is not a dead-end (it is a catalytic intermediate of H 2 evolution) and therefore the inhibition constant is not a true dissociation constant; it is actually greater than K d (ref. 61).
If the inhibitor binds irreversibly on the experimental time scale, then inhibition is complete (provided the concentration of inhibitor is greater than the concentration of enzyme) and the rate of inhibition can be measured, 9 but the rate constant of dissociation and the dissociation constant (K i ) cannot.
3.2.3 Reduction potentials. Reduction potentials are very important properties of redox cofactors, because according to Marcus theory, they are one of the three parameters that determine the kinetics of electron transfer (ET) between distant centers. The other two are the reorganisation energy, which is difficult to measure (it is deduced from the dependence of the rate of ET on either DG or T, all things being equal), and the intercenter coupling, which cannot be independently measured. Note however that when both redox centers are paramagnetic, the intercenter coupling is related to the magnitude of their exchange interaction, which can be deduced from the simulation of the EPR spectrum. 62 The reduction potential of an active site is also one of the parameters (but by no means the only parameter) that determines the "catalytic bias", that is whether the enzyme is a better catalysts of the reaction in the oxidative or reductive direction. 5,63 Reduction potentials can be determined in experiments termed redox titrations, where the system is poised under equilibrium conditions, stepwise reduced or oxidized; the "solution" potential is measured using a platinum electrode and the redox state is monitored using a spectroscopic technique. This is conceptually very simple if the system has a single redox center. If the protein or enzyme houses several redox centers that interact (meaning that the reduction potential of one center is affected by the redox state of the nearby centers), it is important to distinguish between microscopic reduction potentials (that can only be measured if the centers have distinct spectral properties) and macroscopic potentials (that are measured if the centers are indistinguishable in a particular experiment). 1,64 Depending on the spectroscopic method used to monitor the redox state of the sample and the spectral properties of the redox cofactors, a large amount of biological material may be required to carry out a complete redox titration. The implementation of the measurement is oen tricky. (1) A cocktail of redox mediators has to be present in solution to increase the rate at which the equilibrium is reached; its composition and concentration must be chosen carefully. (2) An artifact may arise from the fact that the redox equilibrium may unexpectedly shi when the sample is frozen to be examined by e.g. EPR (for an effect of temperature on the thermodynamics of intramolecular ET, see e.g. ref. 65). Changes in apparent pH can also occur on freezing aqueous buffer solutions. 66 (3) Enzymes like hydrogenases cannot be equilibrated at low potential because they turnover protons, which cannot be removed from the solution. (4) Last, and maybe most importantly, it is rarely checked that the redox process is fully reversible (for an example where it is unexpectedly irreversible, see ref. 67). Overall, the error on E 0 is most oen larger than AE10 mV, and there are many sources of artifacts that can result in the value being uncertain.
Dynamic electrochemical methods, where the system is not at equilibrium, can also be used to measure reduction potentials. 68 The information can sometimes be simply obtained from the result of a voltammetric experiment, where the electrode potential is repeatedly swept up and down to trigger the oxidation and reduction of the center, which is detected as an oxidation or reduction current. If the system has several redox centers, voltammetry measures macroscopic reduction potentials. If the redox reaction is a pure electron transfer or if it is coupled to fast reversible reactions (such as (de)protonation or ligand binding and release), then the thermodynamic information is easily obtained from experiments carried out in the low scan rate limit, where the system remains close to equilibrium. The rate of interfacial electron transfer and/or the rates of the coupled reactions can be deduced from experiments carried out at fast scan rates. 69 If the coupled reaction is irreversible, then the reduction potential can only be measured if the electrode potential is swept so quickly as to outrun the coupled reaction, 70 but there is no guarantee that this regime can be reached in experiments.
If the coupled reaction is the reversible or irreversible catalytic transformation of a molecule in solution, then the electrochemical response we are considering is a catalytic current, which is proportional to turnover frequency. If we consider the situation where electron transfer between the electrode and the enzyme is direct, the mid-point potential of the catalytic wave is somehow related to the reduction potential of the enzyme's active site, but it is equal to the reduction potential of the active site only in very rare situations. In most cases, the wave potential (the "catalytic potential") is a global parameter that is affected by the thermodynamics 71 and kinetics 72 of substrate binding, the kinetics and thermodynamics of intramolecular electron transfer along the redox chain that wires the active site to the electrode, 4,5 the kinetics of electrode/enzyme electron transfer 73 etc. It is now clear that catalytic potentials are parameters that may strongly depart from the reduction potential of the active site. An analogy in this respect is the Michaelis constant, which has the unit of a dissociation constant, but is not a thermodynamic parameter (cf. eqn (7b)). 59 The comparison of experimental and calculated reduction potentials may help understand how the environment tunes the redox properties of a metal center. Calculating potentials may also discriminate between several plausible mechanisms. The reduction potential is directly proportional to the free energy change associated to the redox process: where DE el is the adiabatic electron affinity of the system at the potential energy minimum of the oxidized state, DG solv is the difference in solvation free energies of the oxidized and reduced forms, and E zpe and RT ln(q) are the enthalpic and entropic contributions for the optimized structure, calculated within the harmonic oscillator/rigid rotor approximation. Due to the difference in charge between reactants and products, reduction potentials are generally strongly affected by the environment. Regarding coordination compounds, the differences in the solvation free energies of the reduced and oxidized species are usually computed using implicit solvation models, such as PCM, COSMO and COSMO-RS, 24,74 and their reduction potentials can oen be accurately computed using DFT methods (although complications arise in some class of compounds, see as an example some Cu complexes). Such calculations are more problematic in the case of metalloenzymes, because the environment of the redox centre cannot be satisfactorily described using an implicit solvation model. Therefore, the intermolecular interactions between the active site and the environment must be described with QM/MM methods where the effect of the inhomogeneous dielectric environment is treated at an atomistic level. In addition, an adequate sampling of the congurations associated with the environmental degrees of freedom can be crucial, in particular when the active site is exible or the surrounding residues adopt different conformations. In such case the harmonic approximation, which is usually assumed for calculation of vibrational entropy, is no longer justied. Adequate sampling can be achieved, for instance, with QM-and QM/MM-based molecular dynamics simulations by sampling the vertical electron affinity DE v el , 75-79 where hi O denotes the thermal average for the potential energy surface of the oxidized state. Note that the expression above is a rigorous result of classical statistical mechanics and does include all enthalpic and entropic effects (corrections for nuclear quantum effects can be added). The thermal average needs to be computed using enhanced sampling schemes such as free energy perturbation or thermodynamic integration, which are computationally expensive. However, when the uctuations of the DE v el are gaussian, 78,80 it is sufficient to carry out two MD simulations (one in the reduced state and one in the oxidized state) and take the average of the two: 77 errors. An error of 100 mV may not be acceptable considering that the biological redox scale is very narrow (most relevant reduction potentials range from À400 mV to +500 mV), and yet error of 100 mV corresponds to about 2.3 kcal mol À1 , which is well within the present accuracy of DFT methods. Therefore, results obtained from computing electron affinities, ionization energies and reduction potentials are oen more useful in a relative or qualitative manner, to distinguish among different species or reaction paths, than for the prediction of absolute values. In other words, such calculations are most useful if one aims at understanding changes in the reduction potential of a cofactor in response to point mutations or other modications of the environment, or differences in the reduction potential of the same cofactor in different proteins. 81,82 In these cases, since the QM system containing the redox active co-factor is the same and changes in reduction potential are due to different interactions with the environment only, the DFT errors are expected to cancel. Indeed, one can assume that the reduction potential differences are mostly due to the protein so that a QM calculation is no longer necessary and the reduction potential can, to rst approximation, be calculated entirely with classical force elds 81,82 or continuum electrostatics methods. 83 A recent example is the calculation of the relative reduction potentials of ten identical c-type heme cofactors bound to the deca-heme protein MtrF, 82 as reviewed elsewhere. 84 In this study classical MD simulation was employed to compute the reduction potential using thermodynamic integration. The range of potentials computed was in relatively good agreement with experiment, even though the computed potentials were microscopic reduction potentials (all other hemes remaining oxidized), whereas in experiments (protein lm voltammetry) macroscopic reduction potentials are measured (the system goes from being fully oxidized to fully reduced as the electrode potential is swept down). The effect of the oxidation state of a neighbouring cofactor on the reduction potential can be signicant, in the order of 10 to 95 meV, 85-87 but it remains typically below the statistical error caused by the nite length of the MD trajectories.

Acidity constants.
Protein folding and stability, as well as many biological protein functions such as proton and electron transfer processes, ligand binding, and proteinprotein association, are controlled by the ionization state of protein side chains. The pK a s of such acidic or basic side chain (Asp, Glu, Lys, Arg, His) are strongly affected by the protein environment, so that they can be signicantly different in the protein with respect to the value of the amino acid in solution. 88 This is particularly true for ionisable groups buried in a hydrophobic pocket. An example is given by the pK a value measured for a Lys residue inserted in the hydrophobic core of staphylococcal nuclease by site-directed mutagenesis, 89-91 which is 4.3 units lower than the pK a of Lys in water: this residue is deprotonated in the protein.
Several experimental methods, such as equilibrium denaturation measurements at different pH and potentiometric titrations have been applied to evaluate pK a s of ionisable residues in proteins. Accurate values of pK a s can be measured using multidimensional and multinuclear NMR spectroscopy, by monitoring the pH dependence of 13 C, 1 H and 15 N chemical shis and corresponding coupling constants of relevant atoms (Cg for Asp; Cd for Glu; Cd, Cd2, N32 and Nd for His, etc.) previously assigned to specic residues. [92][93][94][95] Theoretical predictions of pK a s are very useful even when experimental values are available, since they can provide a better understanding of the molecular determinants of ionization. Many different methods and levels of theory have been proposed for the calculations of pK a s. 96 However, in spite of the signicant progress since the rst work of Tanford and Kirkwood based on the Poisson-Boltzmann equation, 97 calculation of pK a s remains challenging because of the difficulties in capturing quantitatively the effects of the strong and position dependent short-range electrostatic interactions, and the nonspecic long range interactions between charged sites and with the solvent. [98][99][100] The heterogeneous response of the protein to a change in charge, which depends on the dielectric environment and the local exibility, is another difficult issue. [101][102][103] As recently reviewed, among the various methods proposed for pK a calculations, none performs signicantly better than others. 96 The most fundamental approach for describing electrostatics, as well as all other physical interactions, are quantum mechanical (QM) methods which solve the Schrödinger equation at some level of approximation. This approach can be successfully applied to small molecular systems such as single amino acids or small peptides. [104][105][106][107][108] In this case full QM geometry optimizations and vibrational frequencies calculations are carried out for the species included in a thermodynamic cycle such as that in Fig. 3.
The Gibbs free energy of reaction in solution (DG s ) is obtained as the sum of the Gibbs free energy of reaction in vacuum (DG g ) and the difference in solvation free energies (DDG solv ) where DG g + DDG solv are calculated as: pK a values can then be calculated from DG s using the equation: The main source of errors in this approach seems to arise from modelling solvation. In particular, widely used dielectric continuum models (DCM) are frequently the worse approximation for systems where short range solute-solvent interactions are important. The explicit inclusion of a few solvent molecules in close proximity to the solute in addition of using a DCM can be a way to overcome this issue, without making the calculations computationally too expensive. 109 In this respect we note that more elaborate DFT based molecular dynamics schemes have been developed for calculations of pK a values, where both the solute and a large number of solvent molecules are treated at the DFT level. 108 In addition, the accuracy of the calculated pK a s is also signicantly improved by using thermodynamic cycles that maximize systematic error cancellations. 110 The QM level of theory used in the pK a calculations is also important, as it should be feasible at reasonable computational costs for relatively large-sized systems. For macromolecular systems like proteins, using a QM method for the entire system is clearly prohibitive due to the computational cost. Most importantly, the use of QM methods is undesirable since electrostatic interactions dominate at large distances, and must be included in the calculation. An approach to overcome these issues is the QM/MM method introduced in Section 3. In this context, ad hoc computational methods for the calculation of pK a s have been recently proposed by Li and Jensen and coworkers: 111,112 one method is based on a QM representation of the ionisable residues and their immediate environment combined with a continuum description of bulk solvation with the linear Poisson-Boltzmann equation; alternatively, the QM region is surrounded by fragments described by static potentials predetermined using ab initio QM.
Several methods utilizing Molecular Dynamics (MD) and Monte Carlo (MC) simulations have recently been proposed at various levels of approximation. We recall that MD simulations are used to sample all possible conformations of a protein by calculating a long trajectory based on deterministic rules (Newton mechanics) whereas MC simulations consist in randomly generating a large number of conformations, which are accepted or rejected according to their Boltzmann probability. Recent promising models combine (i) atomistic simulations of the protein, performed using MC or MD with a xed or exible protein backbone, (ii) an implicit description of the solvent using a Poisson-Boltzmann model (PB), and (iii) a MC sampling of conformations and ionization states of the protein.
In these PB based approaches, the protein is dened as a region with a low dielectric constant embedded in a solvent with a high dielectric constant. The value of the dielectric constant of the protein is crucial for the correct prediction of pK a s. In this respect, different values have been used, from 4 to 80, 113-116 as the appropriate value depends on the distribution of polar and charged residues within the protein and on local protein exibility. [117][118][119] One of the most commonly used methods for incorporating conformational exibility into pK a calculations is the so-called Multi-Conformation Continuum Electrostatics (MCCE) method developed by Alexov and Gunner. 114,119 In the MCCE the protein side chain exibility is considered by generating several conformations for each residues which are relaxed using a force eld with Lennard-Jones and torsion energies. The resulting conformers, which represent all degrees of freedom including appropriate acid/base ionization states and side chain positions, are then subjected to Monte Carlo sampling to generate the Boltzmann distribution of conformers. A state featuring one conformer for each residue is a microstate. The energy expression to determine the acceptance for a microstate x (DG x ) is given by: where M is the total number of conformers, d x,i is 1 if conformer i is present in the microstate and 0 otherwise, m i is 1 for bases, À1 for acids and 0 for neutral conformers, k B T is 0.59 kcal mol À1 at 298 K, pK sol,i is the reference value of pK a for the group involved in the ionization equilibrium, DG p is a sum of pairwise terms independent from the other conformers of the microstate, and DG CE and DG LJ are pairwise electrostatic and Lennard-Jones energy terms which depend on the conformers selected in the microstate. Monte Carlo simulations are carried out for 15 different values of pH. The pK a of each ionizable group is then calculated from the occupancy of the ionized form in the Boltzmann distribution using the Henderson-Hesselbalch equation: in which m is equal to À1 for an acid and 1 for a base and n is the Hill coefficient reecting the degree of cooperativity between different sites. Equilibrium ionization states in proteins have also been investigated by the protein dipole Langevin technique, 103,120 and by MD based approaches using either constant-pH MD or free energy perturbation techniques. [121][122][123][124] 3.3 Kinetic parameters 3.3.1 General comments. The ability of an enzyme to catalyze a certain reaction is most easily quantied by the Michaelis parameters, K m and k cat (or v max , see eqn (7a)), obtained by tting the dependence of steady-state turnover rate on reactants concentration. The Michaelis parameters are "global" parameters, which depend on all steps in the mechanism and usually tell us very little about the mechanism and the rates of particular steps in the catalytic cycle (such as intramolecular electron or proton transfers) unless it is clearly established that one particular step fully limits turnover (about the concept of rate limiting step, i.e. the step which, if perturbed, causes the largest change in overall velocity, see the discussions of pitfalls in ref. 125 and 126). An example discussed in Section 4 is carbonic anhydrase, where proton transfer is the rate limiting step in turnover, but there are also examples where intramolecular electron transfer (ET) is rate limiting. 65 To specically learn about individual steps, the experimental method consists in triggering the cycle and monitoring the evolution of the concentration of reaction intermediates by appropriate techniques, most oen spectroscopic techniques. A kinetic model is then needed to deduce the rate constants. 87 Another general approach consists in examining how the steady-state kinetic parameters are altered when the substrate or the system is modied, for example by changing the concentrations, temperature, substrate/solvent deuteration, or using site-directed mutagenesis. In that sense, the amino-acid sequence can be considered as one of the experimental parameters which can be varied to see an effect on rates. 127 In contrast, regarding complex metalloenzymes, the calculations of rates necessarily focus on one particular step, not the entire cycle. Since reaction rates are macroscopic averages over a very large number of reactive events from the reactant basin to the product basin (and vice versa), following different trajectories, it is necessary to compute a large number of trajectories (dynamics approach) or to use statistical theories based on ensemble distributions. In particular, using MD, the reaction rates can be computed by averaging a statistically representative number of trajectories, obtained using different initial conditions, that take reactants to products. Many approaches exist to carry out such averaging procedure in practice at a reasonable computational cost: "umbrella sampling" is one such method, where the potential energy surface is biased to force the trajectories computed by molecular dynamics simulations to reach the transition state region. The most used statistical approach is grounded in the transition state theory (TST) of Eyring, according to which where DG 0 ‡ is the standard free energy of activation, directly evaluated from the barrier height, i.e., from the energy difference existing between the transition state and the preceding intermediate. However, due to the present accuracy of theoretical methods and to the approximations used to compute free energies, the comparison between computed and experimental reaction and activation energies is oen only semi-quantitative. Eqn (19) includes a prefactor, k, that accounts for barrier recrossing, nuclear tunneling and dynamical effects. 128 In general, the enzyme kinetics is the result of a large number of elementary steps, most of them reversible, each occurring with a given rate constant. This includes not only the chemical reaction steps at the active site but also the transport processes of substrates/products. For instance, proton transfer from the solvent to buried active sites occurs via a chain of proton exchanges between water and/or protonatable amino acid residues (see Section 3.3.2). Similarly, binding of small ligands to buried active sites can be described as a series of diffusive jumps between protein cavities connecting the solvent with the protein active site (see Section 3.3.3). The time evolution of these kinetic chains can be obtained by solving master equations (a set of differential equations governing the time evolution of all possible states of the system) or by using kinetic Monte Carlo methods. The latter use as input the elementary rate constants obtained for each step, for instance, from TST.
3.3.2 Proton transfer (PT) rate constants. Direct information about the rate of a PT step in enzymes can be obtained when this step is rate limiting during turnover. This is expected when the catalytic constant k cat is strongly modied either upon deuterating the substrate or when the reaction is studied in D 2 O (kinetic isotope effect, KIE). In this case, k cat can be equated to the PT rate constant. In these circumstances, the activation free energy of the PT step can be deduced from the temperature dependence of k cat . When the DpK a of the proton transfer can be altered by modifying some ionisable groups or their environment, the variation of the PT rate constant as a function of DpK a provides strong constraints for the interpretation of the PT mechanism (as exemplied with carbonic anhydrase, see Section 4.4.3).
The rate of elementary proton transfer steps taking place in enzymes is generally calculated with the expression given by transition state theory (eqn (19)).
To evaluate the activation free-energy DG 0 ‡ , a suitable reaction coordinate is chosen so as to follow the reaction progress, like the difference between the donor-proton and acceptorproton bond distances. Classical MD simulations with extensive umbrella sampling are then carried out to obtain the freeenergy prole along the reaction coordinate (called PMF, potential of mean force). DG 0 ‡ can be obtained from the difference of the PMF at the maximum (transition state) and minimum (reactant state), see e.g. ref. 129 for details. The PMF also provides the standard free-energy change DG 0 , which is proportional to the DpK a of the reaction. The parameters used in semiempirical methods must be determined through a careful calibration based on a set of experimental data. Nuclear quantum mechanical effects due to tunnelling and zero-point energies may be signicant in biological proton transfers. Various methods have been proposed to evaluate their contributions, especially in studies devoted to the interpretation of the KIE. 130 In enzymes, proton exchanges between the active site and the solvent take place through proton transfer chains made of protonatable groups, like water molecules and/or ionisable residues. The time evolution of these chains can be simulated by using various methods like the center of excess charge, the Langevin equation, or kinetic models leading to a master equation, as described in Section 3.3.3.

Rates of ligand binding and release.
Here we focus on methods for measuring and calculating the rates of binding of small ligands. In this context, the most intensively studied is CO diffusion in myoglobin, for which a wealth of experimental diffusion and binding rate constants are available for WT and mutant proteins, 131 but there has been considerable progress recently regarding intramolecular transport in hydrogenases and CO dehydrogenase.
In attempts to distinguish between the partition of the ligand between the solvent and the protein and the actual binding on the active site, it is useful to consider a two-step binding model, with the diffusion of the substrate towards a "geminate" (G) position near the active site (with a forward bimolecular constant k 1 and a rst order rate of release k À1 , dissociation constant K 1 ¼ k À1 /k 1 ) and the chemical binding/ release on the active site (with rst order rate constants k 2 and k À2 , equilibrium constant, The observed bimolecular rate of ligand binding and rst order rate of ligand release are related to the four rate constants above by These equations are obtained by assuming (i) the steady state for G, d[G]/dt ¼ 0 and (ii) that K 2 is small (k À2 ( k 2 ).
In metalloenzymes that transform small molecules like CO, CO 2 , and H 2 , putative substrate tunnels are most easily identi-ed as hydrophobic cavities in (static) X-ray structures. Xenon can be used as a probe in crystallographic studies, because it is supposed to prefer hydrophobic environments, like H 2 or O 2 ; it is of a similar size to O 2 but it is more electron-rich, thereby facilitating its detection with X-rays. We note that Xe-binding cavities may not reveal CO 2 diffusion paths because they may be too small to be used for CO 2 transport. Testing the diffusion pathways predicted from crystallographic studies usually consists in using site-directed mutagenesis to try to alter the main routes (most commonly, by increasing the bulk of the side chains that point in the channels) and examine the effect on the rates of ligand binding (see e.g. ref. 132 for a review).
One method for probing the rate of intramolecular diffusion in enzymes may consist in measuring the rate of substrate or ligand binding in experiments where the enzyme-ligand complex has a clear UV-vis signature: the ve-coordinate hemes of cytochrome c oxidase and myoglobin lend themselves to this sort of investigations. 133,134 Regarding hydrogenases, a particular method for looking at H 2 diffusion rates is based on analysing the progress of the isotope exchange reaction, whereby D 2 is irreversibly transformed into H 2 using protons from the solvent, in two steps that are catalyzed at the [NiFe] active site: Both steps are irreversible, because the solvent H 2 O provides a very large excess of H + over D + . The reaction can be monitored by using mass spectrometry to follow the change in concentration of D 2 , HD and H 2 , see e.g. Fig. 4. HD is an intermediate along the reaction pathway from D 2 to H 2 , and because the egress of HD competes with its transformation into H 2 , the slower intramolecular transport, the less HD dissociates from the enzyme's active site and the less it can be detected in the solvent. Modelling the change in HD concentration against time returns the ratio of rate of HD dissociation over H + /D + exchange at the active site. 135 Under certain conditions, 63 the data can also be used to directly measure the rate of dissociation, k H2 out . Alternatively, the information about the kinetics of ligand binding may be deduced from turnover-rate measurements: it is indeed possible to determine the rate of binding or release of a competitive inhibitor ("competitive" means that it targets the active site) by monitoring the change in turnover rate upon exposure to the inhibitor. The electrochemical measurement of the rate of binding and release of CO in hydrogenase is illustrated in Fig. 5: the H 2 -oxidation activity is measured as a current, with the enzyme adsorbed onto an electrode immersed and rotated in a solution continuously ushed with H 2 , and   View Article Online small aliquots of a solution saturated with CO are repeatedly injected in the cell. 9,135 The concentration of CO instantly increases aer each injection (the mixing time is about 0.1 s) and then decreases exponentially as CO is ushed away by the stream of H 2 . The activity decreases aer the addition of CO, and it is eventually fully recovered. We derived in ref. 60 the analytical equation that can be used to t the electrochemical data recorded aer a single injection of CO (as in Fig. 5) to measure k out and the apparent value of k in . The value of k CO out is independent of substrate concentration, but since H 2 competes with CO, the "true" value of k in is obtained from its apparent value using: An alternative strategy for characterizing the kinetics of inhibition by CO (or O 2 ) consists in tting the exponential relaxation of the catalytic current that follows a step in inhibitor concentration 136 (rather than a burst, as in Fig. 5A). This can be achieved by injecting an aliquot of solution saturated with CO and simultaneously changing the composition of the gas phase above the cell solution. In that case however, it is important to realize that the time constant s of the relaxation is not 1/k CO in,app [CO], but it is: Unless the experiment consists in monitoring the spectroscopic signature of the active site, the rates of diffusion in either direction (k CO 1 , k CO À1 ) and the rates of ligand binding and dissociation at the active site (k CO 2 , k CO À2 ) cannot be measured independently, and the meaning of the binding/release rate constants must be discussed in relation to eqn (21).
The rate of binding (k CO in ) equates the rate of diffusion towards the active site only on condition that the binding at the active site is fast In this case, the measured rate of ligand released (k out ) is the rate of diffusion out multiplied by the dissociation constant, In other words, the dissociation from the active site acts as a pre-equilibrium for the release of the ligand, as discussed in ESI of ref. 10.
Atomistic simulations, in particular MD, have most oen been used in this context independently of experimental investigations. They can give important qualitative information on intramolecular transport, such as the most likely diffusion paths within the protein and the location of key residues that guide, block or gate ligand diffusion. The simulations that have been carried out were either based on long equilibrium molecular dynamics [137][138][139][140] or on the use of enhanced sampling methods. [141][142][143][144][145][146][147][148] With computational capabilities steadily increasing in recent years, it has become possible to compute not only qualitative diffusion paths, but also energetic properties such as activation barriers 146,147 and estimates of global free energy surfaces. 144,145,148 Nevertheless, rate constants for the diffusion of gas molecules have only rarely been computed. Free energy surfaces could in principle be used to obtain approximate diffusion rates using e.g. TST for each single transition. However, as pointed out in ref. 131, there are two issues. First, the transition of small ligands between protein cavities may be strongly affected by dynamical effects leading e.g. to frequent barrier recrossings. This effect is neglected in standard TST and calculation of respective correction factors for each transition would be cumbersome. Second, for the construction of free energy surfaces collective variables need to be chosen, typically the cartesian position of the gas molecule. While this is an intuitive and suitable choice for fast transitions, it may be a poor choice for slow transitions through narrow passages where gas diffusion is coupled to ("gated by") side chain motions of amino acid residues. In this case the reaction coordinate for the diffusive transition is likely to be more complicated, involving in addition to the cartesian position of the gas molecule some suitable coordinates describing the motion of the side chain(s) in question.
Considering the above issues, it is preferable to compute diffusion rates directly without prior calculation of equilibrium free energy proles. Indeed, for relatively small proteins like myoglobin, it has been possible to obtain estimates for diffusion rates by brute force MD simulations. In the work of ref. 138, a relatively large number of trajectories of length 90 ns were generated and the rate constants estimated by counting the number of successful transitions between solvent and active site. Similarly, in ref. 140, rates for CO migration between Xe-binding sites in myoglobin were estimated from equilibrium MD simulations. The results obtained for diffusion were combined with QM calculations for CO binding, to propose a detailed kinetic model that was in reasonable agreement with available experimental data.
Brute force MD simulations are sometimes insufficient to obtain a statistically signicant number of successful transitions of gas molecules from the solvent to the enzyme active site. This can be the case for large gas-processing enzymes with active sites buried deep inside the protein, far away from the solvent. The large number of possible but unproductive pathways reduces the probability for successful entry in the active site. Other difficult cases are enzymes with very narrow passages for gas diffusion such as the [NiFe]-hydrogenase mutants studied in ref. 10. Some of us have recently developed a master equation approach with rate constants estimated from equilibrium and non-equilibrium MD simulation, that addresses the sampling problem in these systems. [149][150][151][152] The method allows us to compute diffusion rates of small ligands, even when these are very slow. Most importantly, the approach yields phenomenological diffusion rate constants, that can be directly compared to experimental rate constants. In the following we describe this computational method in more detail.
In a rst step, one runs one or several long equilibrium MD trajectories of the protein and the surrounding aqueous View Article Online solution containing 10-100 gas molecules, in the following referred to as ligand ("L"). Small diatomic or triatomic molecules penetrate the protein typically on the pico-to nanosecond time scale and quickly explore the accessible cavities and tunnels inside the protein. In a second step, the equilibrium probability distribution of the gas molecules inside the protein is obtained by dening a grid and counting the number of times a molecule visits a given elementary volume. The probability distribution is then clustered ("coarse grained") in a way such that the cluster positions coincide as closely as possible with the maxima of the probability distribution (see e.g. the spheres in Fig. 7C). These clusters are then identied as coarse "states" in a kinetic model that describes ligand diffusion as a sequence of hops between these states with rate constants k ij , where j is the initial state or cluster and i the nal state. The surrounding solvent is considered as a single cluster with rate constants for transitions to protein clusters dened similarly. In a third step the transition rates k ij are calculated simply by counting the number of transitions observed in the long equilibrium MD runs. For important transitions that are insufficiently sampled, enhanced sampling methods (such as e.g. non-equilibrium pulling) are used to obtain k ij . In the fourth step the transition rates k ij are inserted in a master equation, which is a set of coupled rst order differential equations for the population of each cluster as a function of time, p i (t), with solution where K is the rate matrix with elements [K] ij ¼ k ij , The master equation (eqn (27)) is solved for given initial conditions (e.g. by setting the gas population inside the protein to zero at time equal zero as is the case in experimental measurements) to obtain the time dependent population of the states as a function of time. For calculating the rate of diffusion to the active site, the quantity of interest is the ligand population in the geminate state, p G (t). In the h and last step p G (t) is t to the phenomenological rate law for reversible diffusion of L to the enzyme active site (rst reaction step in eqn (20)), which takes the form: Eqn (29) relates the populations obtained with atomistic MD simulation techniques to the phenomenological rate constants for pure diffusion from the solvent to the active site and vice versa, k 1 and k À1 , respectively. Within the coarse master equation scheme described above it is straightforward to include the chemical binding step, 152 For this, we dene an additional "bound" state B, in which the substrate is chemically attached to the enzyme (denoted as EL in eqn (5) above) and the corresponding population p B . The rate constant for transition from state G to B, k 2 , and for the reverse transition, k À2 , is estimated, for instance, using quantum chemical methods as described above. The dimension of the rate matrix in eqn (27) is then increased by one to include the entries for k 2 and k À2 and eqn (27) is solved for p B . A t of p B to the phenomenological rate law for reversible ligand attachment, takes the same form as eqn (29), and provides a route for calculating the phenomenological rate constants for diffusion to the active site and chemical binding, k in and for chemical unbinding and diffusion out of the protein, k out . Alternatively, the value of k in can be calculated using the steady state formulae eqn (21) (note that in eqn (32), the steadystate assumption for G is not made). In Section 4 we will discuss applications of this methodology to substrate and inhibitor diffusion in hydrogenase and ACS/ CODH, and compare the rate constants computed this way with experimental measurements.   (Fig. 6) and have attracted great attention due to the potential application of these enzymes in biotechnological energy-conversion processes. 153,154 To put the results below into context, it is important to remember that [NiFe]-hydrogenases are converted under oxidative (aerobic or anaerobic) conditions into a mixture of inactive states, two of which are referred to by the name or their EPR signatures: NiA and NiB. 156 The enzymes recover H 2oxidation activity upon reduction, NiB more quickly than NiA. 157 According to X-ray investigations, an oxygenic ligand bridges the Ni and the Fe in the inactive states. 45  cluster, which has been suggested to play a crucial role in the protection of the active site against oxidative inactivation. The [4Fe3S] cluster is linked to the protein by an unusual sixcysteine binding motif. Four of the six cysteine residues bind the cluster in the classical way, whereas one of the supernumerary cysteine residues replaces an inorganic sulde in the cubane core, and the other is terminally coordinated to one of the Fe atoms. While classical [4Fe4S] clusters are involved in one-electron transfer reactions, the proximal [4Fe3S] cluster found in some O 2 -tolerant [NiFe]-hydrogenases can attain three redox states within a redox potential span of only 150 mV (and therefore be involved in two-electron transfer reactions), although it is unclear if it is a condition for O 2 tolerance. 163 The superoxidized state is stabilized by a structural reorganization arising from deprotonation of a backbone-nitrogen atom and concomitant nitrogen coordination to one of the iron atoms. X-ray diffraction results also suggest that in the enzyme from E. coli a Glu residue is coordinated to Fe2 in the superoxidized species (Fig. 6), whereas in the membrane-bound hydrogenase from R. eutropha a dioxygen-derived oxo or hydroxo ligand replaces the Glu sidechain. 164 Different spectroscopic techniques (X-ray, EPR, Resonance Raman and Mössbauer) have been complemented by quantumchemical calculations, with the aim of disclosing structural and electronic properties of the unusual [4Fe3S] cluster. As an example, broken-symmetry DFT calculations complemented Mössbauer measurements, indicating that the superoxidized [4Fe3S] 5+ cluster can be described as a mixed-valence Fe 2.5+ /Fe 2.5+ and a diferric pair. The reduced [4Fe3S] 3+ has an electronic pattern consistent with a mixed-valence and a diferrous pair, while the [4Fe3S] 4+ state can be described as formed by two mixed-valence pairs. Even though these studies agree about the ferric character of the "special" Fe ion (Fe2 in Fig. 6), the spin coupling scheme of the four Fe atoms remains debated. DFT calculations have also been used to study some aspects of the energetics of the interconversion between the three accessible redox states of the [4Fe3S] cluster. 155,165 Many mutations of amino acids near the proximal or medial cluster increase the O 2 -sensitivity of otherwise O 2 -tolerant [NiFe]-hydrogenases, 163,164,166 and it is still unknown how the properties of the electron transfer chain make O 2 -resistant [NiFe]-hydrogenases form, upon oxidation, only a NiB state that reactivates very quickly. Certain single point mutations in D. fructosovorans [NiFe]-hydrogenase also strongly affect the rates of anaerobic formation and reactivation of the NiB state, for reasons that still need to be claried. Fourty years aer the NiA and NiB inactive states were discovered, we still need to elucidate their structures and mechanisms of formation, not forgetting that different mechanisms may operate under oxidizing aerobic and anaerobic conditions, and lead to the same inactive states. 63 In this respect, it is remarkable that the actual reaction of [NiFe]-hydrogenases with O 2 has not yet been studied computationally; this is certainly a subject for further studies.

Intramolecular diffusion in [NiFe]-hydrogenase.
The existence of a gas channel in [NiFe]-hydrogenase was recognized when a 2.54Å resolution structure of the enzyme revealed the presence of hydrophobic cavities connecting the molecular surface to the active site. A crystallographic analysis of xenon binding, together with molecular dynamics simulations of xenon and H 2 diffusion in the enzyme, suggested that these cavities were functional. 13 Comparison of amino acid sequences showed that a bottleneck at the end of this channel, near the active site, is shaped by two conserved residues, Val74 and Leu122 (D. fructosovorans numbering) 167 (Fig. 7B), and several subsequent studies suggested that the side chains of these amino acids could inuence H 2 and/or O 2 access to the active site. [168][169][170] The suggestion that bulky side chains at these positions may render certain [NiFe]-hydrogenases O 2 -resistant by preventing O 2 access, which eventually proved wrong, 10 was the initial motivation for a series of studies aimed at determining the effects of amino-acid substitutions in the channel on the functional properties of the enzyme: rates of CO binding, CO release and O 2 binding, Michaelis constant for H 2 , and catalytic "bias" (dened as the ratio of the maximal rates of H 2 oxidation and production 63 ). Some results are shown in Fig. 8, each data point corresponding to one particular mutant of the [NiFe]hydrogenase from D. fructosovorans. The mutations of amino acids in the channel change the rates of CO binding by up to a factor of 1000, but most mutations have no signicant effect on the dissociation constant for CO (Fig. 8A shows that k out is proportional to k in in this series of mutants, except for the V74Q, E and N substitutions). The comparison of the rates of reaction with CO and O 2 in this series of mutants (Fig. 8B) shows that CO This led to the conclusion that CO and O 2 diffuse within the enzyme at the same rate, but O 2 reacts slowly at the active site, suggesting that the rate of inhibition by CO is mainly determined by diffusion towards the active site: Molecular dynamics simulations of gas diffusion in [NiFe]hydrogenases gave important clues about the molecular mechanism of inhibitor transport in some of the wild type and mutant enzymes studied experimentally. [149][150][151] Before we describe the main ndings of the MD simulations, we would like to comment rst on the accuracy that one can expect from the molecular models that were used in these simulations.
A good test to assess the force eld used to describe the interactions between ligands and proteins is the calculation of diffusion constants in various solvents. The force eld models used typically reproduce the lowest non-vanishing multipole moment of the ligands in the gas phase and contain Lennard-Jones interactions sites. 149,150,152 The solvent is described with the same force eld as that used for the protein. Diffusion constants computed for H 2 , O 2 , CO and CO 2 are summarized in Fig. 9 (data taken from ref. 149, 150 and 152). The experimental values are very well reproduced, albeit not perfectly, with a mean relative unsigned error (MRUE) of 15% for water and 21% for hydrocarbons, where the average was taken over the four gases. For O 2 , additional calculations were carried out for aprotic dipolar solvents (DMSO, acetone, acetonitrile) resulting in a MRUE of 16%. While there is certainly room for further The clusters are depicted as spheres together with three typical "pathways" to the active site observed by following the trajectories (pathways 1, 2, and 3, colored in red, blue, and yellow, respectively). Cluster E in white is the cluster that gas molecules temporarily occupy before binding; cluster G in gray is the state in which a gas molecule occupies the active site cavity but is not yet chemically bound to Ni. The labels a, b, etc., denote the approximate positions of the Xe-peaks reported in ref. 13.  Energy & Environmental Science Review improvements, the results show that the performance of these simple and computationally efficient force eld models is fair. Regarding intramolecular transport in hydrogenase, the advantage of studying CO over e.g. O 2 is that CO chemical attachment to the [NiFe] active site is fast (k CO 2 [ k CO À1 ). Therefore, the bimolecular CO binding rate is a good proxy of the diffusion rate (eqn (33)), which allows for a direct comparison between simulated rates for gas diffusion and experimentally determined rates. The diffusion rates of CO in [NiFe]-hydrogenase and three mutant enzymes have been computed using the methodology described in Section 2. The results are summarized in Fig. 8C and D (data taken from ref. 150 and 151). The simulated rate constants for diffusion in the active site, k 1 , are very close to the experimental binding rate constants k in . They range from z10 4 s À1 mM À1 for the WT enzyme down to z10 s À1 mM À1 for the V74M mutant (Fig. 8D). The very good agreement obtained in this specic case, with deviations of no more than a factor of 3 in the diffusion rates, can be considered somewhat fortuitous given the imperfections of the force eld and the statistical errors due to limited sampling. However, a good order of magnitude estimate for the diffusion rate can be generally expected by such simulations. Panel D shows the experimental value of k CO out against the calculated value of k À1 , which can be interpreted using eqn (26). The observation that the data points fall reasonably well on a line of slope 1 in a loglog plot shows that the measured value of k out is indeed proportional to the calculated value of k CO À1 . We deduce K CO 2 z 10 À2 , consistent with the approximation made to derive eqn (25). Overall, regarding the kinetics of CO binding and release, the agreement between the model and the data can be taken as an indication that the assumptions underlying the model for gas diffusion developed in Section 2 are sound.
We now discuss the measurements and values of dissociation constants. 149,150 According to both experimental results and computations, CO and O 2 diffuse about equally fast to the active site of [NiFe]-hydrogenase. Every 100 microseconds a CO or O 2 molecule reaches the active site at a gas concentration of the surrounding solution of 1 mM (corresponding to a gas pressure of about 1 atm.). Conversely, it takes only about 100 ns for a gas molecule to diffuse from the active site to the solution. Interestingly, the same time scales have been reported for CO diffusion in myoglobin and for CO 2 diffusion in ACS/CODH. To rst approximation K 1 can be estimated by the ratio of volume per gas molecule in the active site cavity and in solution (since, as explained in ref. 150, the values of K 1 are mainly a consequence of the loss of translational entropy as the ligand moves from the solution to the active site cavity). For more quantitative estimates and to understand differences between ligands, MD simulations must be used to account for specic interactions with the solvent/protein. The calculated equilibrium constant for pure ligand diffusion in [NiFe]-hydrogenase obtained from MD simulations is K 1 ¼ k À1 /k 1 z 10 3 mM, and this value is very similar for H 2 , CO and O 2 .
All experimental mutations studies have focused on the V74 L122 motif and indeed it was unequivocally shown that this motif is one of the bottlenecks for gas transport. 10,151 However, some of the observed effects were difficult to rationalize. For instance, there was an absence of correlation between the diffusion rate and the "width" of bottleneck shaped by the 74-122 motif. For example, diffusion in the V74M L122A mutant is slowed by a factor of 42 relative to the WT enzyme, even though the gas channel diameter is not signicantly reduced. 135 Another puzzling observation is that the L122M V74M double mutation is less effective than the V74M single mutation even though the gas channel diameter is similar to the one for the single mutant according to the crystal structure. Simulations have shown that diffusion is in fact controlled by two rather than one motif, one between residues 74 and 476 and the other between residues 74 and 122. 151 The existence of two control points in different locations explains why the reduction in the experimental diffusion rate does not simply correlate with the width of the main gas channel measured between L122 and V74. The simulations also helped us understand how inhibitors can access the active site in certain mutants, despite the fact that the access route is blocked according to the crystal structure. 151 Considering one of the most effective mutants (V74M), we found that CO molecules reach the active site due to strong thermal uctuations of the width of the gas channel dened by M74 and L122 and through transitions that are gated by the microsecond dihedral motions of the side chain of a strictly conserved arginine (R476). These ndings suggest that attempts to further decrease inhibitor diffusion could focus on making the main gas channel, in particular the two above mentioned motifs, more rigid.

Substrate transport in ACS/CODH
Gas diffusion is also a key aspect of the reactivity of bifunctional ACS/CODH (Fig. 2C), where structural features allow for the effective transport of substrate and product molecules. In this enzyme, CO 2 diffuses from the solvent to the C cluster located deep inside the protein interior, approximately 30Å from the protein surface. The reaction product CO is then transported from the C cluster to the catalytic A cluster of ACS, that is about 70Å away. Early experimental studies demonstrated that this transport occurs without CO being released to the solvent. 177,178 More recent xenon-binding studies and calculations of cavities in the static structure [179][180][181] showed that the clusters are interconnected by a channel which extends throughout the entire length of the enzyme complex (138Å). Mutations of putative channel residues resulted in decreased acetyl-CoA production rates, providing convincing evidence that CO molecules use this tunnel. 182 On a rst view, it is puzzling that directional transport of this ligand is possible considering the thermal uctuations of the protein and the high mobility of the small CO ligand. Why doesn't CO merely take the same route out of the protein as the one CO 2 takes to reach the C cluster from the solvent? Aerall, CO is smaller than CO 2 and the path that CO 2 takes to diffuse from the solvent to the C cluster should also be accessible for diffusion of CO from the C cluster to the solvent. Molecular dynamics simulations answered this question. 152 It was shown that the hydrogen bonding network in the active site pocket accommodating the C cluster changes drastically with oxidation state. Aer formation of CO from CO 2 , the hydrogen bond network becomes stronger, preventing CO from taking the CO 2 access pathway. Hence, the change in the hydrogen bond network leads to obstruction of the CO 2 channel and enables the directional ow of CO from the C cluster, where it is produced, to the A cluster of ACS/CODH, where it is utilized.
Another puzzling question is how CO 2 diffuses from the solvent to the C cluster of ACS/CODH. Neither Xe-binding studies nor cavity calculations have given indications for a pathway connecting the C cluster and the protein surface. 181 Volbeda et al. hypothesized that CO 2 could enter the enzyme via the A cluster and travel "backward" through the long CO channel. 183 However, the mutations of channel residues do not affect CODH enzymatic activity, 182 and CO 2 transport against CO ux in the channel would require an elaborate mechanism in order to avoid unproductive molecular collisions. 183 Tan et al. suggested that CO 2 might enter the C cluster through a channel connecting the two C clusters, as shown in cavity calculation, or via a hydrophobic channel near the CODH dimer interface. 184 Finally, Doukov et al. proposed that the CO 2 diffusion path is dynamically formed by the thermal motion of the protein. 181 Recent MD simulations conrmed that CO 2 diffusion into the C cluster is facilitated by a dynamical gas channel that extends orthogonal to the static channel where Xe binds. 152 The cavities of this dynamic tunnel that are close to the active site are temporarily created by protein uctuations, and as such not apparent in available crystal structures.
With regard to binding kinetics, the experimental information is scarce. Kumar et al. determined a rate constant of 2.6 Â 10 4 s À1 mM À1 for the CO driven conversion of C red1 into C red2 at 300 K (calculated from the data measured at 5 C), 185 which provides a lower limit for the rate of diffusion of CO to the active site. One can expect that the rate is lower for CO 2 due to its larger size. The MD simulations that revealed the dynamic access channel (see above) predicted rates of k 1 ¼ 4800 s À1 mM(CO 2 ) À1 for the diffusion of CO 2 from the solvent to the C cluster and k À1 ¼ 1.5 10 7 s À1 for escape from the C cluster to the solvent. 152 Interestingly, these rates are on the same order of magnitude as those reported for CO and O 2 diffusion in [NiFe]hydrogenase (see above). Combining the MD simulations with the DFT calculations for CO 2 binding to the C red2 state of ACS/CODH, binding rates of k in ¼ 4.4 s À1 mM À1 have been estimated, similar in magnitude to the experimental turnover rate of the enzyme, k cat ¼ 1.3 s À1 . 152 It is interesting to note that the rates for diffusion (k 1 ) and for chemical attachment (k 2 ) are very similar, but k in is 3-4 orders of magnitude smaller than both k 1 and k 2 . This can be easily understood when considering the steady-state expression, eqn (21). Since diffusion out of the protein is much faster than chemical attachment (k À1 [ k 2 ), k in is given by k 2 divided by the equilibrium constant K 1 ¼ 10 3 -10 4 mM.

Transformations of the H cluster of [FeFe]-hydrogenase
Several combined electrochemical and DFT studies have been carried out with the aim of characterizing the reactivity of [FeFe]-hydrogenases (Fig. 10), even if the quantitative comparison of rate and binding constants obtained using PFV and chronoamperometry experiments with the corresponding computed data can be problematic, due to the limited accuracy of present DFT methods (see Sections 1 and 2).

Inhibition by formaldehyde.
Recently, Armstrong and collaborators observed that formaldehyde reversibly inhibits [FeFe]-hydrogenase by targeting the reduced H cluster. DFT calculations were carried out with the aim of characterizing the species formed when the enzyme reacts with the H cluster in the Hox state, or its one-or two-electron reduced forms Hox À1 and Hox À2 , respectively 187 (Fig. 10). Two possible reaction mechanisms were evaluated: (a) nucleophilic attack of a Fe d hydride species at the carbonyl group of HCOH and (b) Schiff base chemistry involving the bridgehead N atom of the dithiolate chelating ligand. Considering the hydridic reaction with HCHO at the Hox À1 redox level, the formation of methanol bound to Fe d via the oxygen atom is strongly exothermic (DE ¼ À28 kcal mol À1 , the resulting species is labelled "Hox À1 (f)" in Fig. 10). The Hox À1 state of the H cluster is therefore thermodynamically competent to bind HCHO. Further addition of an electron slightly increases the exothermicity of the reaction (DE ¼ À34 kcal mol À1 ). DFT calculations also suggested that the reaction of HCOH with the bridgehead N atom of the dithiolate chelating ligand could yield aminol intermediates (Hox(e), Hox À1 (e) and Hox À2 (e) in Fig. 10, reaction energies DE ¼ À18, À17, and À22 kcal mol À1 , respectively), which are expected to decompose in a protic environment to yield dehydrated imine species.
This investigation illustrated how the combined used of PFE and DFT allowed to evaluate plausible reaction pathways for the reactivity of [FeFe]-hydrogenases with HCOH, but also highlighted intrinsic limitations in these approaches. In particular, even if both the formation of a strongly bound methanol molecule and a Schiff base modication of the H cluster are consistent with the enzyme inhibition observed when H 2 production is monitored at very negative potentials, these scenarios are necessarily incomplete since they do not account for the observed reversibility of the inhibition process, leading the authors to conclude that the protein environment around the H cluster may play an important role in destabilizing or hindering the formation of the predicted products. 4.3.2 Inhibition by exogenous CO. The role of the protein surrounding the H cluster was clearly highlighted in a recent combined experimental and theoretical study of the reaction of extrinsic CO with the H cluster of [FeFe]-hydrogenase. 186 CO behaves as a mere competitive inhibitor when the enzyme is inhibited under very oxidizing conditions (leading to Hox(a) in Fig. 10); in other conditions, the reaction with CO is partly irreversible, 136,186 as illustrated in Fig. 11. These experiments are the same as those described to study CO binding to [NiFe]-hydrogenase in Fig. 5, but here the observation that the activity is not completely recovered aer CO is ushed away reveals an irreversible process. The data could be accurately analyzed in ref. 186 using a model that assumes that the inactive enzyme-CO complex can either dissociate or be transformed irreversibly into an inactive form.
The rate of CO binding depends on electrode potential in a sigmoidal manner, with a mid-point potential that appears to match the value expected for the Hox/Hred transition. The change in rate of irreversible transformation of the enzyme-CO complex and the change in k in occur at the same potential, which suggested that the Hred state is irreversibly degraded aer it binds CO. DFT was used to carry out geometry optimization of the partially oxidized and one-electron reduced forms of the H cluster bound to exogenous CO (such enzyme forms are termed Hox-CO and Hred-CO in the following). The experimental free energy of formation of Hox-CO, deduced from the ratio of k out over k in , is reasonably reproduced by calculation (see also ref. 189), although it must be noted that the AE2 kcal mol À1 uncertainty in the calculated value corresponds to a large difference in terms of K d , a factor of 800. The calculations showed that in Hox-CO, the H cluster is stable, while in the case of Hred-CO, the Fe p -S(Cys) bond that covalently attaches the diiron cluster to the enzyme is cleaved (leading to Hox À1 (a) in Fig. 10). This behaviour can be qualitatively rationalized simply by using electron count rules: in Hred, the iron atoms already have 18 valence electrons, a conguration which is particularly stable; upon coordination of an additional CO ligand, the weakest bond (the Fe p -S(Cys) bond according to DFT) has to be cleaved if the 18-electron rule is still to be fullled. The resulting [Fe 2 (m-SR) 2 (CO) 4 (CN) 2 ] 2À complex is a stable species, which explains why the reaction of Hred with CO is partly irreversible. Following bond rupture, the fate of the diiron subcluster should depend on the surrounding protein matrix, and Baffert and coworkers considered as unlikely that the diiron site is released from the protein because the H cluster is deeply buried and shielded from the solvent. 186 The reverse reaction, that is, transfer of [Fe 2 (m-SR) 2 (CO) 4 (CN) 2 ] 2À (a 2Fe(I) precursor of the diiron subsite) from the solvent to the active site pocket of an apo form of [FeFe]hydrogenase, has recently been observed by Happe and collaborators. 16,190 This implies that the organometallic complex is able to autonomously integrate into the protein core, and to covalently bind the [4Fe4S] subsite with concomitant release of one carbonyl ligand. However, it is believed that insertion of the 2Fe subcluster occurs through a cationically charged channel that collapses following incorporation. 191 4.3.3 Inhibition by dioxygen. As mentioned above, a topic of increasing relevance in the hydrogenases eld concerns the reactivity of these enzymes towards molecular oxygen. From the results of electrochemical measurements with the [FeFe]hydrogenases from C. acetobutylicum, Baffert and coworkers proposed that the aerobic inactivation of the enzyme occurs as a result of initial, slow and reversible formation of an O 2 adduct, followed by an irreversible transformation; when the reaction is monitored by following the change in catalytic current caused by a pulse of O 2 , the kinetic scheme that can be used to analyse the data and measure the rate constants of the three reactions is the same as that considered above for CO binding (eqn (34)). 10,192 That O 2 initially targets the distal Fe of the 2Fe subsite is clear from the observation that the competitive inhibitor CO binds on this atom in the crystal 193 and protects the enzyme from O 2 inactivation. 192,194 This is consistent with the theoretical investigation by Stiebritz and Reiher, who used DFT to examine the regioselectivity of O 2 binding. 195,196 A subsequent study by Hong and Pachter, based on both MD simulations and DFT calculations, corroborated such picture. 197 Blumberger and coworkers investigated the kinetics of the initial O 2 binding step using DFT calculations: 198 by parametrizing a range-separated density functional using high-level ab initio data as a benchmark, they could compute an activation free energy barrier of 13 kcal mol À1 for O 2 attachment to Fe d , and a binding free energy of À5 to À7 kcal mol À1 . The rate of O 2 binding could then be calculated from eqn (21) above. Converting the computed free energies into k 2 using TST and adopting values for k 1 and k À1 from MD simulations for [NiFe]-hydrogenase, they obtained values for k in of 3.6 s À1 mM À1 and 1.2 s À1 mM À1 for Cp and Dd enzymes, respectively, in fair agreement with the experimental values (2.5 s À1 mM À1 and 40 s À1 mM À1 , respectively 10 ). The reason the kinetics of O 2 binding and release is different in the three homologous [FeFe]-hydrogenases for which such data have been published 10,136,192 remains to be claried.
In contrast with the above experimental and theoretical evidence that O 2 targets the distal Fe on the 2Fe subcluster, X-ray absorption measurements indicated that the main structural consequence of the exposure to O 2 is oxidative damage of the [4Fe4S] subcluster. 194 Armstrong and coworkers concluded that the destruction of the 4Fe subcluster follows up O 2 binding at the catalytic site of [FeFe]-hydrogenase and proposed two   198 This working hypothesis should now be tested by characterizing the kinetics of inhibition of these mutants.
Taken as a whole, the above results raise hopes that the dialogue between theory and experiments in this challenging case of protein engineering will provide even more fruitful outcomes in the near future.
4.3.4 Flexibility of the H cluster. In a very recent contribution of ours, 8 electrochemistry was combined with site directed mutagenesis and quantum and classical calculations to dissect the steps leading to oxidative inactivation of [FeFe]hydrogenases and learn about the binding of H 2 to the active site of [FeFe]-hydrogenase. By discussing in details hereaer the path that led us to the proposed mechanism, we intend to illustrate the potential synergy in combining computational and experimental approaches.
The key issue in this study was to rationalize the occurrence of different intermediates formed when [FeFe]-hydrogenase are oxidized in the presence of H 2 . Spectroscopy is difficult to use in this context, since turnover prevents equilibrium from being reached under these conditions, but a redox titration of the enzyme from C. reinhardtii followed by FTIR showed that full oxidation in the absence of H 2 destroys the H cluster, 44 and PFV experiments demonstrated that if H 2 is present, the enzyme inactivates reversibly (at least partly reversibly) at high potential. 202 Previous experiments on bio-inspired model complexes 203 suggested that oxidation of Hox prior to H 2 binding could trigger coordination of the pendant amine in the H cluster to the distal iron atom (Fe d ); formation of such bond would inactivate the enzyme by preventing H 2 binding to Fe d . We considered an intramolecular reaction of this kind as a rst hypothesis for the mechanism of reversible oxidative inactivation, but we had to rule it out based on the results of DFT calculations. Indeed, a small model of the H cluster in overoxidized state did show barrierless formation of Fed-N bond along geometry optimization, but no such bond is formed when relevant portions of the protein are also included in the model. 204 Chronoamperometry experiments can be analysed in a qualitative manner, to observe that the enzyme activates or inactivates, but we have also developed methods for precisely measuring the rates of the transformations in experiments where the electrode potential is repeatedly stepped up and down to trigger (in)activation. 6,205,206 Analysing experiments such as those in Fig. 12C, we demonstrated that [FeFe]-hydrogenase undergoes both reversible and irreversible inactivation at high potential. The activity loss, evidenced by a decrease in H 2 oxidation current, is clearly bi-exponential, which we interpret as an evidence that the active species (Hox) reversibly converts into two inactive species. Moreover, the dependence on pH of the two rate constants of reactivation of the inactive states indicates that the formation of each inactive species corresponds to a oneelectron oxidation of the active site that is coupled to the loss of one proton. This is remarkable because the H cluster has only one acidic proton, on the pendant amine, which should be tightly bound (indeed, DFT calculations suggest that deprotonation leads to the cleavage of one of the C-S bonds within DTMA). Our observation is therefore inconsistent with the former hypothesis that inactivation results from the intramolecular binding of the nitrogen atom of dtma.
In search of the origin of the two protons released upon oxidation, we were tempted to consider that inactivation could result from the binding of a water molecule; indeed, Fe d -OH 2 bond formation would make water signicantly acid. An oxygen atom bound to Fe d is present in the models of the crystal structure of Clostridium pasteurianum [FeFe]-hydrogenase, 201,207 but it was difficult to imagine that there could be two isomers of this water complex, whereas the electrochemical data clearly reveal the formation of two distinct, inactive species.
Another ligand whose acidity increases upon metal binding is H 2 . With this idea in mind, we examined the dependence of the reversible inactivation rate constants on H 2 partial pressure; the experimental results in Fig. 12D showed that the two rate constants of inactivation are proportional to H 2 concentration, meaning that regarding each of the two inactive species, H 2 binding is actually the rst step of the inactivation reaction. This led us to look for three distinct modes of H 2 binding to the H cluster, two of which would be non-productive. Since the only vacant coordination site is on Fe d , we hypothesized that alternative coordination sites may be created aer the movement of the intrinsic CO ligand that is bound to Fe d to an axial position, and/or the movement of the bridging CO to a terminal position on Fe d (Fig. 10, Hox(b), (c) and (d) isomers); we considered as unlikely that the CN À ligand on Fe d would move, because it is bound to a conserved lysine residue by a hydrogen bond. 208 Inspection of the protein crystal structure indicates that the interconversion among the three possible conformers of the H cluster may be impeded by the presence of the bulky side chain of a conserved phenylalanine residue (F234 in Cr hydrogenase) shown in Fig. 12A. However, molecular dynamics calculations showed that thermal uctuations of the structure are sufficiently large to allow the movements of iron-bound carbonyl ligands and the isomerisation of the active site (Fig. 12B). We used DFT to describe the intermediates involved in the inactivation process. DFT conrmed (through the comparison of reaction energies) that H 2 binding can occur not only on the "normal" binding site (Fig. 12E), but also on two minor (i.e. higher in energy) vacant coordination sites (Fig. 12F & G). The calculations suggest that the H-H bond is cleaved in all cases (Fig. 10), but binding on the abnormal sites leads to species that are essentially inactive, because no base is sufficiently close to the coordinated H 2 molecule to quickly accept the proton that is produced upon heterolytic cleavage.
This mechanism is supported by other experimental ndings, such as the effect of replacing phenylalanine with tyrosine (which prevents isomerisation and slows down reversible inactivation), and the fact that the two inactive states are protected against O 2 attack (the coordination sphere of Fe d is complete when H 2 binds to abnormal positions). The electrochemical data also show that the two H 2 -bound oxidized forms are not destroyed at high potential, unlike the fully oxidized H 2 -free H cluster (Hox +1 (b) in Fig. 10); this is consistent with the previous observation of Lubitz and coworkers; 44 we therefore hypothesised that the oxidative, irreversible inactivation arises from the attack of the distal Fe d by a nucleophilic molecule (e.g. water) which competes with H 2 binding.
Overall, this is a case where all experiments and calculations converge on the conclusion that the H cluster is more exible, and its chemistry more versatile, than had been anticipated based on the crystal structure. This may be relevant in the case of other inorganic active sites.

Long range proton transfer (PT)
The catalytic cycles of the redox enzymes that we discuss here involve transfers of protons and electrons over long distances, between the active site and the solvent or the redox partner. The crystal structures of hydrogenases and CODH immediately give the information about the electron transfer pathways, which is a chain of FeS clusters. However, experimental information about the kinetics of elementary ET steps along this chain is scarce, 127,212 and calculations of ET rates virtually non-existent (this contrasts with the situation where hemes mediate long range electron transfer 84,[213][214][215][216][217][218]. In contrast, the path taken by protons cannot always be deduced from the X-ray structure in a straightforward manner, and calculations by different authors sometimes give different results (in the case of [NiFe]-hydrogenase, the PT pathway is still elusive). Measuring the rate of PT in an enzyme is straightforward only if PT is the rate limiting step, as occurs with carbonic anhydrase. Using site-directed mutagenesis to identify a PT pathway may prove particularly challenging, as illustrated below.

[FeFe]-hydrogenases.
Based on the initial structure of the enzyme from C. pasteurianum, Peters and coll. suggested that Cys299 could act as a proton donor for the formation of dihydrogen, and identied a putative PT pathway connecting the protein surface to C299, involving two Glu residues, a Ser residue, and a water molecule, 207 as shown in Fig. 13A. This pathway has been widely supported by calculations 219-222 and site-directed mutagenesis studies. 223,224 Hong and collaborators 219 used a combination of DFT and QM/MM molecular dynamics simulations to study plausible PT pathways from the enzyme surface to the H cluster. Although free energies were not computed and therefore this study provided only qualitative information, the results are consistent with experimental evidences, and suggest a mechanism in which protons move from E282 to E279 via S319 and from E279 to C299 via water612. Ginovska-Pangovska et al. 220 carried out a series of classical MD simulations in the wild type enzyme, as well as in a series of mutants, starting from the assumption that a well-dened and stable hydrogen bonding network is fundamental for efficient PT. Their results also support the pathway shown in Fig. 13A and suggest the existence of a persistent hydrogen bonded core (residues C299 to S319), with less persistent hydrogen bonds at the ends of the pathway for both H 2 release and H 2 uptake. Long et al. 221 combined classical MD simulations, free energy perturbation and QM/MM calculations to quantitatively investigate the kinetics and thermodynamics of the PT pathway described by Hong and collaborators. 219 It turned out that the side chains of E279 and E282 could adopt two different conformations, depending on their protonation state, and are well suited to play the role of proton shuttles. In particular, a proton from bulk water can enter the protein through E282, and then be transferred to C299 via pathways that involve E279 and S319. The importance of S319 and C299 was supported by running calculations with in silico mutants: according to the analysis of the QM/MM MD simulation trajectories, the S319A and C299S mutations prevent PT during the simulation time. 219 The effect of single substitutions (C299S, E279D and E282D) has also been assessed in silico by examining the disruption in the hydrogen bonding network. 220 The C299S mutant is indeed inactive in H 2 evolution according to three independent investigations. [223][224][225] The enzyme retains activity only when C299 is replaced by aspartic acid. 225 It has been observed that the C299A, C299S, E279D, E279L and S319A mutants have no H 2 evolution activity, but 5 to 30% residual H 2 oxidation activity, whereas the E282D and E282L mutants have 5-30% residual activity in both directions. 223 The authors rule out the relevance of a second putative PT pathway starting from C299, passing through several modelled water molecules and S298, and ending at the nonconserved K571 residue at the enzyme surface by showing that S298 is not critical for activity (the S298A mutation has no effect).

[NiFe]-hydrogenase.
The situation is far less consensual in the case of [NiFe]-hydrogenases; this example illustrates the limitations of both the experimental and theoretical methods for studying the kinetics of PT in complex enzymes, and the difficulty in combining the information in that case. Many distinct PT pathways have been proposed based on the examination of the X-ray structures of the [NiFe] enzymes from Desulfovibrio species, 45,211,226 E. coli 227 and Hydrogenovibrio marinus. 153 Fig. 13B only shows those that have been selected in computational studies. [209][210][211] According to these results, PT to/ from the active site occurs either between the sulfur atom of a cystein ligand to the Ni and E25 of the large subunit, or between the Ni and R476 of the large subunit (D. fructosovorans numbering). Both amino acids are fully conserved. In this section, we indicate by (L) or (S) the location of the amino acids in the large or small subunit of the enzyme.
On the basis of calculations, and assuming E25(L) as the starting point, complete pathways have been proposed (Fig. 13B), using structures of enzymes from D. fructosovorans, 209 D. gigas 210 and D. vulgaris. 228 In particular, Baptista and collaborators 210

Review
Energy & Environmental Science the pK a values of protonatable groups within the protein, whereas the distance-based network analysis was used to nd likely pathways for the proton transport. A PT pathway was proposed between the active site and the surface that mainly involves glutamate and histidine residues: E18(L), H20(L), H13(S), E16(S), Y44(S), E46(S), E57(S), E73(S), and some water molecules (purple in Fig. 13). Fdez Galván et al. 209 carried out a QM/MM study of the [NiFe]-hydrogenase from D. fructosovorans, computing reaction and activation energies for plausible pathways. In this case, the calculations were carried out not only on the crystallographic structure, but also considering several structures of the protein obtained from MD simulations. Higher level quantum chemical (DFT) corrections were also made to some of the calculated energy proles. The pathway characterized by the most favorable energy prole involves PT via E25(L), E16(S), and E46(S) (green in Fig. 13), and corresponds approximatively to the pathway proposed by Teixeira et al. A second pathway (blue in Fig. 13), which involves E25(L), H549(L), and E53(L), was characterized by a less favorable reaction energy prole. Notably, Fdez Galván et al. underlined that the results obtained in their work, as well as in the study by Teixeira et al., 210 could not be considered conclusive because only a limited set of possible pathways was examined. In addition, only "static" pathways were considered, not considering possible alternatives forms produced by medium-or large-scale movements of the protein. In a later work, Summer and Voth 228 studied PT in D. vulgaris Miyazaki F [NiFe]-hydrogenase using multi-state empirical valence bond (MS-EVB) reactive MD simulations, coupled to an enhanced path sampling methodology. MS-EVB, which is a molecular-mechanics approach that dynamically allows chemical bonds to break and form during MD simulations, was coupled with metadynamics, which can be used to nd complex, nonlinear minimum free-energy pathways. In contrast with the previous computational studies, this methodology allowed to nd unbiased PT pathways, i.e. without making a priori assumptions. Each simulation was initialized with a hydronium near residue E34(L), which is the assumed initial site in the PT chain, and three PT pathways were found. The preferred pathway, as deduced considering the frequency with which this pathway was found in all active site geometries and oxidation states under consideration, is in agreement with previous proposals, and involves H13(S), E16(S), T18(S), H36(L), E46(S), E57(S), and E75(S). Notably, the residues E16(S), T18(S) and E75(S) (E16(S), T18(S), and E73(S) in D. gigas) are conserved in the [NiFe]-hydrogenases from all Desulfovibrio species. A completely different pathway starting with R476 of the large subunit has been proposed on the basis of the examination of the structure of D. desulfuricans hydrogenase 211 and recently supported using calculations with the structure of the [NiFe]-enzyme from D. vulgaris. 229 The observation that the sequences of the large subunits of almost all membrane-bound [NiFe]-hydrogenases show a highly conserved histidine-rich region, prompted Kovács and collaborators 229 to carry out a computational and experimental study on the [NiFe]-hydrogenase from T. roseopersicina. Only two of these conserved histidines are present in the cytoplasmic hydrogenase (H104 and H110, in T. roseopersicina, H124 and H130 in D. vulgaris). Since the structure of the enzyme from T. roseopersicina has not yet been determined, and considering that a homology model could not be used to propose possible proton-hopping mechanisms due to the fact that the positions of structural water molecules could not be predicted, the authors analyzed the X-ray structure of the [NiFe]-hydrogenase from D. vulgaris Miyazaki F. The protonation state of the aminoacids at pH 7.4 and the preferred orientation of the structural water molecules were predicted minimizing the total free energy of the system. Based on the analysis of networks of hydrogen bonds, it was concluded that, among the conserved His residues, only H104 plays an important role in the enzyme function, suggesting that this residue could be part of an alternative PT route involving R487, H104 and D103.
The above results obtained in the computational investigations highlight peculiar problems connected to the prediction of PT pathways. First of all, it should be noted that to properly model PT in proteins one should not only take into account proton migration between different sites (which is a reactive event), but also consider the dynamics of the protein, and the possible involvement of solvent molecules in the PT chain. The most rigorous approach to study such process in an unbiased way would imply to use QM methods to model both the reactive and dynamical behavior of the system, which is clearly prohibitive. Therefore, in a more realistic way, PT pathways have to be studied using complex ad hoc computational schemes, either by postulating a priori possible pathways, or without any bias but using a more qualitative level, which necessarily includes some empirical parameters. In this context, the different computational approaches discussed above have been very helpful for the suggestion and evaluation of plausible PT pathways, even if the comparison of results obtained from different methods is challenging.
It is not easier to discriminate between the three main putative pathways in [NiFe]-hydrogenase using site-directed mutagenesis.
The hypothesis that the rst PT relay is E25(L) was supported by site-directed studies, showing that replacing E25 with a nonprotonatable glutamine abolishes PT in the enzyme from D. fructosovorans 230 and the hydrogen-sensor hydrogenase from Ralstonia eutropha. 231 That the active site is functional in the two E25(L)Q mutants was conrmed by the observation that they retain the ability to convert ortho and para dihydrogen. 232 In contrast, the relevance of the rightmost pathway in Fig. 13B is supported by the characterization of site-directed mutants of the enzyme from T. roseopersicina: the replacement of E14 (E25(L) in D. fructosovorans) with a glutamine results in only a two-fold decrease of the H 2 oxidation/production rates, 229 whereas the D103L, H104A and H104F have little activity (D. fructosovorans numbering R476(L), D123(L), H115(L)). The function of the arginine shown in Fig. 13B cannot be tested by site-directed mutagenesis: the attempts to produce the R476K and R476L mutants of D. fructosovorans [NiFe]-hydrogenase and R487I of T. roseopersicina hydrogenase failed, the bacteria did not produce a mature form of the enzyme (unpublished results of ours and ref. 229). Since the amino acids involved in the two pathways are present in both hydrogenases (T. roseopersicina and D. fructosovorans) it is unclear how a single pathway can be functional in each enzyme.
Overall, testing the putative PT pathways in hydrogenases and other complex metalloenzymes appears to be very difficult for a number of reasons.
(1) In contrast to the case of carbonic anhydrase discussed below, there is no indication that PT limits the rate of H 2 oxidation, H 2 production or isotope exchange in WT [NiFe]hydrogenase. This implies that a mutation that decreases slightly or increases the rate of PT may have no apparent effect. There is no experimental method that measures the rate of single PT events in hydrogenases. Any quantitative comparison between the results of computational and experimental characterization of site-directed mutants is therefore impossible.
(2) Unlike electron transfer pathways, the putative PT pathways may be highly ramied. Even the E25(L)-E57(S) pathway depicted in Fig. 13B involves many parallel routes. If they were functional, this would imply that several branches of a ramied pathway have to be blocked in a single mutant in order to observe an effect.
(3) Another problem is very general regarding studies based on site-directed mutagenesis: it is not always possible to make sure that substituting an amino acid has no side effects. There are examples in the literature where a mutation intended to interrupt a PT pathway does not have the expected effect because a water molecule is stabilized in the mutant and substitutes for the missing side chain. 233 It may also be that structural rearrangements remote from the site of the mutation disrupt the hydrogen bond network or create new PT pathways. Regarding the works cited in this section, none of the mutants have been crystallized to make sure that such effects are not the cause of the observed phenotypes. The (unpublished) observation of ours that the E46(S)Q mutant of D. fructosovorans [NiFe]-hydrogenase has approximatively 50% of the WT H 2 oxidation/production rates may suggest that E46(S) is not essential, but only if we can rule out the above mentioned artifacts. In contrast, we have observed that the E16(S)Q and E16(S)V mutations in D. fructosovorans [NiFe]-hydrogenase severely affected both H 2 production/oxidation and isotopeexchange activity (unpublished), but it is not unambiguous evidence that PT is impaired in these mutants.
Worse, a mutation design to assess a PT pathway may also affect steps others than proton transfer. The T18(S) amino acid shown in Fig. 13B is next to a cysteine ligand of an electron transfer cluster (C19(S)) and its backbone shapes the substrate gas channel. Replacing T18(S) may affect the activity in a way that is mistakenly interpreted as revealing the disruption of a PT pathway.
(4) Last, the mutation of a side chain putatively involved in PT sometimes prevents protein folding. We have not been able to replace H549(L), which is a direct ligand of a putative Mg ion (turquoise in Fig. 13B) and appears to have a structural role. We also failed to discriminate between the two pathways starting with E25(L) by examining the effect of replacing the E57(S) and H549(L) residues, because the mutants we constructed could not be produced (E57Q, H549R, H549Q, H549V in D. fructosovorans [NiFe]-hydrogenase; unpublished results).
Overall, regarding PT in hydrogenases, we must acknowledge that rather limited results have been achieved since it became possible (in the late 1990's) to use site directed mutagenesis to test the numerous putative pathways detected in the crystal structures. This is because there is no direct measurement of the rate of PT in hydrogenase, and no strong conclusion about the effect of a mutation can be reached if the mutant of interest is not fully characterized using crystallography, spectroscopy, kinetic methods, etc.
4.4.3 PT in carbonic anhydrase. Carbonic anhydrase is one of the rare enzymes in which the chemical step of catalysis is so fast that intramolecular PT is rate limiting, which allows this transfer to be studied in detail.
The mammalian enzyme carbonic anhydrase II (CAII) catalyses CO 2 hydration into HCO À 3 and the reverse reaction, which are involved in various physiological processes: The active site is a Zn centre coordinated by three His nitrogen atoms and one OH À ligand. The catalytic cycle includes a chemical step: (36) and the transfer of the extra proton to a buffer base B through His64: ZnH 2 O + His64 # ZnOH À + His64, H + (37a) His64, H + + B # His64 + BH + (37b) The maximum turnover rate of CAII is 10 6 s À1 for the hydration reaction and 5 Â 10 5 s À1 for the inverse reaction, which makes this reaction one of the fastest catalyzed reactions. Both rate constants were shown to decrease four-fold when the reactions took place in D 2 O and it was soon demonstrated that intramolecular PT (reaction (37a)) is rate limiting. The activation free energy deduced from the temperature dependence of the catalytic constant of the hydration reaction is DG ‡ ¼ 9.0 kcal mol À1 at 25 C. Carbonic anhydrase is one of the rare enzymes in which the chemical step of catalysis is so fast that intramolecular PT is rate limiting, which allows this transfer to be studied in detail. Several mutated forms of the enzyme were prepared to elucidate the various factors which determine the PT rate. Replacing His64 by alanine decreased the rates 20-fold, but they were restored when proton donors like imidazole and pyridine were added to the solution. In another series of mutants, the variation of the PT rate as a function of DpK a ¼ pK a (ZnH 2 O) À pK a (His64) could be studied. Similar studies were carried out on an isoenzyme, CAIII, were residue 64 is a lysine and proton transfers are two orders of magnitude slower. The crystal structure of CAII revealed that Zn and His64 are 7Å apart and that they are connected by a network of hydrogen-bonded water molecules. Moreover, the orientation of His64 with respect to the active site can easily change from inward to outward, Fig. 2D. This rich set of data has motivated a number of theoretical studies. To specify the role of His64, the PMF was rst calculated with and without His64. 234 The calculated values of pK a (ZnH 2 O), pK a (His64) and DG ‡ were in good agreement with the data only when His64 was present in the inward orientation. The effect of the His64 to Ala mutation and the rescue by imidazole were also studied. Warshel's group was able to reproduce the variation of the PT rate as a function of DpK a in CAIII by using a simplied PT chain made of His64, a water molecule and ZnOH À (ref. 235). These studies underscore the importance of thermodynamic factors like the pK a s of His64, of water molecules and more importantly of ZnH 2 O, which must be close to 7 to ensure catalysis in both directions.

Conclusion
In this review we have shown how experimental and computational information can be combined to obtain mechanistic insight into enzyme catalysis that would not be possible to achieve by experimental or computational work alone. We have illustrated this point by discussing a few selected examples, focussing on enzymes that are relevant in the context of renewable fuel production: hydrogenases and carbon-monoxide dehydrogenase.
For reasons that we have discussed in the introduction, the catalytic mechanisms of enzymes that use inorganic active sites, such as those discussed in this review, is oen very difficult to study. Theoretical methods have been invaluable for predicting and understanding the molecular structure of intermediates by calculating their spectroscopic signatures, but the interplay between experiments and theory should also be useful for learning about the reactivity of these intermediates, and, about the kinetics of their chemical transformations. In particular, with regard to enzyme kinetics, the synergy arises because experimental methods typically report phenomenological rate constants characterising the overall process, whereas computational methods can help disentangle them to a set of rate constants of well dened elementary reaction steps. The ability to devise well dened model system, with innite spatial resolution, is probably the greatest advantage of computational methods. Finding ways to connect computations on molecular models to actual experimental observations is arguably the greatest challenge.
Regarding the relative contributions of experimentalists and theoreticians in the elucidation of enzyme mechanisms, the question of which of the two plays the most important role is awed, because the question assumes a two-step strategy where an initial proposal is simply followed by conrmation or refutation. In this review, we have attempted to describe another strategy where experimentalist and theoreticians work hand in hand and combine their expertise to obtain an answer more quickly and, hopefully, spread it over fewer papers.