Computational modelling of oxygenation processes in enzymes and biomimetic model complexes

With computational resources becoming more eﬃcient and more powerful and at the same time cheaper, computational methods have become more and more popular for studies on biochemical and biomimetic systems. Although large eﬀorts from the scientific community have gone into exploring the possibilities of computational methods for studies on large biochemical systems, such studies are not without pitfalls and often cannot be routinely done but require expert execution. In this review we summarize and highlight advances in computational methodology and its application to enzymatic and biomimetic model complexes. In particular, we emphasize on topical and state-of-the-art methodologies that are able to either reproduce experimental findings, e.g. , spectroscopic parameters and rate constants, accurately or make predictions of short-lived intermediates and fast reaction processes in nature. Moreover, we give examples of processes where certain computational methods dramatically fail.


Introduction
Nature has a large arsenal of efficient catalysts that are able to perform regioselective as well as stereospecific substrate conversions. These chemical systems, i.e. enzymes, work at ambient temperature and pressure and have specific functions such as biodegradation, bioconversion or biosynthesis of compounds. Enzymatic reactions influence the general health and well-being of the organism and as such are important. Often the enzymes contain a transition metal active site, where the catalysis takes place. As iron is abundant relative to other metals on planet Earth, many enzymes utilize iron in their active site, 1,2 but there are also numerous examples of copper-, 3 vanadium-, 4 molybdenum-, 5 and other transition metal-containing enzymes as well as non-transition metal containing enzymes. In addition, enzymes sometimes use clusters of two or more transition metal atoms as catalytic centres, as is, for instance, the case in the diiron enzyme ribonucleotide reductase 6 or the multi-metal cluster in photosystem II. 7 The mononuclear iron-containing enzymes can be split into heme 1 and nonheme 2 categories of which we show the active site structure of two typical examples in Fig. 1: namely for the heme enzyme cytochrome P450 and the nonheme iron enzyme taurine/a-ketoglutarate dioxygenase. 8,9 The crystal structure coordinates of the former were taken from the human enzyme cytochrome P450 2A13, which is a liver enzyme involved in the detoxification of endogenous compounds. The heme group is linked to the protein via a direct bond of the metal with the thiolate group of Cys 439 , the axial ligand. It has been proposed that this axial ligand fine-tunes the electronic properties of the oxidant and gives it the functional properties to act as a monooxygenase. 10 In contrast, other heme enzymes, such as peroxidases, typically have a histidine group as the axial ligand, 11 while catalases use a tyrosinate group. 12 As usual, key intermediates of the catalytic cycles of enzymes are short-lived, and consequently, difficult to detect and characterize. This means that experimental work in the field often needs to be complemented by theoretical studies. Computational modelling has been instrumental in assisting experiment into establishing the fundamental properties and changes upon ligand substitution in heme enzymes and their catalytic features. 13 For instance, more than 10 years before the first characterization of the active species of P450 enzymes, theory proposed a high-valent iron(IV)oxo heme cation radical (Compound I) that was found to react with substrates via radical mechanisms with low barriers. 14 Sam P. de Visser Sam de Visser graduated with a BSc in Analytical Chemistry from the College of Zeeland (the Netherlands) in 1990. He then moved to the University of Amsterdam (the Netherlands), w h e r eh eo b t a i n e da nM S ci n Physical Chemistry (1993) and a PhD in Organic Mass Spectrometry (1997) with Prof. Nico Nibbering. During his PhD studies he got in touch with computational modelling and moved into this field and specialized further during two postdoctoral positions at King's College London and the Hebrew University of Jerusalem. Since 2004 he has been a lecturer in Biophysics at the University of Manchester (United Kingdom) and is based at the Manchester Institute of Biotechnology. His research interests are in the field of biomimetic and metalloenzyme reaction mechanisms using QM-cluster as well as QM/MM methodologies. He has experience with heme enzymes (cytochromes P450, peroxidases, catalases) as well as nonheme iron dioxygenases. In addition, his group works on a range of transition metal containing biomimetic models in combination with experimental studies, where the calculations focus on reaction kinetics and spectroscopic characterization.

Matthew G. Quesne
Matthew Quesne completed his undergraduate studies from the University of Lancaster (United Kingdom) with specialization in Biochemistry. He then moved to the University of Manchester, w h e r eh es t a r t e dh i sP h Ds t u d i e s under the supervision of Dr Sam de Visser in September 2009. He has experience with computational modelling using QM-cluster and QM/MM techniques. His research interests include the understanding of biological reaction mechanisms of nonheme iron hydroxylases and halogenases. In addition, he has worked on synthetic model complexes of these nonheme iron systems.

Bodo Martin
Bodo Martin obtained a diploma in Chemistry from the Friedrich-Alexander University in Erlangen (Germany) in 1999, and a PhD in Organic and Computational Chemistry from the same University in 2004. In 2002 he joined the workgroup of Peter Comba at the University of Heidelberg (Germany). He is interested in modelling transition metal complexes using ab initio and DFT methodologies as well as classical calculations of geometries, reaction mechanisms, and spectroscopic parameters. He is also involved in method development of novel classical force fields and property models for inorganic compounds.

Peter Comba
Peter Comba obtained a diploma in Chemistry and Chemical Education from the ETH Zürich (Switzerland) and a PhD from the Université de Neuchâtel (Switzerland). After a Research Fellowship at the Australian National University, a postdoc at the University of Lausanne and a junior group leader position at the University of Basel (Switzerland), he joined the University of Heidelberg (Germany) in 1992. He is interested in fundamental properties of transition metal coordination chemistry, involving ligand design and synthesis, preparative coordination compounds, spectroscopy as well as theoretical and computational inorganic chemistry, with projects in bioinorganic and medicinal inorganic chemistry, molecular magnetism and molecular catalysis.
Only a few years ago, Rittle and Green characterized Compound I using electron paramagnetic resonance (EPR), absorption and Mössbauer spectroscopy. 15 As a comparison, we also show in Fig. 1(b) the crystal structure of a nonheme iron enzyme, where the metal is bound via a characteristic facial triad represented by two histidine and an aspartate amino acid. This enzyme is involved in the biodegradation of taurine and functions as a sulphur source. Also here, computational modelling assisted in establishing differences in the properties between heme and nonheme iron enzymes and the origin of the efficiency of hydrogen-atom abstraction. 16 However, axial and equatorial ligand effects in substrate oxidation by heme and nonheme iron complexes have been extensively reviewed previously and will not be covered here. 17 It is clear from these examples that theory can assist experimental studies related to biocatalysis and biotransformations and give insight into reaction mechanisms and pathways. Unfortunately, these studies are not routinely done, but may involve caveats and pitfalls. To fully appreciate and understand the computational decisions on method and model selection, we review here computational approaches with relevance to transition metal catalysis. We will start with a brief overview of the basic methods in the field and their general accuracy and reproducibility and then give some recent examples on how theory guided and assisted experimental work.

Quantum mechanics and density functional theory
According to quantum mechanics (QM), all properties of a molecular system can be obtained from the wave function (C), which is the solution of the Schrödinger equation, 18 eqn (1), and expresses the total energy of the molecule (E) as an eigenfunction of the Hamilton operator (H).
The Hamilton operator includes the kinetic and potential energy contributions of all the electrons and nuclei within a molecule as described in eqn (2) and (3). 19 Here, h is Planck's constant divided by 2p, m e is the electron mass, e is the elementary charge, e 0 represents the permittivity in vacuum, and r Ij are the distances between nuclei and electrons, whereas the distances between two electrons and two nuclei are described by r ij and R IJ , respectively. The Hamiltonian in eqn (2) is split into four components that represent the kinetic energy of the electrons, the electron-nucleus attractions, the electron-electron repulsions and the nucleus-nucleus repulsions, respectively. Within the Born-Oppenheimer approximation the nucleus-nucleus interactions are constant and the electronic energy E is calculated for a rigid molecular geometry.
The solutions to the Schrödinger equation clearly depend on the number of electrons (N e ) and the number of nuclei (N p )i n the chemical system and will increase dramatically in complexity as the size of the system expands. Because of its complexity, the Schrödinger equation can only be solved exactly for the hydrogen atom and the H 2 + ion, while for any larger chemical system approximations need to be used. This has led to the development of a large range of computational chemistry methods that all have different features and accuracies. For accurate results, thorough benchmarking and calibration is required, especially for systems that contain transition metals whereby the consistency and reproducibility of methods is tested by comparison to higher-level methods or against experimental data, as shown in several examples below.
The most basic solution to the Schrödinger equation comes from Hartree-Fock (HF) theory, in which it is assumed that each electron moves in the average field of all the other electrons and the one-electron wave functions are expanded into a set of known (hydrogen-like) orbitals: the basis set. The solution of the Schrödinger equation is not the end of the calculation as shown schematically in Fig. 2. Thus, after one has created a starting geometry and selected a method, i.e. a Hamilton operator description, and basis set, or wave function model, one can solve the Schrödinger equation and obtain the energy associated with the starting structure. Subsequently, the geometry is modified and a new energy is calculated. Thereafter, the energy is minimized with respect to all geometric coordinates in the

Ulf Ryde
Ulf Ryde received his PhD in Biochemistry from Lund University (Sweden) under the supervision of Prof. G. Pettersson in 1991. He then moved into the field of Theoretical Chemistry at the same university, where he became a docent in 1996 and full professor in 2004. During 2001-2007 he had a senior research position from the Swedish research council. He studies the structure and function of proteins, in particular, metalloproteins, such as blue copper proteins, heme enzymes, vitamin B 12 enzymes, hydrogenases, and multi-copper oxidases. He has developed QM/MM methods for an accurate treatment of environmental effects, e.g. using accurate MM force fields with multipole expansions and anisotropic polarisation, and combinations of QM/MM with free-energy methods or experimental approaches, such as X-ray crystallography, NMR, and EXAFS. He also studies and develops methods to calculate ligand-binding affinities, especially combinations of MM and QM methods.
chemical system and when the first derivatives of the energy with respect to the degrees of freedom reach a certain threshold value, the structure is considered converged, i.e. an optimized geometry is found. A frequency calculation, vide supra, will then establish whether a structure is a local minimum or a higherorder saddle point.
The HF method gives reasonable geometries but its energies are not good enough to reproduce chemical reaction mechanisms and experimental barrier heights. Improved results are generally obtained with Møller-Plesset second-order perturbation theory (MP2) 20 that gives much more accurate energies and chemical structures as compared to HF. 20 Even better results are obtained with coupled-cluster calculations using singles, doubles and perturbatively treated triples (CCSD(T)), 21 which is currently considered as the gold-standard method in QM. 22 Unfortunately, MP2 and especially CCSD(T) are computationally demanding and even with current resources they can only be applied to chemical systems with less than about 50 and 20 atoms, respectively. 23 For biosystems and biomimetic complexes with often well over 100 atoms, computationally cheaper methods are needed.
As an alternative to the wave function methods, DFT was developed by Kohn and co-workers in the 1960s. 24 They showed that there is a one-to-one correspondence between the wave function and the electron density, r(r). The latter is a functional of the three Cartesian coordinates (r), whereas the wave function is a function of the coordinates of every electron in the chemical system; therefore, this was a major simplification. In DFT, the electronic energy (E el ) is calculated from a functional of the electron density, which unfortunately is not known. However, a typical expression is given by the sum of the integrals in eqn (4). The first term contains an integral over the Laplacian over the atomic orbitals and represents the kinetic energy of the electrons. The second and third terms involve the electronnucleus attractions and the electron-electron repulsions, respectively. The final exchange-correlation term (E xc )h a st h el a r g e s t uncertainty as no exact equations are known for this component; it is normally split into individual exchange and correlation components, E x and E c , where the exchange energy results from ferromagnetic interactions of two electrons with the same spin (a or b) in different orbitals, while the correlation energy is determined by the electron pairing energies of two electrons in the same molecular orbital.
There are many available exchange and correlation energy descriptions and two widely used ones are the Slater exchange (E Slater x ), eqn (5), 25 and the Vosko, Wilks and Nusair correlation energy (E VWN c ), eqn (6). 26 The Slater exchange equation includes a scale factor (a ex ) that has a value of 2 3 for a uniform electron gas. The local density approximation (LDA) uses E xc = E Slater which generally gives a good improvement over HF theory, but is not good enough for most chemical systems.
In the 1990s DFT methodology made a major leap due to the development of functionals that depend on the gradient of the electron density, e.g., E Becke x , 27 and hybrid density functional methods. 28 Becke discovered that calculated energies had a systematic error and showed that the results could be improved by using a fraction of the exchange energy from a HF calculation, E HF x . He combined it with different exchange and correlation functionals using three fitted parameters (A, B,a n dC). An example of such a hybrid DFT method is B3LYP, given in eqn (7), 28,29 which has become immensely popular due to its low computational cost and relatively high accuracy. The three parameters A, B,a n dC were optimized against experimentally determined electron affinities, ionization potentials and proton affinities for a test set of molecules, technically making it a semi-empirical method.
In recent years it has been found that the B3LYP method has weaknesses, especially for transition metal containing systems, and alternative DFT methods have been suggested. Unfortunately, no universal and consistent method has been developed yet, so that a range of different methods are in current use. For example, it has been shown that a modified hybrid DFT method with 15% E HF x (rather than the standard 20%), termed B3LYP*, gives better energetics for some transition metal complexes, 30 but as will be discussed below for other properties higher amounts of E HF x may be needed. In addition, neither HF nor standard DFT methods can describe dispersion interactions. Recent work by the Grimme group has shown that a simple empirical dispersion correction significantly improves calculated energies: B3LYP-D. 31 Typically, the results of DFT calculations show only limited dependence on the functionals used. However, for transition metal complexes with close-lying spin states the ordering and spin-state energy splitting often strongly depend on the functional chosen, and particularly on the amount of E HF x .I ti so f paramount importance, therefore, to calibrate DFT calculations on transition metal complexes to either experimental data or to do test calculations with alternative methods.

Geometries, frequencies and free energies
The solution of the Schrödinger equation is not the end of the calculation as shown schematically in Fig. 2. Thus, after one has created a starting geometry and selected a method and basis set, the QM method gives the energy associated with the starting structure. Thereafter, the energy can be minimized with respect to the geometric coordinates of the chemical system using the first derivatives of the energy with respect to the coordinates. A frequency calculation can establish whether a structure is a local minimum or a higher-order saddle point.
A QM calculation gives the electronic energy of the studied system. In order to convert these energies into values that can be compared to experiments, one needs to add vibrational, entropic and often also solvent effects. The former two can be obtained from a frequency calculation on the optimized geometry, which establishes the second derivatives of the energy with respect to the coordinates, i.e. the degrees of freedom. These derivatives are used to calculate the vibrational energy levels of a molecule and give the zero-point energy (ZPE) (see Fig. 3). ZPE is the lowest vibrational energy level (n 0 ) in which the system will reside at 0 K temperature. Fig. 3 shows the potential energy profile along one degree of freedom (e.g., a reaction coordinate) between two local minima S1 and S2 with electronic energies E S1 and E S2 ,r e s p e ctively. These two local minima are separated from each other by a transition state (TS). Higher energy vibrational levels of the local minimum S1 are given as n 1 , n 2 and n 3 in Fig. 3. A Boltzmann distribution at a specific temperature will determine the degree of occupation of each individual vibrational energy level.
In addition, using the information of the geometry and the vibrational energy levels the entropy can be calculated from the individual partition functions (Q) for the translational, rotational and vibrational entropy components. Generally, the entropy of a molecule is calculated from the partial derivative of the free energy (G) to the temperature (T) at constant pressure (p), eqn (8). In this equation R is the gas constant and N is the total number of particles.
The partition function is the product of all molecular variables and includes the translational (Q tr ), rotational (Q rot )a n dv i b r ational (Q vib ) contributions of the molecule as well as those arising from internal rotations (Q int ) and electronic spin (Q es ), eqn (9). For a molecule obeying the harmonic oscillator, rigid-rotor and Ideal Gas Law conditions, equations have been derived for these partition function components, eqn (10)-(12). 32 In these equations k B is Boltzmann's constant, h is Planck's constant, m is the molecular mass, T is the temperature and s is the symmetry number. The moments of inertia A, B and C in eqn (11) are dependent on the Cartesian coordinates of the molecule with respect to its centre of mass.
Quantum chemical software packages usually include these entropy derivations in order to approximate the free energy of a molecule. The obtained free energies of activation can then be compared with experimentally determined values or rate constants. We did that on several occasions and found that free energies of activation agree to within 3 kcal mol À1 from experiment using B3LYP with a triple-z basis set and solvent corrections. 33 Fig. 3 Potential energy profile between two local minima S1 and S2 each with their own vibrational energy levels (n) and ZPE. To further test the method dependence on the calculated free energy of activation we did a set of test calculations as shown in Fig. 4 5 ] 2+ with dimethylsulfoxide and measured a free energy of activation of 10.5 kcal mol À1 at 298 K. To find out how well computations can reproduce this, we studied the catalytic mechanism with the B3LYP, BP86, PBE1PBE, M06 and B3LYP-D DFT methods as well as with MP2. Fig. 4 displays the optimized geometries of the sulfoxidation transition states and the corresponding free energies of activation. In this particular example the DFT methods are within a couple of kcal mol À1 from the experimental value and B3LYP and BP86 are closest. Interestingly, the computationally expensive MP2 method performs very poorly and gives a much higher barrier of 39.0 kcal mol À1 . Clearly, computational time in this example does not correlate with higher accuracy.

Environmental (solvent) effects
Calculated energies using computational chemistry methods refer to gas-phase data. In order to compare computational results with experimental data, the solvent needs to be implemented into the calculations. This can be done by simply adding explicit solvent molecules to the chemical model and reoptimizing the structures. However, calculations with a large number of atoms require more resources and are not always practical, especially since solvent molecules can exist in many possible conformations. 35 Therefore, a more common approach is to add solvent corrections to the energetics using an implicit solvent model. 36 This can either be done at the single point level, whereby the energy is recalculated with a solvent model on the gas-phase optimized geometry, or by reoptimizing the structure using a solvent model.
Most QM software packages include implicit solvent models like the polarizable continuum model (PCM), the conductorlike screening method (COSMO) or the density solvation model (SMD). 37 In these calculations, the molecule is placed in a cavity that is surrounded by the dielectric continuum, characterized by a dielectric constant, and a reaction fluid builds up at the cavity surface, described by a number of surface charges, perturbing the wave function. The cavity is normally defined by a set of atomic radii. The electrostatic part of the solvation energy is supplemented by non-polar terms, which typically include dispersion, repulsion and cavitation terms.
A common approach to investigate enzyme reactions is to cut out a small model of the active site from the protein (50-200 atoms) and study it with QM methods, called the QM-cluster approach. 38 In order to ensure that the geometry of the model does not change too much from that in the protein, one or several atoms are fixed at the positions where the protein is truncated to form the QM model. Moreover, a continuum-solvation method is commonly employed to model the surrounding protein in an approximate manner. A dielectric constant of B4 is typically used, 39 a quite arbitrary choice, but fortunately it has been shown that as the size of the QM system is increased to 150-200 atoms, the results obtained with different values of the dielectric constant typically converge. 38a,40 Quantum mechanics/molecular mechanics (QM/MM) Another approach to include the effect of the surrounding into QM calculations is to use the quantum mechanics/molecular mechanics (QM/MM) approach. 41 The philosophy of the QM/MM technique is to divide the chemical system, such as an enzyme, into two or more regions (shells), Fig. 5: the inner core, which is calculated with accurate QM or DFT methods, and the outer region, which contains the remaining protein and solvent, is calculated with computationally cheap methods, like molecular mechanics (MM). 42 The inner core is the region of interest and contains the basic features of the enzyme active site and the substrate, but is held in position by interactions with the rest of the protein. These chemical constraints ensure that the active site remains in a conformation that is relevant for the enzyme. Ideally, this would combine the accuracy of QM methods with the speed of MM methods.
In practice, the QM/MM approach is implemented by an energy function of the type E tot (r 1 , r 2 )=E QM (r 1 )+E MM (r 2 )+E QM/MM (r 1 , r 2 ) (13) where r 1 and r 2 are the coordinates of the QM and MM systems, respectively, and E QM and E MM are the QM and MM energy functions. E QM/MM is the energy of the QM/MM interface and can be implemented in many different ways, carefully avoiding double-counting of interaction terms. A simple way to implement a QM/MM method without any need to modify the QM or MM software packages is via the following equation: In eqn (14) one essentially does three calculations, namely a QM calculation on the QM region, and MM calculations on both the total system and the isolated QM region. A complication arises when the QM and MM systems are connected by covalent bonds (so-called junctions). 41 Then, the QM system needs to be capped in some way to fulfil the valences of the dangling bond. This can be done either by adding extra atoms (the link-atom approach), usually hydrogen atoms, although halogen-like pseudoatoms have also been tried, or by using localised orbitals that place frozen hybrid orbitals at the boundary. With a careful calibration, both approaches can give results of a similar quality, 43 and therefore, the hydrogen link-atom approach is most common, because it is simple and can be used without any modifications of the QM software. However, it should be kept in mind that the junction problem is one of the largest sources of errors for the QM/MM approach, the influence of which can be reduced by moving the junctions away from the site of interest (making the QM system larger). Another important issue is how to treat the interactions between the QM and MM systems. 41 Van der Waals interactions are normally properly treated by MM, e.g. by the simple energy function in eqn (14). However, electrostatic interactions are more complicated to describe. The simplest solution is to treat these at the MM level (as in eqn (14)), giving what is called mechanical embedding. It requires a MM representation of the QM system, typically a point-charge model obtained by a fit to the QM electrostatic potential. However, this ignores the polarisation of the QM system by the MM system and vice versa. Therefore, it is more common that a point-charge model of the MM system is included in the QM calculations and the charges of the QM system are zeroed in into the MM calculations, which is called electrostatic (or electronic) embedding. Then, the QM system is polarised by the MM system, whereas the MM system is still not polarised. It has been argued that this may lead to an overpolarisation of the QM system 44 and it has been much discussed which charges (especially around junctions) to include in the point-charge model. 41 It should also be noted that mechanical embedding reduces the link-atom problem, because the full MM calculation (and therefore the electrostatic interactions) is performed without any link atoms. With electrostatic embedding, the link-atom problem is severe, because charges are positioned close to the artificial link-atoms. Corrections to this have been suggested, 45 but it has been shown that they often do more harm than good. 44 In fact, a carefully designed molecularembedding method typically gives better results than electronic embedding. However, the ideal procedure would be to use polarised embedding, i.e. employing a polarisable MM method and including both point charges and polarisabilities (and possibly also higher-order multipoles) in the QM calculations. Only a few applications have been performed along those lines so far. 41,46 QM/MM codes are available in most common QM and MM software packages (e.g. Gaussian, ADF, Amber, CHARMM, Schrödinger), as well as in some independent interfaces (e.g. ChemShell and COMQUM). 47 A promising approach is to combine QM/MM methods with experimental data (X-ray reflections, NMR, EXAFS) to obtain an ideal compromise between theory and experiments. 48 The advantage of QM/MM methods compared to QM-cluster calculations is that the surroundings are explicitly accounted for, avoiding a bias caused by the selection of the modelled system or any need of fixing atoms at the periphery of the QM model. The disadvantage of QM/MM is that the modelled system is so large that it is hard to ensure that all structures along a reaction pathway reside in the same local minimum (a change in the hydrogen-bond pattern of a single water molecule far from the active site can change the total energy by B7 kcal mol À1 , although it is probably not relevant for the reaction). Many solutions have been suggested to such problems, e.g. by keeping the surroundings fixed at the starting (crystal) structure, by running reactions back and forth until the energy stabilises, or by calculating reactions for many different structures, taken from snapshots from molecular dynamics simulations. 41 However, the preferred solution is to calculate free energies, rather than pure energies (from minimised structures). 41,45a,49 This can be done at the MM and semi-empirical QM/MM level with free-energy perturbation methods. Unfortunately, it is a challenge for high-level (including DFT-type) QM/MM methods, owing to the cost of these calculations and differences in the MM and QM potential energy surfaces.
Finally, it should be mentioned that both QM and QM/MM methods have problems with convergence of the calculations with respect to the size of the QM system. Convergence studies of QM-cluster calculations have shown that reaction energies can change by 3-14 kcal mol À1 as the QM system is enlarged from 27-83 to 161-220 atoms. 40 Other studies have shown that a difference of 14 kcal mol À1 can remain even for systems of B400 atoms and that the energy may differ by 11 kcal mol À1 between QM systems of 300 and 1600 atoms. 40c The convergence is not improved by geometry optimisations. 50 However, QM/MM methods also have similar problems, due to the limited accuracy of the MM methods and the link-atom problem: electrostatic-embedded QM/MM calculations gave a mean absolute deviation similar to QM-cluster calculations over 40 sizes of the QM system (40-446 atoms), of 4 and 5 kcal mol À1 , respectively. 44 Only a carefully designed mechanical-embedding QM/MM method could bring this difference down to 1.7 kcal mol À1 . Therefore, we currently recommend the use of QM/MM geometry optimisations with a quite small QM system, followed by a single-point QM calculation with a very big QM system, involving all groups within B4.5 Å of the minimal QM system and all buried charged groups in the protein, and moving all junctions three residues away from the minimal QM system, typically 600-1000 atoms. 50,51 The QM/MM methodology starts off from experimentally determined crystal structure coordinates available from the internet. 52 Often these pdb files refer to a resting state structure of the enzyme rather than its catalytically active species. Therefore, the structure often needs to be converted to a later structure in the catalytic cycle. In addition, the pdb files typically do not contain hydrogen atoms, so that they need to be added, as well as a number of explicit solvent molecules. Generally, this is done in the set-up where the user carefully checks protonation states of residues, salt and disulphide bridges and ions. Histidine groups in proteins can exist in three possible protonation states: deprotonated, singly protonated (with two different tautomers) or doubly protonated. Therefore, all His residues need to be manually and visually investigated for likely hydrogen bonding donor and acceptor groups, as well as their solvent accessibility.
Counterions can be added to the surface to balance the overall charge of the system. After this initial set up the system is equilibrated, and heated to room temperature with molecular dynamics (MD) simulations in a stepwise protocol. 41,53 Examples In this section we will give examples of several computational studies that either supported experiment or had predictive value and guided experimental work in the field. In addition, we discuss bioinorganic examples where computation was calibrated against experimental rate constants, crystal structures and spectroscopic parameters. Thus, from the ground state electronic wave function, it is, in principle, possible to calculate many molecular properties besides geometries, by calculating derivatives of the energy (such as e.g. the electric dipole moment as a first derivative of the external field). For properties involving electronic transitions, however, at least a reasonable approximation to the excited state electronic wave functions is needed, which often requires the application of more involved computational methods, as detailed below. These examples highlight the challenges that computational work needs to overcome. We will start, however, with a section on the enzymatic reaction mechanism, and in particular, focus on topics that are hard to study experimentally. In the second part of this section, we will discuss several biomimetic studies that were aimed at establishing a suitable method and technology to address the chemical systems.
Part 1: enzymatic models Catalytic mechanism of cysteine dioxygenase. Cysteine dioxygenase (CDO) is a vital enzyme for human health that catalyses the metabolism of cysteine, and thereby, detoxifies the body. 54 In contrast to most nonheme iron dioxygenases that contain a facial 2-His/1-Asp/Glu active site, in CDO there is a 3-His metal coordination. As most intermediates in the catalytic cycle are short-lived, little is known about its catalytic mechanism after dioxygen binding.
Computational studies were performed using both QM-cluster model and QM/MM approaches, 53 Fig. 6. All methods gave an open-shell singlet spin state as the ground state with orbital occupation p* xy 2 p* xz 2 p* yz 1 p* OO 1 that represents a superoxo radical antiferromagnetically coupled to a metal-oxo p*-type unpaired electron. With most QM/MM methods the nearest excited spin state is the triplet state, which is 3.0-6.0 kcal mol À1 higher in energy, and the quintet spin state is more than 10 kcal mol À1 above the ground state. These studies are well reproduced with QM-cluster models 55 that give the same spin-state ordering and minor differences in relative energies.
In addition, the QM-cluster calculations also located a septet spin state at 5.2 kcal mol À1 above the singlet spin ground state. Clearly there are a number of close-lying spin-state structures within 5 kcal mol À1 from the electronic ground state, which may result in a Boltzmann distribution and a mixture of low-lying geometries and spin-states at room temperature. Indeed for a biomimetic nonheme iron(III)-superoxo complex 12 different close-lying spin-state structures were calculated, 58 which highlights the complications in the description of iron(III)-superoxo complexes. Nevertheless, all computational studies on CDO agree that the iron(III)-superoxo has an open-shell singlet configuration. Addition of O 2 À to the substrate-bound resting state of the CDO enzyme led to a stable structure that was characterized spectroscopically and identified as the most likely septet spin state of a ferromagnetically coupled iron(III), S = 5 2 , coupled to an S = 1 2 radical. 59 Currently, it is not clear whether this intermediate is a catalytic cycle intermediate and why it has a significant lifetime for spectroscopic detection. It may very well be that the other spin states are more reactive and do not let the iron(III)-superoxo accumulate over time.
Subsequently, the catalytic mechanism leading to cysteine sulfinic acid products was studied with QM-cluster and QM/MM methods. The lowest energy mechanism starts off from an openshell singlet iron(III)-superoxo (A)w i t ht h ef o r m a t i o no fa nS -O bond via a transition state TS A to give the ring-structure B.Pastthe transition state, the system will cross over to the triplet spin state 3 B, which is slightly lower in energy than 1 B. The ring-structure has a weakened O-O bond and splits into an iron(IV)-oxo state with a bound sulfoxide. Similar to biomimetic nonheme iron(IV)oxo complexes with hexacoordination the triplet spin state is lower in energy than the quintet spin state. 60 The final stages of the mechanism include a sulfoxide isomerization to form C 0 followed by oxygen-atom transfer to give cysteine sulfinic acid products D.
An alternative mechanism was recently suggested, based on a crystal structure that appeared to give a stable persulfenate intermediate. 61 Therefore, it was also tested whether the distal oxygen atom of the iron(III)-superoxo could attack the sulphur atom of cysteinate to form a persulfenate structure via transition state TS 0 A . However, this pathway was found to be substantially higher in energy with barriers >30 kcal mol À1 as well as highly endothermic. Consequently, the persulfenate structure is a highenergy structure that cannot take part in the catalytic mechanism. The fact that it was observed experimentally implicates that it arises from a dead-end side reaction.
Thus, theory has predicted a most likely mechanism for CDO enzymes and predicted a multistate reactivity pattern on several close-lying spin state surfaces and a rate-determining O-S bond formation step.
Bioengineering of S-mandelate synthase into R-mandelate synthase. Further combined experimental and computational studies on nonheme iron dioxygenases were focused on S-p-hydroxymandelate synthase. 62 This is an a-ketoacid-utilizing enzyme that converts p-hydroxyphenylpyruvate on an iron(III)-superoxo centre into p-hydroxyphenylacetate and an iron(IV)-oxo species. The latter abstracts a hydrogen atom from p-hydroxyphenylacetate and the subsequent rebound gives S-mandelate products enantioselectively. QM-cluster calculations on its catalytic mechanism leading to R-a n dS-mandelate products were performed and the reaction was shown to proceed with a common initial hydrogenatom abstraction leading to a radical intermediate in either the pro-R or pro-S configuration. We then calculated the transition states for radical rebound to form R-mandelate and S-mandelate products: TS reb,R and TS reb,S . These structurally different transition states were then re-inserted into the crystal structure of the enzyme and the active site was investigated. An analysis of these structures led to suggestions of putative active-site mutations that could reverse the enantioselectivity of the enzymatic reaction.
When these predictions were subsequently tested experimentally they showed a full enantioselectivity reversal from dominant S-mandelate products to R-mandelate products in the mutant. Thus, theory guided experimental work and enabled the biosynthesis of the first artificial nonheme iron enzyme whereby an enantioselectivity reversal was obtained.
Comparison of families of mononuclear Mo enzymes. Molybdenum is the most common transition metal in sea water. 63 Therefore, it is not surprising that it is used by biosystems in a range of different enzymes, such as nitrogenases and oxygen-transfer enzymes. In the latter, Mo is coordinated to an unusual dithiolene ligand, called molybdopterin (MPT). However, the nature of the other Mo-ligands varies extensively, and three families have been identified. In the dimethylsulfoxide reductase (DMSOR) family, the metal coordinates in its oxidised Mo(VI) state to two MPT ligands, one oxo group, and a protein-derived ligand (Ser, Cys, or SeCys; Fig. 7). In the sulfite oxidase (SO) family, Mo(VI) coordinates to one MPT, two oxo groups, and a Cys ligand. In the xanthine oxidase (XO) family finally, Mo(VI) coordinates instead to one MPT, one oxo group, onehydroxide,andonesulfidogroup.Itisinterestingtounderstand why these enzymes have such different active sites. Such a question is quite hard to address with experimental methods, but with QM-cluster calculations, much insight can be gained. 64 From the description above, it can be derived that the active sites of the three families can be represented by Mo VI O(MPT)XY, where X and Y are varying ligands, being MPT and Ser/Cys/ SeCys for the DMSOR family, O 2À and Cys for the SO family, and S 2À and OH À for the XO family. Thus, the X ligand has a double negative charge and the Y ligand a single negative charge. With QM-cluster calculations, the intrinsic properties and reactivities of the three enzyme families can be studied, as well as intermediate sites with other combinations of X and Y ligands. For example, the driving force of a general oxo-transfer reaction can be studied: where Z is a typical substrate. Z = SO 3 2À gives an exothermic reaction for all ligands, with the SO ligands in the middle. On the other hand, Z = S(CH 3 ) 2 gives endothermic energies for all ligands, illustrating that the native reaction goes in the opposite direction (from (CH 3 ) 2 SO). Moreover, the DMSOR ligands give the lowest reaction energy, probably to conserve energy. In fact, the X = MPT ligand reduces the reaction energy by B25 kcal mol À1 . For Z = neutral xanthine, X = O 2À or S 2À (including the native ligands of XO) gives endothermic reactions. 65 This agrees with the consensus view that the actual substrate of the enzyme is deprotonated anionic xanthine, for which all ligands give exothermic reactions. Again, the native XO ligands give energies closest to zero to conserve energy.
Second, the reoxidation of the active site was studied, i.e.

Mo IV (MPT)XY + H 2 O -Mo VI O(MPT)XY + 2H + +2 e À (16)
In principle, it can be divided into two one-electron transfer and two proton-transfer steps, thus giving two redox potentials and two acidity constants (which depend on the order of the reactions), but the results of the net reaction are the easiest to interpret. The ligand sphere of DMSOR (and all other combinations with X = MPT) gives a positive redox potential for the net reaction in eqn (16), indicating that the reduced state is most stable, which is the correct direction for the re-reduction of the active site after the reduction of DMSOR. For the other sets of ligands, the redox potential is negative, and indicates that the oxidised state is instead most stable, again in accordance with the direction of the reactions (both SO and XO oxidise their substrates). Thus, the ligands seem to have been selected to make the re-reduction or re-oxidation of the active sites thermodynamically favourable. Finally, the actual reaction mechanisms (and therefore the activation barriers) for the typical reactions of each of the three families were studied. It turns out that the active sites of all three families can catalyse the DMSOR reaction with reasonable barriers. In fact, the native DMSOR ligands give the highest barrier, owing to the less favourable reaction energy discussed above (Fig. 8). Likewise, all three active sites can catalyse the SO reaction with similar barriers, although the native ligands give the lowest activation barrier (by 3-5 kcal mol À1 ). However, for the more complicated XO reaction, only the native model was active. In particular, the S 2À ligand seems to be essential as a hydride acceptor.
Thus, we can conclude that the ligands seem to have been selected to give exothermic reactions (but typically by a minimal amount of energy), both for the main reaction and for the regeneration of the active site, as well as reasonable activation barriers. This illustrates how QM-cluster calculations can give insight into the evolutionary design of active sites of different enzyme families.
Reaction mechanism of DMSO reductase. DFT methods often experience problems in estimating the energy difference of chemical reactions where the transition metal changes oxidation state. 66 This is nicely illustrated by a recent study of the reaction mechanism of the enzyme DMSOR. 67 As mentioned above, the active site of the enzyme contains a Mo ion bound to two molybdopterin cofactors and a serine residue from the protein. The (CH 3 ) 2 SO substrate binds to this site with Mo in the +IV oxidation state (Fig. 9). Then, dimethylsulfide dissociates, leaving a Mo(VI) ion with an oxy group. The latter dissociation is the rate-limiting step.  Different dispersion corrected DFT methods give varying results for the activation barrier of this reaction (and also for the reaction energy), ranging from 5.5 to 33 kcal mol À1 (Fig. 10). 67 The barrier height follows quite closely the amount of exact exchange in the functional, e.g. 5.5-7.9, 11, 18, 21, and 33 kcal mol À1 for pure functionals, TPSSH-D3, B3LYP-D3, PBE0-D3, and BHLYP-D3 with 0, 10, 20, 25, and 50% exact exchange, respectively. In order to determine which of these results is most accurate, local CCSD(T) calculations were performed with extrapolations to complete basis sets and to canonical methods. These gave a barrier of 16 kcal mol À1 , i.e. closest to B3LYP-D3. However, other states in the reaction were more accurately predicted by the pure functionals. Thus, there is not a single DFT method that gives good results for all states. Instead, it is better to use the LCCSD(T) energies and add corrections for solvation, entropy, and thermal effects from the DFT calculations. In fact, the time seems to be ripe to start to use LCCSD(T) (which currently can be used for complexes with up to 70-90 atoms) calculations also for metalcontaining systems, provided that the multiconfigurational character is not too large.
With such an approach, a barrier of 15 kcal mol À1 was obtained 67 in perfect agreement with the experimental estimate 68 (which of course is fortuitous). There are several previous theoretical studies of the same enzyme, which all have reported similar barriers, 8.8-19 kcal mol À1 . 69 This is unexpected, because none of them included dispersion effects, which are pronounced for this barrier, 17 kcal mol À1 , and they employ both pure and hybrid density functionals. However, looking closer at the earlier studies, it can be seen that they used either a too small basis set (that gives a too low barrier by B17 kcal mol À1 ) or the pure BP86 functional (that gives an B10 kcal mol À1 too low barrier), in both cases compensating for the missing dispersion. Thus, the previous studies obtained the correct answer for the wrong reason.
The take-home message is that the seemingly simple Mo oxy-transfer reactions are quite complicated to treat with QM methods. Large basis sets must be used (at least triple-z quality), dispersion and solvation effects need to be included (also the non-polar solvation terms), and the DFT methods should ideally  be calibrated against more accurate coupled-cluster or multiconfigurational methods. Key intermediates of multicopper oxidases. An example of a case for which QM calculations have solved a problem that later has been confirmed by experiments comes from the multicopper oxidases (MCOs). This group of enzymes contain at least two copper sites: a mono-nuclear type 1 (T1) centre, with a Cu ion coordinated to at least one cysteine and two histidine residues, and a tri-nuclear centre, consisting of a type 2 copper ion coupled to a dinuclear type 3 copper centre in a triangular fashion (the T23 site). 70 All three Cu ions in the T23 site coordinate to two or three histidine residues and, in the oxidised state there are also two solvent-derived ligands: one binding to the type 2 ion and one bridging between the type 3 ions.
The MCOs catalyse oxidation of various substrates at the T1 site, coupled to the reduction of O 2 to H 2 O at the T23 site. By spectroscopic methods, two intermediates in the catalytic cycle have been characterized, the peroxy and native intermediates. 70 The former arises when O 2 binds to the fully reduced (Cu I 3 ) state, leading to an immediate reduction of O 2 to the peroxide level. However, it was originally not known how O 2 bound to the T23 cluster -the experiments could not decide whether it binds inside the cluster, coordinating to all three Cu ions, or on its periphery, coordinating only to two of the ions. 71 In the native intermediate, all three copper ions are fully oxidised (Cu II 3 ), the O-O bond is cleaved, and both O atoms are fully reduced. However, again it could not be decided whether it contained a central O 2À ion coordinated to all three Cu ions or three OH À ions, bridging each pair of Cu ions. 72 At this point, Rulíšek et al. 73 performed a QM/MM study of the MCO reaction. They could obtain geometries of both the suggested structures of the peroxy and native intermediates, as can be seen in Fig. 11a. The structures were similar to what was suggested by experiments, although the peroxide ion bound in a diagonal manner inside the T23 site, and no other structural interpretations of these states could be found even if many other starting configurations were tried. Already from Fig. 11, it can be seen that the peripheral binding of both intermediates seems to be more strained than the central ones. Moreover, approximate energy considerations (the complexes do not contain the same number of atoms, so the energies are not directly comparable, but by removing or adding atoms, an approximate comparison can be made) indicate that the central binding is favoured by 11-34 and 28-34 kcal mol À1 .T h u s , the QM/MM calculations indicated that the central structures (first and third structures in Fig. 11a) were the correct interpretation of these intermediates. The only problem was that the preferred peroxy-intermediate structure had an incorrect ground state (ferromagnetically, rather than antiferromagnetically coupled).  This problem was later solved by advanced calculations using multiconfigurational perturbation theory (CASPT2), which showed that all intermediates are antiferromagnetically coupled. 74 These calculations also gave further support to the central structures by giving better excitation energies for low-lying excited states. Additional support for these structures were also obtained by calculating EPR tensors. 75 EXAFS data for the Cu-Cu distances in the two intermediates were also available, 67,68 but both structures agreed equally well. However, the EXAFS data contain much more information than only Cu-Cu distances. A combined QM/MM and EXAFS refinement gave structures that are an ideal compromise between QM and EXAFS data and showed that the central structure fits the EXAFS data appreciably better than that with peripheral binding of HO 2 À (Fig. 11b). 76 Subsequently, spectroscopic and inorganic modelling studies have confirmed that the central binding is the correct mode of both the peroxy and native intermediates of the MCOs. 70, 77 Altogether, these studies show the predictive power of the QM calculations and the strength of combining various spectroscopic and theoretical investigations.

Part 2: biomimetic systems
Predictive studies of spectroscopic properties. Calculating spectroscopic properties of molecules is a challenging task and often requires computationally expensive methods, such as CCSD(T), in combination with a large basis set. As these methodologies are not always accessible for large chemical systems of well over 50 atoms, such as biomimetic model complexes, users often resort to DFT modelling. A typical strategy applies conventional single-reference (hybrid) DFT with a polarized triple-z quality basis set for the calculation of EPR g and A values, as well as d-d transitions and spin distributions. It should be noted, however, that at this level of theory one would not expect to obtain highly accurate results and even qualitatively they can be incorrect, e.g., when there is the unphysical preference for higher spin states or unrealistic energies for spin transitions. 78 Unfortunately, rigorous electronic property calculations with correlated wave function treatments such as CCSD(T) 21 in the singlereference domain or CASSCF 79 (and subsequent perturbations, such as CASPT2 80 or NEVPT2 81 ) in the multi-reference domain require large (better than triple-z quality) basis sets and are, despite recent developments, 78,82 often prohibitively expensive in terms of computer time. For this reason, a good strategy is to try to correlate the DFT-derived properties to experimental data where at all possible. This approach allows explaining and to some extent predicting, spectroscopic properties, but only for a particular problem, where the applied methods have been previously validated.
Prediction of the d-d electronic transitions of a nonheme iron(IV)-oxo complex. The prediction of the electronic d-d transitions of [trans-Fe(O)(NCCH 3 )TMC] 2+ , TMC = 1,4,8,11-tetramethyl-1,4,8,11-tetraazacyclotetradecane, was studied with highlevel ab initio (SORCI 83 ) methods, time-dependent DFT (TD-DFT 84 ) and a ligand field method based on DFT, LF-DFT. 85 The use of a LF-DFT method is possible in this particular case because of the localized character of the 3d electrons, which gives a mainly ionic interaction between the metal and its ligands: the ligands donate electrons to the partially filled 3d and virtual 4s and 4p valenceshell of orbitals on the metal. The bonding molecular orbitals (MOs) are, therefore, ligand-centred, whereas the antibonding orbitals are metal-centred. The resulting d n configuration on the central metal then leads to the electronic d-d transitions for emission and absorption. 85a The concept of the LF-DFT method 86 is based on the validity of constructing a sub-Hamiltonian for the d-d block on the metal taking into account that the d-d transitions are energetically well separated from the charge transfer transitions. 87 In the case of [trans-Fe(O)(NCCH 3 )TMC] 2+ , the structure was first carefully optimized with BP86 85a and then studied with LF-DFT, DSCF and time-dependent DFT (TD-DFT). The DSCF method is similar to a normal DFT calculation, but one or more valence electrons are placed into higher-lying Kohn-Sham orbitals. As a comparison, the optimized geometries of [trans-Fe(O)(NCCH 3 )TMC] 2+ at the BP86 and B3LYP levels of theory are shown in Fig. 12 and the MO diagram is given in Fig. 13. 85a,88 As can be seen from Table 1, both LF-DFT and TD-DFT produce acceptable results, whereas SORCI gives an incorrect ground state. For the ground state to 3 A 1 transition, the TD-DFT value is closer to experiment than the LF-DFT value, but on the other hand the TD-DFT calculation fails to predict the ground state to 3 A 2 transition. 89 As the size of the complex is already challenging for a SORCI treatment, not all transitions could be obtained; on the other hand,  some transitions, such as the 1 A 1 and 1 A spin-flip transitions, cannot be explained by the LF-DFT method.
Prediction of the d-d electronic transitions of dinuclear Cu II bispidine complexes. Copper(I) complexes of tetra-or pentadentate mono-or dinuclear bispidine ligands (bispidine = methyl-2,4bis(2-pyridin-yl)-3,7-diazabicyclo-[3.3.1]-nonane-9-diol-1,5-dicarboxylate) are known to take part in oxygen activation reactions. For instance, in their reaction with catechols to quinones they have been shown to form unusually stable end-on [{(bispidine)Cu} 2 (O 2 )] 2+ complexes 90 (see Fig. 14). For this dinuclear complex, the metal centres are either antiferromagnetically or ferromagnetically coupled to an open-shell singlet or triplet spin state, respectively. In the present case, the two spin state surfaces for the oxidation of catechol to quinone by [{(bispidine)Cu} 2 (O 2 )] 2+ are close to being degenerate, 90 especially on the reactant side of the reaction mechanism. Particularly important for the description of the free energies along the reaction mechanism are both solvent and dispersion effects due to small energy differences between the available spin states, whereby the omission of one of these effects can easily change the spin state ordering. Also note that the treatment of solvent effects may vary depending on the solvent model and program package used (for instance, the COSMO 91 vs. PCM 47b,92 models). Similar situations occur in the description of the dispersion treatment through either the Grimme-D3 model 93 or the use of long-range dispersion functionals. 94 For the treatment of dispersion, no single ''best'' method has emerged to date; such calculations should therefore be conducted with at least two separate program packages or set-ups/approaches, and the relative energetics should match to have some confidence in the results. 90 The d-d spectrum for [{(bispidine)Cu} 2 (O 2 )] 2+ was determined experimentally and subsequent TD-DFT computations led to the spectrum shown in Fig. 15. As expected the spectra match only qualitatively, and among the three mentioned methods for the prediction of d-d transitions, i.e. the LF-DFT, SORCI and TD-DFT methods, probably the latter is the least accurate. TD-DFT also has fundamental problems. 95 In the case of the dinuclear complex above, however, the semi-quantitative picture still allows one to assign the experimentally observed transitions to the computed ones. a Ground state configuration. b Spin-change within the ground state configuration. c Spin-change within the ground state configuration with significant contributions from excited state configurations. d Substantial multiconfigurational character. e Two-electron transition. f Not found in the energy range below the ligand-to-metal charge-transfer excitations. g Calculated using the reported S = 1 DFT geometry optimized structure 89 showing negligible deviation from the D 4h symmetry.
The broken-symmetry approach, eqn (18), can be used for a DFT-based calculation of the exchange coupling constants. 96 In this equation, E HS and E BS represent the energies of the highspin (HS) and low-spin broken-symmetry (BS) states, whereas hS 2 i HS and hS 2 i BS are their spin-expectation values, respectively. The broken symmetry state refers to a symmetry lowering due to a spin-flip of an otherwise identical electronic configuration.
These coupling constants can be calculated using most standard computational chemistry software packages including Jaguar, 47a Gaussian, 47b and ORCA. 97 The latter is especially simple to use as it directly outputs the broken symmetry-derived coupling constants on request. Accurate predictions of J values depend strongly on the method and basis set used, as follows from calculations on the bisphenolate-bridged dicopper(II) complex, shown in Table 2. Thus, the hybrid DFT methods B3LYP, 28,29 B3P86, 28,57 and B3PW91 28,98 reproduce the experimental value within reasonable accuracy, which contrasts the results obtained from pure DFT methods that due to their poor description of the broken-symmetry state do not give accurate results. The basis set should be of triple-z quality on the metal and its first coordination sphere, but can be much smaller for the atoms not involved in the electronic coupling, which should speed up the calculation. 99 For the prediction of coupling constants, it is important to do an initial geometry optimization of the structures with DFT, rather than a single-point calculation on an available X-ray structure. This procedure avoids possible errors with the DFT method -as, strictly speaking, properties derived from the Kohn-Sham (KS) wave function are only valid at stationary points on the PES. On the other hand, correlating properties derived from DFT-relaxed geometries to experimentally determined ones also may have problems as the environment (e.g. solvent) used in the DFT calculation may be imperfectly modelled. Nonetheless, one can show that J coupling constants for a number of oligonuclear transition metal complexes containing Cu, Mn, Fe, V, Ni, and Cr can be obtained with similar accuracy from DFT-optimized structures. 99 Prediction of EPR g and A tensors of Cu(II) hexapeptides. The prediction of EPR g and A tensors with DFT methods is not straightforward, as we will exemplify with results on mononuclear copper(II) with an 18-membered azacrown-6-macrocycle. 100 Fig. 16 shows the B3LYP/6-31G* optimized structure for a cyclic copper(II) pseudo-hexapeptide overlaid with the X-ray structure of the metalfree ligand. The overlay suggests that the ligand is well preorganized and that the modest method and basis set used are able to well-predict the structure of the complex. On the other hand, the accurate prediction of g and A tensors is much more complicated and the corresponding spin Hamiltonian (H s ) has the form as given in eqn (19), 101 and is essential for accurate predictions.
It depends on the external magnetic field (B), the electron and nucleus spin-operators for the total spin of the system (S and I), the g-tensor, the hyperfine coupling constants (A) and copper ( 63,65 Cu) and nitrogen ( 14,15 N) isotopes. DFT calculations using a number of different functionals and basis sets have been tested for three complexes with a In cm À1 . b J exp = À298 cm À1 . c Initial guess obtained with Jaguar. 47a d Basis set: TZVP for Cu II and the donor atoms, 6-31G* for the remaining atoms. Fig. 16 Overlay of the X-ray structure of hexapeptide H3L1 with the DFT optimized geometry of its copper(II) complex (Gaussian 03, B3LYP/ 6-31G*); hydrogen atoms and co-ligands are omitted for clarity.
different cyclic peptide ligands with the program packages ORCA 2.6 97 and Mag-ReSpect. 102 The results are given in graphical form in Fig. 17 for the g-tensor (part a) and the A-tensor (part b) components.Usually,thex and y components of the g-tensor are obtained accurately with a hybrid DFT method with double to triple-z quality basis set, but difficulties in the correct description of the g z component arise in this case. The calculated g z value dependspartlyonthechosenbasis set and, for instance, improves in quality upon changing the basis set from a split-valence basis set (SVP) to a much larger Wachters-type basis set 103 for copper and the first ligand sphere. In order to obtain values that are close to experiment, though, the amount of exact exchange in the DFT method needs to be modified (often arbitrarily, to a value of 38 or 40%, or alternatively through the use of a functional such as BHLYP which contains 50% exact exchange). Even with these modifications, all components of the g-tensor cannot be accurately reproduced with ''standard'' DFT methods. The MAG-ReSpect program gives values very close to experiment, employing the spin Hamiltonian as given in eqn (19), but unfortunately is computationally very demanding. 100 A similar situation occurs for calculations of the A-tensor, but as can be expected for the hyperfine coupling, accurate predictions critically depend on a very good basis set (even for the A x and A y components). Also here, improvement of the results is obtained by changing the total amount of exact exchange in the DFT method, whereby values beyond the 20% E HF in B3LYP are needed, although in this case the A z values are overestimated by a factor of 2. Only with spin-orbit coupling (SOC) treatments one can expect to obtain accurate predictions for the A-tensor. Spin distribution and stability of a Cu(II) bispidine complex. For the well-characterized bispidine copper(II) complex, [Cu(L 1 )(Cl)] + ,L 1 = methyl-2,4-bis(2-pyridinyl)-3,7-diazabicyclo-[3.3.1]-nonane-9-diol-1,5-dicarboxylate, Fig. 18, an attempt was made to find a DFT methodology to predict structures that correctly predict the energetically favoured isomer, EPR g-and A-tensors and d-d transitions. As expected, there is not one single density functional method that performs best when all experimentally accessible observables are taken as a reference. Instead, for each property to be reproduced, a functional/ method/basis set combination has to be carefully chosen and benchmarked. As we have demonstrated above, several DFT methods are suitable for the calculation of d-d transitions and EPR parameters, but in this section we will focus on the spin density distribution. We calculated spin densities for the complex given in Fig. 18 and its structural isomer (where Cl is bonded perpendicular to the pyridine plane). The spin density on copper can vary strongly, is dependent on the DFT method chosen, 103,104 and in our example it ranges from 0.57 using pure DFT functional to 0.93 for HF. As a matter of fact, the spin density on copper is linearly dependent on the amount of exact exchange admixture to the (hybrid) functional. The estimate for the experimental value 105 for the spin densities is reproduced quite accurately with the SORCI method, but for the DFT method the amount of exact exchange which would best fit the reference values would be 61%. A similar linear trend is observed for the spin density on the ligands, although for the chlorine ligand a value of 38% exact exchange would give the most accurate reproduction of the reference values.
To summarize, there does not appear to be a straightforward method to calculate electronic properties such as d-d transitions, EPR parameters or spin distributions from the existing DFT functionals without using more specialized methods such as LF-DFT, Mag-ReSpect or SORCI for EPR values and d-d transitions, or to carefully tune the functional to reproduce a given property. The size of many model systems prohibits the use of a more computationally demanding method, and one has to resort to trial and error for picking the right combination of DFT method and basis set to compensate effects such as too covalent metal-to-ligand interactions in pure DFT methods. Still, the choice of an established functional can often be advantageous, because the weaknesses and strengths of a well-known functional such as B3LYP or BP86 are relatively well understood, especially for geometry optimizations, which often give the right trends for a given property, if not the experimentally observed values. If accurate predictions of experimental observables are needed, though, no single DFT method will give satisfactory results without careful tuning to experimental and/or high level ab initio reference data.
Prediction of Mössbauer parameters. In a combined experimental and computational effort, the electronic properties and reactivity of a high-valent iron(IV)-imido complex were investigated. 106 Multiply bonded iron-nitrido and imido complexes have relevance to biological systems, including the enzyme nitrogenase that catalytically reduces nitrogen molecules to ammonia. 107 Our studies focused on iron(IV)-N-tosyl-corrolazine, [Fe IV (Cz + )NTs], which was investigated by several spectroscopic methods, namely UV-Vis, Mössbauer and EPR spectroscopic methods, alongside theory. The optimized geometries of the lowest lying doublet and quartet spin states calculated with UB3LYP and a double-z basis set are given in Fig. 19. In analogy to cytochrome P450 Compound I, 13,14 [Fe IV (Cz + )NTs] has closelying doublet and quartet spin states with three unpaired electrons: two in p* FeN orbitals (p* xz , p* yz ) and a third in a corrolazine p* orbital (a 00 ). These three unpaired electrons are either antiferromagnetically or ferromagnetically coupled to the overall doublet or quartet spin states, respectively. We find a small energy splitting between the doublet and quartet spin states in favour of the low-spin state. This is in agreement with EPR studies that characterized it as an S = 1 2 spin state. In addition, we calculated the Mössbauer and EPR parameters of 2 [Fe IV (Cz + )NTs] in Orca, 97 which we report at the  Fig. 19 alongside the experimentally determined values. Spectroscopic parameters were calculated at the B3LYP 28,29 level of theory on the optimized geometries reported in Fig. 19 in combination with the CP(PPP) basis set on Fe coupled to a TZVP basis set on the rest of the atoms. 108 The quadrupole splitting (DE Q ) was calculated from the electric field gradients V i (i = x, y, and z), the elementary charge of a proton-electron (e), the nuclear quadrupole moment Q( 57 Fe) of 0.16 barn, and the asymmetry parameter of the nuclear quadrupole tensor (Z) using eqn (20) and (21).
The isomer shift d was calculated from the spin density at the iron nucleus r 0 (Fe) using fit-parameters as implemented in Orca, 97 whereas the magnetic hyperfine parameters A i (i = x, y, and z) were obtained using the scalar relativistic zero-order regular approximation (ZORA) at the B3LYP level of theory. These calculated variables were then used to fit the experimental Mös s b a u e rs p e c t r u m ,t a k i n gi n t oa c c o u n tt h a tt h eE u l e ra n g l e s had to be rotated due to differences in the principal axis between the experimental and computational studies. Nevertheless, as follows from Fig. 19, the computational methods reproduced the experimentally determined Mössbauer and EPR parameters excellently. Calculated values of d and DE Q are within 0.10 mm s À1 from experiment, although it should be noted that this type of agreement cannot be expected for all transition metal complexes at this level of theory. Therefore, precaution should be used to predict experimental Mössbauer and EPR spectroscopic parameters from DFT calculations.

Conclusions
Since the formulation of the Schrödinger equation and the development of computational quantum chemistry almost a century ago, theory has come a long way. Major advances in the field have been made, in particular in the area of methods development. Especially, computational resources have become more efficient and powerful enabling computational studies of relatively large (bio)chemical systems. Some of these techniques are highly accurate and are able to reproduce experimentally measured variables well. However, for challenging chemical systems such as biomimetic transition metal complexes as well as biochemical systems, some major challenges remain. Currently, there is not a single universal method that is applicable to transition metal chemistry. Therefore, considerable testing and benchmarking alongside experiment is still needed.
However, it appears that theory has established a strong foothold in biochemistry and biomimetic chemistry and more and more computational studies are done alongside experiment. As such theory can make predictions and guide experiment in the field and address reaction mechanisms and spectroscopic parameters of short-lived intermediates.