Judith
Herzfeld
Brandeis University, Waltham, Massachusetts, USA. E-mail: herzfeld@brandeis.edu
First published on 13th March 2024
With the increasing sophistication of each, theory and experiment have become highly specialized endeavors conducted by separate research groups. A result has been a weakening of the coupling between them and occasional hostility. Examples are given and suggestions are offered for strengthening the traditional synergy between theory and experiment.
Ideally, theory generates predictions that inspire experimental tests, and experimental observations call for theoretical interpretation or generalization that generates new hypotheses. Where distance or energy scales are difficult to access, the cycle may start with a theoretical advance that suggests where and how to look for new phenomena. Where phenomena are relatively easy to observe, the cycle may start with novel experimental observations that suggest challenges to prevailing principles. In either case, incremental progress then follows the canonical cycle. Both theory and experiment lose meaning unmoored from each other. Experimental observations without theory are isolated snapshots, and theory without experimental data is speculative. Ideally, through close and frequent interaction, theory and experiment each guide and discipline the other.
During the scientific revolution, the empiricism of the scientific method distinguished the new discipline of science from the old disciplines of philosophy and technology. Individual practitioners joined observation and reason, obtaining and interpreting their own data. The successful scientist was both an experimentalist and a theoretician. There was specialization, but it was by focus on different phenomena (e.g., astronomy vs. electricity vs. optics), rather than between experiment and theory. If theory and experiment didn’t agree, the issue had to be in one's own observations or one's own reasoning, and being intimately familiar with both could help to resolve the matter.
The situation is very different today: with experimental and theoretical methods sufficiently sophisticated that it can be demanding to master and implement even one, principal investigators generally identify as either experimentalists or theoreticians and train protégées in one or the other. Moreover, the fragmentation can be expected to grow with the rise of big data and scientists who specialize in organizing it.1 There are a variety of vulnerabilities in this specialization and the example in the next section illustrates some of them with a case that went spectacularly sideways. This is followed by sections considering some roots of the vulnerabilities and some practical suggestions for mitigating them.
I was introduced to the debate at a 2012 Gordon Research Conference at which a theoretician and an experimentalist reprised a dispute that was, by then, at least three years old.3,4 In a heated session, the theoretician argued that the experimental observations reflected contamination and the experimentalist argued that the theoretical calculations were naïve. The experimentalist defended his results by describing ever more stringent purification procedures and the theoretician defended his results by citing interpretations of other experimental data. So how did such smart scientists, each at the top of their own game, get into such a muddle?
Table 1 groups representative reports of the surface charge of water, from 1861 to the present, by approach and result. (Restricted to the gas–water interface, the table omits studies on electrophoresis of oil droplets and streaming currents on solid surfaces.) As shown, there is qualitative disagreement within theory and within experiment, as well as between the two.
When the only ions present are the self-ions of water, there is actually broad experimental agreement about surface charge. The earliest measurements were of negative charges on droplets torn from the water surface by mechanical forces.6,26 This “waterfall effect” has even been reproduced in American bathrooms where splashing from sink height produces less charge than splashing from shower height and more charge than the swirl of a toilet flush.27 In more controlled experiments, electrophoresis of gas bubbles consistently indicates that the charge is negative on the bubble side of the slip plane in the sheared the water surface8 and that the isoelectric point, arrived at by adding minimal amounts of acid to the system, occurs at a bulk pH ∼3.5.7
The same picture is revealed by monitoring proton transfer reactions on- vs. in-water droplets prepared at different bulk pH's. For the protonation of trimethylamine, the equivalence point for the on-water reaction occurs for droplets with a bulk pH of 3.8, vs. 9.8 for the in-water reaction. In particular, the surface of a neutral droplet is no better able to protonate triethylamine than either the interior or the surface of a droplet prepared at bulk pH 12.10 Other on-water protonation studies show that the change in the surface as the bulk is titrated through pH 3–4 is very dramatic. Although n-hexanoic acid is an extremely weak base, it is protonated on the surface of droplets with bulk pH as high as 3.28 Similarly, isoprene, another weak base, is protonated and polymerizes on the surface of droplets with bulk pH as high as 4.29 On the other hand, whereas deprotonation of hexanoic acid occurs in the bulk with an equivalence point of 4.8, it occurs on the surface of droplets with an equivalence point of 2.8.30
Surface-selective spectroscopy tells a comparable, though ill-recognized, story. While, UV second harmonic generation spectra show no detectable change in surface hydroxide signals from bulk pH 7 to 13,15 vibrational sum-frequency spectroscopy has covered a wider pH range.11–14,16 Consistent with the UV results, the vibrational spectrum from the surface of neutral water is very similar to that from the surface of aqueous base. However, the features of the vibrational spectra are sensitive to titration of the bulk through pH 3–4 such that the spectrum from the surface of aqueous acid becomes markedly different from that from the surface of neutral water or aqueous base. Unfortunately, except for the distinctive signals of dangling OH groups, there are no assignments of specific vibrational features as these are likely shifted for semi-dehydrated surface species vs. their bulk counterparts.
So, what about the contrary experimental results? What they have in common is salinity. Already in 1892, Lenard reported that, in simulating the “waterfall effect” by splashing water against hard surfaces, the charge on ejected droplets became less negative, and eventually reversed sign, as salt was added to the water.6 This is consistent with subsequent studies establishing that small droplets ejected from seawater are positively charged.20,21,31–33 The effect of salt is also evident in proton transfer reactions on- vs. in-water droplets: with the addition of less than 1 mM NaCl or LiCl in the preparation of neutral water droplets, the surface becomes abruptly capable of protonating trimethylamine (declining a bit at much higher salt concentrations, but not to the total absence of protonation seen without salt).10 In addition, vibrational SFG spectra at OH stretch frequencies show changes in surface structure on addition of sodium halides that are similar to those seen on acidification.13
In other words, the presence of salt alters the partitioning of the self-ions of water between the bulk and the surface. This should not be surprising because the preference between the two environments depends on conditions in both. A particularly salient feature of neat bulk water is its three-dimensional hydrogen bonding network, a feature reflected in the remarkable heat capacity of bulk water, which at 3k per atom is as expected for a network solid rather than for a liquid of small molecules (in which contributions from rotational and translational modes are expected to dominate over those from vibrational modes). As is well-known, this hydrogen bonding network is disturbed by salt, as water molecules are rearranged into solvation shells around the solute ions.
It follows that a theory that does not do a good job of describing the H-bonding of water and its self-ions will predict the correct surface charge of water only by accident. For first principles methods, this will depend on the choice of basis set. For density functional theory, it will also depend on the choice of functional. And for molecular mechanics, it will depend on the choice of force field. In addition, spurious results may arise from simulations with periodicities that are too tight or sampling that is too brief. (Avoiding this source of artifacts is particularly difficult for ab initio methods, still challenging for density functional methods, and least difficult for molecular mechanics.) All things considered, off-the-shelf simulation packages should be expected to be inadequate for the task unless shown otherwise. In fact, correctly predicting the observed surface charge of water should have been considered a test of the theory, a view that would have conformed to the traditional relationship between theory and experiment.
As it happened, it wasn’t until this century that theoretical simulations were able to begin addressing the surface charge of bulk water and it is fair to say that much of what drove repeated and diverse experimental probes thereafter was a response to repeated theoretical predictions by diverse conventional methods that the surface of neat water was positively charged. Ultimately, only a radically new approach to molecular modeling produced results consistent with experiment, uniquely predicting a robust preference of hydroxide for the surface and of hydronium for the bulk.17 The LEWIS force field achieved this by modeling valence electron pairs as semi-classical particles with mobility independent of kernels. This intrinsically affords molecules not just flexibility, but also anisotropic polarizability and reactivity. Furthermore, with strictly pairwise interactions of the electron pairs with each other and with kernels, the model has an efficiency that allows large (1000 molecules) and long (500 ps) simulations of water.17 In this construct, an excellent description of hydrogen bonding is achieved by training the particle pair potentials on the structures and energies of water monomers and dimers in all their common protonation states,34 resulting in a force field that does an excellent job of describing acid–base dynamics in bulk water.35 At the surface, the predicted propensities of the self-ions are consistent not only with the experimentally observed negative surface charge of neutral water, but also with the observed need to increase the bulk hydronium ion concentration and decrease the bulk hydroxide concentration by several orders of magnitude in order to reorganize the surface.
Other large and long simulations use classical molecular dynamics in which electrons are represented only implicitly. Employing rigid models of water and various models of the self-ions, a scheme that is not expected to adequately represent the flexibility and lability of the hydrogen bonding network, this approach predicted that hydronium prefers the surface and hydroxide weakly avoids it.14,22,23
While ab initio simulations are only practical for systems that are too small for periodic boundary conditions to reasonably describe bulk properties, they are suitable for water clusters and find that an excess proton prefers the surface.22,36 However, clusters have no bulk region and LEWIS also predicts that the energy of a protonated 21-mer is lowest with the excess proton located on the tightly curved surface of the distorted H-bond network.37
Kohn–Sham density functional theory has afforded sizeable simulations either directly, through on-the-fly calculation of forces in FPMD,18,19 or indirectly, by informing barrier heights between sets of MS-EVB basis states.25 In both cases, a limitation has been in the weaknesses of the chosen functional, BLYP, which is relatively practicable, but is known to have problems with hydrogen bonds38–41 (even though it, like LEWIS, does predict the correct order of the self-diffusion constants of the water self-ions vs. each other and water42). The most recent MS-EVB studies predicted that hydronium is weakly attracted to the surface while hydroxide is repelled.25 Meanwhile, the FPMD simulations are the only ones besides LEWIS to find hydroxide attracted to the surface, although only weakly so19 and with hydronium indifferent between the surface and the bulk.18 It seems likely that the disagreement between FPMD and MS-EVB results owes to the artificial restriction in MS-EVB to a set of basis states and that the divergence of both from the LEWIS results owes to the limitations of the BLYP functional. LEWIS and BLYP DFT also disagree on the dominant H-bonding of hydroxide in bulk water, with LEWIS predicting traditional coordination43 and BLYP predicting hypercoordination.44,45 The latter can be expected to penalize transfer to the surface.
The twists and turns in the long saga of the surface charge of neat water, which are relatively clear in retrospect, exemplify some real-time issues in modern science more generally. Two of these are discussed in the following two sections.
For example, Langmuir absorption theory was used to analyze UV intensities on the surface of water.15 However, that theory assumes that absorption events occur independently at a uniform set of sites in the surface from a uniform set of sites in the bulk. In contrast, the surface adsorption of hydroxide is a cooperative process that involves separation of charge and large multiplicities of states in both the surface and the bulk due to the flexibility of H-bonding. Misapplication of the theory was manifest by theoretical curves with shapes that could not be fit to the experimental data.
As another example, the simulations of slabs of water described above employed a variety of approaches, each with their own sets of underlying assumptions. Although these were generally acknowledged, they were not always associated with commensurate skepticism about the results, especially in interpretations of vibrational spectra.14
More generally, it is tempting to take at face value the software packages that have proliferated for both processing and interpreting data. Choices, not only among packages, but also among the input options afforded the user, may be made by custom without much consideration of their impact. It is a tall order to be familiar with the details of these complex packages, but it can make a big difference.
In the case of data processing, it is important to understand, for example, how and why NMR spectra depend on the details of the conversion of data from the time domain to the frequency domain, especially for multi-dimensional experiments with non-linear sampling. Similarly, it is important to understand how the translation of electron diffraction data into macromolecular structures may be biased by the force field that is chosen to make the process of fitting the electron densities more manageable by constraining the structures. This is especially so for studies of higher energy structures that occur in cryogenically trapped functional intermediates.
In the case of molecular interpretation, valid use of theoretical packages often requires expertise well beyond what an experimental group can be expected to cultivate or sustain in-house. In particular, for molecular dynamics, it is necessary not only to choose a suitable model for the system of interest (about which more below), but also to adopt adequate system dimensions and boundaries, an appropriate ensemble with effective controls, a suitable step size, a sufficiently long trajectory, and a satisfactory sampling scheme.
In the computational chemistry landscape, there are three general types of lampposts: the first principles ones based on the description of electrons by wave mechanics; the molecular mechanics ones that treat electrons implicitly, modeling only their effects; and the fledgling sub-atomistic force fields that treat the valence electrons as semi-classical particles. Across the board, these involve various clever approaches to avoiding the prohibitive computational cost of ab initio calculations. However, whereas ab initio theory has a known path to greater accuracy (i.e., larger basis sets and more configurations), the other methods don’t.
—In the first principles group, density functional theory has explored scores of functionals, finding that different ones are more reliable for different applications.
—Also in the first principles group, semi-empirical methods provide great gains in efficiency by obviating evaluation of demanding integrals. However, this entails careful parameterization for specific applications.
—Parameterization is a growing headache in the molecular mechanics camp where, not only are the traditional parameters required for each type of atom, bond, bond angle, and dihedral angle, but new parameters have been increasingly required for patches to add polarizability and reactivity, features that were not contemplated in the original conception of molecular mechanics.
—The sub-atomistic force fields need far fewer parameters because, based only on valence electrons and kernels34,46–48 (or all electrons and nuclei49,50), polarizability and reactivity are built in with no need for distinction between different types of bonding. However, these force fields are still in early stages of development.
In the process of parameterization, one has to be concerned about the choice of functional forms and the content of the training set. Machine learning can help avoid restrictive functional forms. However, to the extent that this approach relies on very large training sets populated by properties calculated from density functional theory, the results will reproduce the idiosyncrasies of the functional used in those calculations.
Clearly the central requirement of synergistic relationships is regular, open, constructive communication, with thoughtful and detailed reading and listening. This is non-trivial, and conducive formats for conferences, schools and journal articles become ever more critical. Among other things communication needs to include attention to the ills mentioned above. Too often, presentations make only brief mention of the methods used, while emphasizing the results obtained. However, even conventional methods require a thoughtful appraisal of strengths and weaknesses for a particular application.
In fact, theoretical calculations operate in multiple modes. The first is validation, i.e., to show that the theory reproduces experimental results at a relevant level. Only once appropriately validated, can theory offer explanations and make potentially reliable predictions. Where no experimental data are available for relevant validation, theory can offer no more than suggestions as to what might be expected. In the above example of cooperativity in hemoglobin, the model was validated on abundant sigmoidal data and then used to (accurately) predict biphasic saturation under intermediate conditions. In the above example of the bacteriorhodopsin photocycle, there was no opportunity for direct validation of the theory and the results were presented as suggestions of possible trajectories. In the above example of the surface charge of water, there was abundant experimental data (the left side of Table 1) and the theoretical work (the right side of Table 1) should have been regarded as attempts at validation of different models in the regime of extensive intermolecular H-bonding. While most of the water models clearly bumped up against the Peter Principle, LEWIS was validated for the surface charge of water (as well as for the acid–base behavior of bulk water more generally37) and therefore able to offer explanations for experimental observations (in particular that there is a net gain (loss) of H-bonds when hydroxide (hydronium) ions displace water molecules from the surface of neat water).
All things considered, it would be most helpful to readers for theorists to be very clear in their papers about distinctions between validation, explanation, prediction and suggestion. In this context, it behooves theorists to give readers an idea of what they think their approach gets right and what it sidesteps. As long as some progress has been made, whether in testing or applying theory, thoughtful caveats should be seen as a strength of a paper, rather than a deficiency.
When existing approaches have reached their limits, there are two possibilities. One is to apply patches that address the perceived gap (such as polarizability and reactivity in atomistic force fields). However, at some point thought also needs to be given to starting from scratch (as in the pursuit of sub-atomistic force fields) to address applications that were not foreseen in previous methods design. Innovation has always been the answer to the proverbial young scientist's nightmare that if an experiment or calculation is interesting it either has already been done or can’t be done. In fact, science keeps progressing because methodological advances make studies possible that were not before. New experimental methods may reach new length/time scales or disentangle signals that are of different types or from different parts of a sample. New theoretical methods may rethink assumptions made/avoided, the organization of data, or the method and means of analysis. Stepping out in a new direction can seem daunting, and may be discouraged by grant reviewers as too blue-sky, but it is also important for the long-term health of the community that it be wary of falling into the “sunk cost fallacy”.
This journal is © the Owner Societies 2024 |