E. J.
Baerends
Vrije Universiteit, Amsterdam, The Netherlands. E-mail: e.j.baerends@vu.nl
First published on 16th May 2022
Many references exist in the density functional theory (DFT) literature to the chemical potential of the electrons in an atom or a molecule. The origin of this notion has been the identification of the Lagrange multiplier μ = ∂E/∂N in the Euler–Lagrange variational equation for the ground state density as the chemical potential of the electrons. We first discuss why the Lagrange multiplier in this case is an arbitrary constant and therefore cannot be a physical characteristic of an atom or molecule. The switching of the energy derivative (“chemical potential”) from −I to −A when the electron number crosses the integer, called integer discontinuity or derivative discontinuity, is not physical but only occurs when the nonphysical noninteger electron systems and the corresponding energy and derivative ∂E/∂N are chosen in a specific discontinuous way. The question is discussed whether in fact the thermodynamical concept of a chemical potential can be defined for the electrons in such few-electron systems as atoms and molecules. The conclusion is that such systems lack important characteristics of thermodynamic systems and do not afford the definition of a chemical potential. They also cannot be considered as analogues of the open systems of thermodynamics that can exchange particles with an environment (a particle bath or other members of a Gibbsian ensemble). Thermodynamical (statistical mechanical) concepts like chemical potential, open systems, grand canonical ensemble etc. are not applicable to a few electron system like an atom or molecule. A number of topics in DFT are critically reviewed in light of these findings: jumps in the Kohn–Sham potential when crossing an integer number of electrons, the band gap problem, the deviation-from-straight-lines error, and the role of ensembles in DFT.
(1) |
(2) |
The restricted domain of densities on which Ev[ρ] is defined is called in optimization theory the domain of feasible densities, or the feasible domain for short. Constrained derivatives, which only consider infinitesimal variations of the variable (i.e. the function in functional analysis) over the feasible domain, are difficult to handle. The crux of the Lagrange multiplier method is that it allows one to use full derivatives. This requires that the full derivative is defined, including its components. But we have just noted that the crucial component ∂Ev/∂N is not defined. We will discuss this problem in Section II, cf. ref. 5. From this discussion it emerges that the Lagrange multiplier in (1) is an arbitrary constant. It does not have physical meaning and cannot be interpreted as the chemical potential of the electrons in an atom or molecule. The DFT literature is nevertheless replete with references to this “chemical potential”. Next we will argue, in Section III that in fact the meaning of a chemical potential for the few electrons in an atom or molecule is problematic. Against the background of a summary of the well known statistical mechanical underpinning of thermodynamics in Appendix A, it is demonstrated that the thermodynamic origin of the concept of chemical potential is not compatible with a few-electron quantum mechanical system that does not obey the characteristic properties of a macroscopic thermodynamic system.
In a well-know paper Perdew et al.7 (PPLB) have highlighted the paradox that arises when μ of eqn (1) is considered a chemical potential. As a solution they propose to extend the domain of densities on which Ev[ρ] is defined by introducing an ensemble of two states with different electron numbers. It has been argued (see ref. 5 and Section II) that this does not solve the problems with the identification of μ in (1) as a chemical potential.
We note in passing that similar criticism as the one here against the quantity ∂Ev[ρ]/∂N can be levelled against the derivative ∂Ev[ρ]/∂ni, where ni is the occupation number of orbital ϕi in the Kohn–Sham approach of DFT. In KS DFT the equality ∂Ev[ρ]/∂ni = εi is usually assumed and denoted Janak's theorem.8 Again the derivative is defined as
(3) |
The structure of this article is as follows. In Section II the arbitrariness of the Lagrange multiplier μ = ∂Ev/∂N is discussed. This raises the question of the validity of the concept of chemical potential for the electrons in an atom or molecule. In the following section (Section III), it is argued that indeed electrons in an atom or molecule do not have the properties that would allow to treat them as a thermodynamic system to which the laws of statistics (arising from the exceedingly large numbers of particles that feature in thermodynamic systems) and thermodynamic concepts such as chemical potential and temperature would be applicable. Section IV deals with the conditions and conclusions for the step behavior of functions like μ() and Ē() which follow from the PPLB Ansatz for the grand canonical ensemble-like probability distribution of neutral atom and positive and negative ion. The findings in Sections II–IV have a bearing on several topics that feature frequently in DFT. These are touched upon in Section V: steps in the KS potential in Section V A, the band gap problem of solid state physics in Section V B, the issue of atoms as open systems with a fluctuating electron number in Section V C, the straight-lines condition in Section V D and the use of ensembles in DFT in Section V E. Section VI makes summarizing remarks.
In Appendix A a brief review is given of the statistical mechanical underpinning of thermodynamics. Although unabashedly unoriginal, we need this exposition to establish the salient features of statistical mechanics which prevent the treatment of few-electron quantum mechanical systems (atoms and molecules) as thermodynamic systems. It can be skipped by anyone familiar with statistical mechanics and thermodynamics. Appendix B discusses how cases should be understood where properties like chemical potential and temperature are attributed to (particles in) small subsystems of macroscopic thermodynamic systems.
Ref. 5 dealt with the elucidation of the derivatives (2) and (3) and the consequences. It was concerned with the T = 0 situation exclusively. The present paper replaces and corrects statements in ref. 5 referring to the finite temperature situation and its statistical mechanical treatment.
(4) |
(5) |
The fact that μ = (∂Ev/∂N)σ(r) is an arbitrary constant is not in any way problematic. However, Parr et al.2 have stated that this is “the chemical potential” (of the electrons in a molecule). They stipulate, without further proof or derivation, that this is a physical quantity, and is characteristic of the molecule. That conflicts with the arbitrariness we noted above. One can also see that it contradicts the gauge invariance property of the external potential, which is carefully taken into account in the HK theory. Breaking Ev up into the HK functional F[ρ] and the external potential dependent part , one finds for v(r) from (1) the well-known expression
(6) |
Another problem with the notion of μ being a characteristic physical quantity of a molecule (or atom) with the meaning of the chemical potential of the electrons was brought forward by Perdew et al. in ref. 7 (PPLB). These authors describe the following paradox or anomaly. Suppose the flow of electron density would be governed by such a chemical potential, and take two different neutral atoms at noninteracting distance, with different chemical potentials. Then a small density transfer of magnitude δN to the atom with lowest μ would lower the energy. This will continue till the chemical potentials (which will change upon density change) will equalize. So the energy will minimize at net negative charge (possibly even noninteger) on the atom with initially lowest chemical potential and net positive charge on the other atom. This is in contradiction with physical reality where the ground state for each pair of noninteracting atoms has neutral atoms, since no electron affinity A is larger than an ionization energy I. It is also a quantum mechanical reality that the ground state wave function for two noninteracting atoms would be a product of two atomic ground states with integer numbers of electrons.
The anomalous result signalled by PPLB arises from an assumption which is maybe not inherent in the concept of a chemical potential for the electrons, but is almost automatically linked with it: that the electron distribution can be considered as an electron “fluid”, in fact consisting of very many “particles” each having a tiny fraction of an electron charge, whose behavior is analogous to the behavior of the very many particles in thermodynamic systems: the flow is towards a region (or a phase) with lowest chemical potential. But electrons are not like that, they cannot fracture into a myriad of smaller particles but can only jump as a complete electron. The anomaly should lead to the conclusion that this conceptual framework does not correspond to the reality.
PPLB propose a different solution of the anomaly along the following lines. They define the energy for noninteger N, Ev[ρN+ω], by making a specific choice for ρN+ω and Ev[ρN+ω], namely a linear interpolation between the (physical) ground state densities and energies of the N-electron system and the (N + 1)-electron (viz. the (N − 1)-electron) system. This is done by density and energy extension into the noninteger N domain through a quantum mechanical density matrix (also called ensemble), for instance for between N and (N + 1) (distinguishing the noninteger N by an overline),
(7) |
(8) |
Fig. 1 The straight-line energy behavior as a function of noninteger electron number according to the definition of the energy on the noninteger N domains by the ensemble Ansatz of ref. 7 (PPLB). |
This solves the paradox in the sense that a small density change δN will now have energy increase proportional to I at one atom and energy lowering proportional to A at the other atom. In fact, the correct situation has been restored that either the electron will go over in its entirety or not at all, depending on the magnitudes of I and A. However, this continuation of the density into the noninteger domain is not in keeping with the fact that the partial derivative (∂Ev/∂)σ(r) has to be taken with constant shape function σ(r). It should be stressed that ∂Ev/∂ is a partial derivative, meaning that it has to be taken while the shape of the density σ(r) = ρ(r)/N is constant,5
(9) |
The most straightforward extension of the density into the nonphysical fractional electron domain while keeping the density shape constant would be to choose ρN+ω ≡ (N + ω)σ = (/N)ρN and to define the corresponding energy as Ev[ρN+ω] = (/N)Ev[ρN]. The derivative with respect to at constant σ(r) is simple and continuous at the integer N point. Lieb14 mentioned the possibility Ev[ρ] = Ev[ρ/] for the extension of the definition of Ev[ρ] to the noninteger N domain. However, for integer this does not revert to the standard value Ev[ρN]. PPLB do not keep the shape σ(r) of the density constant in the neighborhood of the integer N density, but make a break exactly at that point. That is what the derivative discontinuity reflects.
The PPLB straight-line energies for noninteger electron number could be called just a possible definition, since we have seen the energy for noninteger N is not a physical quantity and can be defined in any way we like. It is nevertheless important that these straight-line energies are not determined by the physics of some real (existing) system. They do not represent “the exact DFT energy for noninteger N”, a point to which we return below. It is one of the possible choices for the continuation of Ev[ρ] into the unphysical domain of noninteger densities. Given the arbitrariness of this choice, one cannot expect that any physics can be derived from it, neither from the discontinuous PPLB choice of the derivative nor from any continuous choice.
If a small electron density increase at an N-electron atom is required to have the shape of the (N + 1) ground state density ρN+10, and if we wish to describe the total (N + ω)-electron density with a single set of Kohn–Sham orbitals, the highest energy Kohn–Sham orbital (the one with ω electrons) must have orbital energy −A. This is necessary because the asymptotic behavior of the ρN+10 density is known to be exponential as . At the same time the asymptotics is determined by the slowest decaying KS orbital density, which is governed by its orbital energy as (this orbital with occupation ω is the former LUMO, hence the subscript L). So we must have εL(N + ω) = −A. Now it is known that the exact Kohn–Sham orbital energy of the LUMO of the N-electron system, εL(N), is usually (for closed shell molecules) considerably lower than −A,9 which can be understood from the physical nature of the KS potential15 (see Section V B). The implication of the prescription εL(N + ω) = −A then is that the KS potential for any finite density increase δN having shape ρN+10, however small, must shift up by a constant over the molecular region (so as not to disturb the shapes of the fully occupied orbitals making up the ρN0 density) of magnitude Δ = −A − εL(N). This jump raises all orbital levels so that the LUMO level (with now ω electrons) becomes εL(N + ω) = −A, see ref. 7 and 16. Note that the constant should not extend to infinity, since the KS potential must always go to zero asymptotically in order to give the orbital energies absolute meaning (not dependent on an arbitrary gauge choice). The radius R beyond which the constant should no longer be effective16 and the potential has returned to the asymptotic −1/r behavior can be estimated.17
When one introduces this jumping behavior of the KS potential, the fundamental band gap I − A is obviously restored if one takes the LUMO level after the jump has occurred (but the HOMO level before the jump): εL(N) + Δ − εH(N) = I − A. We have noted that this jumping behavior of the KS potential is not a physical phenomenon, it is only required if the density extension beyond the integer N is prescribed to have the ρN+10 shape. It has nevertheless been considered to provide an explanation for the band gap problem. We will return to this issue in Section V B.
We have been concerned here with ground states, i.e. solutions of the Schrödinger equation. The HK theorems have revealed that an alternative procedure to obtain the ground state energy would be the solution of the Euler–Lagrange eqn (1), if the functional Ev[ρ] would be known. But this does not change the quantum mechanical reality that the ground state (any energy eigenstate) can be fully known by solving the Schrödinger equation. There are no other variables, like chemical potential or temperature, that could also affect the eigenstates. These are not a kind of “hidden variables” that also have to be known in order to fully characterize an eigenstate.
It would therefore appear that statistical mechanics has little relevance for an understanding of properties of the ground state, and of a mixture of ground states. Statistical mechanics is just concerned with the distribution in a macroscopic system of particles like atoms and molecules over the known eigenstates. It makes us understand how this distribution can be described with thermodynamic quantities like temperature and chemical potential. Using the so-called energy representation, one pictures the particles in a gas of say electrons and molecules (possibly ionized) as being in energy eigenstates most of the time (except for the instants where they change their state, e.g. by collisions, so that equilibrium can be achieved and maintained). These states do not themselves depend on the temperature or chemical potential. Only the distribution over the states is tied to these macroscopic variables.
Nevertheless, in DFT a connection of ground state solutions (energies, densities) with thermodynamics has been pursued. The rationale seems to be that the T → 0 limit of a thermodynamic treatment should substantiate the concept of a chemical potential and the associated straight lines behavior of Fig. 1, together with the notion of a derivative continuity of the energy.7,18,19 We will consider these notions in detail in the next two sections. However, that will not change the point of view expounded in the present section, and does not have relevance for the consequences that are listed in Section V.
(10) |
(11) |
In the same way the average energy Ē can be obtained as a function of μ. Ē is defined at any T as
(12) |
(13) |
The GC probability distribution depends on the temperature T and chemical potential μ of the electrons in the atom. In order to fix these quantities the customary device of contact with a usually unspecified “reservoir” is invoked that would be able to endow the electrons with these properties, and can modulate them at will, apparently without causing any essential disturbance of the (properties of the) atom. The latter requirement is important if one wants to deduce any free-atom property from this device.
However, the few electrons in an atom do not constitute a thermodynamic system in the usual meaning of the term. If chemical potential and temperature are not properties of the system, the device of contact with a reservoir in order to bring these properties to desired values cannot be invoked. Since the notion of atoms/molecules as thermodynamic systems of electrons seems to be widespread in the DFT community, we feel it is important to dispel it. We will use arguments from elementary statistical mechanics, and refer to Appendix A for a brief exposition of the statistical mechanical underpinnings of thermodynamics. More detail can be found in the many excellent textbooks on the subject.21–28
The derivation of the equation for the probability distribution over the members of an ensemble, like eqn (10), proceeds in statistical mechanics in two steps (see Appendix A).
In the first step it is crucial that proper statistics can be done, which requires large numbers, both of particles in the system and in the case of the grand canonical ensemble also of a very large number of systems in the ensemble. These ensemble members represent “microstates” of the target thermodynamic system, which as a macroscopic system will traverse in the course of time very many microstates compatible with the few thermodynamic variables defining its state, such as temperature T, volume V and number of particles N (or chemical potential μ in the GC case). The large numbers give rise to statistics when, to mimick the time behavior of the real system, a distribution of these microstates over a huge ensemble of “macroscopically identical” systems would be constructed. This is the statistical mechanical device which, assuming equal a priori probabilities for the micostates, leads to the distribution (A22). As always in statistical mechanics, the constraints on total number and total energy are introduced with the Lagrange multipliers α and β that feature in eqn (A22). However, such microstates of “macroscopically identical” systems, that are traversed in the course of time, do not exist in the case of an atom as “thermodynamic system”. The “laws of large numbers” that are the basis of statistical mechanics require an enormously large number of particles N in the thermodynamic system that is the target of the statistical mechanical derivations, and for the grand canonical ensemble an enormously large number of systems in the Gibbsian ensemble. But here we have very few systems, with very low particle numbers. These particle numbers are very different from the average, but an overwhelmingly large number of the members of the ensemble should have particle numbers very close to . So the derivation of the probability distribution (A22) cannot be carried through for an atom.
In the second step the Lagrange multipliers α and β in (A22) should be shown to have the usual physical meanings in terms of the chemical potential μ and temperature T of the thermodynamic system that the grand canonical ensemble is to represent. That allows to obtain the probability distribution in terms of μ and T, as in eqn (A23) (cf.(10)–(13)). To make this identification, the First Law of Thermodynamics (cf.(A7)) is invoked, see e.g. Pathria and Beal28 Ch. 4.3 or Hill,23 Ch. 3. So the system must be a thermodynamic system for which the First Law (including its ingredients such entropy, temperature, chemical potential) are applicable. However, the few electrons in the target system, a free atom, do not constitute such a thermodynamic system to which the First Law can be applied. In particular, the properties of temperature and chemical potential (of the electrons) do not exist.
As for the chemical potential, we have noted in Section II the problems that exist with the definition μ = ∂E/∂N on account of the absence of a definition of the energy for infinitesimal increase of the particle number. The arbitrariness of the Lagrange multiplier μ in the Euler–Lagrange eqn (1) shows that the HK theorems do not provide a basis for the definition of a chemical potential for the electrons in an atom. More generally, there is a well-known small-number problem with defining the chemical potential if the particle number is not extremely large. We recall that the chemical potential is defined in thermodynamics as the partial derivatives in eqn (A9). Using F = E − TS and G = E + pV− TS one also has
(14) |
Turning then to the temperature, it is clear one cannot measure the temperature of the electrons in an atom, and neither can one establish equilibrium by establishing equality of temperature in parts of the system. It is an elementary law of thermodynamics that one should be able to measure the temperature of the experimental system. The importance of the existence and measurability of temperature as the foremost characteristic of a thermodynamic system has eventually led to its codification into the Zeroth Law of Thermodynamics, cf. Reif.26 Temperature is not an ensemble property, it should be a measurable physical property of the single system for which a representative ensemble is invoked to make deductions about its equilibrium properties. For the concept and measurement of temperature it is relevant that the thermodynamic limit for the number of particles in one system can be reached, lest the temperature remains undefined at the required level of precision. In the present case the particles are the electrons of the atom. For them the thermodynamic limit of ca. 1023 never comes into play. We emphasize that we are not dealing with the prototypical statistical mechanical case of a gas where the particles are atoms, in which case we have very many particles in a macroscopic volume V and the temperature T is related to the occupation of the translational energy states (the kinetic energy). At higher T the occupation of electronically excited states starts to play a role. The present case is very different. We would need a common temperature that can be deduced for the electrons in energy eigenstates of an atom and its ions. We conclude that temperature is not a property of the few electrons in an atom, let alone that it could be (made) equal in all the ions in their stationary states, as implied by the probability distributions (10) and (13).
So the electrons in an atom do not constitute a thermodynamic system. Then equations (10)–(13) cannot be derived as the probability distribution over the members of a GC ensemble of thermodynamic systems. Now in spite of the fact that an atom is evidently not a thermodynamic system, it has been stated that nevertheless the electrons can derive thermodynamic properties like μ and T from contact with a reservoir. Then obviously the reservoir is assigned a different role from the usual one. In statistical mechanics a genuine macroscopic thermodynamic system which does have properties like μ and T is sometimes (thought to be) brought into contact with another huge thermodynamic system (the reservoir) which can exchange only heat with it (to establish the temperature) or both heat and particles (to establish both temperature and chemical potential). In that way μ and T of the thermodynamic system can be brought to desired values. What happens if the system is such that μ and T do not exist for the isolated system? Can these properties be conferred to it in some magical way by the contact (what kind of contact?) with the reservoir? That is not the case. There are special cases where some properties can be derived from a partition function of a small subsystem (not itself a thermodynamic system) of a large system.24,30 This may happen if a large thermodynamic system contains many small subsystems, for instance adsorption sites on a surface in equilibrium with a gas of adsorbing particles (see e.g. Hill,24 Ch. 7). Independence of the subsystems then may lead to simplification. We discuss some examples of this special case in Appendix B. The conclusion remains that the electrons in an atom do not constitute a thermodynamic system and as such do not afford the definition of thermodynamic quantities like temperature or chemical potential for such a system.
So we reject the applicability of the GC eqn (10)–(13) to atoms. We will nevertheless consider the question whether these equations, if accepted, would justify the picture of Fig. 1 and the related assumption that the chemical potential of the electrons exists and is ∂Ē/∂. In the next Section (IV) we will investigate this in detail and will conclude negatively. Readers who are convinced at this point that atoms and molecules are not bona fide thermodynamic systems, and therefore eqn (10)–(13) cannot provide a justification of the existence of a chemical potential for the electrons in an atom, as an atomic property, can skip these details and move to Section V where a number of concepts in DFT are discussed in the light of the findings of Section II.
The underlying “ensemble” for eqn (10)–(13) is not a proper GC ensemble with very many members that represents at a given moment the time behavior of the thermodynamic system it represents. We can consider it as a man made collection of ions with only very few members. μ and T in eqn (10) and (13) can be considered as just parameters with which one can regulate the fractions p(Ni,Ej(Ni)) of the various ions in this collection of (a few) ions. The averages and Ē are then functions of the parameters μ and T. For particular extreme choices of the parameters, like T → 0 or μ = ∞, particular results for the pi and the averages result. These are reviewed in this section.
As first example of the parameter tuning ref. 20 mentions the extreme choice μ = −∞ for which all terms with Ni > 0 lead to negligible contributions for any finite T. Then only Ni = 0 survives and = 0, corresponding to the fully ionized atom. The collection of ions then has effectively one member. Another extreme parameter choice would be μ = +∞, which makes the term i with maximum number of particles Ni = Z + 1 overwhelmingly larger than any other term, hence = Z + 1, corresponding to the anion. These cases where the “ensemble” has only one member constitute extreme deviations from the statistical mechanical ensembles. By varying μ one can vary between these extremes of 0 and (Z + 1).
The parameter T in the expression (10) for the fractions of ions in the collection occurs in the denominator of the arguments of exponential functions. This causes the T → 0 limit to have the extreme effect of blowing up the arguments. This causes just one term in the sum (11) for to be overwhelmingly larger than all other ones, namely the one with the largest (positive or least negative) argument. will be equal to the Ni of the dominating term. So in the limit T → 0 will exhibit step behavior as a function of μ, making a jump when μ crosses a boundary to an interval with another dominating term. Fig. 2 gives a picture of the energies of the various ions, both the ground states E0(Z′), as well as the possibly included excited states Ej(Z′), with the maximum included excited state energy for Z′ denoted Emax(Z′). (Here and in the sequel we denote with Z′ any of the integer values from Z + 1 to 0.) The largest exponent of the terms for Z′ particles is the one involving the ground state energy, since always Z′μ − E0(Z′) > Z′μ − Ej>0 (Z′). The threshold where switches from Z′ − 1 to Z′ is for the μ making this largest term for Z′ larger than the largest of the (Z′ − 1) terms:
Z′μ − E0(Z′) > (Z′ − 1) μ − E0(Z′ − 1) μ > −(E0(Z′ − 1) − E0(Z′)) = −I(Z′) | (15) |
μ > − I(Z′ + 1) | (16) |
At fixed T the average only depends on the μ parameter. The blue dashed curve in Fig. 3 depicts the steps in the function (μ) at small T, which are well known.18,19 In the limit T → 0 the straight-line steps are approached. It is easy to deduce from eqn (11) that the width of the jumps at the μ = −I(Z′) ordinates of Fig. 3, is of the order kT: for μ = −I(Z′) + δ goes to Z′ at δ ≫ kT and to Z′ − 1 at δ ≪ −kT. At exactly μ = −I(Z′) the exponential terms for (Z′ − 1) and Z′ contribute equally and one has = Z′ − (1/2) (see small black dots in Fig. 3). This equality is not mathematically exact, since other exponential tems in (11) will still contribute, even if by exceedingly small amounts. On the intervals −I(Z′) < μ < −I(Z′ + 1) becomes constant in the limit T → 0. For the midpoints μ = −[I(Z′) + I(Z′ + 1)]/2 one can analytically derive that they are practically independent of T, as is also intuitively obvious (see heavy black dots in Fig. 3). For the point μ = −(I + A)/2 one has = Z. This has led to the suggestion that the chemical potential of the electrons in the Z-electron (neutral) atom would be −(I + A)/2, as noted above.
The function Ē(μ) (12) has similar staircase behavior as the function (μ), with steps at the μ = I(Z′) values that cause a switch of the dominant exponential term in (12), see Fig. 4. There are some differences of detail. For instance at the midpoints μ = −(1/2)(I(Z′) + I(Z′ + 1)) the value E0(Z′) is not practically independent of T (no heavy black dots) but the limiting E0(Z′) value is approached from above when T → 0. In the limit T → 0 the steps become sharp, similar to the situation for (μ).
From the figures and data for (μ) and Ē(μ) one may deduce the behavior of Ē as a function of , Ē(). If (μ) is invertible so that μ() is defined, Ē() can be obtained. As long as T is still finite the situation is as sketched with the blue curves, so μ() is defined. It can be derived that Ē() becomes a straight line with slope −I(Z′) on the interval Z′ − 1 < < Z′ (μ in the neighborhood of the −I(Z′) point). Considering next the −I(Z′) < μ < −I(Z′ + 1) interval, it is clear one has ≈ Z′ and at the same time Ē ≈ E0(Z′). The limit T → 0, where the steps become sharp, has to be carefully considered. The whole −I(Z′) < μ < −I(Z′ + 1) interval leads to the single point = Z′, and the single value Ē(Z′) ≈ E0(Z′). At the point μ = −I(Z′) is in the range (Z′ − 1,Z′), but undetermined, and also Ē becomes undetermined on the range (E0(Z′),E0(Z′ − 1)). This is depicted in Fig. 5 by the dashed lines on the (Z′ − 1,Z′) interval. It should be noted that the lines disappear for T → 0. Machine calculations cannot capture this behavior because of the limited precision, but they exhibit breakdown first in the neighborhood of the midpoints = Z′ − 1/2, which is understandable in view of the (μ) and Ē(μ) curves being steepest there (both derivatives,∂Ē/∂μ and ∂/∂μ are going to ∞ there and their ratio will be undetermined.). In Fig. 5 we indicate the growing regions of indeterminacy of Ē(). At small but finite T the function Ē() persists at ≈ Z′, going smoothly from E0(Z′) + δ to E0(Z′) − δ and the derivative ∂Ē/∂ smoothly changing from −I(Z′) to −I(Z′ + 1), see inset of Fig. 5. However, in the limit T → 0 the whole Ē() “curve” collapses to just the points (Z′,E0(Z′)), which is indicated with arrows on the dashed lines. It is not surprising that eqn (10) leads to this collapse at T → 0: In that limit the dominance of just one term in (10) becomes absolute, so (depending on the μ interval) only one ion ( = Z′, Ē = E0(Z′)) has fraction 1.0. The jumps from one dominating term to the next become discontinuous.
For finite T the Ē() picture of Fig. 5 has some resemblance to Fig. 1. However, the latter is for T = 0 (or rather is temperature-less) while Fig. 5 becomes very dissimilar to Fig. 1 in the limit T → 0, reducing to just single points. The straight lines are then nonexistent.
The “ensemble” on which eqn (10)–(12) are based7,18,19 is not a proper Gibbsian ensemble. Such an ensemble has very many members, which all have the same μ and almost all a particle number N that is close to the ensemble average . It represents the time behavior of a target thermodynamic system. Here there is not a bona fide target thermodynamic system, with very many particles that allow definition of the chemical potential μ and temperature T. The “ensemble” here has few members with particle numbers rather different from the average and with so few particles that T and μ are not defined. So the target thermodynamic system that the Gibbsian ensemble aims to represent is actually nonexistent in this case. The application of thermodynamic relations therefore has to be viewed with reservation. This has nevertheless been undertaken18,19 and is briefly discussed here. The denominator of eqn (10) is treated as the grand canonical partition function ZGC, although it does not qualify as such because eqn (10) cannot be derived for an atom (the Lagrange multipliers α and β of Appendix A cannot be written in terms of μ and T since these do not exist for the electrons in an atom). The usual thermodynamic relations for the internal energy E and the Helmholtz free energy F have nevertheless been applied to such small electronic systems,
E = TS − pV + μN F = E − TS = −pV + μN = μN − kTlnZGC | (17) |
(18) |
(19) |
As stated earlier, the present (pseudo-)thermodynamic discussion based on the “ensemble” eqn (10)–(13) cannot be expected to shed light on the (temperatureless, quantum mechanical) Fig. 1 and its implications. The present finding that at T = 0 there is nothing but the E0(Z′) energies is wholly satisfactory. It does not support the concept of atoms with noninteger number of electrons and an energy that is not an eigenvalue of the Hamiltonian but somewhere in between. The opinion that this is a meaningful concept appears to have settled in the DFT community and has given rise to the frequent reference to fractional electron systems, with apparently the feeling that the statistical mechanical theory of grand canonical ensembles would condone such a concept.
For completeness we mention there are also true, physical, steps in the KS potential for integer electron systems. “Physical” then means: required in the potential of a noninteracting particle system in order to endow it with a density equal to the one of the interacting electron system. A very well known step is the one occurring between two atoms A and B (or larger fragments) with each a single valence electron so they can form a covalent bond. At large distance the single electron level at atom B at the lower energy (−IB) has to move up to the higher level at −IA in order to form a doubly occupied orbital with 50–50 mixing so the atoms will get the correct amount of one electronic charge density each. The KS potential therefore must form a plateau over the region of atom B of height IB − IA. This leads to a step of this height in between the atoms, as was recognized by Almbladh and von Barth31 and Perdew.18 This qualitative argument is confirmed by an analysis of the exact KS potential. It can be shown that the so-called response part of the KS potential generates the plateau mentioned above, with exactly the height IB − IA.32 This behavior is directly related to the conditional amplitude, i.e. it is a direct consequence of the (strong) left-right correlation in a (weak) covalent bond. It has been studied for model systems by Maitra et al.,33 also for the TDDFT case,34,35 and for the case of strongly correlated systems by Giarusso et al.36 The response part of the KS potential also has step behavior when going in an atom from one shell to the next.37–39
However, the PPLB picture does not explain the band gap problem nor solves it. What has caught the attention is that in the PPLB picture the KS potential for their density (1 − ω)ρ0(N) + ωρ0(N + 1) has for any ω > 0, however small, to jump up from the one for the ρ0(N) shape by a constant Δ = −A − εL(N) over the molecular (or crystal) region (but not in the asymptotic limit), see end of Section II. This jump is required in order to move εL up to −A so that the asymptotic decay of the ω electron density in the LUMO orbital will have the proper decay (of the ρ0(N + 1) charge density). Of course −A − εL(N) is just the band gap “deficit”. The deficit of the KS orbital energy gap, or band gap, εL(N) − εH(N), i.e. the difference between it and the fundamental gap I − A, and the jump of the potential to which it is equal, are always called derivative discontinuity (DD) in solid state physics. (Then of course the DD (i.e. Δ) is a different quantity than the discontinuity in the derivative of the energy, which is I − A).
But there is a big if: if εL(N) is below −A the band gap problem exists and the jump of the KS potential has to occur if one wants the build up the ρN+ω density by an admixture of some (N + 1)-electron density ρ0(N + 1) to the N-electron density: ρN+ω = (1 − ω)ρ0(N) +ω ρ0(N + 1). But PPLB do not predict that εL(N) is not equal to −A and do not give an estimate of the magnitude of the discrepancy and the necessary potential step. In order to understand the band gap problem one has to understand why the KS potential leads to a LUMO level that is below −A. That understanding does not follow from ref. 7 or ref. 16. It should follow from an understanding of the physics of the KS electrons, i.e. from the nature of the KS potential and the one-electron energies that follow from it. It is indeed perfectly understandable why the exact KS potential leads to a LUMO level that is below −A, see the arguments in ref. 15, notably its Fig. 3 and 4. In short, the exact KS potential incorporates the attractive potential of the full exchange-correlation hole also for the virtual orbitals, which is not the case in the Hartree–Fock model, which does have εHFL (N) = −AHF (which is ≈ −A, although in poor frozen orbital approximation). Actually, understanding the origin of the difference −A − εL(N) from the nature of the KS potential leads to a correction that can easily be calculated for solids (extended systems). It can be proven43 that the correction −A − εL(N) is in a macroscopic solid equal to the expectation value for the LUMO orbital (the state at the bottom of the conduction band) of the response part of the KS potential. The latter can be reasonably well approximated by the expression of ref. 39 (GLLB), explaining the success of Kuisma et al.44 and others45–48 with this correction.
PPLB have straight-line energies that have −A as derivative at the N + ω side. So A occurs in their picture. This does not explain why εL(N) < −A, and neither does it give a strategy for the calculation of the discrepancy. One would still have to calculate or approximate A in order to know the discrepancy (the DD), i.e. to know −A − εL(N). So one has to calculate the fundamental gap (I − A) in order to obtain the magnitude of the potential step of PPLB. There is not an independent way of establishing the derivative discontinuity DD and from there obtain the correction −A − εL(N).
Such a calculation of A (and I) is actually quite feasible. It has been pointed out by Görling and coworkers49,50 that it is possible to calculate the total energy differences for ionization from a periodic crystal (I) or addition of an electron to a crystal (A) from total energy calculations by series of calculations with standard band structure codes. The proposed procedure has been illustrated and applied by Tran et al.51 It has also been used to confirm52 that for those approximations (the LDA, GGA and meta-GGA functionals) for which the Slater relation ∂Eappr/∂ni = εi holds, the orbital energy gap εapprL(N) − εapprH(N) is equal to the total energy based fundamental gap Iappr − Aappr (which is not the case for exact KS for which indeed Slater's relation (often called Janak's theorem) does not hold). The equality εapprL(N) − εapprH(N) = Iappr − Aappr does not solve the “band gap problem” since Iappr − Aappr and therefore εapprL(N) − εapprH(N) is wrong (very different from the exact I − A) for the LDA and GGA functionals. The error is due to the error of these approximations for the total energy of delocalized ion states, see ref. 9 for detailed discussion.
The terminology “open system” and “fluctuating particle (electron) number” has made its way into the density functional literature, but then not regarding thermodynamic systems, but mostly referring to the electrons in an atom. The atom is not a thermodynamic system, and the fluctuation must be of a very different type than the phenomenon treated in statistical mechanics. Usually interaction with an “environment” is held responsible for the fluctuation. The environment is typically just the other atoms in a molecule, or a solid surface to which the atom may be bound. We wish to stress that in the ground state of such a system (or any energy eigenstate) we are not dealing with any fluctuation phenomenon. The electron density surrounding (the nucleus of) an atom is stationary in the ground state or an excited state. This also holds when the atom is only very weakly interacting with the rest of the system, be it the remainder of the molecule from which it dissociates, or the solid surface from which it detaches. The electrons in such an atom are not like the particles of a thermodynamical system for which the phenomenon of (energy or particle) fluctuation is well studied, cf. ref. 23 Ch. 3, or ref. 28 Ch. 3.6 and 4.5. These remarks pertain to the stationary states, the eigenstates of the Hamiltonian. At elevated temperatures we need to consider a Boltzmann distribution over the states. This does not alter this statement on the lack of fluctuation in an energy eigenstate.
The prototypical example is H2+, but other well known examples are He2+, [H2O–H2O]+etc. (for simplicity we take identical fragments as example). The poor behavior of the LDA and GGA functionals ((semi-)local DFAs in general) in such cases was well known from the treatment of ionization from equivalent sites in a molecule,53e.g. 1s core holes in homonuclear diatomics like He2, N2 or C2H4 and subvalence ligand levels in TM complexes like Cr(CO)6. It has for instance been highlighted in 1982 by Noodleman et al.53 for N2+ and Hen+, in 1997 by Bally and Sastry54 for H2+ and He2+, in 1999 by Sodupe et al.55 for [H2O–H2O]+ and in 2008 by Cohen, Mori-Sánchez and Yang56 for H2+. The root cause of the problem is that the local approximation is applied in situations where non-locality is essential. For a system with a noninteger electron number on separated (noninteracting) fragments (for instance two (1/2) electron charges on individual H's for long distance H2+), the local approximation causes the functional to be effectively evaluated for each fragment, i.e. for a noninteger electron number. But for such systems the HK functional is not even defined. Of course if the local functional would yield for each (1/2) electron density just half of the required H atom energy, the total energy would still be correct. But it has been clear53,55 that the local approximations, which all have a basic LDA exchange ingredient of ρ(r)4/3 in the xc energy density, do for that reason not exhibit the right scaling behavior for the correct total energy to result. If we extend the example to n noninteracting fragments of N electrons each, with a surplus 1 electron that will be distributed over the n sites, it is clear that a local functional will have to deal with n fractional electron charges of N + 1/n each. It has been observed57,58 that a local functional yields the right energy for any n if the scaling of the local energy density would be perfectly linear between N and N + 1 electrons on a fragment. This is not a proof that the behavior of the straight-lines picture of Fig. 1 is correct physics. It is simply making the local approximation work in this special case, where in fact the local approximation is not warranted, being applied to a case where nonlocal effects are vital (because the fragment systems are entangled), see ref. 5 for further discussion. Applying an exchange-correlation functional to a noninteger electron system means that the functional is applied outside the domain of densities on which the HK functional has been defined.
Let us consider a small increase δN = 1/n of the electron number on a fragment over the integer number N (e.g. when the number n of noninteracting fragments in the example above is large). A local functional will derive the energy of a fragment from the local (N + 1/n) electron number and the energy increase according to the straight-lines picture would be
(20) |
(21) |
∂EDFA/∂nL = εL = −A | (22) |
(23) |
In the first place an equi-ensemble of the ground state and an excited state has been introduced as a means of obtaining the excitation energy by Theophilou.65 This has been extended by Gross, Oliveira and Kohn to ensembles with more excited states.66–68 This ensemble approach for excitation energies is currently receiving considerable interest.69–72 We emphasize that this type of “ensemble” is something very different from statistical mechanical ensembles, like the “canonical ensemble” and “grand canonical ensemble”. There is no connection with thermodynamics, in this application the density matrix and ensemble are just elements of the edifice of quantum mechanics. (The names “density matrix” and “ensemble” have of course originated from the link with statistical mechanics,73–75 but the concepts are now independent of this context.) There is no theoretical problem with the ensemble approach to excitation energies. It is interesting to observe that Levy has used this ensemble formulation for excitation energies to investigate the relation between excitation energies and Kohn–Sham orbital energies.76 The first excitation energy, for instance, is not equal to the KS orbital energy difference between HOMO and LUMO, εL − εH, for an N-electron system. The LUMO must be raised by a constant
(24) |
A second occurrence of ensembles in DFT is in the case of non-pure-state v-representable ground state densities. Levy77 and Lieb14 proved that in case of degenerate ground states some ground state densities are only ensemble v-representable. This has acquired some practical importance when it was discovered that there are cases where the density of a nondegenerate ground state wavefunction can only be represented by an ensemble density of a degenerate KS ground state.78–80 This appears to be connected to strong (nondynamical) correlation. The strong mixing in that case of a few electron configurations in the wavefunction then leads in the KS system to the description of the density by an ensemble of KS states representing the mixing electron configurations. In that case the KS states (determinants) are degenerate, the HOMO being degenerate. The KS ensemble is then an example of Levy and Lieb's ground state ensemble, but now for the KS noninteracting electron system.
The practical relevance of this type of ensemble has been demonstrated in a series of papers by Filatov (see review ref. 81) who developed the spin-restricted ensemble-referenced Kohn–Sham methods (REKS) precisely for the cases where the density is no longer pure-state vs representable in the Kohn–Sham system but is only ensemble vs representable. This applies to many cases including diradicaloids and excited states (conical intersections).82–84
It is unfortunate that the terminology “ensemble DFT” is now gaining traction, comprising on the one hand the PPLB approach with its derivative discontinuity and on the other hand the ensemble approaches for excitation energies and for non-pure-state representable Kohn–Sham cases. We stress that these are very different theoretical constructs.
The heart of the problem with ∂E/∂N is that no physical meaning can be given to systems with a fractional number of electrons, such as (N + ω). This also leads to the denial of such physical meaning to the linear energy picture of Fig. 1 other than that is the average electron number for two states of different electron numbers which constitute an ensemble with mixture parameter ω, and ĒPPLB the average energy. Such mixtures have also been considered with probabilities patterned after those of the grand canonical ensemble of statistical mechanics,7,20 see Section III. In that case the behavior of Fig. 5 is obtained, with collapse of the Ē() curve to just the points (Z′,E0(Z′)). Neither the straigt-lines picture of Fig. 1 nor the dashed lines of Fig. 5 represent physical behavior of an atom or molecule.
We have been discussing the Euler–Lagrange eqn (1) and other issues which pertain to the theory (DFT, i.e. quantum mechanics) of electrons in atoms and molecules, where temperature does not play a role. Elevated temperature effects can of course be described with statistical mechanics. For instance, at (very) high temperature a macroscopic gas of H atoms may exhibit ionization, meaning that an equilibrium is established in the gas between H atoms, free electrons, and H+ ions, see ref. 24, 26, 27 and Appendix B. In this case we are dealing with thermodynamic systems, in principle macroscopic with a defined pressure for the gases of each type of particle (atoms, ions, electrons). These are traditional systems for the application of thermodynamics and statistical mechanics. Ignoring the population of electronically excited states (which however will be important at such high temperatures that ionization becomes measurable), one could describe the fraction of H+ ions by way of fractional occupation of the 1s orbital. This does not mean of course that any H atom/ion would exist in the gas with a fractional number of electrons. Just the average number of electrons on an H becomes fractional.
Fractional occupations are also well known as the Fermi-Dirac distribution of electrons over single particle states in free-electron models of an electron gas at elevated temperatures, cf.eqn (A19). This is again just a way of describing the distribution of such a system over ground state and excited states at nonzero T, see ref. 85, Ch. 2. An electron gas in a potential with a defined μ (so no band gap) is a model for metals. The generalization of DFT to include temperature dependence for such a system was proposed long ago by Mermin.86 For a gas of electrons moving in an external potential at T ≠ 0 he established the one-to-one correspondence of the external potential and the density. Mermin uses that his system of electrons has a defined temperature T and chemical potential μ, signalling that we are dealing with a thermodynamic system. Not only DFT, also other electronic structure theories may be extended to incorporate temperature effects in extended electronic systems where T and μ are defined quantities. The pioneering work in 1963 of Mermin on Hartree–Fock for the electrons at finite T87 should be mentioned as well as the recent upsurge of interest, for instance the work by Hirata and coworkers on one-dimensional solids at finite temperature, with both Hartree–Fock approximation and various correlated methods.88,89 See also recent work by Harsha et al.90,91 and White and Chan.92 Finite temperature effects in extended systems with defined T and μ are of course well known in many-body (perturbation) theories.93–96 This does not imply that thermodynamic properties would exist for a small finite-electron system like an atom or molecule.
Finally we have noted that “ensemble DFT” is not a good common denominator for on the one hand for instance the T-GOK ensemble approach to excitation energy calculations,65–72,76 and/or the occurrence of ensembles to describe densities of degenerate ground states,14,77,78,80 as employed in the REKS method,81,84 and on the other hand the use of ensembles of ground states of different electron number.7 While there is no objection to the former, we have warned against the pitfalls that open up when unwarranted conclusions are drawn from behavior of the latter.
One important characteristic of a thermodynamic system is that it has to be macroscopic for the following reasons. For some concepts or derivations it is necessary that the thermodynamic limit can be reached, meaning that the particle number can be increased to, say, Avogadro's number (ca.1023), keeping the density N/V constant. It is also necessary that the temperature can be measured and that equilibrium exists in the sense that the temperature will be the same in different parts of the system. Statements about the existence and measurement of a physical attribute called “temperature” as well as its transitivity and its role in defining equilibrium, feature in the literature as the Zeroth Law of Thermodynamics.97 The basis of the statistical mechanical derivation of the properties of a thermodynamic system is the realisation that the system, for which the macroscopic state is described with a few macroscopic variables (e.g. N, V, T) will in the course of time traverse an exceedingly large number of microstates which are all compatible with the macroscopic state but which differ in the states of the large number of particles comprising the system. When the movements of the particles are described classically this is simple: the position and momentum coordinates for all particles define a point (q1…qN,p1…pn)≡(q, p) in phase space which travels along some path in phase space due to the constant changes in position and in momentum (e.g. due to collisions). When the particles are described quantum mechanically the assumption of constantly changing microstates is a bit more subtle. It would not be compatible with this fundamental viewpoint of statistical mechanics to assume that the total macroscopic system can be in an energy eigenstate and therefore be stationary, not subject to change. Detailed arguments why this cannot be the case can be found in the cited literature, notably Tolman21 who stresses that the principle of detailed balance also applies to a macroscopic system in quantum mechanics, and e.g. Hill23 and Landau and Lifshitz.27 Landau and Lifshitz27 summarize this in the statement that it is impossible for a macroscopic system to be in an energy eigenstate due to the unavoidable disturbance by interaction with the outside world and the internal disturbances by density fluctuation and other perturbations. So it is generally accepted that also when quantum mechanics is applied the same assumption holds that the system traverses in the course of time the microstates compatible with the thermodynamic state of the system.
Since the calculation of the time-dependent behavior of the system is out of the question, at least before computer simulations came around, a so-called representative ensemble is formed. The ensemble consists of very many mental copies of the system, each presenting the system in a particular microstate. Then the basic postulate of statistical mechanics asserts that there is no a priori bias in the probability that the system be in some microstate: all microstates (or all points in phase space) compatible with the macroscopic state variables, are equally probable. The impossible task of calculating a desired property as a time-average of the system is now replaced by an ensemble average. Given the equal probabilities for all microstates, this amounts basically, for a wanted property, to finding the probability that a microstate has a certain value for the desired property. Then an average over all the microstates can be taken. If the treatment is quantum mechanical, the notion that macroscopic systems cannot be in a stationary state, does not preclude the use of quantum mechanical states – either energy eigenstates or some set of other states compatible with the thermodynamic variables – as the microstates of the systems in the ensemble, if only their number is large enough and representative of the thermodynamic system.
The simplest example is the case of an assembly of N independent classical particles within a (macroscopic) volume V with total energy E or within a narrow energy range (E − ΔE, E + ΔE). The independent particles will have individual energies {εi}. If there are ni particles having energy εi the constraints of fixed particle number N and energy E can be written
(A1) |
(A2) |
S = klnΩ. | (A3) |
The well known results of statistical mechanics assert that, as a consequence of the huge number of particles and very large Ω({ni}) of thermodynamic systems, and the fact that it is the logarithm of Ω that enters the entropy, the contributions of all other terms Ω({ni}) than just the maximum one, , make a negligible contribution to the entropy klnΩ.
When the total number of particles N is very large, and the occupations {ni} as well, so that the Stirling approximations lnN! = NlnN − N and lnni! = nilnni − ni can be used, lnΩ({ni}) reduces to
(A4) |
(A5) |
(A6) |
Thermodynamics has provided a framework for macroscopic systems of particles, with the introduction of quantities such as the (internal) energy (E), temperature (T), entropy (S), volume (V), chemical potential (μ) and derived quantities such as Helmholtz free energy F = E − TS, enthalpy H = E + pV and Gibbs free energy H− TS. A fundamental relation is (cf. the first law of thermodynamics)
dE = TdS − pdV + μdN | (A7) |
Now from eqn (A7) several relations follow, for instance
(A8) |
and
(A9) |
(A10) |
(A11) |
(A12) |
To determine α we use the of eqn (A6) and then (A9) gives, together of course with S = klnΩ(A3), the result
(A13) |
In connection with this thermodynamic interpretation of α and β a short remark on the concept of equilibrium is in order.21,25,28 Let us consider two systems A1 and A2 with macrostates (N1,V1,E1) and (N2,V2,E2) respectively. The corresponding numbers of microstates are Ω1(N1,V1,E1) and Ω2(N2,V2,E2) and the total energy is Etotal = E1 + E2. If we bring these systems into contact, so that energy can be exchanged but not particles (so they are separated by a heat conducting wall through which particles cannot pass) at any time t the total system will have a number of microstates dependent on the energies at that moment
Ωtotal = Ω1(E1)Ω2(E2) = Ω1(E1)Ω2(Etotal − E1) | (A14) |
(A15) |
(A16) |
At this point we stress that the derivation of the occupation number distribution (A12) hinges on two conditions: (a) the statistical mechanical derivation requires primarily that huge numbers of particles are involved; (b) the introduction of physical meaning (temperature, chemical potential) for the constants in the statistical expressions requires that the target system to which the statistics is applied is a bona fide thermodynamic system, i.e. the system is macroscopic (in the order of 1023 particles) and in equilibrium, with a uniform temperature and chemical potential.
So it is essential that we are dealing with a very large total particle number, but other aspects of our example above are not important. For instance, for the more relevant case (even classically) of indistinguishable particles, as in the ideal gas of noninteracting particles, Ω({ni}) has to be divided by N!,
(A17) |
(A18) |
(A19) |
The earlier discussion has used the microcanonical ensemble. Sometimes, mostly for calculational expedience, it is easier to use another type of ensemble, the canonical and grand canonical ensembles being best known. In these cases the constraints on the total energy of the thermodynamic system (canonical ensemble) or on both the number of particles and the total energy (grand canonical ensemble) are no longer maintained. This does not prohibit the applicability of the results to the thermodynamic system, even if that still has fixed particle number and energy. If the average particle number or both the average energy (Ē) and particle number () are equal to the corresponding quantities in the thermodynamic system, the results are applicable since the deviations from the average have virtually no weight.
In connection with the main text our interest is in the grand canonical ensemble.23–28 We note that in that case only the total number of particles in the whole ensemble of members, and the total energy of the ensemble are fixed, which are in terms of the averages per system just and . Let ni,s denote the number of systems that have at any time t Ni particles and energy Es. So we have the relations
(A20) |
(A21) |
(A22) |
This ensemble is a mental construct that serves to obtain the time-average of quantities by averaging over the members of the ensemble. In order to establish that it is representative of a thermodynamic system, with number of particles N equal to the ensemble average and energy E equal to the ensemble average Ē, we have to give the Lagrange multipliers α and β of eqn (A22) a physical meaning. The bridge from statistical mechanical quantities to thermodynamic ones is again the First Law (A7). Expressing α and β in terms of thermodynamic properties of the target system is more involved than in the simple case of the microcanonical ensemble above, see e.g. ref. 28, Ch. 4.3. But if the system which we represent with a grand canonical ensemble is a bona fide equilibrium thermodynamic system, with a temperature T and with a chemical potential μ for the particles, to which the First Law applies, one finds again the meanings α = −μ/kT(A13) and β = 1/kT(A11). As for any thermodynamic system, the numbers of particles N must be very large in order to have well defined μ and T. We thus arrive at the well known expression for the distribution of the members of the grand canonical ensemble that is representative of a (μ, V, T) thermodynamic system,
(A23) |
Summarizing, eqn (A23) is valid for an ensemble with very many members and for a target thermodynamic system in equilibrium at a temperature T with very many particles at chemical potential μ. The ensembles discussed in the main text, with a few members which consist of the electrons in an atom or molecule in specific energy eigenstates, fall far short of the requirements to qualify as grand canonical ensembles: the number of ensemble members should be very large to enable the statistical derivation of (A22) and the numbers of particles (electrons) in each system should be very large in order for them to constitute a thermodynamic system in equilibrium and provide (A23).
Let us stress that the GC ensemble is a (μ, V, T) ensemble: the members of the ensemble should be characterized by a (common) chemical potential μ and temperature T, which is thought to be effected by embedding in a huge reservoir (which may or may not be formed by all the other members). The summation in the grand partition function extends in principle over all particle numbers, and the associated energy levels, and the constancy of μ and T might seem a bit problematic at the very low particle numbers. However, this is a moot point, the probability distribution (A22) peaks extremely at the particle number and energy of the actual thermodynamic system and the few terms at low particle number have essentially zero contribution.
The role of the reservoir to establish temperature and chemical potential is unproblematic when the system itself is a thermodynamic system in equlibrium (both within itself and with the reservoir) because then the chemical potential and temperature of the particles in the system are unambiguously defined. But what about a system that is so small that it does not qualify as thermodynamic system, and μ and T cannot be defined for the isolated small system? That this is possible is the underlying assumption when one imagines an atom to be “in contact” with a reservoir, from which the electrons in the atom are supposed to derive a chemical potential and temperature.
This is a subtle issue. In what sense such thermodynamic attributes can be associated with small (non-thermodynamic) systems may be elucidated with two examples that are discussed below.
(B1) |
(B2) |
(B3) |
(B4) |
(B5) |
Considerable simplification in the treatment may now be achieved due to the independence of the M sites. The partition function for a site with N particles is defined as . It is possible to write the grand partition function for the lattice gas of adsorbed particles (in which a summation over all particle numbers from 0 to mM is carried out) as a simple power of single-site “grand partition functions” (see Hill,24 Ch. 7.2):
ZGC(μ, T, M) = ξ(μ, T)M | (B6) |
(B7) |
(B8) |
(B9) |
So this leaves the impression that the determination of e.g. an average particle number can be achieved with a small system “grand partition function”. But it is important to realize that the derivation proceeded from the total thermodynamic system, and that this system is needed to give meaning to the chemical potential μ and temperature T that feature in the single site “grand partition function”. These are collective properties of the macroscopic thermodynamic system of a gas of particles in equilibrium with an array of many adsorption sites. They cannot be determined in an independent way for the small system of maximum m particles bound to a single site, they are simply not attributes of such a system. And again, the energies Ej(N) of the single site systems (solutions to the Schrödinger equation) are input to the thermodynamic treatment, they do not depend on or are in any way affected by the μ and T that feature in this application of statistical thermodynamics.
This journal is © the Owner Societies 2022 |