Open Access Article
Leszek J.
Frasinski
Department of Physics, Imperial College London, London SW7 2AZ, UK. E-mail: l.j.frasinski@imperial.ac.uk
First published on 26th July 2022
Cumulant mapping employs a statistical reconstruction of the whole by sampling its parts. The theory developed in this work formalises and extends ad hoc methods of ‘multi-fold’ or ‘multi-dimensional’ covariance mapping. Explicit formulae have been derived for the expected values of up to the 6th cumulant and the variance has been calculated for up to the 4th cumulant. A method of extending these formulae to higher cumulants has been described. The formulae take into account reduced fragment detection efficiency and a background from uncorrelated events. Number of samples needed for suppressing the statistical noise to a required level can be estimated using Matlab code included in Supplemental Material. The theory can be used to assess the experimental feasibility of studying molecular fragmentations induced by femtosecond or X-ray free-electron lasers. It is also relevant for extending the conventional mass spectrometry of biomolecules to multiple dimensions.
From its invention covariance mapping has been used mostly in studies of ionization and fragmentation of small molecules, with some notable exceptions, such as X-ray scattering or brain studies.4,5 Since covariance mapping requires extensive data processing, two-dimensional maps have been most practical. With continued progress in computational power an extension of covariance mapping to higher dimensions is a timely proposition.
Two recent developments have motivated this work. One is a successful application of covariance mapping to two-dimensional mass spectrometry of large biomolecules,6,7 with good prospects for extending this technique to higher dimensions. The second development is the emergence of X-ray free-electron lasers (XFELs), which are powerful research tools for studying atomic and molecular dynamics on the femtosecond and attosecond timescales.8 The unprecedented intensity of X-rays in the XFEL pulses induces a large number of fragmentation events, which leaves covariance mapping as the only practical method for correlating the fragments. Moreover, recent XFEL upgrades to high repetition rates, including fast data acquisition,8 make the extension of covariance mapping to higher dimensions feasible.
Firstly, each of the Z, Y, X,… random variables is usually measured at just one point on a fluctuating spectrum of mass, energy, or other quantity characterising the fragments, and normally it is not obvious in advance into which parts of the spectra the fragments may fall. Therefore it necessary to calculate the
cumulant for each possible n-tuple of spectral points and display the reconstructed parent objects in an n-dimensional map, as shown at the bottom of Fig. 1. Moreover, the spectra of the Z, Y, X,… variables could be more than 1-dimensional, for example, they could be sourced from a position-sensitive timing detector, which effectively resolves fragments in 3D momentum space,9 and in principle would require mapping in 3n-dimensional space, unless the fragmentation kinematics can be used to reduce the dimensionality.
Secondly, in most experiments the fragments are detected with quantum efficiency η < 100%. This means that some of the fragments are undetected, as indicated by empty circles in Fig. 1. (The colours mark different detection patterns to enable the reader tracing them from the sample to the bins.) Therefore, the number of tuples (Zc, Yc, Xc,…) of only collectively-correlated detected fragments (black filled circles) is smaller than the number of all detected fragments Z, Y, X,… (all filled circles).
And thirdly, there may be a Poissonian background of fragments (magenta circles) completely unrelated to the sample of interest. This background is characterised by parameter ζ, which is the ratio of the mean number of the background fragments to the mean number of the sample fragments. Hence, ζ = 0 means no background, ζ = 1 means as much background as the sample signal, etc.
Taking into account reduced detection efficiencies and the presence of background fragments relaxes the idealized requirements of identical parent objects and a single fragmentation pattern, which makes cumulant mapping applicable to studies of mixtures and multiplicity of fragmentation channels. The aim of this work is to estimate the cumulant value and noise when the fragments are detected under the non-ideal conditions of η < 100% and ζ > 0.
On one hand, a random variable can be measured yielding sample values, which I denote giving it an index, e.g. Zi. On the other hand, the general properties of the random variables, such as probability distributions, moments, etc. are mathematical abstractions that cannot be known with absolute certainty. Nevertheless, we can infer these abstract properties by repetitive sampling of the random variable. I assume that in an experiment the conditions stay constant and that we draw N samples of Z, which are indexed by i = 1, 2, 3,…, N.
The parameters describing the properties of a random variable are denoted with Greek letters, such as
. To infer a parameter value we can use the samples to construct an estimator denoted by a hat, e.g.
. For example, to estimate the first moment of variable Z, we use the sample average indicated by an overline:
![]() | (1) |
Since repeating the experiment gives us a new set of samples, the sample averages and the estimators are also random variables. However, calculating their expected values (or variance, or higher-order moments) fixes them to the theoretical limit when N → ∞. Angular brackets are used to denote the expected values:
![]() | (2) |
stands for an estimator of collective correlations among n fragments. When n = 1 the problem is degenerate and the best we can do is to estimate the mean number of only one kind of a fragment, Z, using the sample average according to eqn (1).
Calculating the expected value of this estimator leads to the well known formula for covariance:
| z0 = Z − 〈Z〉, y0 = Y − 〈Y〉, | (3) |
![]() | (4) |
![]() | (5) |
One may expect that this method of extending the covariance formula works for four fragments:
because there is no collective correlation among all four fragments.
:
•
only if all arguments are collectively correlated;
•
has units of the product of all arguments;
•
is invariant under interchange of any two arguments.
It turns out that these properties uniquely determine the formula. For example, the reader is invited to check that the following formula has the desired properties:
| 〈z0〉 = 〈Z − 〈Z〉〉 = 〈Z〉 − 〈Z〉 = 0. |
can be simplified by writing![]() | (6) |
denotes a sum over all ways of paring the four variables.
Similarly,
![]() | (7) |
![]() | (8) |
parameter that measure the collective correlations of n random variables is known as the n-variate joint cumulant of the first order14 (chapters 3, 12 and 13). In this work I shall shorten this name and call it simply the nth cumulant.
Cumulants are useful in seemingly disjoint areas of science, such as light–matter interactions,15 quantum theory of multi-electron correlations,16 bond breaking of diatomic molecules,17 neural network theory,18 financial data analysis,19 and gravitational interaction of dark matter.20 In physics multivariate cumulants are also known as the Ursell functions.21

Let us suppose that we repetitively gather a sample of objects in a random manner, so the number of objects S in the sample follows the Poisson distribution:
![]() | (9) |
To understand why cumulants are useful in this task, we first consider ideal conditions: there is only one way of fragmenting the parent, we detect every fragment, and there is no background of fragments from other processes. Such fragmentation process can be written as
| S → (Z, Y, X,…), |
| Z = Y = X = … = S |
| z0 = y0 = y0 = … = s0, |
| 〈z0y0x0…〉 = 〈sn0〉 = 〈(S − 〈S〉)n〉 = μn. |
![]() | (10a) |
![]() | (10b) |
![]() | (10c) |
![]() | (10d) |
When η < 100% or ζ > 0, then the random variables S, Z, Y,… are only partially correlated. To find the formula for
we need to separate the correlated and uncorrelated parts of these variables.
| Z ∼ Pois(λ)*Binom(ηZ) → Pois(ηZλ). |
| Z ∼ Pois(ηZλ)⊞Pois(ηZζZλ) □ Pois(ηZ(1 + ζZ)λ). |
Since the binomial sampling of each of the Z, Y, X,… fragments is independent, their joint detection efficiency is a product of the individual efficiencies. Therefore the probability distribution of the detected fragments Z correlated with all the other detected fragments is given by
| Zc ∼ Pois(θnλ), |
| θn = ηZηYηX…. |
| Zc = Yc = Xc = … = Sc ∼ Pois(θnλ). | (11) |
Since the number of detected fragments is the sum of the correlated and uncorrelated parts:
| Z = Zc + Zu, Y = Yc + Yu, X = Xc + Xu,… | (12) |
Further calculations are significantly simplified when mean-centered variables are introduced for the correlated and uncorrelated parts:
![]() | (13) |
![]() | (14) |
One useful implication of eqn (13) is that the expected values of the mean-centered variables vanish:
| 〈s〉 = 〈z〉 = 〈y〉 = 〈x〉 = … = 0. | (15) |
The second useful property is that they can be regarded as independent, which is exactly true if only one or two detectable fragments are produced. For three or more fragments it may happen that one fragment that is not detected relegates the other fragments to the background in spite of their correlation. It can be shown that these residual correlations do not affect the expected values of cumulants, nor the variance of the first and second cumulant.
![]() | (16) |
consistent with the higher cumulants but also gives it the meaning of a signal that is separate from the background. As illustrated in Fig. 2, when in spectral analysis the sample fragments form a peak riding on a broad background given by 〈Zu〉 = ηZζZλ, then 〈Zc〉 = ηZλ is just the peak height.
The higher cumulants can be calculated using the mean-centered variables, z0 = s + z,
y0 = s + y, etc. For example,
![]() | (17) |
We notice that the formulae for all higher cumulants can be derived in a similar manner. When expanding 〈z0y0x0…〉, most of the terms vanish because of eqn (15), and we are left only with polynomials of moments of s. These polynomials are the same as in eqn (10) except now
follows Pois(θnλ) rather then Pois(λ). Therefore, we obtain a general result:
![]() | (18) |
![]() | (19) |
These effects can be quantified by calculating the variance of the cumulant estimator,
, and finding the noise-to-signal ratio:
![]() | (20) |
, therefore, once we know σn we can estimate the number of samples, N, needed for the required noise-to-signal ratio and assess the experimental feasibility.
The equations give cumulant estimates
constructed from the samples according to eqn (1), the cumulant values
, and the cumulant variances
. The auxiliary quantities are the central moments of the correlated parts, 〈sn〉, and of the uncorrelated parts, 〈z2〉, 〈y2〉, 〈x2〉, and 〈w2〉 (see eqn (13)).
With increasing counting rate, i.e. increasing λ, the noise of the first cumulant approaches zero as
, which reflects the well-known fact that a high counting rate is always advantageous in collecting 1D spectra.
The N/S of the second cumulant (covariance) approaches a constant value of
with an increasing counting rate. This tells us that for covariance mapping there is little advantage in increasing λ beyond 1 or 2, unless we want to accommodate weak and strong features on the same map. (In fact a high counting rate exacerbates map distortions due to fluctuations in experimental conditions, which induce common-mode fragment correlations.22,23)
Cumulants higher than the second one have N/S increasing at high counting rates due to the higher powers of λ present in the expressions for
. Therefore, for n ≥ 3 there is an optimal counting rate at low values of λ as the minima of the green and orange curves show in Fig. 3.
![]() | ||
| Fig. 4 Cumulant noise as in Fig. 3 but resolved for η. When detection efficiency is reduced to about 50%, there is only a modest increase in the noise. | ||
As expected, the noise of the 2nd and higher cumulants substantially increases when the detection efficiency is very low. However, when the detection efficiency is reduced only moderately, to around 50%, the increase in the noise is also moderate, even for the 4th cumulant. Since 50% detection efficiency is within the reach of modern particle detectors, it makes cumulant mapping a feasible proposition.
![]() | ||
| Fig. 5 Cumulant noise as in Fig. 3 but at a reduced η and resolved for ζ. The noise increases with increasing background, especially for the higher cumulants and higher counting rates. | ||
For simplicity, we assume the same background level, ζ, for each kind of fragment, which makes the noise of higher cumulants to grow faster with increasing ζ than the lower ones because of the higher powers of ζ present in the expressions for
. In some experiments this may be an over-pessimistic assumption because some of the Z, Y, X, or W fragments may experience little or no background at all. The code provided† accepts ζ and η tailored to each kind of fragment.
The optimum counting rate is broadly the same as for no background shown in Fig. 4. With an increasing background level, however, the optima for the higher cumulants shift to even lower counting rates.
![]() | ||
| Fig. 6 Number of samples needed for a fixed noise-to-signal ratio. In realistic experimental conditions about 105 samples are needed to build a clear 4th cumulant map. | ||
Such sampling rates are now routinely available from femtosecond lasers and becoming available from XFELs.8 For example, the LCLS-II XFEL will be operating at up to 1 MHz repetition rate, enabling researchers to probe over 109 samples in a single experimental run. In principle, such a large number of samples makes it possible to build clear cumulant maps of even higher order than the 4th one. In practice, however, the data acquisition speed is likely to be the limiting factor, since every single-sample spectrum needs to be recorded.
Cumulant mapping can significantly enhance the conventional mass spectrometry, whose main application is in identifying large biomolecules. The conventional approach is to obtain a high-quality mass spectrum of fragments and search for a match in a large database of molecular spectra. Therefore, the development of commercial spectrometers is driven towards the high mass resolution at the expense of the repetition rate, which is normally below 1 Hz.
Recently covariance mapping has been successfully applied to analyse mass spectra obtained from a commercial spectrometer.6,7 Rather than relying on the high mass resolution, the technique resolves the spectra in the second dimension and partially reconstructs the parent objects on a 2D map. Such a reconstruction allows some parent identifications that would be impossible using only a 1D spectrum of any quality. Clearly, this technique can utilise higher-cumulant mapping to perform a more complete parent reconstruction. Since many samples are needed to build cumulant maps, the spectrometer should operate at high repetition rates, for example, by employing the time-of-flight technique.
The volume of a multi-dimensional cumulant map can be very large and cumbersome to explore. In the simplest approach, cross-sections and projections of the map can be used to visualise the reconstructed molecules.10 If the locations of the reconstructed molecules are to be found, computational methods of artificial intelligence can be used to discover and identify them.
Cumulant mapping spectrometry can be combined with laser-induced fragmentation. On one hand, such combination allows us to elucidate the dynamics of molecular ionization and fragmentation,9 on the other hand, it makes it possible to tune the fragmentation to specific bonds in large biomolecules.24
Studies of molecular fragmentations induced by femtosecond or X-ray lasers are obvious areas for applying cumulant mapping. The technique can be used to extend the conventional mass spectrometry to multiple dimensions and substantially enhance its selectivity. Extension to particles other than electrons or ions should be straightforward. In particular, photons in a wide spectral range from near infrared to hard gamma rays are the promising candidates.
Since cumulant mapping relies on the Poisson distribution of the samples, in principle, it is applicable to any repetitive Poissonian process. For example, the neuronal spike trains closely follow Poisson point processes.25,26 It could be speculated that cumulants represent the high-level brain functions emerging from correlations in low-level neuronal activity.
is needed to estimate the number of samples that have to be collected to measure the cumulant with given noise-to-signal ratio. Since only an estimate is needed, I will use approximations in deriving some of the formulae.
| 〈A + B〉 = 〈A〉 + 〈B〉. | (21) |
| 〈AB〉 = 〈A〉〈B〉 + cov(A, B). | (22) |
Their variances and the covariance can be expressed in terms of expected values:
| var(A) = 〈(A − 〈A〉)2〉 = 〈A2〉 − 〈A〉2, | (23) |
| cov(A, B) = 〈(A − 〈A〉)(B − 〈B〉)〉 = 〈AB〉 − 〈A〉〈B〉. | (24) |
An expected value of a sample average is equal to the expected value of the whole population:
| 〈Ā〉 = 〈A〉, | (25) |
![]() | (26) |
![]() | (27) |
Since covariance is a linear function, the covariance of a random variable A with the sum of several other random variables X1, X2, X3,… is the sum of the covariances:
![]() | (28) |
![]() | (29) |
The variance and covariance of products of random functions can be calculated using the delta method. If F(X) and G(X) are functions of a random vector X = [X1, X2, X3,…], then the Taylor expansion of F and G gives14 (eqn (10.12) and (10.13))
![]() | (30) |
![]() | (31) |
To calculate the variance of the product of two random variables, I make the following choice for eqn (30):
| F = AB, X = [A, B], ξ = [〈A〉, 〈B〉], |
var(AB) ≈ 〈A〉2 var(B) + 〈B〉2 var(A) + 2cov(A, B), | (32) |
var(A2) ≈ 4〈A〉2 var(A). | (33) |
In a similar way if I choose the following for eqn (31):
![]() | (34) |
When estimating experimental cumulants we use sample averages,
, to center random variables. In theoretical estimations the expected values, 〈X〉, can be used instead, which significantly simplifies the calculations. These two versions of the cumulant estimators are asymptotically equal when the number of samples tends to infinity:
![]() | (35) |
Another approximation I use, also affecting only the variance of the third and higher cumulants, is the neglect of the residual correlations that are discussed following eqn (15).
We calculate the variance terms one by one:
Note a useful simplification of eqn (15): a term vanishes if it has a factor linear in the expected value of a mean-centered variable. For this reason all the covariance terms are zero, for example
Gathering all the variance terms we obtain the formula for
given in Section 5.2.
Expanding the variance of the sum we find that the covariance terms vanish again and we need to calculate only variances:
Gathering all the variance terms we obtain the formula for
given in Section 5.3.
To calculate the variance of this estimator we apply eqn (29) and evaluate the variance of the first two terms:
To calculate the variance of the next three terms we can use eqn (32) and (33) for the variance of a product of two variables:
When we proceed to calculate the variances of the last line of
, we notice that all terms in eqn (32) vanish. Therefore all remaining variance terms are zero.
To calculate the covariance terms of
, we refer to eqn (27) and note that the covariance vanishes unless both A and B have the same subset of the z, y, x, w variables. Therefore, the only non-zero covariance terms are:
Gathering all the variance and covariance terms gives us the formula for
given in Section 5.4.
Footnote |
| † Electronic supplementary information (ESI) available: Matlab code that calculates the values and variances of the first four cumulants. See DOI: https://doi.org/10.1039/d2cp02365b |
| This journal is © the Owner Societies 2022 |