Pierre-Emmanuel
Peyneau
*a,
Léonard
Seydoux
b and
Mickaël
Tharaud
b
aUniv Gustave Eiffel, GERS-LEE, F-44344 Bouguenais, France. E-mail: pierre-emmanuel.peyneau@univ-eiffel.fr
bUniversité Paris Cité, Institut de physique du globe de Paris, UMR CNRS 7154, F-75005 Paris, France
First published on 9th October 2025
Single particle inductively coupled plasma-mass spectrometry (sp-ICP-MS) produces time series—time scans—that require processing to extract meaningful information regarding the nanoparticles that are analysed. In this work, we present a stochastic algorithm to generate such time scans. We also introduce an open-source implementation of it,
, a Python library designed to generate synthetic, yet realistic sp-ICP-MS time scans for a single mass-to-charge ratio. We argue that our library is an efficient and reliable testbed on which future studies can build to assess existing or new data processing strategies.
So far, research on sp-ICP-MS has been reported with quadrupoles, sector field and time-of-flight (ToF) mass spectrometers.5,6 While sector field instruments present the highest sensitivities,7 time-of-flight the lowest7 and quadrupoles are in between,6 time-of-flight mass analysers are the only one allowing for the simultaneous measurement of multiple metallic or metalloid elements within a single particle, providing information about the nanoparticle elemental composition.8
Regardless of the mass spectrometer used, if a particle entering the plasma contains a sufficiently high mass of one of the isotopes monitored, a fraction of the ion cloud originating from that very particle will give rise to a detectable fast-transient, high-amplitude signal, hereafter named spike, typically lasting a few hundred microseconds.4,9,10 Continuously introducing a dispersion containing a sufficiently diluted number concentration of detectable nanoparticles produces a time scan—number of counts per dwell time as a function of time—with a proportional number of spikes emerging from a background signal.
One has to extract individual spikes from the raw signal to get meaningful information, such as the number concentration and the mass of the monitored element associated with each spike. Spike detection is thus the most crucial step of the data processing stage. When sp-ICP-MS first appeared, milliseconds dwell time were the norm and experimental data could be treated with spreadsheet editors.11 With the emergence and the advent of microsecond dwell time instruments, data processing became far more resource-intensive and instrument manufacturers started to propose dedicated software modules, while some open-source data processing solutions were developed by academic research groups.12–14
Evaluating the performance of these different data processing tools is challenging.15 Commercial software packages are widely used because they are robust and validated by the instrument manufacturers who design them. However, the implementation details of the underlying algorithms are typically not available. These solutions also reflect the design choices of the manufacturer, which may not always align with every researcher specific needs. In parallel, open-source alternatives—with SPCal14 serving as a notable example in the context of sp-ICP-MS—offer distinct advantages from a scientific perspective. Their source code is openly available and documented, enabling independent reproduction of results from raw datasets without requiring commercial software. Moreover, when appropriately licensed, open-source code can be adapted, making it possible to implement alternative data processing algorithms.
Open-source is thus a prerequisite for transparent and reproducible testing of various data processing methods, but it is not sufficient per se. A strong common testbed is also needed to evaluate and rank different methods. This is perhaps one of the biggest shortcomings in sp-ICP-MS, due to the profusion of possible reference materials. Good-quality, certified nanomaterials do exist (e.g., NIST RM 8012 or NIST RM 8013),16 but not for all elements. Spherical gold nanoparticles, probably the most commonly used reference nanomaterials, come with an uncertainty in particle number concentration and size. Transmission or scanning electron microscopy can be performed to circumvent this issue, but artefacts can still occur over the course of a sp-ICP-MS analysis (adsorption on the walls of the vials, losses related to incomplete consumption introduction systems overwhelmingly employed).17 Overcoming these issues to obtain high-quality experimental reference data is possible but difficult to standardize. In addition, it is time-consuming, requires sophisticated and costly equipment, and must be implemented by an experienced staff.18
An alternative approach, employed in many fields of science—gravitational wave detection, optics, computer vision, genomics, chemistry, etc.—is to generate synthetic data for benchmark purposes.19–26 Indeed, synthetic but very reliable sp-ICP-MS datasets, i.e., close or even virtually indistinguishable from real ones, would allow to test different data processing methods, on reference data that could be easily generated with precision and reproducibility.
To date, such an approach had never been pursued in the realm of sp-ICP-MS. The purpose of this study is to fill that gap. To do so, we propose and describe here an algorithm developed to generate realistic single-channel (i.e., mono-elemental) sp-ICP-MS time scans and we provide an open-source Python implementation, named
. We then present some examples illustrating the synthetic data that can be generated with the code and compare them with experimental data obtained with either quadrupole or time-of-flight mass analysers.
The time scan signal produced by a mass spectrometer operated in single particle mode depends on the production, transfer and detection of monoatomic ions generated in the plasma of the instrument. The algorithm relies on a Bernoulli distribution34 to model the transport of analyte ions up to the detector: a given analyte ion produced in the plasma has a probability ptrans to reach the detector, and a probability 1 − ptrans not to reach it. Eventually, the response of the detector to an incoming ion is taken into account. This simple approach enables the generation of time scans that capture both particle-induced events and background fluctuations over time.
Obviously, the ultimate purpose of the algorithm is to assign a value (number of counts) to each reading of the time scan, the total number of readings nreadings being equal to the duration of the single particle analysis divided by the dwell time τdw. The precise method used to generate these nreadings values is described below.
(fdissτdw),![]() | (1) |
Each analyte ion produced in the plasma has a probability ptrans to go through the interface, the ions optics, the mass analyser and to reach the detector. As a consequence, among a given number of analyte ions nions, the number reaching the detector is thus a random variable that follows a binomial distribution with parameters nions and ptrans,34
![]() | (2) |
(
(fdissτdw), ptrans), following eqn (2), with nions being drawn in a Poisson distribution of parameter fdissτdw (eqn (1)). This value is the background component of the synthetic time scan for the dwell time under consideration.
Finally, as can be shown by a straightforward calculation,
(
(fdissτdw), ptrans) is equivalent to
(fdissτdwptrans). Drawing directly into this distribution is therefore an alternative option to simulate the background component.
Assuming that the size, shape and composition of every particle entering the plasma of the instrument is known, the number of atoms of the analyte contained in each individual nanoparticle can be calculated. Since it is assumed in this work that each nanoparticle is entirely vaporised and the resulting atoms ionized once introduced in the plasma, it equals the number of analyte ions, nions|NP. For each nanoparticle, this grouping of analyte ions—ion cloud—evolves over the course of its transport within the instrument. In particular, its spatial extent increases. Modelling the transport of an ion cloud is out of the scope of this paper; beside, it is unnecessary. Instead, we will only retain a few key components.
A particle event detectable in an experimental time scan can be characterized by its start time, its duration and its shape. It is already known that the starts of the particle events can be described by a homogeneous Poisson process of intensity fpart,30 the flux of nanoparticles carrying the analyte under interest: in practice, it entails that the time interval between the starts of two successive particle events follows an exponential distribution with the very same parameter as the underlying homogeneous Poisson process.30
As for the duration and the shape, these are two sides of the same coin. Indeed, from a microscopic point of view, the constituents of an ion cloud are transported, filtered and finally detected. Duration and shape of a particle event are just by-products of the number of detected ions and the time it takes for each of them to reach the detector. This time distribution is the core information needed. In theory, it could be computed from first principles thanks to a precise modelling of ion transport in the instrument. However, we did not take this bottom-up approach. We opt instead for a simpler, heuristic alternative, relying on the so-called macrotransport paradigm: in essence, it states that an advection–dispersion equation with constant coefficients is often a good first-order macroscopic approximation for mass or heat transport modelling when several transport phenomena are intertwined.38 From a mathematical viewpoint, an advection–dispersion equation is completely analogous to an advection–diffusion equation. The latter equation being the macroscopic limit for the probability density of Brownian particles experiencing a constant and uniform drift, the same microscopic stochastic model holds for advecto-dispersive transport.† In other words, if we take seriously the fact that the advection-dispersion transport model decently represents ion transport in a mass spectrometer at a macroscopic scale, and we certainly do, it implies that individual ions can be modelled as Brownian particles whose second moment of displacement is proportional to a coefficient of effective diffusivity.
Computing this coefficient would require to rely on first principles and to take into account all the involved transport processes. It is thus out of reach within the effective modelling framework we set ourselves. And yet, the simple knowledge that ion transport can be adequately modelled as a Brownian motion in a uniform advection field (an assumption already implicitly made in the literature with the mention of spatially Gaussian distributed ion clouds40,41) has far-reaching consequences. Indeed, it is known since the very beginning of the 20th century and the pioneering work of Louis Bachelier that the first-passage time distribution of a Brownian motion with a constant drift to reach a given position (the detector in this case) follows an inverse Gaussian distribution.42,43 Beside, this distribution has already been used in the literature—albeit with less justification than here—to simulate the ICP-MS signal.41,44 Consequently, we decided to model the time distribution of particle events with an inverse Gaussian distribution.45 Its pertinence will be checked in Sec. 4, by comparing some generated time scans with experimental ones.
The inverse Gaussian distribution, hereafter denoted
, depends on two non-negative parameters, its mean and its shape, respectively denoted µ and λ. These two parameters control the allure of the distribution. As they cannot be computed from first principles with our approach and yet exert a strong influence on the shape and the duration of synthetic particle events (see Sec. 3), they have to be chosen with some care. According to the literature, with a quadrupole ICP-MS, a particle event typically lasts a few hundred microseconds and up to a few milliseconds when the collision-reaction cell of a triple quadrupole ICP-MS is employed.4,9,10,46 Moreover, the corresponding spikes often have a bell-shaped allure when the mass analyser is a simple quadrupole, whereas it is quite common to observe much more asymmetric spikes with a time-of-flight instrument. Shape and duration of sp-ICP-MS spikes can thus be quite diverse but fortunately, the inverse Gaussian distribution is flexible enough and Sec. 4 will show that it can accommodate these different allures of particle events.
Finally, the synthetic particle event component of the time scan is calculated as follows:
(1) Particle event starts correspond to the discontinuity points of the realization of a one-dimensional homogeneous Poisson process of intensity fpart, denoted
(fpart).
(2) The number of detected ions for each nanoparticle is drawn from a binomial distribution with parameters ptrans and the number of analyte atoms in the nanoparticle,
(nions|NP, ptrans), nions|NP being the number of analyte ions produced by the nanoparticle under consideration.
(3) Relative to the start of a given particle event, the durations taken by all detected ions produced by a single nanoparticle to reach the detector follow independent and identically distributed inverse Gaussian distributions with user-defined parameters µ and λ, i.e.
(µ, λ) repeated
(nions|NP, ptrans) times for each nanoparticle.
(4) The number of detected ions over a given dwell time is finally calculated (binning operation).
, the random function modelling the single-ion response produced by the detector, is finally applied on every single ion reaching the detector to get the full synthetic time scan. Implementation details pertaining to the detecting stage are unnecessary for the description of the algorithm and are postponed to Sec. 3.
Additionally, we stress that the focus is on what happens between the plasma and the detector, i.e. what would be measured with a total consumption introduction system48,49 (100% nebulization efficiency ηneb) for metal or metal oxide nanoparticles. In other words, we left aside the introduction part of the spectrometric analysis. Since the argon ICP is a highly efficient ionization source, we also assumed that all analyte atoms are ionized during the ion production stage, although this is an oversimplification according to Saha's equation under standard ICP-MS conditions.47 Finally, the probability ptrans of ion transmission up to the detector may depend on the physical speciation of the analyte (dissolved or nanoparticulate form),50 a possibility that was not taken into consideration here.
Making minor modifications to the algorithm would be all it takes to lift these restrictions. For instance, accounting for ηneb < 100% is straightforward, as nanoparticles are generally transported to the plasma similarly, regardless of their size.51 In this case, applying a mere multiplication factor (ηneb) to the flux of nanoparticles pumped from the sample is all it takes to model the nebulization step. In the same vein, a less than 100% ionization yield could also be considered by modifying the values of fdiss and fpart. Two distinct probabilities of transmission ptrans could also be used for the analyte ions stemming from the dissolved phase of the sample and those belonging to ions clouds produced from nanoparticles, in order to account for possible space-charge effects. The open-source nature of the code presented below that implements the algorithm means that users will be able to account for these factors in the future if they are deemed relevant.
, an open-source, cross-platform implementation of the algorithm presented in the previous section, has been written in Python. It relies on a limited number of mainstream dependencies (mostly NumPy and SciPy libraries54,55). As for the algorithm, the code consists of two generative parts, background and particle events. Their addition, followed by the application of the single-ion response
random function, provides the complete synthetic time scan.
Due to the inherently stochastic nature of the algorithm,
makes an intensive use of random numbers. Two distinct random number generators are used, one for the particle events and one for the background, so that parameter changes affecting one component do not affect the other. To ensure computational repeatability, seeds used by random number generators have to be provided by the user.
Several parameters have to be provided to
for a synthetic time scan to be generated. The main ones are listed and described in Table 1. Typical values are included as well.
. Typical values are given when available; they can vary depending on the instrument, analyte and experimental setup
| Name | Symbol | Description | Typical values | |
|---|---|---|---|---|
| MS parameters | Dwell time | τ dw | Amount of time the instrument spends on one cycle of data collection | 50 µs to 3 ms (typ. 100 µs) |
| Number of time scan readings | n readings | Total number of dwell time intervals in the synthetic time scan | Typically 105–106 | |
| Probability of ion transmission | p trans | Probability that an analyte ion reaches the detector after being generated in the plasma | 10−6 to 9 × 10−4 (ref. 7, 11, 40, 52 and 53) | |
| Fluxes | Analyte ion flux (dissolved) | f diss | Number of analyte ions generated per unit of time from dissolved species in the plasma | 108–1014 ions per s (ppt–ppm range) |
| Analyte nanoparticle flux | f part | Flux of nanoparticles entering the plasma per unit of time | ≃ 103 NP s−1 | |
| Size distribution | Particle size distribution |
|
Probability distribution used to draw nanoparticle diameters (spherical shape assumed) | Truncated normal, lognormal or Dirac |
| Average diameter | 〈d〉 | Average of the nanoparticle diameter distribution | 10–100 nm | |
Standard deviation of diameter (applicable if is truncated normal or lognormal) |
σ(d) | Standard deviation of the nanoparticle diameter distribution | Depends on the sample and the PSD | |
| Material | Molar mass | M i | Molar mass of the monitored isotope | 23–238 g mol−1 (from 23Na to 238U) |
| Abundance | A i | Natural isotope abundance of the monitored isotope | 0–100% | |
| Mass fraction | x | Mass fraction of the monitored element in the nanoparticles | 0–100% | |
| Mass density | ρ | Mass density of the nanoparticles | 4–20 g cm−3 | |
| Event shape | Particle event shape distribution |
|
Time dispersion distribution of analyte ions produced by one nanoparticle relative to the start of the corresponding particle event | Inverse Gaussian distribution IG(µ, λ) |
| Mean parameter | µ | Mean transport time of an ion from the plasma to the detector (relative to particle event start) | 5–15 τdw | |
| Shape parameter | λ | Controls the dispersion of the ion transit time | 10–200 τdw | |
| Detector | Detector response model |
|
Model of detector response to a single ion: ideal (identity function) or non-ideal (stochastic response) | Identity or lognormal distribution |
Mean of the logarithm of the detector response (applicable if is lognormal) |
m | Controls the mean of the number of counts produced when one ion hits the detector | −0.11045 | |
Standard deviation of the logarithm of the detector response (applicable if is lognormal) |
s | Controls the dispersion of the number of counts produced when one ion hits the detector | 0.47 |
(fdissτdw). Each of these integers is the number of analyte ions of dissolved origin produced in the plasma during a single dwell time. Each ion having a probability ptrans to reach the detector, readings of the background are independently drawn from
(
(fdissτdw), ptrans). This random number equals the number of analyte ions of dissolved origin ndissi having hit the detector between i τdw and (i + 1) τdw, for every integer i comprised between 0 and nreadings − 1. Background generation is schematized in the upper part of Fig. 1.
![]() | ||
Fig. 1 Scheme of the algorithm: background and particle event components of the synthetic time scan are separately generated. The resulting signal finally go through the random function, yielding the full synthetic signal. The background component is calculated independently for each dwell time (draw from ( (fdissτdw), ptrans)), whereas the starts of the particle events are calculated globally (realization of a Poisson process denoted with intensity fpart), first without considering the time discretization of the sp-ICP-MS time scan. For the latter component, inverse Gaussian time dispersion is applied to all the ions reaching the detector at some point (their number is drawn in (nions|NP, ptrans)) and the number of ions detected over a succession of dwell times is calculated. Details regarding the algorithm, its parameters and its implementation are provided in Sec. 2, Table 1 and Sec. 3, respectively. | ||
(fpartτdw), until the duration set for the time scan has elapsed. The first value obtained by drawing in
(fpartτdw) is the start time associated with the first incoming particle event. The second value obtained drawing this exponential distribution is added to the first start time to get the start time of the second particle event, the third one is added to the second start time to get the start time of the third particle event, and so on and so forth. This recursive procedure yields the random start times associated with each particle event.
We made the decision to model the number of analyte atoms in a given nanoparticle as a random variable. This choice is due to the fact that even so-called monodisperse nanoparticles always display a certain level of uncertainty in their size.56 Moreover, nanoparticles in environmental matrices can display widely varying sizes.57–59 In
, it is assumed that
, the diameter distribution of the nanoparticles—assumed to be spherical—follows either a truncated normal distribution (to get non-negative values), a lognormal one or a Dirac distribution (constant diameter). Both truncated normal and lognormal are two-parameter distributions that are completely characterized by their mean 〈d〉 and their standard deviation σ(d). Because of the assumed spherical shape of the nanoparticles, the number of analyte ions stemming from one nanoparticle, nions|NP, scales as the cube of the diameter d, with a prefactor that depends of several properties of the material making up the nanoparticles—molar mass Mi and natural isotope abundance Ai of the monitored isotope, mass fraction x of the element and nanoparticle mass density ρ:
![]() | (3) |
Each analyte ion present in the nanoparticle has a probability ptrans to reach the detector. It entails that for a given nanoparticle, the number of analyte ions reaching the detector is drawn in
(nions|NP, ptrans). For each ion that makes it, we draw in a probability distribution denoted
. For the reasons explained in Sec. 2.2,
is taken as the inverse Gaussian distribution, with a user-defined set of parameters (µ, λ),
(µ, λ), and it models the duration it takes for the ion to reach the detector, relative to the start time of the particle event to which it belongs. For each nanoparticle,
(µ, λ) is independently sampled
(nions|NP, ptrans) times to get the duration it takes for each concerned analyte ion to reach the detector. This duration is then added to the start time associated to the corresponding nanoparticle.
This procedure is repeated for each simulated particle event. The number of analyte ions stemming from a nanoparticle nparti reaching the detector in the temporal range [iτdw, (i + 1)τdw] is then computed, for every integer i comprised between 0 and nreadings − 1.
Particle event generation is schematized in the lower part of Fig. 1.
function in
. For a so-called ideal detector (i.e., when one ion reaching the detector yields one single count),
is the identity function and it has no effect of the signal obtained by adding the background and the particle event components: the number of counts equals the number of ions having reached the detector during the dwell time under consideration.
Alternatively, a non-ideal detector, typically employed in time-of-flight instruments, can be considered. ICP-MS detectors are generally based on electron multipliers followed by either time-to-digital or analog-to-digital converters.60 The latter kind of conversion system increases the dynamic range of the instrument, but it is sensitive to the gain fluctuations of the electron multiplier. Indeed, when one ion strikes an electron multiplier, it produces a cascade of electrons, but their exact number varies from a particle striking event to another. Analog-to-digital systems record these fluctuations, whereas time-to-digital converters eliminate them.61 Consequently, the random nature of the gain of an electron multiplier contributes to the measured signal when analog-to-digital conversion is employed, hence the ‘non-ideal’ adjective used to qualify such detectors. When analog-to-digital conversion is active, the signal measured over a given dwell time is equal to the sum—on all the ion striking events—of the single-ion response produced by the detector. This single-ion response can either be measured or modelled, for instance with a lognormal or a gamma probability distribution.34 Accuracy of the lognormal modelling was demonstrated in ref. 14; this two-parameter probability distribution, entirely characterized by the mean m and the standard deviation s of its logarithm, is thus implemented in
. The two parameters of the lognormal distribution have to be provided by the user. Empirical values can be found in the literature14,60,62,63 and typical values for m and s are provided in Table 1. For a non-ideal detector,
is thus a lognormal distribution with parameters m and s.
The number of analyte ions ni reaching the detector between iτdw and (i + 1)τdw, be they of dissolved or particulate origin, is known for every integer i comprised between 0 and nreadings − 1. Indeed, for a given dwell time interval, it is simply the sum of the numbers of analyte ions of dissolved (see Sec. 3.1.1) and particulate (see Sec. 3.1.2) origins, ni = nparti + ndissi. For each value of i, the
(possibly) stochastic function is independently drawn ni times and the resulting values are summed to get the reading ri of the simulated time scan,
. The detecting stage is schematized in the rightmost part of Fig. 1.
Finally, the whole synthetic time scan is simply the sequence r0, r1, r2, …, rnreadings−1.
are shown in Fig. 2, for different background levels, particle event shapes, as well as various incoming fluxes of nanoparticles fpart and nanoparticle size distributions (monodisperse vs. polydisperse). Note that the same seed has been used for the random number generator controlling the occurrence of particle events, hence the synchronization of the particle events in the four subfigures of Fig. 2 sharing the same fpart parameter.
As observed experimentally for sufficiently diluted dispersions of nanoparticles, the synthetic time scans displayed in Fig. 2 are made of a sequence of spikes emerging from a fluctuating background. Indeed, parameters have been deliberately selected to remain in the very dilute limit where sp-ICP-MS experiments are conducted in practice,31 so that particle event coincidence is very unlikely. The time interval between particle events is exponentially distributed by construction of the algorithm (see Sec. 2), to match what is experimentally observed.30 As the flux fpart of nanoparticles entering the plasma increases, more and more particle events are crammed into the same time interval and overlapping can occur, with detrimental effects on data processing, although all the information regarding the nanoparticle number concentration and size is not completely lost if the background level remains low.31,32
Beside fpart, the other parameters having the most important influence on the allure of
outputs, especially in the very dilute limit favoured to run experiments, are fdiss, ptrans, the response
of the detector, µ and λ. The effects of these parameters on the synthetic time scans produced by
are discussed in the two following subsections.
.37,64
When the detector is non-ideal, each analyte ion reaching it produces a single-ion response modelled with a lognormal distribution (see Sec. 3.1.3). Previous studies have shown that a standard deviation s = 0.47 of the logarithm of the random variable is well suited to model the detector response of TOFWERK and Nu Instruments time-of-flight mass spectrometers currently on the market.14 The mean value m of the logarithm of the random variable was taken equal to m = −0.5s2, for the mean value of the lognormal distribution itself, exp(m + s2/2), to be equal to one. This choice of parameters entails that on average, the mean number of counts associated with one detected ion is equal to 1. But of course, because of the stochastic nature of the response of a non-ideal detector, there is some variability around this average response, and the number of counts during a dwell time is the realization of a compound Poisson distribution.60 The bottom part of Fig. 3 illustrates the impact of detector non-ideality: the low level background signal is no longer Poisson distributed‡ and the normal approximation of the high level background distribution is less accurate than it was in the ideal case. Such discrepancies have already been observed with time-of-flight mass spectrometers and are related to the presence of analogue-to-digital converters equipping the detectors of these instruments.60
, in particular the mean µ and shape λ parameters of the inverse Gaussian distribution modelling the time dispersion of analyte ions originating from a single nanoparticle. Particle events simulated with
being inherently stochastic, variability can be quite important between two particle events, even with the same input parameters. To somehow smooth this variability, we resorted to Monte Carlo simulations to compute the average shape of a particle event for different sets of realistic values for µ and λ. As expected given the properties of the inverse Gaussian distribution, these simulations illustrate that the asymmetry of the synthetic particle events depends on the ratio µ/λ: asymmetry is barely noticeable when µ/λ < 0.02, becomes clearly visible when µ and λ are of the same order of magnitude and is unmistakable when µ/λ ≫ 1.
In addition, both parameters control the duration of an average synthetic particle event. For a given value of the mean parameter µ, the duration of a synthetic particle event is a decreasing function of the shape parameter λ. Reciprocally, for a given value of the shape parameter λ, the duration of a synthetic particle event is an increasing function of the mean parameter µ. Besides, Fig. 4 shows that the duration of an average synthetic particle event also depends on the total number of counts associated with the particle event. No background signal has been considered in Fig. 4, but in the presence of some background signal, the duration of a synthetic particle event would be positively correlated with its cumulative number of counts, provided that all other factors remain equal.
Finally, the number of particle events may be less than the number of nanoparticles numerically injected in
. This is because a necessary condition for a nanoparticle to give rise to a particle event is that at least one of its constitutive atoms makes it up to the detector, once ionized and after transmission through the instrument. According to eqn (2), the probability than at least one ion among nions reaches the detector is equal to 1 − (1 − ptrans)nions: this expression is an increasing function of ptrans and of nions: as expected, the more efficient the transmission and the higher the mass of the analyte in the nanoparticle, the better the chance of getting a particle event.
, particularly with regard to the transport of the analyte ions of particulate origin between the plasma and the detector. Of all the parameters used in
, the particle event shape distribution is the one that has received the least attention in the literature so far. In particular, experimentally measured and
simulated particle events will be compared to assess the soundness of the inverse Gaussian transport model.
000 times. The monitored isotope was 197Au.
For experiments performed with the Vitesse, as an analog to natural nanoparticles, a BHVO-2 (Hawaiian Volcano Observatory Basalt) geochemical reference material, purchased from the United States Geological Survey, was drilled using a microsampling device (MicroMill Sampling System, New Wave Research). The particles created were collected with a clean brush and then dispersed in 50 mL of ultrapure water. This suspension was then left to settle overnight, and the top 5 mL was collected in order to avoid introducing coarse particles in the Vitesse. Prior to analysis, the suspension was sonicated for 90 s and, after reviewing initial measurements, freshly diluted using ultrapure water to avoid NP coincidences. Silica, calcium and iron are major constituents of BHVO-2, hence the focus on 28Si, 40Ca and 56Fe, three isotopes with good yet distinct sensitivities on the Vitesse.65
Time scans generated were visually inspected, and individual spikes were manually selected based on the absence of apparent coincidences and saturation. This selection criterion maximizes the likelihood that each spike corresponds to a single particle event and ensures a relatively high signal-to-noise ratio, reducing the chances of interference from background, overlapping particle signals and excessive spike intensity.
, with an ideal detector, this distribution is unaltered, whereas it is transformed into a compound Poisson distribution when the detector is non-ideal, as explained in Sec. 3.1.3 and displayed in the bottom part of Fig. 3. As a consequence, the background of a synthetic sp-ICP-MS signal can be either Poisson distributed, as observed with quadrupole mass spectrometers37,64 or compound Poisson distributed, as observed with time-of-flight mass spectrometers60 depending on the single-ion response—identity or lognormal—selected in
to model the detecting stage.
Once synchronized, all the experimental signals are added and the resulting aggregated spike is first normalized and then fitted with the probability density function (PDF) of an inverse Gaussian distribution in order to find the optimal mean and shape parameters. The shape parameter of the inverse Gaussian distribution is sensitive to the choice of the reference spike. To avoid as much as possible arbitrariness in the choice of this reference, we repeat the following procedure: each experimental spike is sequentially chosen as the reference spike, the previous algorithm is applied and we choose the reference spike that minimizes the mean-squared error between the fitted PDF and the aggregated–normalized spike.
As shown in Fig. 5, the quadrupole 197Au aggregated particle event is almost bell-shaped, whereas time-of-flight aggregated particle events can range from slightly (28Si) to very asymmetric (40Ca and 56Fe). For a given isotope and a given instrument, Fig. 5 demonstrates that the aggregated spike built from all the individual experimental spikes can be very well adjusted by an inverse Gaussian distribution. The optimal mean and shape parameters used to draw the fitting curves, as well as their associated one standard deviation uncertainties, are gathered in Table 2. Lastly, we checked that the inverse Gaussian distribution remains an adequate transport model when the nanoparticles display a significant dispersion in size (see SI).
![]() | ||
| Fig. 5 From top to bottom: aggregated and normalized particle events for 56Fe, 40Ca, 28Si (Vitesse sp-ICP-ToF-MS measurements for these three isotopes) and 197Au (Agilent 8900 sp-ICP-QQQ-MS measurements). Experimental data are represented with black circles. The grey lines are the numerical adjustments to the inverse Gaussian distribution. Experimental spikes are provided in the SI and fitting parameters employed are given in Table 2. | ||
| 28Si | 40Ca | 56Fe | 197Au | |
|---|---|---|---|---|
| µ/τdw | 9.21 ± 0.13 | 7.19 ± 0.07 | 11.9 ± 0.1 | 5.56 ± 0.67 |
| λ/τdw | 69.4 ± 3.7 | 19.7 ± 1.0 | 42.0 ± 1.4 | 180 ± 68.3 |
The single set of inverse Gaussian parameters found that way is then used to assess whether it can adequately model each individual experimental spike that were extracted. Since each particle event—be it real or synthetic—is a realization of an underlying stochastic process, which here is assumed to be unique for a given time scan, for each experimental spike, we computed 1000 synthetic particle event realizations based on the previously inferred inverse Gaussian parameters with a number of counts equal to the number of counts of each experimental spike. Mean values with their standard deviations can thus be calculated for the synthetic particle events and compared with the experimental spikes (see SI). Adjustments appear to be of good quality for numerous experimental spikes. Some discrepancies are nevertheless visible (see Fig. S4) for certain individual particle events. They may be related to the simplicity of the time dispersion model (inverse Gaussian distribution), to a possible dependence of the allure of the spikes on the location where the nanoparticle vaporizes, etc. The parsimony principle that we retained, i.e., the existence of a single set of inverse Gaussian parameters able to model each individual particle event, may also be too restrictive.
Beside the inverse Gaussian distribution, the same procedure has been applied with another distribution, the skew normal distribution.67 Because of the asymmetry we just mentioned, it turns out that in general, a simple normal distribution is not well suited to model sp-ICP-MS particle events. A skew normal distribution is much better adapted, and visually, it comes quite close to the achievements of the inverse Gaussian distribution (cf. SI). Still, according to the mean-squared error, an optimized inverse Gaussian distribution outperforms its skew normal counterpart for the adjustment of the aggregated-normalized spike. Regarding the individual particle events, we found that the inverse Gaussian inferred from the fit of the aggregated–normalized spike provides the best adjustment with respect to the mean-squared error. For instance, in the case of the 56Fe experimental spikes, 81 out of 100 are better adjusted by the inverse Gaussian distribution. The lower statistical performance displayed by the skew-normal distribution and the theoretical arguments put forward in Sec. 2.2 validate that the inverse Gaussian distribution is the PDF of choice for the simulation of sp-ICP-MS particle events.
Ion transport within the instrument has been modelled in an effective way. We did not attempt to deal with plasma physics, ion thermal diffusion, ion advection by auxiliary gases or trajectories of charged particles in electrostatic or magnetic fields. All these phenomena are strongly dependent on the instrument and tuning conditions. Nevertheless, the simplified approach we have taken, relying on a Bernoulli distribution to model ion transport within a mass spectrometer, gives satisfactory results.
The algorithm has been implemented in Python and open-sourced on GitHub. This implementation, named
, includes a few additional choices regarding the shape, the size distribution of the nanoparticles, and the behaviour of the detecting stage (time-to-digital or analogue-to-digital conversion can be modelled). All these choices can be modified easily, even by a rather unexperienced programmer, thanks to the open-source nature of the code and its simplicity:
does not rely on any exotic external libraries and has no dependencies, so the source code is very flexible and easy to tweak. The only drawback that may slow the adoption of
is that it is a command-line only software, without any graphical user interface. However, it was designed to be easy to use and a use case is described in the GitHub repository of
.
We have been able to show that the synthetic time scans generated with
are consistent with experimental single channel time-resolved sp-ICP-MS signals measured with quadrupole or time-of-flight mass analysers. The background component is either Poisson distributed (as classically observed with quadrupole mass spectrometers) or compound Poisson distributed (as with time-of-flight instruments), depending on the single-ion response selected by the user. As for the particle events generated by
, the inverse Gaussian distribution used to model the time dispersion of analyte ions included in a single nanoparticle is compatible with spikes extracted from experimental time scans, measured on synthetic dispersions or dispersions produced from a reference material.
We feel that
opens up some new avenues for research. First and foremost, this code can be used as a testbed to assess existing or new data processing strategies. Moreover, thanks to the realistic time scans generated by
, one could also envision to use this code to train machine learning or deep learning algorithms. New capabilities could be added to the software in the future, as the possibility to simulate signals produced by the coupling between a laser ablation system and a mass spectrometer.
A shorter term perspective would be the simulation of multi-channel single particle ICP-MS time scans. This would allow to generate synthetic time-resolved signals mimicking those measured with a time-of-flight mass spectrometer when multi-elemental nanoparticles are analysed. We intend to pursue this interesting line of research in the near future.
Python program implementing the algorithm presented in this article is publicly available at https://github.com/peyneau/spGen. The specific version of the
library used for this article is the release JAAS-2025: https://github.com/peyneau/spGen/releases/tag/JAAS-2025.
The synthetic and the experimental data supporting this article have been included as part of the supplementary information (SI). Supplementary information is available. See DOI: https://doi.org/10.1039/d5ja00232j.
Footnotes |
† If v denotes the constant and uniform drift velocity, D the coefficient of effective diffusivity and η a Gaussian white noise with zero mean and unity standard deviation, then the stochastic equation describes the trajectory of a Brownian particle drifting at velocity v. One can show that the evolution of the underlying probability density p(x, t) obeys the Smoluchowski equation, ∂tp = −∂x(vp) + D∂xxp, which is in this case the advection–dispersion equation with constant parameters.39 |
| ‡ If it were, the parameter of the Poisson distribution would be equal to the average number of counts during a dwell time. When this value is used to compute the histogram that is expected if the readings were Poisson distributed, one gets the values represented by the black crosses that are markedly different from the empirical histogram (in purple) calculated from the synthetic time scan with a low level of background signal and a non-ideal detector, as shown in the bottom right part of Fig. 3. The non-Poisson character of the background signal when the detector is non-ideal is rigorously demonstrated in the SI. |
| This journal is © The Royal Society of Chemistry 2026 |