Anton
Vladyka
*a,
Christoph J.
Sahle
b and
Johannes
Niskanen
a
aUniversity of Turku, Department of Physics and Astronomy, 20014 Turun yliopisto, Finland. E-mail: anton.vladyka@utu.fi; johannes.niskanen@utu.fi
bEuropean Synchrotron Radiation Source, 71 Avenue des Martyrs, 38000 Grenoble, France. E-mail: christoph.sahle@esrf.fr
First published on 15th February 2023
We report a statistical analysis of Ge K-edge X-ray emission spectra simulated for amorphous GeO2 at elevated pressures. We find that employing machine learning approaches we can reliably predict the statistical moments of the Kβ′′ and Kβ2 peaks in the spectrum from the Coulomb matrix descriptor with a training set of ∼ 104 samples. Spectral-significance-guided dimensionality reduction techniques allow us to construct an approximate inverse mapping from spectral moments to pseudo-Coulomb matrices. When applying this to the moments of the ensemble-mean spectrum, we obtain distances from the active site that match closely to those of the ensemble mean and which moreover reproduce the pressure-induced coordination change in amorphous GeO2. With this approach utilizing emulator-based component analysis, we are able to filter out the artificially complete structural information available from simulated snapshots, and quantitatively analyse structural changes that can be inferred from the changes in the Kβ emission spectrum alone.
The pressure dependent evolution of the germanium coordination by oxygen in glassy GeO2 has been a long standing subject of study.21–24 Besides applications of amorphous GeO2 in technical glasses, the increased sensitivity of a-GeO2 to pressure compared to amorphous SiO2 motivates the study of structural changes similar to those expected to occur in the pressurized analogue glass a-SiO2 but at greatly reduced absolute pressures. Detailed knowledge of the compaction mechanisms in these simple glasses will have direct consequences for our understanding of geological, geochemical, and geophysical processes involving more complex silicate glasses and melts.
X-ray emission spectra (XES) of GeO2 is an inviting case for development of spectroscopic analysis for soft and amorphous condensed matter. First, large spectroscopic changes with changing local structure are known to exist.24 Second, simulations are known to reproduce the observed ensemble-mean effects well.25 Third, XES is local-occupied-orbital derived and a few orbital-bonding neighbor atoms are expected to be decisive for the spectrum outcome. This would result in a minimal set of structural parameters needed to predict XES. Last, owing to the chemical simplicity and simple bonding topology due to non-molecular structure, this system has promise to be reproduced by ML with the limited number of data points that the condensed phase allows. Namely, for such systems the electronic simulation needs to account for multi-electron effects in numerous interacting atoms - typically on the level of density functional theory. As a consequence, the number of individual structural data points for spectroscopy can be expected to be ∼ 104 in an extensive contemporary simulation.
In this work, we focus on Ge Kβ XES calculations of amorphous GeO2 at elevated pressures. Our previous work on the water molecule indicated that predicting spectral features is easier than predicting structural features.14 In the condensed phase, where the structural features to be predicted are more numerous, the task is arguably even more complicated. As a solution to this dilemma, we build a procedure on spectrum prediction for structures, dimensionality reduction and iterative optimization algorithms. This approach is possible because the evaluation of a ML model requires much less computational resources than the corresponding quantum mechanical calculation does. We predict statistical moments of XES lines from a Coulomb matrix16 that describes the local atomic structure around the site of characteristic X-ray emission. Next, we study obtainable structural information for the occurring spectral changes in the pressure progression of the XES by emulator-based component analysis (ECA).15 Last, we investigate an approximate solution to the spectrum-to-structure inverse problem by first transforming it into an optimization task in the dimension-reduced ECA space, followed by expansion to the full multi-dimensional Coulomb matrix. A dedicated evaluation data set allows for assessment of performance of the approach in each of the aforementioned tasks.
We study data of amorphous GeO2 from statistical spectral simulations over a range of 11 pressure values from 0 GPa to 120 GPa. These XES spectra, simulated using the OCEAN code26,27 (version 2.5.2), are based on real-space configurations from ab initio molecular dynamics simulations reported earlier by Du et al.28 We used the Quantum ESPRESSO program package (version 5.0)29,30 for sampling the ground state wave functions and electron densities at the gamma point with a plane wave cutoff of 100 Ry (see Ref. 25 for more details). Transition matrix elements are then calculated using the Haydock recursion method31 as implemented in the OCEAN code using a Lorentzian width of 1.0 eV for the continued fraction. At each pressure point, Ge Kβ XES spectra of 18 structurally uncorrelated AIMD simulation snapshots containing 72 GeO2 formula units were calculated for each Ge atom. For 5 pressure points only 17 out of 18 snapshots yielded spectra in a finite time frame due to convergence issues, resulting in 13896 XES spectra. The spectra of individual Ge sites were aligned and normalized for each pressure to yield a constant Kβ5 line peak position and intensity in its ensemble average spectrum.
Even though extensive from a statistical simulation viewpoint, the available dataset is still rather limited for sophisticated ML algorithms. In this case, using a descriptive numerics allows for condensing structural and spectral information to a few parameters, resulting in an improvement of ML performance. We apply descriptors to both the spectrum and the atomic structure of the system (see below).
![]() | (1) |
![]() | (2) |
![]() | (3) |
The definition of the Coulomb matrix implies that it can be inverted to a distance matrix containing interatomic distances by
![]() | (4) |
To check the performance of the Coulomb matrix descriptor against a many-body-tensor-representation32 spirited descriptor, we used snapshot-wise evaluated radial distribution functions (RDF) from the active site. Although similar predictive power was obtained via the RDF, its performance in the later steps of the analysis (spectral coverage of decomposition) was inferior to that of the Coulomb matrix.
![]() | (5) |
The ECA method requires an emulator capable of predicting spectral moments for new structural data points. As an emulator, a trained multilayer perceptron (MLP) with 2 hidden layers and 64 neurons in each layer was used. We dedicated 80% of data for training, and 20% for evaluation of the prediction. Overall, all configurations of MLPs with 2–3 hidden layers and 64 or 128 neurons were evaluated on the training dataset (∼11000 spectra) using mean squared error as a training metric.
For comparison, we used partial least squares fitting based on singular value decomposition (PLSSVD)33 as applied to the X-ray spectroscopic problem in ref. 15. The PLSSVD algorithm relies on projections of spectral and structural feature vectors on latent variables between which a linear fit is made. The method results in an approximation of the data up to rank k
![]() | (6) |
Fig. 3a–c present structures and spectra of three individual snapshots at pressures of 0, 30, and 120 GPa, respectively. Fig. 3d–k in turn show the prediction and training performance of the chosen MLP for these descriptors. In the figure, perfect match between known and predicted data lie on the diagonal dashed line. Furthermore, the positions of the three illustrated spectra of Fig. 3a–c, as well as the mean moment values for each pressure point against moment values of the known mean spectrum are indicated by crosses.
The spectra and their statistical moments show a clear trend as a function of pressure. Moreover, the overall quality of the prediction performance yields Pearson correlation coefficients above 0.94. The pressure-induced progression in the spectra is transferred into spectral moments, for which the ML task proved to be easier than predicting spectra as vectors of channel-wise-listed intensity values (see Fig. S2, ESI†). Analogously with simple intensity prediction, spectral moments of an ensemble-averaged spectrum can be estimated by the mean of predicted moments to a good accuracy (see crosses in Fig. 3d–k). However, this is an approximate finding instead of a mathematical equality.
For the evaluation data set, some 77% of generalized covered spectral variance (R2 score) can be explained by only a single ECA component ṽ1 (83% with two components {ṽ1,ṽ2}). These components represent individually standardized elements of a Coulomb matrix unrolled to 153-dimensional vectors (for {ṽ1,ṽ2} rolled back to the standardized Coulomb matrix differences, see Fig. S3, ESI†) For PLSSVD, corresponding spectral variance coverages were 73% and 77% for one and two components, respectively. The added contribution of the second component indicates a rapid drop of improvement in higher ranks.
Before entering the inverse problem, it is instructive to analyse the decomposition of first rank i.e. along the path (1) = t1ṽ1. Since the emulator provides the nonlinear response to the input vector, ECA is able to mimic the behavior of the moments more closely than PLSSVD, which is linear by definition (for the spectral moments along t1 see Fig. S4, ESI†). For this reason, dimensionality reduction by ECA will be better adjusted to the spectral response; even with inaccuracies in prediction by the emulator, higher covered spectral variance is still obtained than from PLSSVD. For a majority of the atoms, both PLS and ECA trajectories follow the pressure-wise ensemble mean interatomic distances R0i from the active Ge site along the path (Fig. S5, ESI†). However, for atoms Ge3, Ge4, O3, O4 and O7 the ECA trajectories show a different behavior, which indicates that the role of these atoms in deciding the spectral outcome is low compared to other atoms.
The interpretation of spectra would ideally lead to structures constructed from the spectroscopic information. However, already with a few number of degrees of freedom (here 153) this problem is tedious. In line with findings in ref. 14, training an emulator to directly predict the Coulomb matrix from the spectral moments was not successful with the model selection grid, data and descriptors used here (the mean Pearson correlation coefficient of 0.33 was obtained). Therefore we looked at approaches that would rely on spectrum prediction by an emulator, that has in general better performance. However, an emulator-based approach of iteratively fitting the parameters to yield the 8 desired spectral moments proved also to be an unstable high-dimensional problem, that we were unable to solve. Instead, fitting a few ECA component scores for matching spectral moments is a much simpler task that could be solved.
We searched for the coordinates t in the standardized dimension-reduced space by minimization of the least-squares error
![]() | (7) |
Fig. 4 shows coordinates t = (t1,t2) for each point from the evaluation data set from fitting of eqn (7). Deduced coordinates for the set of moments of each mean-ensemble spectrum (blue line) are in a good agreement with projections of known mean points (black line) for a given pressure ensemble on the same subspace. It appears though, that this reconstruction of the scores ti misses the second component, possibly due to the fact that the component is already insignificant and the emulator is known to be imperfect. Knowledge of scores ti allows construction of an approximate Coulomb matrix as a linear combination up to rank k. The absolute p is obtained after inverse standardization, as are C and R.
Even though the mean interatomic distances are not necessarily obtainable from mean Coulomb matrix elements, and even though this matrix is not necessarily obtainable from the spectral moments of the ensemble-mean spectrum (which closely match with the ensemble mean of the spectral moments), we find both to be the case. Fig. 5 depicts the reconstructed atomic distances from the spectral moments with one-dimensional and two-dimensional ECA space, indicating rapid convergence. Moreover, the reconstruction is at least qualitatively correct as seen from comparison with the known values for the evaluation data set, the most notable discrepancy being the 5th closest O atom at low pressures. This behavior can be understood in terms of reduced sensitivity of the spectra to these atomic distances; the first ECA component does not capture the drastic relative change in the parameter value (Fig. S5, ESI†), and even the second component does not fix this shortcoming. Likewise, for the overall match on the data set, the vector ṽ1 results in underestimation of O6 distance at large t1 (high pressures), which leads to the line crossings in Fig. 5a. However, the pressure-induced coordination change from 4-coordinated Ge to 6-coordinated Ge21 is clearly discernible around the pressure of 10 GPa by the increase of the Ge–O separation for the first four oxygen atoms and the concomitant decrease of Ge–O distance for the fifth and sixth nearest oxygen neighbor. We note that while the first row of the constructed Coulomb matrix represent ensemble-averaged distances, the structure constructed from the mean Coulomb matrix is nonsensical.
Instead of more direct approaches, approximate solution of the inverse problem by reconstruction of the first ECA components proved to be a feasible task to solve by optimization. It is natural to select these parameters so that they explain most spectral variance. As a result converging expansion of less and less relevant degrees of freedom are added and finally, irrelevant are identified and filtered out. Imperfection of emulator and incompleteness of the basis are likely reasons for the crossings of lines in Fig. 5a.
Structural analysis of the AIMD trajectory results in a complete analysis of structural changes across the data set. However, this information does not indicate what can be concluded based on the XES alone, as the sensitivity of core-level spectra to structural parameters may vary greatly.15,34 A parameter without an effect on a spectrum certainly cannot be expected to be reconstructed from it, and thus spectral insensitivity to a structural degree of freedom presents a danger of misinterpretation. The design of ECA means that a spectrally irrelevant structural degree of freedom obtains zero projection in the basis vector and is, in principle, omitted in subsequent analyses. Therefore, effects shown by ECA, and analysis based on it, can be considered to be inferred from a spectrum and its change. This reasoning is supported by Fig. 5, where the magnitudes of change from 0 GPa to 120 GPa in the known distance curves mostly exceed those of the predicted ones. For the end-to-end difference oxygens O3 and O4 with negligible (< 0.05 Å) total change exceed that of the known data. Depending on details of an analysis other - rather minor - violations to the tendency can be found in the data.
For the 17 atoms and 11 pressures, the mean absolute deviation from the known ensemble-mean distances for 2-component decomposition was 0.091 Å for ECA and notably 0.051 Å for PLSSVD with which we also carried out the analysis (see Fig. S7–S9, ESI†). We interpret the better performance of PLSSVD to be due to more emphasis placed on structural variance in the method, whereas ECA focuses strictly on spectral significance. Thus PLS is allowed to know more from the simulated structural parameter space than the spectra alone would allow. However, the method undoubtedly benefited of the choice of descriptors by ML studies, making the data suitable for a linear model. In addition, imperfection of ECA results come from the imperfection of the emulator.
Since the studied XES involves transitions of electrons from the occupied valence to localized deep core levels, the associated transition matrix elements become naturally limited to the immediate neighbourhood of the active atomic site. The occupied valence orbitals, in turn, can be expected to participate in chemical bonding, and thus to render these transitions sensitive to e.g. coordination number of the active site. It is an interesting yet open question to which degree the findings presented here generalize in other systems, and specifically to those posed by XES of high-pressure science. When assuming no exceptionality for GeO2 studied here, these spectra are potent of delivering far more structural information than it may at first seem.
Decomposition of structural sensitivity of spectra reduces the number of free parameters to be solved in the inversion problem, to only a few that have been chosen a priori for their spectral significance. The basis vectors of such decomposition span a subspace of degrees of freedom with most spectral response, and therefore reconstruction via this subspace will show structural effects with true inference from the change of spectra. This prevents spectrally irrelevant structural information, available in a simulation work, from affecting the analysis. Partial least squares fitting such as PLSSVD offers a usable and much lighter alternative where machine learning is not feasible, but the method is not as strict in spectrum-only inference.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2cp05420e |
This journal is © the Owner Societies 2023 |