Open Access Article
Bikramaditya
Mandal
a,
Dmitri
Babikov
b,
Phillip C.
Stancil
c,
Robert C.
Forrey
d,
Roman V.
Krems
e and
Naduvalath
Balakrishnan
*a
aDepartment of Chemistry and Biochemistry, University of Nevada, Las Vegas, Nevada 89154, USA. E-mail: naduvala@unlv.nevada.edu
bDepartment of Chemistry, Marquette University, Milwaukee, WI 53233, USA
cDepartment of Physics and Astronomy and the Center for Simulational Physics, University of Georgia, Athens, GA 30602, USA
dDepartment of Physics, Penn State University, Berks Campus, Reading, PA 19610, USA
eDepartment of Chemistry, University of British Columbia, Vancouver, BC V6T 1Z1, Canada
First published on 21st October 2025
Water (H2O) is one of the most abundant molecules in the universe and is found in a wide variety of astrophysical environments. Rotational transitions in H2O + H2O collisions are important for modeling environments rich in water molecules but they are computationally intractable using quantum mechanical methods. Here, we present a machine learning (ML) tool using an ensemble of neural networks (NNs) to predict cross sections to construct a database of rate coefficients for rotationally inelastic transitions in collisions of complex molecules such as water. The proposed methodology utilizes data computed with a mixed quantum-classical theory (MQCT). We illustrate that efficient ML models using NNs can be built to accurately interpolate in the space of 12 quantum numbers for rotational transitions in two asymmetric top molecules, spanning both initial and final states. We examine various architectures of data corresponding to each collision energy, symmetry of water molecules, and excitation/de-excitation rotational transitions, and optimize the training/validation data sets. Using only about 10% of the computed data for training, the NNs predict cross sections of state-to-state rotational transitions in H2O + H2O collisions with an average relative root mean squared error of 0.409. Thermally averaged cross sections, computed using the predicted state-to-state cross sections (∼90%) and the data used for training and validation (∼10%), were compared against those obtained entirely from MQCT calculations. The agreement is found to be excellent with an average percent deviation of about ∼13.5%. The methodology is robust, and thus applicable to other complex molecular systems.
Atmospheres of icy planets and their moons, such as Jovian moons, are known to have an anisotropic distribution of water vapor, affecting the properties of the observed water line. While most of such atmosphere is collision-less, the sub-solar point supports intense sublimation and photoinduced desorption, which results in a distribution that is not in local thermodynamic equilibrium (non-LTE) and driven by molecular collisions, such as excitation and de-excitation of H2O. Interpreting these and many other observations requires numerical modeling and relies on the knowledge of precise excitation and quenching schemes for ortho- and para-H2O. Large uncertainties of rate coefficients for these transitions can affect the predictions of astrophysical models by orders of magnitude.9,10 To characterize emission as a function of coma radius, modeling with radiation transfer codes, such as RADEX,11 LIME,12 or MOLPOP,13 is necessary, which in turn requires collision rate coefficients as an input. To understand and correlate with the observed rotational spectra from ALMA or JWST missions, one requires state-to-state collisional rate coefficients for the rotational excitation and quenching processes. These rate coefficients are difficult to calculate for complex collision systems such as H2O + H2O and HDO + H2O. Astrophysical models also require inelastic scattering rate coefficients for a range of other complex collision systems, including CH3OH + CO, H2O + HCN, H2CO + CO, and H2O + CH3OH.14
Databases such as BASECOL15 and LAMDA16 have been developed to simplify the process of obtaining rate coefficients for different molecular systems. The rate coefficients can be computed by a quantum mechanical treatment of the collision problem as implemented in a few codes available to the scientific community, such as MOLSCAT,17,18 HIBRIDON,19 and TwoBC20 or using a mixed quantum/classical theory (MQCT),21 when quantum calculations are not practical.
The inelastic collisions of H2O with H2O are almost impossible to study using fully quantum methods because the water molecule has a dense spectrum of quantum states.22–24 However, significant progress has been made recently by Mandal et al. to study rotationally inelastic collisions of two water molecules using MQCT.25–27 MQCT has proven its ability to produce accurate results with computational efficiency for inelastic collisions of several molecular systems.21,23,25–39 In this approach, the relative translational motion of collision partners is treated classically using the mean-field trajectories method, while rotations and vibrations (i.e., internal degrees of freedom of the colliding molecules) are treated quantum mechanically.21 In the current implementation of the MQCT method for H2O + H2O collisions, the H2O molecules are treated as rigid rotors.21
Using the MQCT methodology, a database of thermally averaged cross sections (TACSs) (averaged over a thermal population of rotational levels of the partner H2O molecule) was first published by Mandal and Babikov26 followed by a database of thermal rate coefficients.27 This database of TACSs and the rate coefficients contained 231 transitions in para-H2O and 210 transitions in ortho-H2O (both treated as the target molecule) in the temperature range of 5 ≤ T ≤ 1000 K. In a subsequent study, a database of both rotational temperature (Trot) and kinetic temperature (Tkin) dependent rate coefficients was built to model non-LTE environments using RADEX for H2O + H2O collisions.25
While MQCT can be applied to collisions of two water molecules, the computational complexity remains challenging. As elaborated by Mandal and Babikov,26 the computation involves evaluation of matrix elements of the interaction potential in a basis of rotational wave functions of the water molecules that are required for the simulation of mixed quantum/classical trajectories. In the prior work, the computation of these matrices alone required about ∼2.7 M CPU hours in the HPC facility Raj at Marquette University (AMD Rome 2 GHz processors, memory 512 GB). Additionally, the total cost of the scattering calculations (trajectory simulations) for six collision energies was about 5.25 M CPU hours using the same HPC facility. Altogether, the cost of the MQCT calculations of rate coefficients for H2O + H2O collisions was nearly 8 million CPU hours. More importantly, several months of human work were needed to manage the ongoing simulations and carry out numerous post-processing analyses to convert state-to-state cross sections to the rate coefficients to be deposited into the databases. While significant speedup in the computation of the relevant coupling matrices has been achieved recently, the trajectory simulations remain computationally demanding.
The challenge of such massive computational tasks is two-fold. First, computing the TACSs for 231 transitions in para- and 210 transitions in ortho-H2O required generating over a million cross sections for individual state-to-state transitions at each collision energy. Independent calculations for a total of 3268 initial states combining both the target and quencher molecule needed to be completed to build the database. Secondly, each of these simulations for individual initial states and collision energies required computation of four large interaction potential matrices, each containing about 1.35 million coupling elements. With these two factors, building an extended database for such a complex system using direct calculations is often not practical. Machine learning may instead prove useful for constructing such a complex database.
Machine learning (ML) has found widespread applications in recent years in many areas of physics and chemistry, including condensed matter physics,40–42 nuclear physics,43–45 astronomy,46–48 particle physics,49–51 quantum many-body physics,52–54 cosmology55–57 and fitting of multi-dimensional potential energy surfaces (PESs) from electronic structure calculations. Permutationally invariant polynomials (PIPs) combined with neural networks (NNs) by Bowman, Guo, and others58–69 and Gaussian process regression (GPR) by Krems and coworkers70–75 have been widely adopted for building PESs of complex molecular systems. ML has also been used to predict rate coefficients for inelastic collisions of diatomic molecules for astrophysical modeling from a smaller set of available data and improve the accuracy of approximate quantum scattering calculations.47,75–80 Quantum machine learning is also being actively explored.81–87
To our knowledge, ML has so far not been applied to develop a complex database of rate coefficients for collisions involving two triatomic molecules. In this work, our goal is to reduce the computational effort needed to build databases for complex colliding partners, such as H2O + H2O, by implementing and incorporating ML algorithms into this process so that more such databases can be produced, and made available to the modeling community. For this purpose, we make use of previously computed cross sections for individual state-to-state transitions for collisions of two water molecules as a benchmark, and for training machine learning models. The goal is to use the available data to explore if it is feasible to construct a reliable model for efficient interpolation in the large space of quantum states of two triatomic molecules.
This paper is organized as follows: Section 2 briefly discusses the theory to compute thermally averaged rate coefficients for H2O + H2O collisions, data pre-processing and architecture of the machine learning models employed here. In Section 3, we discuss results obtained from the ML models using NNs and compare them against the MQCT data not used for training the ML models. A summary of our findings is given in Section 4.
, is explained in detail by Mandal et al.25 Only relevant equations to compute the TACSs are provided below.
Thermal rate coefficients for state-to-state transitions at a given kinetic temperature Tkin are computed by averaging the corresponding cross sections over a Maxwell–Boltzmann distribution of relative velocities for all relevant collision energies, Ec, as follows:
![]() | (1) |
is the average collision velocity, μ is the reduced mass of the collision complex, and the subscripts n1n2 and
indicate the initial and final states, respectively. Each n is a composite index that represents a full set of quantum numbers for one molecule. For example, for water molecules, n denotes jkAkC, where j is the rotational quantum number and kA and kC are the projections of j along the axis of the largest and smallest moment of inertia, IA and IC, respectively. Furthermore, for para-H2O (nuclear spins of two H atoms are anti-parallel), kA + kC is even and for ortho-H2O (nuclear spins of two H atoms are parallel), kA + kC is odd. Since our focus is on the target H2O molecule, we compute the rate coefficients for water molecules by summing over all final states and averaging over all initial states of its collision partner (quencher):![]() | (2) |
![]() | (3) |
![]() | (4) |
The computation of the state-to-state rate coefficients in eqn (1) reaches practical limits for complex systems, like H2O + H2O, due to the enormous numbers of individual state-to-state transitions
. In the work reported by Mandal and Babikov,26 231 para–para and 210 ortho–ortho transitions were computed for the target H2O, considering a maximum value of j1 = 7. This required a rotational basis set with 38 states each for the para- and ortho-isomers of the quencher H2O molecule for which a maximum value of j2 = 10 was adopted. This led to over a million individual state-to-state transitions
, considering all the initial states of the molecular system as elaborated in the Introduction section. Ideally, these calculations need to be done on a grid of collision energy dense enough to perform the integral over the collision energy Ec, as shown in eqn (1). The human effort needed to manually check millions of individual transitions and implement them in eqn (1) is very labor intensive.
To tackle this challenge, an alternative approach was introduced by exchanging the order of integration in eqn (1) with the summation in eqn (2) as follows:
![]() | (5) |
of the target H2O molecule is introduced:![]() | (6) |
The TACSs are computed as follows: first, all the individual state-to-state transition cross sections are summed over the final states of the quencher H2O molecule, and then the resulting sums are averaged over the initial states of the quencher H2O molecule for a given value of rotational temperature Trot and collision energy Ec. Since the number of rotational transitions between the states of a target molecule is relatively small, it is much easier to check the behavior of all TACSs before they are integrated over the collision energy in eqn (5). As previously mentioned, in the work of Mandal and Babikov,26 the number of rotational transitions in para- and ortho-water considering only de-excitation processes was 231 and 210, respectively, and the number of collision energies was six, making them easier to manually check and ensure proper behavior.
The computed TACSs for these six collision energies can then be used for analytical fits to compute kinetic temperature dependent rate coefficients as described in detail by Mandal et al.25 However, in this work, the computed TACSs are the main goal; therefore, the details of computing rate coefficients using analytical expressions are not discussed here. These 231 transitions in para- H2O and 210 transitions in ortho-H2O for the target molecule are used in this work as a benchmark of the ML predictions.
As a result of the exponential decay, the cross sections vary by several orders of magnitude as the energy difference ΔE increases. Therefore, we start by testing whether the very small cross sections are necessary when the individual cross sections are converted to the TACS. The small-magnitude cross sections are expected to add unnecessary noise to the data, leading to increased complexity of the NNs. Fig. 2 displays a comparison between TACSs for state-to-state transitions with all individual cross sections included compared to the case in which cross sections smaller than 0.01 Å2 were omitted. The TACSs including all individual cross sections are plotted along the horizontal axis, while the TACSs computed without σ < 0.01 Å2 are plotted along the vertical axis. The black dashed line is the perfect agreement while blue circles and red crosses represent the TACSs for para and ortho-H2O targets, respectively. It is clear that omitting σ < 0.01 Å2 in eqn (5) has no appreciable effect on the accuracy of the TACSs. Therefore, we eliminate individual state-to-state cross sections with a magnitude of <0.01 Å2 to build our machine learning models.
For training and validation, we take slices from different ranges of ΔE to make a subset of the entire data as follows:
![]() | (7) |
The datasets used for training, validation, and testing are displayed in Fig. 3 as a function of ΔE. The red circles represent the subset of data used for training and validation, while the blue crosses indicate the remaining data used for testing. We note that the different energy slices chosen for training and validation may not always include the smallest and the largest ΔE for a given initial state. To avoid extrapolation in ΔE, these lowest or highest energy transitions are also added to the training data for these initial states. Thus, there is a subset of points in the training dataset that lies outside the region specified by eqn (7), as shown in Fig. 3. About 80% of the data points selected in this way are used for training, while the remaining 20% are used for validation.
![]() | ||
| Fig. 3 A visual representation of the data used for training and validation as well as testing. The red circles denote the data for training and validation and the blue crosses denote the data for testing as given by eqn (7). | ||
In general, we aim to design the training and validation dataset that consists of cross section values for which the energy difference of the transition lies within the entire range so that there are no predictions to be made outside the range of the training data. All the cross sections for the remaining transitions are used as the test data set. In this work, multiple slices are made through the whole data set for ML models to optimize the computational efficiency and accuracy of NNs, as discussed later.
Due to the differing slopes of the excitation (ΔE < 0) and quenching (ΔE > 0) wings, a single neural network was unable to adequately capture the behavior of both. Therefore, separate NNs were constructed for these two regimes. Also, each collision energy and para- and ortho-H2O symmetries were treated separately to build distinct NNs. For each of the six collision energies, we trained four NNs based on combinations of excitation and quenching transitions for both para and ortho symmetries, resulting in a total of 24 ML models trained and validated on their respective datasets.
Each of our dataset has thirteen features as input parameters for the NNs: rotational quantum numbers of the initial and final states of the first water molecule
, the second water molecule
, and the energy difference between the initial and final states of the molecular system, ΔE. The NNs are designed to interpolate over all these input features. Since the input features are composed of different data types (integers for rotational quantum numbers of the initial and final states while float for the energy difference) with a large variation in the magnitude of the input data, they needed to be scaled for the NNs to work optimally. This is done by using the “StandardScaler” function from the “sklearn” package to have zero-mean and unit-variance as
, where u and s are the mean and the standard deviation of the features, respectively. This standardization of the input data is done so that none of the features get higher weights just because of their magnitude being larger than the values of other features. Note that this scaling is applied only to the input features as listed previously, and not the cross section, i.e., output. The same transformation is applied uniformly to all three data sets: training, validation and testing.
In our data analysis, we found that the dependent feature, i.e., cross sections for individual state-to-state rotational transitions, vary by several orders of magnitude. We found that the NNs do not perform well for data that vary over several orders of magnitude. To resolve this issue, we used the logarithm of the cross section in our ML modeling [y = log10(σ)] as the target. The predictions are then converted back to cross sections as σ = 10y.
![]() | ||
| Fig. 4 Architecture of the ML models composed of four hidden layers with each having 128 neurons, thirteen features for the input layer, and one output neuron for the logarithm of the cross sections. | ||
The rectified linear unit (ReLU) [f(x) = max(0,x)] is used as the activation function in all hidden layers. We used the Adam optimizer to train the NNs with a learning rate of 0.0001.88 The details of the NNs including the number of parameters for each layer are provided in Table 1.
| Layer name | No. of neurons | No. of parameters | Activation |
|---|---|---|---|
| Input | 13 | 13 | |
| Hidden layer 1: dense | 128 | 1792 | ReLU |
| Hidden layer 2: dense | 128 | 16 512 |
ReLU |
| Hidden layer 3: dense | 128 | 16 512 |
ReLU |
| Hidden layer 4: dense | 128 | 16 512 |
ReLU |
| Output | 1 | 129 |
Our ML models were built using a TensorFlow with a batch size of 32 and a maximum of 300 epochs.89 An early stopping mechanism with a patience of 30 using the validation dataset was adopted to prevent the NNs from overfitting. Additionally, we used ridge regularization (i.e., L2 regularization) with a penalty of 0.01 to the weights of the kernels for each hidden layer. This discourages large weights and reduces the complexity of our NNs. A summary of the NN hyperparameters is provided in Table 2.
| Hyperparameters | Type/value | Additional details |
|---|---|---|
| Batch size | 32 | |
| Optimizer | Adam88 | Learning rate = 0.0001 |
| Kernel regularization | Ridge (L2) | Regularization weight = 0.01 |
| Maximum no. of epochs | 300 | |
| Early stopping | Implemented | Patience = 30 |
The mean-squared error function,
, was used as the loss function to quantitatively analyze the performance of our ML models applied to the training data. Here, n is the total number of samples used, and the variables yiMQCT and yipredicted represent the logarithm of the actual cross sections from MQCT calculations and the cross sections predicted by the NNs, respectively. We also monitored the
for the interpretation of the predicted data.
![]() | (8) |
The sets of training/validation and test data from Dataset 2 are displayed in Fig. S4. The comparison with the MQCT data for individual state-to-state cross sections became slightly worse for both para and ortho-H2O targets as shown in Fig. S5 and S6, respectively. The predicted TACS also displays larger discrepancies with the MQCT TACS as shown in Fig. S7 of the SI.
![]() | (9) |
This is the training/validation set that we label as the best performing and the most optimized for reliable predictions. We adopt the data structure shown by eqn (9) for collision energies of 133, 200, and 267 cm−1. Because the density of individual cross sections near the elastic peak decreases rapidly with collision energy, for Ec = 400 cm−1 and higher, we replace ΔE of < 10 cm−1 by ΔE of < 15 cm−1 in eqn (9) to have adequate sampling near ΔE = 0 cm−1. The resulting training/validation and test data corresponding to individual state-to-state rotational transitions are displayed in Fig. 5 for the para-H2O molecule at the highest and lowest collision energies. For other collision energies and ortho-H2O molecules, the dataset is very similar.
A summary of the sizes of the training, validation and test datasets from eqn (9) is given in Table 3 that includes both exchange symmetries of para and ortho-H2O molecules and both excitation and quenching transitions. The size of the training data is about ∼10% of the entire dataset after removing the small-magnitude cross sections and limiting the range of energy gaps to ΔE of ≤ 300 cm−1. The size of the training, validation and test datasets remains almost the same for the lower three collision energies for both para and ortho symmetries and for both excitation and quenching transitions. The same applies to the higher three collision energies, but the size of the training, validation and test datasets is slightly different as explained before.
| E c (cm−1) | H2O symmetry | ΔE sign | Training data size | Validation data size | Test data size | Relative RMSE |
|---|---|---|---|---|---|---|
| 133 | para | (+ve) | 34 422 |
7772 | 273 223 |
0.430 |
| 200 | para | (+ve) | 34 424 |
7772 | 273 232 |
0.379 |
| 267 | para | (+ve) | 34 445 |
7778 | 273 374 |
0.421 |
| 400 | para | (+ve) | 39 165 |
8957 | 267 713 |
0.402 |
| 533 | para | (+ve) | 39 162 |
8957 | 267 783 |
0.404 |
| 708 | para | (+ve) | 39 199 |
8966 | 267 967 |
0.397 |
| 133 | ortho | (+ve) | 33 161 |
7494 | 259 189 |
0.429 |
| 200 | ortho | (+ve) | 33 164 |
7495 | 259 241 |
0.384 |
| 267 | ortho | (+ve) | 33 186 |
7501 | 259 329 |
0.389 |
| 400 | ortho | (+ve) | 37 526 |
8586 | 254 107 |
0.387 |
| 533 | ortho | (+ve) | 37 541 |
8590 | 254 221 |
0.380 |
| 708 | ortho | (+ve) | 37 552 |
8592 | 254 307 |
0.367 |
| 133 | para | (−ve) | 36 196 |
8213 | 291 562 |
0.438 |
| 200 | para | (−ve) | 36 137 |
8199 | 291 201 |
0.442 |
| 267 | para | (−ve) | 36 161 |
8205 | 291 181 |
0.462 |
| 400 | para | (−ve) | 40 837 |
9374 | 285 384 |
0.414 |
| 533 | para | (−ve) | 40 864 |
9380 | 285 566 |
0.431 |
| 708 | para | (−ve) | 40 891 |
9387 | 285 943 |
0.404 |
| 133 | ortho | (−ve) | 35 544 |
8089 | 280 565 |
0.444 |
| 200 | ortho | (−ve) | 35 525 |
8084 | 280 295 |
0.430 |
| 267 | ortho | (−ve) | 35 528 |
8085 | 280 334 |
0.416 |
| 400 | ortho | (−ve) | 39 844 |
9163 | 275 045 |
0.424 |
| 533 | ortho | (−ve) | 39 859 |
9167 | 275 191 |
0.409 |
| 708 | ortho | (−ve) | 39 880 |
9172 | 275 370 |
0.381 |
The predicted cross sections for the individual state-to-state rotational transitions of both para and ortho-H2O molecules are shown in Fig. 6 and 7, respectively. The horizontal axis represents the MQCT cross sections while the NN predictions are plotted along the vertical axis. The black dashed line would be the perfect agreement while the red dots represent the comparison. It is seen that the predicted cross sections agree reasonably well with the MQCT cross sections and are within the range of acceptable accuracy for both para and ortho symmetries of the H2O molecule. While the smaller cross sections (corresponding to larger ΔE) exhibit higher discrepancies (relative to their magnitudes), their overall contribution to TACSs is less significant.
![]() | ||
| Fig. 6 A comparison of ML predicted state-to-state cross sections against the actual MQCT results for the para-H2O molecule. | ||
![]() | ||
| Fig. 7 A comparison of ML predicted state-to-state cross sections against the actual MQCT results for the ortho-H2O molecule. | ||
To further quantify the errors in the NN predictions for each collision energy, specific symmetry of the H2O molecule (para or ortho), and excitation/quenching transitions (ΔE < 0 or ΔE > 0), we report in the last column of Table 3 the relative RMSE or RRMSE, defined as:
![]() | (10) |
We have also examined whether a similar level of accuracy can be reached with a fewer number of hidden layers and number of neurons in each layer to reduce the complexity of the NNs since the training dataset in this case is relatively small compared to dataset 1. We found that reducing the number of hidden layers does not significantly make the prediction worse with respect to the RMSE values nor meaningfully improve the computational efficiency. Therefore, we retained four hidden layers with 128 neurons each.
Finally, we composed the TACSs following eqn (6) using the predicted individual state-to-state transitions for the test data combined with original training and validation data and compared against the actual MQCT TACS. Fig. 8 displays this comparison. The blue circles and red crosses correspond to the para and ortho symmetries of the H2O molecule, respectively. These TACSs are computed for a rotational temperature Trot = 300 K. The agreement is excellent between the actual MQCT TACS and the NN predictions. The agreement does not improve significantly when more data are included by extending the range of the ΔE as shown in Fig. S3 of the SI using dataset 1.
It should be noted that the original MQCT state-to-state cross sections were divided into a training and validation set (about 10%) and a test set (about 90%). In the comparison provided in Fig. 8, the test set was replaced by the cross sections produced by NN predictions. To demonstrate the important contributions of the NN predicted data toward the overall TACSs, we computed the TACSs by using only the training and validation data. The resulting TACSs are compared against those obtained by including the NN predictions and the actual MQCT TACS in Fig. 9 for ortho-H2O. Similar results for para-H2O are provided in Fig. S8 of the SI. The TACSs using all of the original MQCT cross sections are plotted along the horizontal axis, while the red crosses and blue circles correspond to the TACS computed with and without incorporating the NN predictions into the training and validation data. The dashed black curve would represent the perfect agreement. These red crosses are the same as shown in Fig. 8. It can be seen that the blue circles deviate significantly from the perfect agreement represented by the black dashed diagonal line. Thus, the contribution of the NN-predicted cross sections is significant accounting for nearly 90% of the original MQCT cross sections not used for training and validation. Therefore, the approach presented here can be applied to other complex molecular systems to substantially reduce the complexity involved in rate coefficient calculations.
![]() | ||
| Fig. 9 Thermally averaged cross sections computed with (red crosses) and without (blue circles) incorporating the ML predicted cross sections into the training and validation data for ortho-H2O molecules plotted against the original MQCT TACS. The red crosses are the same as in Fig. 8. | ||
The TACSs are one of the main ingredients for astrophysical models and serve as an important input to the numerical codes of radiative transfer modeling, such as RADEX, MOLPOP or LIME. To confirm that the accuracy of the NNs implemented here is sufficient for these models, we computed the percentage deviation of the TACSs as predicted by our NNs and the original MQCT TACSs. The computed percentage deviation is then averaged over all transitions for target H2O molecules over 231 para-H2O and 210 ortho-H2O transitions. The resulting data, presented in Table 4, illustrate that the agreement is excellent at the level of TACS despite larger deviations at the state-to-state level. Because TACSs are the main quantity that is relevant for modeling energy transfer in astrophysical environments, the level of accuracy attained in our ML models is adequate for astrophysical models. The average percentage deviation is about ∼13% and ∼14%, while the largest error is about ∼15% and ∼17% for para and ortho-H2O targets, respectively. This is very encouraging since our goal is to reduce the requirements of the computational resources to explicitly compute the relevant TACSs. This is achieved in our proposed workflow without losing significant accuracy but at a very low computational cost.
| E c (cm−1) | Average % difference | |
|---|---|---|
| para-H2O | ortho-H2O | |
| 133 | 11.87 | 10.00 |
| 200 | 13.80 | 13.73 |
| 267 | 12.50 | 14.18 |
| 400 | 13.06 | 14.72 |
| 533 | 13.30 | 14.28 |
| 708 | 15.00 | 16.99 |
To illustrate the accuracy of TACSs derived from the NNs, a plot of percent deviation for all 441 transitions of the target H2O molecule considering both ortho- and para-H2O symmetries for all six collision energies is displayed in Fig. 10 as a function of the magnitude of the actual MQCT TACSs. It is seen that for larger values of TACSs, the percentage deviation remains in the range of ∼10–20% considering all collision energies and both symmetries of the target H2O molecule. The agreement is better for lower values of collision energies and decreases slightly at higher collision energies. We also notice an increase in the percentage deviation for a lower magnitude of the TACSs for both para and ortho-H2O transitions, which is expected.
Besides the size of the training and validation data, there are several other approximations that one can make to reduce the computational cost of the proposed ML approach. One can eliminate small-magnitude cross sections to reduce noise, but these cross sections are hard to identify without scattering calculations. Therefore, we do not consider this in our efficiency estimation. Another approximation is made based on the magnitude of the range of |ΔE|. This can be explored prior to large scale MQCT calculations in order to achieve the most efficient workflow proposed in our methodology.
On average, the size of the test data is ∼7.5 times larger than the sum of the sizes of the training and validation data. It was found from previous studies by Mandal et al.32 that the cost of MQCT calculations, within the adiabatic trajectory approximation (AT-MQCT) methodology, increases as N3 (typical of coupled-channel calculations), where N is the number of rotational states in the basis. Note that a larger rotational basis is needed for higher |ΔE| transitions. Thus, excluding higher |ΔE| transitions, as discussed in the various architectures of datasets, can drastically reduce the computation cost.
The efficiency of the ML approach should consider both the cost of constructing the relevant transition matrices and the actual cost of propagating the trajectories. However, these matrices now also contain significantly reduced numbers of transitions compared to the original MQCT calculations. Computation of each of these matrices in the original MQCT calculations required about 676
000 CPU hours and a total of four matrices were computed. However, it is difficult to estimate the cost savings in computing these matrices with a smaller number of transitions. Additionally, a newer version of MQCT is expected to significantly speedup these calculations making them computationally less demanding compared to the MQCT trajectory simulations. Thus, we do not include it in our estimation of computational efficiency.
As reported earlier,26 the cost of the MQCT trajectory calculations was about 5
250
000 million CPU hours. Considering that the test data for dataset 3 are about 7 times larger than the training and the validation data and that computationally demanding high |ΔE| transitions (|ΔE| > 300 cm−1) were excluded from the dataset, we estimate the cost for the trajectory calculations to be around 100
000 CPU hours for producing the relevant data for training and validation. Overall, we expect about a factor of 50 savings in the computational cost from our approach. The actual CPU time for training the NNs is insignificant (about ∼5.5 CPU hours) once the training and validation data are in place. This is a remarkable gain in efficiency. Using the methodology presented here, an expanded database of much needed rotational transitions in water molecules can be computed at a reasonable computational cost.
Prior applications of GP and NNs for predicting state-to-state cross sections/rate coefficients for atom–diatom scattering involved interpolation in spaces of quantum states defined by only four quantum numbers (v, j and v′, j′) of the diatomic molecules.77 Similarly, in the application of GP models for diatom–diatom scattering, at most 8 quantum numbers (vi, ji,
of two diatomic molecules) are considered in the work of Jasinski et al.47 though recent studies of Mihalik et al.79 and Wang et al.80 considered only changes in ro-vibrational levels of SiO in collisions of SiO and the ground state para-H2. Here, the ML approach is shown to yield reliable results for interpolation in the space of 12 quantum numbers for H2O + H2O collisions. Our approach of using an ensemble of NNs for a range of collision energies paves a pathway for efficient computation of rate coefficients for state-to-state rotational transitions in collisions of two asymmetric top molecules.
For practical purposes, an estimated speed up factor of ∼50 is expected using our proposed pipeline exploiting the physics behind the energy transfer process (exponential decay of rate coefficients with energy gap) and utilizing deep learning algorithms. The cross section data for rotationally inelastic scattering of two H2O molecules computed using MQCT shows a double exponential feature as a function of the energy difference between the initial and final states. NNs constructed and demonstrated in this work appear to capture this feature during training and yield models that successfully predict cross sections for state-to-state rotational transitions in H2O + H2O collisions from which accurate thermally averaged cross sections are derived. Our tested NNs achieved an excellent accuracy level for a higher magnitude of thermally averaged cross sections. In the future, we hope to use this methodology to compute additional rotational transitions in water to extend the existing database.
The methodology presented here is robust and general and can be implemented for other systems of complex colliding partners, such as HCN + H2O and HDO + H2O, and collisions of atoms and diatoms with other polycyclic aromatic hydrocarbons. By utilizing the machine learning models using neural networks as proposed in this work, these computationally demanding scattering calculations are expected to become significantly more affordable. This proposed workflow is expected to open a new avenue in the near future to populate databases such as BASECOL for astrophysical modeling.
| This journal is © the Owner Societies 2025 |