Pei
Liu
,
Wei
Li
* and
Shuhua
Li
*
Key Laboratory of Mesoscopic Chemistry of Ministry of Education, New Cornerstone Science Laboratory, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, Jiangsu 210023, People's Republic of China. E-mail: shuhua@nju.edu.cn; wli@nju.edu.cn
First published on 11th February 2025
Pure phosphoric acid exhibits high proton conductivity and is widely used in modern industry. However, its proton transport mechanism remains less understood compared to that of water, which presents a significant challenge for advancing technologies like phosphoric acid fuel cells. In this study, we utilize machine learning potentials and molecular dynamics (MD) simulations to investigate the proton diffusion mechanisms in liquid phosphoric acid systems. The neural network potentials we developed demonstrate quantum chemical accuracy and stability across a range of temperatures. Our simulations reveal continuous proton hopping between phosphoric acid anions. Moreover, the radial distribution functions and diffusion coefficients obtained from ring polymer MD—a variant of path-integral MD—exhibit improved alignment with experimental values compared to classical MD results, as ring polymer MD inherently accounts for nuclear quantum effects on proton behavior. Additionally, we employed neural networks combined with the charge equilibration method to predict the charge distribution in liquid phosphoric acid, examining the proton transport mechanism through vibrational spectra analysis.
To gain deeper insights into the properties and behavior of liquid phosphoric acid systems, molecular dynamics (MD) simulations are a commonly employed approach. Classical force fields (FFs), such as GROMOS15 and COMPASS,16 are typically used to explore the structural and dynamical properties of liquid phosphoric acid systems.17–19 These classical FFs can reproduce experimental radial distribution functions (RDFs) and densities,17 For example, Zhu et al. utilized classical FFs to investigate the microscopic structure and hydrogen bonding characteristics of pristine and phosphoric acid doped polybenzimidazoles.18 However, these classical FFs have precision limitations and are incapable of describing chemical reactions, making them inadequate for accurately modeling proton transfer processes in liquid phosphoric acid systems. In contrast, ab initio MD (AIMD) simulations are capable of describing chemical reactions with higher precision, providing detailed results such as vibrational spectra,20 PT chains,6 and intermediate scattering functions21 of phosphoric acid. Nonetheless, the computational cost of AIMD scales significantly with system size, often limiting simulations to small systems and time scales of only a few ps.
To overcome these limitations and bridge the gap between efficiency and accuracy, various machine learning (ML) algorithms have been developed to construct FFs with ab initio accuracy by learning from reference ab initio data.22–27 ML potential (MLP)-based MD simulations combine the efficiency of classical FF methods with the precision of AIMD.25,26,28–32 In these methods, the local atomic environment of each atom serves as the input, and the total energy of a large system is calculated by summing the atomic energy contributions, which depend on the local environment. Equivariant neural networks (NNs), such as NequIP,33 Allegro,34 and MACE,35 are employed to construct equivariant interatomic potentials. These potentials rely on relative interatomic position vectors and higher-order geometric tensors, rather than just scalar distances or angles. This enables the network to capture more complex geometric relationships and exhibit both rotational and translational equivariance. Compared to conventional neural network potentials (NNPs), this results in superior predictive accuracy and significantly higher sample efficiency. Jinnouchi utilized ML potentials to conduct MD simulations of liquid phosphoric acid to investigate its proton conduction properties.36 However, the obtained RDFs and diffusion coefficients showed discrepancies compared to experimental values. We attribute these discrepancies to the neglect of nuclear quantum effects (NQEs) associated with protons in the simulations, which have been shown in previous studies to significantly impact PT.37–39
In this work, we combined path-integral MD (PIMD) with NNPs, based on hybrid density functional theory (DFT), to construct a dataset for liquid phosphoric acid at multiple temperatures. This dataset was then used to train NNPs for liquid phosphoric acid. We conducted classical Born–Oppenheimer MD, abbreviated as MD, simulations and ring polymer MD (RPMD)40 simulations utilizing our NNP. The results indicate that the impact of NQEs is clearly observed due to the substantial presence of protons. These NQEs significantly influence the structure and atomic arrangement of liquid phosphoric acid, thereby affecting the RDFs. Proton diffusion coefficients in the RPMD simulations were higher than those in the MD simulations, as NQEs lower the energy barriers for PT in liquid phosphoric acid. Additionally, the simulated vibrational spectra revealed overlapping P–O peaks and a broad O–H peak, reflecting the presence of hydrogen-bonding networks in liquid phosphoric acid, which facilitates long-range PT.
To obtain equilibrium lattice structures, we employed the SCC-DFTB44 semi-empirical method to optimize the three initial configurations and performed NVT MD simulations at 298.15 K, 333.15 K, and 350.15 K for 200 ps each using the CP2K software.45 In the MD simulations, the time step was set to 1 fs for efficient structural relaxation. We randomly sampled 200 structures from the last 100 ps of each trajectory to create an initial dataset, which was then used to train preliminary NNs for subsequent active learning iterations.
Fig. 1 illustrates the periodic structure of liquid phosphoric acid in the dataset. All calculations in this study were performed at the Gamma point. Single-point calculations were carried out using the CP2K software with the mixed Gaussian and plane wave (GPW) method at the PBE0-D3(BJ)/TZVP-MOLOPT-GTH level.48–50 The auxiliary density matrix method (ADMM)51 was applied with admm-dzp as the auxiliary basis set. The Coulomb truncation of the hybrid functional was set to 6.0 Å, while the plane wave cutoff was established at 550 Rydberg, with convergence tests for the cutoff energy shown in Fig. S1 and Table S1 of the ESI.† Atomic charges were calculated using the Hirshfeld-I method.52
The NN model used in this work is the MACE equivariant graph NN (GNN). In the MACE equivariant GNN, the interatomic graph comprises atomic nodes and edges connecting these atoms.35,53 An atom j near atom i is included in the graph if it is within the truncation radius rij, which is a hyperparameter of the NN. The features of node i are represented by , where t denotes the layer number, k indexes this series of features, which also corresponds to the number of chemical channels. L specifies the feature set of node i as well as the transformation of
:
![]() | (1) |
The network outputs the atomic energy EA for each atom A based on the local environment descriptors of the atoms. The total energy E is the sum of the atomic energies in the system:
![]() | (2) |
The atomic force FA on atom A is the negative gradient of the total energy with respect to the position of atom A.
The NNP developed for liquid phosphoric acid is termed PBE0-NN. Additionally, we combined the MACE model with the charge equilibration (Qeq) method to develop a NN for predicting the charge distribution within liquid phosphoric acid systems, termed Qeq-NN. In Qeq-NN, atomic charges are determined by minimizing a simplified energy expression:
![]() | (3) |
![]() | (4) |
![]() | (5) |
![]() | (6) |
The hyperparameters are provided in Table 1:
Model | Cutoff radius r/Å | Chemical channels k | Max L |
---|---|---|---|
Qeq-NN | 4.5 | 16 | 1 |
PBE0-NN | 4.5 | 128 | 1 |
Following several rounds of active learning, the final dataset for liquid phosphoric acid comprised 4947 energies, 4511
664 atomic forces, and 1503
888 atomic charges. This dataset was randomly split into a training set and a validation set with an 88% to 12% ratio and was used to train the MACE equivariant GNNs, PBE0-NN and Qeq-NN. The test set, consisting of 600 frames of periodic liquid phosphoric acid systems, was randomly sampled from all six trajectories, which included both MD and RPMD ones at three different temperatures.
We evaluated the predictive accuracy of the two NNs. For the test set, the correlation coefficient (R2) between the atomic forces predicted by PBE0-NN and the reference values was 99.84% (see Fig. S3, ESI†), demonstrating excellent predictive performance. The mean absolute errors (MAEs) for energies and atomic forces were 0.238 meV per atom and 35.13 meV Å−1, respectively, closely aligned with the errors observed on the training and validation sets (see Table S2, ESI†). This consistency suggests that the NNP has achieved quantum chemical accuracy. Additionally, the MAE of atomic partial charge predictions in the periodic liquid phosphoric acid system, obtained using the Qeq-NN model, was as low as 0.003 e. This indicates that we can efficiently and accurately predict the distribution of partial charges throughout the trajectory, enabling precise analysis in subsequent trajectory studies. To assess the computational speed improvement of NNP MD simulations using PBE0-NN compared to AIMD, we tested systems of three different sizes (see in Fig. S4 and Table S4, ESI†). The results showed that our NNP achieved a performance efficiency improvement of at least 2400-fold over AIMD on CPU. Given that the NNP is deployed to run on GPUs, the computational efficiency in actual simulations is even more substantial. The computational resources for generating the dataset and training the ML model are compared in Section S3 of the ESI.†
![]() | (7) |
![]() | (8) |
The diffusion of molecules and atoms in liquid phosphoric acid is crucial for understanding its transport properties. Given the strong PT capabilities of phosphoric acid under standard conditions, investigating the proton diffusion coefficient provides valuable insights into the mechanisms that facilitate efficient PT in liquid phosphoric acid. In condensed-phase systems, the diffusion coefficient for a specific atomic species can be derived from the slope of the mean square displacement (MSD) of the corresponding atoms:
![]() | (9) |
![]() | (10) |
The vibrational density of states (VDOS) is a tool used in MD simulations to analyze the dynamic properties of a system. The VDOS can be calculated from the autocorrelation function of atomic velocities, vi(t), obtained from the MD trajectories:
![]() | (11) |
![]() | (12) |
![]() | (13) |
The RPMD-simulated dynamical properties of liquid phosphoric acid are calculated by first averaging the coordinates, velocities, and dipole moments of different beads. These averaged values are then substituted into the corresponding formulas to obtain the desired dynamical properties.
![]() | ||
Fig. 2 Radial distribution function of the phosphoric acid system at 333.15 K, obtained from PBE0-NN MD and RPMD simulations, compared with the experimental data.43 The variable X represents oxygen (O) and phosphorus (P) atoms. |
Peaka | Position/Å | Average CNsb | Assignment |
---|---|---|---|
a Peak labels as shown in Fig. 2, where X represents O and P atoms. b Values presented before and after the slash correspond to MD and RPMD simulations, respectively. Experimental values of pure phosphoric acid from ref. 43 are provided in parentheses. n.m.: not meaningful. | |||
gX–X (1) | 1.57 (1.54) | 1.10/1.19 (1.16) | P–O bonds |
gX–X (2) | 2.55 (2.51) | 3.80/3.84 (3.82) | O–O intramolecular |
gH–X (1) | 1.00 (0.98) | 0.95/0.95 (0.95) | H–O bonds |
gH–X (2) | 1.53 (1.54) | 0.73/0.76 (0.75) | H–O through hydrogen bonds |
gH–X (3) | 2.20 (2.20) | n.m. | H–P intramolecular |
gH–X (4) | 2.80 (2.80) | n.m. | H–O intramolecular |
gH–H (1) | 1.90–2.60 | n.m. | H–H across hydrogen bond |
gH–H (2) | 2.60–4.10 | n.m. | H–H intramolecular and intermolecular |
Fig. 2 illustrates that the PBE0-NN MD results align closely with experimental RDFs, particularly in terms of peak positions for gX–X, gH–X, and gH–H in the phosphoric acid system. However, there are some discrepancies in the shape and intensity of these peaks. The first peak in gX–X, located at 1.57 Å in the PBE0-NN MD simulations, corresponds to the P–O bond within the phosphoric acid molecule and shows a minor deviation of 0.03 Å from the experimental value. In gas-phase phosphoric acid, P–O bonds are classified into single bonds (1.60 Å) and double bonds (1.46 Å).20 In the condensed-phase phosphoric acid system, however, only a single peak appears in the RDFs due to the influence of a strong hydrogen-bonding network. The second peak in gX–X, at 2.55 Å, represents O–O intramolecular correlations, with a 0.04 Å deviation from the experimental value. Overall, the deviations between PBE0-NN MD-simulated and experimental gX–X peak positions are small. Additionally, there is little difference between RPMD and MD results for gX–X, as NQEs are less significant for heavier atoms like O and P.
The gH–X peaks from both MD and RPMD simulations are similarly positioned and closely match experimental values. The first peak at 1.00 Å in gH–X corresponds to the covalent O–H bond, with a deviation of only 0.02 Å from the experimental value. RPMD simulations provide a more accurate depiction of peak intensity and width compared to MD. The second peak in gH–X, located at 1.53 Å, represents the intermolecular O⋯H–O hydrogen bond, deviating by 0.01 Å from the experimental value. The overlap of these two peaks in gH–X suggests frequent proton transfers along strong O⋯H–O hydrogen bonds in liquid phosphoric acid. However, RPMD might slightly overestimate this overlap compared to experimental data. The third and fourth peaks in gH–X reflect non-bonded intramolecular H–P and H–O interactions, respectively, with RPMD providing a significantly better representation of these interactions than MD.
The first peak in gH–H, spanning 1.90 to 2.60 Å, captures H–H correlations across adjacent hydrogen bonds. The MD simulations, however, tend to overestimate the intensity of this peak, indicating that the hydrogen-bond network in phosphoric acid appears more structured in simulations than in experimental measurements. Nonetheless, the gH–H peaks from both MD and RPMD simulations qualitatively agree with experimental data.
As presented in Table 2, the average coordination numbers calculated (CNs) from RPMD and MD simulations show no significant differences, indicating that PBE0-NN accurately predicts the structure and local atomic environment of phosphoric acid. Furthermore, the gH–X RDFs in Fig. 2 reveal that RPMD, which accounts for NQE, produces RDFs with peak intensities and shapes that are more consistent with experimental values. The overlap of the O–H bond and O⋯H–O hydrogen bond peaks suggests a lower energy barrier for proton transfer between oxygen atoms, potentially indicating increased proton diffusion and hopping in RPMD simulations. This aspect will be explored in the following section.
According to previous studies,21 proton transfer predominantly occurs through the hydrogen bond network rather than a molecular vehicle, as illustrated in Fig. 3. To investigate this, we tracked the motion of a single proton. Fig. 4 depicts its bonding to various oxygen atoms over time, excluding transfers to the same oxygen atom to provide a clearer view of directional PT within the hydrogen bond network. The transport distance of this proton is approximately 2 Å, contributing significantly to the high conductivity observed in liquid phosphoric acid.
Fig. 5 compares the proton diffusion coefficients at different temperatures from both PBE0-NN MD simulations and experimental data, with specific values listed in Table 3. The diffusion coefficients from the PBE0-NN MD simulations show a clear upward trend with increasing temperature. In RPMD simulations, the diffusion coefficients exhibit an exponential increase with temperature, aligning well with the experimental results.
![]() | ||
Fig. 5 Comparison of proton diffusion coefficients in the phosphoric acid system obtained from PBE0-NN MD and RPMD simulations at various temperatures, alongside experimental values of pure phosphoric acid from ref. 63. |
Temperature/K | D/10−10 m2 s−1 | |
---|---|---|
H | P | |
a Values presented before and after the slash correspond to MD and RPMD simulations, respectively. Experimental values of pure phosphoric acid from ref. 63 are provided in parentheses. | ||
298.15 | 0.16/0.21 (0.40) | 0.04/0.06 (0.09) |
333.15 | 0.38/0.66 (1.27) | 0.06/0.22 (0.21) |
350.15 | 0.55/1.20 (2.02) | 0.11/0.39 (0.57) |
Table 3 also reveals that the diffusion coefficient of P is much smaller than that of H. This supports the conclusion that PT in phosphoric acid primarily follows a hopping mechanism, consistent with the mechanism shown in Fig. 3. The differences between simulated and experimental diffusion coefficients are within an order of magnitude. At temperatures of 298.15 K, 333.15 K, and 350.15 K, experimental diffusion coefficients for hydrogen in phosphoric acid were determined to be 0.40 × 10−10 m2 s−1, 1.27 × 10−10 m2 s−1, and 2.02 × 10−10 m2 s−1, respectively. Corresponding MD values were 0.16 × 10−10 m2 s−1, 0.38 × 10−10 m2 s−1, and 0.55 × 10−10 m2 s−1, while RPMD values were 0.21 × 10−10 m2 s−1, 0.66 × 10−10 m2 s−1, and 1.20 × 10−10 m2 s−1. Similar to previous simulations of liquid water systems, RPMD results show enhanced hydrogen diffusion coefficients compared to MD, highlighting the significant impact of NQEs on phosphoric acid dynamics. Both RPMD and MD simulations predict lower diffusion coefficients than the experimental values at corresponding temperatures. As shown in Fig. 2, the first H–H peak intensities predicted by both MD and RPMD simulaions are higher than experimental observations, which suggests that this discrepancy does not arise from the neglect of NQEs in the MD simulations. Similar discrepancy have been observed in previous PBE-based AIMD simulations of liquid phosphoric acid systems.20 This discrepancy may stem from the inherent limitations of the DFT method, as several studies have pointed out that DFT often struggles to accurately predict hydrogen bonding and tends to underperform relative to more advanced methods like MP2.64,65 This effect could result in protons becoming “trapped” in the simulations, leading to slower diffusion than in experiments, where additional bond-breaking and reforming events enhance proton mobility.
Fig. 2 also shows that the RDFs for hydrogen bonds and covalent bonds exhibit greater overlap in RPMD than in MD. This overlap lowers the energy barrier for proton hopping between oxygen atoms, facilitating proton diffusion. In contrast, MD predicts sharper, more intense RDF peaks for covalent bonds, hydrogen bonds, and H–O and H–P correlations, reflecting a denser phosphoric acid configuration that restricts proton mobility. By incorporating NQE, RPMD provides a more accurate representation of these interactions, which is crucial for capturing proton dynamics in phosphoric acid.
![]() | ||
Fig. 6 Comparison of the VDOS predicted by PBE0-NN MD (a) and RPMD (b) simulations with experimentally observed vibrational frequencies of pure phosphoric acid from ref. 66. Green arrows indicate vibrational frequencies observed in the experimental Raman spectra of liquid phosphoric acid. |
![]() | ||
Fig. 7 Comparison of the IR spectra of pure phosphoric acid predicted by PBE0-NN MD and RPMD, and CVFF MD simulations with experimental data of pure phosphoric acid from ref. 67. |
Fig. 7 illustrates the IR spectra for phosphoric acid at 298.15 K, as predicted by PBE0-NN and CVFF MD simulations, compared with experimental attenuated total reflectance IR spectra.67 In the 400–1400 cm−1 region, the classical CVFF predicts two absorption peaks: one at 460 cm−1 with greater intensity than the 915 cm−1 peak. However, these peaks differ significantly from experimental data in terms of frequency, shape, and intensity. In contrast, PBE0-NN simulations, both MD and RPMD, produce an absorption peak at 490 cm−1 that closely matches the experimental peak in both frequency and shape. The experimental peak at 915 cm−1, associated with P–O bond vibrations, is more intense than in PBE0-NN simulations, with the RPMD-predicted peak exhibiting a red shift of about 30 cm−1, bringing it closer to the experimental frequency (Table 4).
Peaks | Experiment | PBE0-NN RPMD | PBE0-NN MD | CVFF MD |
---|---|---|---|---|
1 | 470 | 490 | 490 | 460 |
2 | 915 | 885 | 1030 | 915 |
3 | 2780 | 2800 | 3270 | 3060/3750 |
At high frequencies, the PBE0-NN simulations produce a broad absorption band from 2000 to 3800 cm−1, consistent with the experimentally observed broad peak. In contrast, absorption peaks from classical FF are much narrower, showing two distinct peaks that fail to capture the complex hydrogen-bonding network in phosphates. The PBE0-NN method weakens the O–H bond force constant and enhances dipole moment changes during vibration, leading to broader, stronger absorption features. Specifically, PBE0-NN MD simulations reveal a prominent absorption peak at 3270 cm−1, which is blue-shifted by approximately 490 cm−1 from the experimental value. In comparison, PBE0-NN RPMD, which accounts for NQEs of light atoms such as hydrogen, predicts O–H stretching peaks in the range of 2000 to 3500 cm−1. The shape and intensity of this RPMD-predicted peak align well with experimental data, with only a 20 cm−1 blue shift at the peak's maximum.
These findings underscore the critical role of NQEs in influencing proton vibrational properties in liquid phosphoric acid. Classical MD, which neglects NQEs, underestimates the impact of strong hydrogen bonding on O–H vibrations in liquid phosphoric acid. In contrast, RPMD effectively captures these effects by modeling protons as a series of “beads” forming a ring polymer. Overall, these results highlight the strong predictive performance of the PBE0-NN method for the vibrational spectra of phosphoric acid and emphasize the indispensable role of NQEs in facilitating PT within liquid phosphoric acid systems.
Comparing proton diffusion coefficients obtained from RPMD and MD at various temperatures confirmed that NQEs significantly enhance PT. Additionally, RPMD reproduces the vibrational density of states and IR spectra with greater accuracy in both shape and frequency. Our results show that RPMD not only provides more accurate vibrational energy spacing but also captures the effects of strong hydrogen bonding and proton tunneling on the transport and vibrational properties of phosphoric acid. These findings offer valuable insights and a robust foundation for further exploration of NQEs in the conductive properties of phosphoric acid and related materials. Building on this work, future research could explore the integration of NNP and PIMD simulations with low-scaling electron correlation methods.68 This would enable the construction of datasets at the MP2 or CCSD(T) level, thus achieving more accurate NNP-based MD simulations. Additionally, we aim to develop NNs capable of predicting system pressure, which will enable pressure-controlled simulations. This approach will allow us to simulate the viscosity of phosphoric acid and phosphate systems across a range of temperatures and concentrations, thus providing a more comprehensive theoretical framework for understanding their properties.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4cp04195j |
This journal is © the Owner Societies 2025 |