Simultaneous learning of static and dynamic charges

Philipp Stärk; Henrik Stooß; Marcel F. Langer; Egor Rumiantsev; Alexander Schlaich; Michele Ceriotti; Philip Loche

doi:10.1039/D6CP00911E

View PDF Version

Open Access Article

This Open Access Article is licensed under a
Creative Commons Attribution 3.0 Unported Licence

DOI: 10.1039/D6CP00911E (Paper) Phys. Chem. Chem. Phys., 2026, Advance Article

Simultaneous learning of static and dynamic charges

Philipp Stärk^ab, Henrik Stooß^c, Marcel F. Langer^d, Egor Rumiantsev^d, Alexander Schlaich^c, Michele Ceriotti^d and Philip Loche^defg
^aStuttgart Center for Simulation Science (SC SimTech), University of Stuttgart, 70569 Stuttgart, Germany
^bInstitute for Computational Physics, University of Stuttgart, 70569 Stuttgart, Germany
^cInstitute for Physics of Functional Materials, Hamburg University of Technology, 21073 Hamburg, Germany. E-mail: alexander.schlaich@tuhh.de
^dLaboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland. E-mail: michele.ceriotti@epfl.ch
^eDepartment of Physics, Technical University of Munich, Garching, Germany. E-mail: philip.loche@tum.de
^fAtomistic Modelling Center, Munich Data Science Institute, Technical University of Munich, Germany
^gMunich Center for Machine Learning (MCML), Munich, Germany

Received 12th March 2026 , Accepted 1st June 2026

First published on 18th June 2026

Abstract

Long-range interactions and electric response are essential for accurate modeling of condensed-phase systems, but capturing them efficiently remains a challenge for atomistic machine learning. Traditionally, these two phenomena can be represented by static charges that underlie Coulomb interactions between atoms, and dynamic charges such as atomic polar tensors—aka Born effective charges—describing the response to an external electric field. We critically compare different approaches to learn both types of charges within a single model architecture, taking bulk water and water clusters as paradigmatic examples: (1) learning them independently; (2) coupling static and dynamic charges based on their physical relationship with a single global coupling constant to account for dielectric screening; (3) coupled learning with a local, environment-dependent screening factor. In the coupled case, correcting for dielectric screening is essential, yet the common assumption of homogeneous, isotropic screening breaks down in heterogeneous systems such as water clusters. A learned, environment-dependent screening restores high accuracy for the dynamic charges. However, the accuracy gain over independent dynamic predictions is negligible, while the computational cost increases compared to using separate models for static and dynamic charges. This suggests that, despite the formal connection between the two charge types, modeling them independently is the more practical choice for both condensed-phase and isolated cluster systems.

1 Introduction

Electric fields influence the structure, dynamics, and reactivity of molecular and condensed-phase systems. Processes such as catalysis, charge transport, or characterization techniques like infrared (IR) spectroscopy depend sensitively on how atoms respond to internal and external electric fields.^1–3 First-principles electronic-structure methods capture these responses with high accuracy, but their computational cost limits their use for large, heterogeneous systems or long simulation times. Machine-learned interatomic potentials (MLIPs) have begun to address this challenge by providing near-quantum-chemical accuracy for the structure and dynamics of molecules and condensed phases at drastically reduced cost.⁴ Modern architectures—often graph neural networks with equivariant message-passing—effectively model complex many-atom correlations, albeit within a finite interaction range and model capacity.^5–8 However, most MLIPs remain restricted to field-free simulations: they do not natively incorporate the effects of external fields or predict electrical response properties, limiting their applicability in technologically relevant scenarios.

Several strategies have been developed to endow MLIPs with electric-field awareness. One class of methods learns dipoles or local dipole contributions,^9–14 but these approaches face fundamental challenges: for periodic systems, dipoles are defined only modulo a polarization quantum,^12,15,16 which requires care during training, as the loss function must account for this multi-valuedness. Other approaches bypass these issues by learning scalar effective charges^17,18 or tensorial quantities such as Born effective charges (BECs), also called atomic polar tensors (APTs).^19–21 These models couple external fields to learned charges, generating auxiliary field-dependent forces and thus avoid the conceptual ambiguities of polarization-based methods. BEC-based models perform well in capturing the linear electric response. However, these models typically neglect an important physical ingredient: the external electric-field response of atoms is intimately linked to the electrostatic interaction between atoms. Effective charges used to model long-range Coulomb forces reflect internally screened interactions, whereas BECs describe the unscreened response of the electron density to an external field. Crucially, the distinction between internal (screened) and external (unscreened) responses becomes pronounced in heterogeneous systems such as clusters, interfaces, or molecular mixtures, where assuming a single homogeneous screening value may not be sufficient.

Besides electric-field response, long-range electrostatics have already been incorporated in MLIPs using fixed charges,^22,23 ML-fitted surrogate charges such as Hirshfeld charges,^24,25 dipole-matching schemes,²⁶ or global charge-equilibration networks.^17,18,27,28 Recently, differentiable Ewald summation frameworks^29,30 have enabled end-to-end charge learning, removing ambiguities in charge assignment.^31–33 A few recent studies attempt to combine long-range electrostatics with electric-field response, for example by deriving BECs from end-to-end learned internal charges.^34,35 These approaches generally assume homogeneous isotropic screening, which can provide decent qualitative behaviour in bulk systems but may fail in heterogeneous environments, limiting their applicability for general-purpose ML potentials.

This breakdown raises a central open question: should static (effective) and dynamic (BEC) charges be learned together, constrained by their physical relationship, or treated as independent quantities? Here, we address this question by systematically evaluating both coupled and uncoupled strategies in long-range ML architectures, including cases with environment-dependent screening, to clarify where and why homogeneous screening assumptions break down and how accuracy can be restored. We critically compare different models in terms of accuracy, interpretability, and computational cost, using high-quality finite-field reference data for bulk water as a prototypical polar liquid with well-characterized experimental reference data including IR spectra³⁶ and water clusters. Water clusters are of direct scientific relevance as precursors of atmospheric ice-nucleating particles³⁷ and as models for aerosol droplets in the context of airborne pathogen transmission.³⁸

2 Theory and models

2.1 Static and dynamic charges

A key concept for describing the response of atomistic systems to externally applied fields is the BEC (or APT) of an atom, which is defined via^19,20


	(1)

where

^α is the α-component of the system polarization density, r^β_i the β-component of the position of atom i, V is the system volume, F^β_i the force on atom i and E^α an externally applied field. The symbols α, β ∈ {x, y, z} specify Cartesian components of the tensors. The right-hand side of eqn (1) motivates the use of these charges for describing electrostatic interactions with external fields, as it represents the corresponding linear-response coefficient—the change in force induced by an applied external field.

We consider for the moment non-periodic systems, where we define partial charges q^IR_i that reproduce the polarization P = [scr P, script letter P] V,


	(2)

This allows us to relate BECs Z_i to static charges q^IR_i through³⁹


	(3)

where the first term mimics the “static charges” q^IR_i, which are sometimes also referred to as the “IR charge” (as it reproduces the system's dipole and thus yields the infrared spectrum, as we will discuss below). The second term in eqn (3) describes the charge flux: the change in the dipole moment due to a collective alteration of the charge distribution as the i-th atom is displaced by an infinitesimal amount. Because Z_i encodes these charge redistribution effects, BECs are also called “dynamic charges”.

2.2 Long-ranged machine learned potentials via learned static charges

As briefly explained in the introduction, a wide range of methods have been proposed for assigning partial “static” charges to nuclei from the continuous electron density. Popular schemes include Mulliken charges,⁴⁰ Hirshfeld charges³³ and the aforementioned IR charges.³⁹ Each of these methods corresponds to a different definition of the partial charges, and there is no unequivocally “best” choice. Given this ambiguity, it is perhaps not too surprising that many different schemes have been used with similar success to capture long-ranged electrostatic interactions in machine-learned interatomic potentials, including the explicit learning of atomic partial charges,^24,25 the learning of the position of Wannier centers⁴¹ and charge equilibration schemes that allow for self-consistent redistribution of atomic charges.²⁷ Atomic charges can also be treated simply as fitting parameters to reproduce quantum mechanical energies and forces. This approach is common in classical forcefields.⁴² It is also used implicitly by schemes such as LODE, where long-range descriptors have fitting coefficients that correspond to charge multipoles.⁴³ Recently, the latent Ewald summation framework has popularized this strategy for end-to-end charge learning.²⁹ Instead of targeting a specific charge partitioning scheme, we focus here on the concept of learned pseudo charges q_i. These are not fitted to any particular definition of atomic charges. Learning static charges indirectly avoids the need to choose an arbitrary partitioning scheme. The model is thus free to infer per-atom quantities q_i(ξ_i) that minimize errors in predicted energies, forces, and, where applicable, Born effective charges (BECs). The model uses descriptors ξ_i of the local environment of each atom i within a finite cutoff [see Fig. 1(A) and (B)]. These descriptors are used to predict both the short-range part of the potential U^sr_i and the atomic pseudo-charges q_i. The pseudo-charges are then used to compute the electrostatic potential and thus ultimately the long-range part of the MLIP. One can go even further and use more general functional forms that do not allow for a transparent interpretation of q_i as atomic static charges. This approach, briefly introduced in the following section, has been shown to further improve the accuracy of energies and forces.⁴⁴


	Fig. 1 Schematic illustration of the two different model architectures. (A) Shows the architecture of an “uncoupled model”, where BECs Z and static charges q are treated independent of one another. (B) Shows a coupled architecture where the learned pseudo charges q_i are related to the BECs through the polarization P, with either a global γ^θ_i = const. or local screening parameters γ^θ_i. The final output of each model is the total force per atom F_i and the electrostatic enthalpy H (see eqn (11)), which reduces to the internal energy H = U in the case of no external fields, E = 0.

2.3 Uncoupled static/dynamic charge prediction

In the present work, the long-ranged contribution to the potential is constructed using learned pseudo-charges q_i introduced above. Without PBC, the electrostatic potential


	(4)

is computed through an automatically differentiable (AD) Coulomb solver. Periodic image contributions are handled via an AD implementation of the Ewald summation.³⁰ To integrate the interactions associated with the Coulomb potential into an energy-decomposable MLIP, we express the long-range energy as


	(5)

Beyond this physical definition of U^lr, we also consider a more flexible long-range formulation introduced by us in ref. 44. Here, the electrostatic potential is processed by a learnable function f_θ, yielding


	(6)

This modification, referred to as “long-range representations with equivariant messages” (LOREM) has been shown to significantly reduce errors across multiple prediction tasks.⁴⁴ We note that in this work we restrict the architectures to learned scalar pseudo charges only, as opposed to the more general equivariant tensorial pseudo charges proposed in ref. 44. When the learnable function is set to be V({q_j,r_j}_j)q_i, the model reduces to eqn (5) and retains interpretability of the pseudo charges. We refer to this latter version as the physical long-range architecture.

The simplest extension of long-ranged MLIPs to also predict BECs is to treat them as independent outputs without connection to the static pseudo charges. An architecture capable of this is shown in Fig. 1A. Here, the pseudo charges q_i(ξ_i) are predicted solely for computing U^lr, while the BECs constitute a second, independent prediction target derived from the same local environment representation ξ_i. This design allows the model to learn static and dynamic charge responses separately, without enforcing any explicit relation between them.

Because we employ an equivariant descriptor ξ_i, predicting Z_i becomes straightforward. To obtain the Born effective charge tensor Z^αβ_i from the local atomic environment, we map the equivariant spherical-harmonic features of each atom through a small neural module. The descriptor (restricted to features up to two angular channels) is first processed by several dense layers applied to the spherical channels. An equivariant tensor-coupling step based on Clebsch–Gordan coefficients mixes the angular components to form all symmetry-allowed rank-2 contributions, which are subsequently linearly recombined into a 3 × 3 matrix, which is fitted to BEC labels derived from DFT (see Methods Section 5.1).⁴⁵ This construction ensures that the predicted tensors transform correctly under rotations. Such an approach is essentially the extension of previous works^19,46 to long-ranged architectures.

2.4 Coupled static/dynamic charge prediction

A second possible strategy is to attempt to approximate the infrared charges q^IR_i via the learned pseudo charges q_i(ξ_i) to construct Z_i via eqn (1) and (3), as has been proposed in recent works.^34,35,47 However, when q_i(ξ_i) are learned through minimization of energy and force errors using eqn (4) and (5), these quantities already include screening effects from the instantaneously responding electronic background. For homogeneous bulk systems³⁴ or isolated molecules in vacuum,³⁵ the screening can be approximated as homogeneous and isotropic. This leads to a simple proportionality between learned and infrared charges


q^IR_i = γq_i(ξ_i).	(7)

Here, the coupling parameter γ can be interpreted in terms of the high-frequency dielectric permittivity, .^34,48,49 The underlying assumption of the latter relation is that learned pseudo charges correspond to physically interpretable partial charges and has been shown to allow for a posteriori prediction of Z_i in a large variety of systems, if one assumes ε_∞ between 1 (isolated molecules) and 1.83 (bulk water).^34,35 Here, we will refrain from interpreting the screening parameter as directly related to ε_∞. Instead, we simply treat γ as a learnable parameter that scales the learned pseudo charges to infrared charges. This scaling is determined by requiring that the resulting charges reproduce BEC targets obtained from DFT calculations. However, as we will show below, the assumptions of homogeneous and isotropic screening break down in inhomogeneous environments such as interfaces, where the dielectric response is well known to become both anisotropic and spatially varying.^50–53 A simple way to resolve the homogeneity assumption is to introduce a local screening factor,


q^IR_i = γ_i(ξ_i)q_i(ξ_i),	(8)

where the scalar coefficient γ_i is predicted from the local environment ξ_i, which follows smoothly from eqn (7). This local formulation captures spatial inhomogeneities, but it still enforces isotropic screening. Promoting γ_i to a tensorial, environment-dependent quantity would be equivalent to directly predicting the full dynamic charge Z_i. Consequently, uncoupled learning of Z_i captures both anisotropy and inhomogeneity, whereas scalar local screening only accounts for the latter.

Using the proposed coupling relations eqn (7) and (8) to predict Z_i via eqn (3) is the basis for the second family of MLIPs that we study, the coupled models. These models, shown in Fig. 1B, use the predicted q_i(ξ_i) to construct the BECs. We explore two variants of these coupled models. (i) In the coupled, global approach, the MLIP is first trained only on energies and forces. Afterwards, a single global screening parameter γ (see eqn (7)) is fitted a posteriori to minimize the BEC error on the validation set. We intend this strategy to closely follow the strategy of ref. 34 and 35. (ii) In the coupled, local approach, we drop the assumption of a uniform screening factor. Instead, the model predicts a local, environment-dependent γ_i (see eqn (8)), learned jointly with energies, forces, and BEC labels. This allows us to test whether relaxing the homogeneity assumption improves BEC prediction accuracy, especially in inhomogeneous systems.

For all coupled architectures an issue arises when incorporating periodic boundary conditions. A naive definition of the polarization density via eqn (2) and (7) can result in ill-defined values, due to the periodic boundaries. This problem can be mitigated in molecular systems by learning molecular dipole contributions,⁵⁴ but this approach is incompatible with the atom-centered, point-charge picture adopted by the models we consider in the present work. Zhong et al.³⁴ solved this by utilizing an artificial complex phase. We instead calculate Z_i from the learned pseudo charges via their positions by


	(9)

where the index PBC accounts for periodic boundary conditions when computing the distances. We explain the motivation behind eqn (9) in detail in Section SI of the SI. Eqn (9) directly follows from recognizing that for neutral systems P is translationally invariant and can thus always be formulated with respect to particle distances, which removes complications due to ambiguous positions in systems with toroidal boundary conditions. For computational efficiency, it is necessary to reformulate eqn (9), which greatly reduces the computational cost of inference and training, following a previously reported approach for computing heat fluxes.^55,56 This optimized implementation is presented in detail in Section SII of the SI.

3 Results and discussion

Fig. 2 compares parity plots of the diagonal BEC components Z^αα for bulk water (panel A) and water clusters (panel B) using the uncoupled model and two coupled variants. Unless noted otherwise, all models in this section are the physical models and were trained on a dataset combining both bulk and cluster configurations (LOREM-model results are reported in Fig. S2 of the SI). See Section 5.2 for full training details.


	Fig. 2 Scatter plot of diagonal components of the BECs Z^αα comparing DFT values with model predictions for bulk water (A) and water clusters (B).

Overall, the uncoupled and the coupled model with local screening (γ_i) both reproduce the BEC labels accurately for the bulk and for the clusters. The global-screening model, however, shows a clear deterioration in performance on both subsets. This stems from the global model's homogeneous-screening assumption: as expected from our discussion above, a single γ cannot simultaneously capture the different effective dielectric responses present in homogeneous bulk and inhomogeneous cluster environments. For this global-γ model—where γ was obtained by an a posteriori fit to the BEC labels of the validation-set—we find γ = 1.974. While one could reduce the error further by fitting distinct screening factors for bulk and cluster subsets, this strategy requires identifying and splitting the dataset into several classes based on their structure. As dataset size and diversity grow, such manual classification becomes increasingly impractical.

Furthermore, the local screening varies strongly in clusters. Fig. 3 shows the distributions of local screening values γ_i predicted by the coupled local-γ_i model as a function of the number of molecules N_mol per cluster. For small clusters the predicted γ_i values are tightly distributed, whereas the distributions broaden as cluster size increases. Notably, we observe pronounced differences for dangling OH groups, highlighted by the color-coded structures in Fig. 3A. This trend is also reflected in the bimodal distribution of hydrogen-atom γ_i values in cluster structures, which we attribute to the distinction between hydrogens engaged in hydrogen bonding and those exposed at the cluster surface.


	Fig. 3 (A) Snapshot of water clusters with 6 and 20 molecules. Atoms are colored according to their local screening values γ_i. An interactive view for the whole validation set is provided online https://doi.org/10.24435/materialscloud:fs-8h. (B) Distribution of the local screening values from the coupled γ_i model as a function of water cluster sizes. Gray dashed line shows the average value of γ_i for bulk structures.

In Fig. 3B, the dashed gray line indicates the mean screening value for bulk structures, [small gamma, Greek, macron] _i = 1.22 ± 0.09, which the γ_i values approach for larger clusters, as expected. Together, these findings provide a physically interpretable explanation for the coupling between learned pseudo charges, infrared charges, and BECs. Accurate modeling in heterogeneous environments requires this coupling to depend on the local atomic environment. By contrast, a single global screening parameter fails to capture the variability present in inhomogeneous structures.

In order to estimate the general fidelity of the different MLIP architectures, we compare the different models in terms of their mean absolute errors (MAEs) on validation sets for energies, forces and BECs (Fig. 4). Discussing first the physical models, we find differences in energy and force errors to be minor, with the uncoupled model performing overall the best. As already observed in Fig. 2, the coupled global γ model performs significantly worse on BEC predictions, while the uncoupled and coupled local γ_i models achieve similar accuracies for predicting BECs, underscoring the need for environment dependent coupling between learned pseudo charges and BECs for these systems. Importantly, modeling the coupling increases the computational cost by a constant prefactor relative to the uncoupled model, while the asymptotic scaling with system size remains unchanged (see Fig. S1 in the SI).


	Fig. 4 Mean absolute errors (MAE) on the validation sets for the energy U (A), forces F (B) and the BEC diagonal elements Z (C). Different colors depict different models. Light color bars show physical models while desaturated bars show the LOREM models.

So far, all models were trained utilizing the physically motivated Coulomb interaction of eqn (5). To investigate if a more expressive but less physically interpretable model utilizing eqn (6) may perform even better, we also trained models with the LOREM formulation. As can be seen in Fig. 4, this modification improves performance on all targets with a particularly strong improvement in bulk potential energy errors. The parity plot (similar to Fig. 2) for the LOREM models are presented in the SI (Fig. S2), showing qualitative agreement with the physical models and mirroring the need for an environment dependent coupling parameter to achieve good BEC accuracy on cluster and bulk structures. However, as shown in Fig. S3, the local screening values γ_i predicted by the LOREM models are no longer physically interpretable due to the learnable mapping f_θ.

Finally, we assess the behavior of the different physical models in predicting infrared spectra from molecular dynamics simulations. Computational details are provided in method Section 5.3. Fig. 5 shows the imaginary part of the complex susceptibility for periodic bulk water at T = 300 K and for the cage and book configurations of water hexamer clusters at T = 10 K. The spectra are organized into two columns: the left column (panels A, C, E) shows the OH-bending mode around ∼50 THz, while the right column (panels B, D, F) shows the OH-stretching mode around ∼100 THz. We note that nuclear quantum effects are neglected, so comparison with experiment is qualitative, although error cancellation in GGA functionals yields reasonable agreement for bulk water. To contextualize the sensitivity of spectra to the predicted BECs, we also provide results for an analysis based on constant scalar charges. To this end, we calculate the average diagonal component of the BECs per element and assign this charge to all O and H atoms for the analysis of the entire trajectory. The results of this procedure are given by the purple lines in Fig. 5.


	Fig. 5 Imaginary part of the susceptibility spectrum of periodic bulk water at T = 300 K (A + B), cage (C + D) and book (E + F) configuration of water hexamer clusters at T = 10 K. We show results for the three model architectures discussed in this work and for reference the result of a simple constant scalar charge analysis (purple line). The black dashed line shows experimental bulk reference spectrum taken from ref. 36.

Focusing first on a comparison of models in the bending mode in panels A, C, E, a consistent hierarchy emerges: the fixed charge model (based on atom-type averaged charges), global screening (γ), local screening (γ_i), and uncoupled models estimate the bending mode amplitude in decreasing order. In bulk water, where we also provide an experimental reference, this ordering is particularly evident. The global γ model strongly overestimates the bending intensity (A) and underestimates the stretching band (B), which is additionally shifted to higher frequencies. The fixed-charge model shows similar but slightly more pronounced deviations from the experimental reference.

In contrast, both the γ_i and uncoupled models reproduce intensities and peak positions much more accurately, in closer agreement with the experiment. Discussing in more detail the results for the water hexamers, we find similar but slightly less pronounced trends. As the spectral features of these isomers have been discussed extensively elsewhere,^36,57,58 we focus here on differences between models. Overall, all models reproduce the main spectral features, indicating consistent nuclear dynamics. Differences arise mainly in intensities and subtle frequency shifts. The bending-mode intensity again increases with decreasing model flexibility (C, E), following the same hierarchy as discussed above, while differences in the stretching region (D, F) are more subtle. The LOREM models give spectra supporting the same conclusion, as shown in Fig. S4 in the SI. In summary, these results are consistent with the model errors discussed above, in that the global screening model shows the largest deviations in predicted infrared spectra, including for the experimentally accessible bulk water case. This reinforces the conclusion that a more local treatment of the coupling between latent and dynamic charges, as realized in the γ_i model, or the complete decoupling in the uncoupled model, more accurately captures the charge redistribution underlying IR activity, particularly for the OH bending/stretching part of the spectrum, where the dynamic part of BEC contributions are expected to be pronounced. At the same time, the overall spectral shape is already reasonably well reproduced even with the simplified fixed-charge analysis, indicating that IR spectra are primarily governed by the nuclear dynamics and the symmetry of vibrational modes, while still reflecting the improved physical consistency of more flexible charge-coupling schemes.

4 Conclusions

Machine-learned interatomic potentials increasingly attempt to incorporate long-range electrostatics and external-field coupling by introducing atom-centered static charges and, in some cases, by exploiting a physically motivated relationship between static and dynamic charges. However, an explicit assessment of whether this physical motivation enables reliable prediction of dynamical charges without explicitly training on them is lacking, and there are no systematic studies evaluating whether coupled architectures with an explicit dynamic-charge target outperform approaches that treat static and dynamic charges separately.

In this work, we systematically examine this question within models that incorporate explicit electrostatic interactions. Using water clusters and bulk water as controlled test systems—where high-quality reference data are available and long-range polarization effects are essential—we demonstrate that even in this comparatively simple setting, static charges learned as coefficients of a Coulomb term are only correlated with Born effective charges (BECs). As we show, the relationship is not captured well with a single factor across heterogeneous environments, breaking down when fitting on systems such as clusters of varying size. We show that restoring quantitative agreement within coupled models for heterogeneous datasets requires employing spatially varying screening coefficients. While this improves accuracy, it removes the anticipated simplicity of the physically constrained approach and does not provide a clear computational advantage over directly learning BECs as an independent target. In contrast, treating dynamic charges as a separate, well-defined observable yields higher accuracy, improved infrared intensities, and lower complexity.

Even though it is tempting to avoid the explicit calculation of reference BECs, especially given that observables such as IR spectra are comparatively insensitive to moderate errors in the predicted dynamical charges, our results show that measurable and systematic differences nevertheless emerge, in particular for bulk water where experimental reference data are available. In this case, models incorporating local screening or decoupled charge responses reproduce the relative intensities and peak positions more accurately than approaches based on global screening or fixed charges. Our results indicate that whenever heterogeneous datasets are considered, quantitative accuracy is strongly improved by explicit training on BECs. For these reasons, we recommend (i) using electrostatic potentials to capture long-range interactions without attributing physical meaning to static charges, which are inherently model-dependent quantities, and (ii) learning dynamic charges explicitly and independently from well-defined ab initio BEC data.

5 Methods

5.1 Dataset construction

The bulk water structures are taken from ref. 59. The water cluster structures are composed from three sources: (1) the water cluster subset of Hobza's benchmark energy and geometry data base (BEGDB),⁶⁰ (2) the WATER27 component of the GMTKN24 and GMTKN30 benchmark suites⁶¹ and (3) dimer and trimer structures from ref. 62. These sets contain very distinct geometrical structures from 1 to 20 water molecules. To extend the dataset size we performed molecular dynamics simulations for each cluster size at a temperature of 400 K for 500 ps with a 0.5 fs timestep using the universal PET-MAD model.⁶³ We combined the initial structures with the MD runs and selected 2000 structures using farthest point sampling using the PET-MAD descriptor.⁶⁴

For consistent labeling of the structures, we recalculated energies, forces and BECs for all structures with DFT at the revPBE-D3 level using the CP2K package.⁶⁵ These calculations employed the revised Perdew–Burke–Ernzerhof (PBE) exchange–correlation functional, a DZVP-MOLOPT atom-centered basis set, and Goedecker–Teter–Hutter (GTH) pseudopotentials.

To compute BECs via finite electric fields we start from the right-hand-side of eqn (1), indicating that BECs can be obtained as the derivative of atomic forces with respect to externally applied fields. This relation, strictly only true in the infinitesimal limit, can be approximated via central finite differences, giving us


	(10)

where E^β is the magnitude of the externally applied field in Cartesian direction β and F^α_i is the α Cartesian component of the force on atom i. This reduces the calculation of the per-atom BECs to six plus one single-point calculations in DFT, where reusing the initial wave-function guess further significantly reduces the computational cost. We show in Fig. S5 of the SI that the estimated values for the BECs from this scheme are stable across a very wide range of externally applied field strengths. To apply homogeneous external fields we use the implementation by Souza et al. and Umari and Pasquarello^66,67 (see also ref. 68). For production calculations, finite field strengths of 0.026 V Å⁻¹ were applied for the central finite-difference scheme in eqn (10).

5.2 Training procedure

We construct the model to output the enthalpy H, such that the total force on atom i is given by F_i = −∇_{r_i}H via


	(11)

where P_ref is the dipole of a reference configuration which serves to obtain a consistent definition in periodic settings.¹⁵ The internal energy per atom i is simply given by the sum of long and short-ranged contributions U^model_i = U^sr_i + U^lr_i. The force given by the gradient of eqn (11) with respect to r_i is thus given by:


	(12)

neglecting second order derivatives of Z_i with respect to r_i. Provided P is known for a reference configuration (e.g. via an ab initio calculation for the starting structure), the enthalpy H is well-defined through P_ref during a simulation run. This procedure is in principle similar to what has been proposed in ref. 11, even though here the ambiguity in the definition of H is made explicit. We note the absence of P_ref for the force in eqn (12), hence forces are always well-defined, even when P is ambiguous.

Finally, training of all models is performed by constructing a typical force and energy loss function via an L₂ norm, i.e.


	(13)

where α_F, α_U, α_BEC are hyper-parameters of the learning procedure. For some experiments, we set α_BEC of the coupled global γ model to zero in order to reproduce the work in ref. 34, where BECs are fitted a posteriori from learned pseudo charges. To encourage charge-neutral predictions, we include an explicit neutrality penalty in the loss function for all models. This penalty term suppresses artifacts associated with the uniform background charge inherent to Ewald-sum-based electrostatics⁶⁹ and is defined as


	(14)

with ε = 1 × 10⁻¹² a small number to ensure numerical stability and α_neutral again a hyper-parameter. Unless otherwise noted, all loss weights were set to 1.

For the short-range features, we follow the architecture described in ref. 44. All models use a cutoff radius of r_c = 5.0 Å and spherical harmonics up to degree 6, with eight spherical channels. The radial dependence is represented using 32 radial basis functions, and the chemical embedding uses eight channels. We employ a cosine cutoff to ensure smooth behavior at the cutoff distance, and perform the radial expansion using basic Bernstein basis functions.⁷⁰ The network uses the SiLU activation function. No message passing is applied, and for each atom a single floating-point value is predicted, representing its pseudo charge.

Training is performed using the Adam optimizer with an initial learning rate of 6 × 10⁻⁵ and a batch size of one structure. Optimization proceeds until the loss no longer improves significantly. For validation, we use a random 80 [thin space (1/6-em)] :20 train–validation split.

For the training of the coupled models with global screening parameter γ, we first train the model purely on energy and force labels. Utilizing the pseudo charge predictions of this model (through eqn (3) for non-periodic structures and eqn (9) for periodic structures) we determine a single γ value which minimizes the absolute error on BEC predictions on the entire validation set, including clusters and bulk structures simultaneously. This value is then used for all subsequent evaluations and simulations of the model.

5.3 Molecular dynamics and IR spectra

All simulations were carried out using the atomic simulation environment (ASE).⁷¹ A time step of 0.5 fs was used in conjunction with a Bussi–Donadio–Parinello thermostat⁷² to simulate bulk water and water clusters at constant temperature and constant volume. For simulations of clusters, we removed the center of mass velocity and rotation. All results for clusters shown in the main work are obtained as the average over 95 independent simulations of 50 ps per system and model. For the bulk systems, twenty simulations of length 100 ps are used. For comparison the SI shows results for the LOREM models obtained from five independent simulations for each system and model. The polarization time derivative follows from the chain rule¹⁹


	(15)

with v_i the particle's velocity. The frequency-dependent complex electric susceptibility is obtained from the polarization–polarization time correlation function³⁶


	(16)

where P(t) is calculated from eqn (15) via numerical integration. Using the Wiener–Khinchin theorem, we compute the dissipative imaginary part χ″ from the polarization spectrum as


	(17)

where

(ω) is the Fourier transform of the polarization time series and where L_t is the length in time of the polarization P(t).

For the reference analysis method utilizing fixed charges, we use the trajectory obtained from the uncoupled model and take the average [Z with combining macron] _αα of the diagonal BEC components over 50 ps per atomic species. The dipole IR spectra are then calculated via eqn (17) via .

Author contributions

P. L., M. C. and A. S. designed the study. P. S. implemented the method, trained the models, performed simulations and analysis. P. L. curated the datasets, created the visualizations, and assisted with implementation and simulations. M. F. L. and E. R. contributed significantly to the design of the models, with E. R. additionally supporting model training and figure design. H. S. performed the density functional theory calculations. M. C. and A. S. supervised the project. All authors contributed to the writing of the manuscript and analysis of the results.

Conflicts of interest

There are no conflicts to declare.

Data availability

Supplementary information (SI) is available, showing additional details on the implementation of the coupled model, computational scaling analysis, results for the LOREM models and an analysis of the numerical stability for the finite difference scheme used to obtain BEC labels from DFT. See DOI: https://doi.org/10.1039/d6cp00911e.

The atomic structures used to train the models are available from https://doi.org/10.24435/materialscloud:fs-8h. In the same entry we also provide a chemiscope visualization for the validation set of the water clusters for the local gamma models containing the structures, screening values and learned pseudo charges.

Code availability: the trained models, training scripts, CP2K input files to generate the dataset, and the code to reproduce this study is available on GitHub at https://github.com/pstaerk/si_charge_learning_bec.

Acknowledgements

We thank Michelangelo Domina, Philipp Schienbein, Bingqing Cheng, Venkat Kapil and Matthias Kellner for the insightful discussions. The work of A. S. and P. S. was funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy – EXC 2075/1 – 390740016 and further supported by EXC 3120/1 – 533771286. We acknowledge the support by the Stuttgart Center for Simulation Science (SimTech). H. S. was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under SFB 1333/2 – 358283783. M. F. L. acknowledges funding from the German Research Foundation (DFG) under project number 544947822. M. C. and P. L. acknowledge funding from the NCCR MARVEL, funded by the Swiss National Science Foundation (SNSF, grant number 182892) and from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement no 101001890-FIAMMA).

Notes and references

Y. Pan, X. Wang, W. Zhang, L. Tang, Z. Mu, C. Liu, B. Tian, M. Fei, Y. Sun, H. Su, L. Gao, P. Wang, X. Duan, J. Ma and M. Ding, Nat. Commun., 2022, 13, 3063 CrossRef CAS PubMed.
S. Ciampi, N. Darwish, H. M. Aitken, I. Díez-Pérez and M. L. Coote, Chem. Soc. Rev., 2018, 47, 5146–5164 RSC.
N. Salles, L. Martin-Samos, S. de Gironcoli, L. Giacomazzi, M. Valant, A. Hemeryck, P. Blaise, B. Sklenard and N. Richard, Nat. Commun., 2020, 11, 3330 CrossRef.
O. T. Unke, S. Chmiela, H. E. Sauceda, M. Gastegger, I. Poltavsky, K. T. Schütt, A. Tkatchenko and K.-R. Müller, Chem. Rev., 2021, 121, 10142–10186 CrossRef PubMed.
K. Schütt, P.-J. Kindermans, H. E. Sauceda Felix, S. Chmiela, A. Tkatchenko and K.-R. Müller, Adv. Neural Inf. Process. Syst., 2017, 30, 992–1002 Search PubMed.
N. Thomas, T. Smidt, S. Kearnes, L. Yang, L. Li, K. Kohlhoff and P. Riley, arXiv, 2018, preprint, arXiv:1802.08219 DOI:10.48550/arXiv.1802.08219.
S. Batzner, A. Musaelian, L. Sun, M. Geiger, J. P. Mailoa, M. Kornbluth, N. Molinari, T. E. Smidt and B. Kozinsky, Nat. Commun., 2022, 13, 2453 CrossRef PubMed.
I. Batatia, D. P. Kovacs, G. Simm, C. Ortner and G. Csányi, Adv. Neural Inf. Process. Syst., 2022, 35, 11423–11436 Search PubMed.
Z. Li and S. Scandolo, J. Chem. Phys., 2026, 164, 044107 Search PubMed.
V. Kapil, D. P. Kovács, G. Csányi and A. Michaelides, Faraday Discuss., 2024, 249, 50–68 RSC.
S. Falletta, A. Cepellotti, A. Johansson, C. W. Tan, A. Musaelian, C. J. Owen and B. Kozinsky, Nat. Commun., 2025, 16, 4031 CrossRef CAS.
E. Stocco, C. Carbogno and M. Rossi, Mater, 2025, 11, 304 CAS.
M. Veit, D. M. Wilkins, Y. Yang, R. A. DiStasio, Jr. and M. Ceriotti, J. Chem. Phys., 2020, 153, 024113 CrossRef CAS PubMed.
B. A. A. Martin, A. M. Ganose, V. Kapil, T. Li and K. T. Butler, General Learning of the Electric Response of Inorganic Materials, arXiv, 2025, preprint, arXiv:2508.17870 DOI:10.48550/arXiv.2508.17870.
R. Resta and D. Vanderbilt, Physics of Ferroelectrics, Springer Berlin Heidelberg, Berlin, Heidelberg, 2007, vol. 105, pp. 31–68 Search PubMed.
N. A. Spaldin, J. Solid State Chem., 2012, 195, 2–10 CrossRef CAS.
T. Dufils, L. Knijff, Y. Shao and C. Zhang, J. Chem. Theory Comput., 2023, 19, 5199–5209 CrossRef CAS PubMed.
J. Li, L. Knijff, Z.-Y. Zhang, L. Andersson and C. Zhang, J. Chem. Theory Comput., 2025, 21, 1382–1395 CrossRef CAS PubMed.
P. Schienbein, J. Chem. Theory Comput., 2023, 19, 705–712 Search PubMed.
K. Joll, P. Schienbein, K. M. Rosso and J. Blumberger, Nat. Commun., 2024, 15, 8192 CrossRef CAS PubMed.
N. Bergmann, N. Bonnet, N. Marzari, K. Reuter and N. G. Hörmann, Phys. Rev. Lett., 2025, 135, 146201 CrossRef CAS.
A. P. Bartók, M. C. Payne, R. Kondor and G. Csányi, Phys. Rev. Lett., 2010, 104, 136403 CrossRef PubMed.
Z. Deng, C. Chen, X.-G. Li and S. P. Ong, npj Comput. Mater., 2019, 5, 1–8 CAS.
N. Artrith, T. Morawietz and J. Behler, Phys. Rev. B:Condens. Matter Mater. Phys., 2011, 83, 153101 CrossRef.
T. Morawietz, V. Sharma and J. Behler, J. Chem. Phys., 2012, 136, 064103 CrossRef PubMed.
K. Yao, J. E. Herr, D. W. Toth, R. Mckintyre and J. Parkhill, Chem. Sci., 2018, 9, 2261–2269 RSC.
T. W. Ko, J. A. Finkler, S. Goedecker and J. Behler, Nat. Commun., 2021, 12, 398 CrossRef CAS PubMed.
R. Gao, C. Yam, J. Mao, S. Chen, G. Chen and Z. Hu, Nat. Commun., 2025, 16, 10484 CrossRef CAS.
B. Cheng, npj Comput. Mater., 2025, 11, 1–8 Search PubMed.
P. Loche, K. K. Huguenin-Dumittan, M. Honarmand, Q. Xu, E. Rumiantsev, W. B. How, M. F. Langer and M. Ceriotti, J. Chem. Phys., 2025, 162, 142501 CrossRef CAS.
K. C. Gross, P. G. Seybold and C. M. Hadad, Int. J. Quantum Chem., 2002, 90, 445–458 CrossRef CAS.
A. E. Reed, R. B. Weinstock and F. Weinhold, J. Chem. Phys., 1985, 83, 735–746 CrossRef CAS.
F. L. Hirshfeld, Theor. Chim. Acta, 1977, 44, 129–138 CrossRef CAS.
P. Zhong, D. Kim, D. S. King and B. Cheng, npj Comput. Mater., 2025, 11, 384 CrossRef CAS.
D. Kim, X. Wang, P. Zhong, D. S. King, T. J. Inizan and B. Cheng, J. Chem. Theory Comput., 2025, 21, 12709–12724 CrossRef PubMed.
S. Carlson, F. N. Brünig, P. Loche, D. J. Bonthuis and R. R. Netz, J. Phys. Chem. A, 2020, 124, 5599–5605 CrossRef CAS.
D. A. Knopf, P. A. Alpert and B. Wang, ACS Earth Space Chem., 2018, 2, 168–202 CrossRef CAS.
R. R. Netz, J. Phys. Chem. B, 2020, 124, 7093–7101 CrossRef CAS PubMed.
A. Milani and C. Castiglioni, J. Mol. Struct. THEOCHEM, 2010, 955, 158–164 CrossRef CAS.
R. S. Mulliken, J. Chem. Phys., 1955, 23, 1833–1840 CrossRef CAS.
L. Zhang, H. Wang, M. C. Muniz, A. Z. Panagiotopoulos, R. Car and W. E, J. Chem. Phys., 2022, 156, 124107 CrossRef CAS PubMed.
L.-P. Wang, T. J. Martinez and V. S. Pande, J. Phys. Chem. Lett., 2014, 5, 1885–1891 CrossRef CAS PubMed.
K. K. Huguenin-Dumittan, P. Loche, N. Haoran and M. Ceriotti, J. Phys. Chem. Lett., 2023, 9612–9618 CrossRef CAS PubMed.
E. Rumiantsev, M. F. Langer, T.-E. Sodjargal, M. Ceriotti and P. Loche, Learning Long-Range Representations with Equivariant Messages, Transactions on Machine Learning Research, 2026 Search PubMed.
O. T. Unke and H. Maennel, E3x: E(3)-Equivariant Deep Learning Made Easy, 2024 Search PubMed.
B. Schmiedmayer and G. Kresse, J. Chem. Phys., 2024, 161, 084703 CrossRef CAS.
B. Schmiedmayer, A. Rittsteuer, T. Hilpert and G. Kresse, Scalar Machine Learning of Tensorial Quantities - Born Effective Charges from Monopole Models, arXiv, 2026, preprint, arXiv:2602.04773 DOI:10.48550/arXiv.2602.04773.
I. V. Leontyev and A. A. Stuchebrukhov, J. Chem. Phys., 2009, 130, 085102 CrossRef CAS PubMed.
I. V. Leontyev and A. A. Stuchebrukhov, J. Chem. Theory Comput., 2010, 6, 3153–3161 CrossRef CAS.
A. A. Kornyshev, J. Electroanal. Chem. Interfacial Electrochem., 1986, 204, 79–84 CrossRef CAS.
H. A. Stern and S. E. Feller, J. Chem. Phys., 2003, 118, 3401–3412 CrossRef CAS.
V. Ballenegger and J.-P. Hansen, J. Chem. Phys., 2005, 122, 114711 CrossRef CAS PubMed.
P. Stärk, H. Stooß, P. Loche, D. J. Bonthuis, R. R. Netz and A. Schlaich, Chem. Phys. Rev., 2026, 7, 011319 CrossRef.
T. Bereau, D. Andrienko and O. A. von Lilienfeld, J. Chem. Theory Comput., 2015, 11, 3225–3233 CrossRef CAS.
M. F. Langer, J. T. Frank and F. Knoop, J. Chem. Phys., 2023, 159, 174105 CrossRef CAS PubMed.
M. F. Langer, F. Knoop, C. Carbogno, M. Scheffler and M. Rupp, Phys. Rev. B, 2023, 108(10), L100302 CrossRef CAS.
C. Pérez, M. T. Muckle, D. P. Zaleski, N. A. Seifert, B. Temelso, G. C. Shields, Z. Kisiel and B. H. Pate, Science, 2012, 336, 897–901 CrossRef.
Y. Wang and J. M. Bowman, J. Phys. Chem. Lett., 2013, 4, 1104–1108 CrossRef CAS PubMed.
B. Cheng, E. A. Engel, J. Behler, C. Dellago and M. Ceriotti, Proc. Natl. Acad. Sci. U. S. A., 2019, 116, 1110–1115 CrossRef CAS PubMed.
B. Temelso, K. A. Archer and G. C. Shields, J. Phys. Chem. A, 2011, 115, 12034–12046 CrossRef CAS PubMed.
V. S. Bryantsev, M. S. Diallo, A. C. T. van Duin and W. A. I. Goddard, J. Chem. Theory Comput., 2009, 5, 1016–1026 CrossRef CAS PubMed.
T. T. Nguyen, E. Székely, G. Imbalzano, J. Behler, G. Csányi, M. Ceriotti, A. W. Götz and F. Paesani, J. Chem. Phys., 2018, 148, 241725 CrossRef.
A. Mazitov, F. Bigi, M. Kellner, P. Pegolo, D. Tisi, G. Fraux, S. Pozdnyakov, P. Loche and M. Ceriotti, Nat. Commun., 2025, 16, 10653 CrossRef CAS PubMed.
A. Mazitov, S. Chorna, G. Fraux, M. Bercx, G. Pizzi, S. De and M. Ceriotti, Sci. Data, 2025, 12, 1857 Search PubMed.
T. D. Kühne, M. Iannuzzi, M. Del Ben, V. V. Rybkin, P. Seewald, F. Stein, T. Laino, R. Z. Khaliullin, O. Schütt, F. Schiffmann, D. Golze, J. Wilhelm, S. Chulkov, M. H. Bani-Hashemian, V. Weber, U. Borštnik, M. Taillefumier, A. S. Jakobovits, A. Lazzaro, H. Pabst, T. Müller, R. Schade, M. Guidon, S. Andermatt, N. Holmberg, G. K. Schenter, A. Hehn, A. Bussy, F. Belleflamme, G. Tabacchi, A. Glöß, M. Lass, I. Bethune, C. J. Mundy, C. Plessl, M. Watkins, J. VandeVondele, M. Krack and J. Hutter, J. Chem. Phys., 2020, 152, 194103 CrossRef PubMed.
I. Souza, J. Íñiguez and D. Vanderbilt, Phys. Rev. Lett., 2002, 89, 117602 CrossRef PubMed.
P. Umari and A. Pasquarello, Phys. Rev. Lett., 2002, 89, 157602 CrossRef CAS.
M. Stengel, N. A. Spaldin and D. Vanderbilt, Nat. Phys., 2009, 5, 304–308 Search PubMed.
S. A. Barr and A. Z. Panagiotopoulos, Phys. Rev. E:Stat., Nonlinear, Soft Matter Phys., 2012, 86, 016703 CrossRef CAS PubMed.
G. G. Lorentz, Bernstein Polynomials, American Mathematical Society, New York, NY, 1986 Search PubMed.
A. Hjorth Larsen, J. Jørgen Mortensen, J. Blomqvist, I. E. Castelli, R. Christensen, M. Duak, J. Friis, M. N. Groves, B. Hammer, C. Hargus, E. D. Hermes, P. C. Jennings, P. Bjerre Jensen, J. Kermode, J. R. Kitchin, E. Leonhard Kolsbjerg, J. Kubal, K. Kaasbjerg, S. Lysgaard, J. Bergmann Maronsson, T. Maxson, T. Olsen, L. Pastewka, A. Peterson, C. Rostgaard, J. Schiøtz, O. Schütt, M. Strange, K. S. Thygesen, T. Vegge, L. Vilhelmsen, M. Walter, Z. Zeng and K. W. Jacobsen, J. Phys.: Condens. Matter, 2017, 29, 273002 CrossRef PubMed.
G. Bussi, D. Donadio and M. Parrinello, J. Chem. Phys., 2007, 126, 014101 CrossRef.

Click here to see how this site uses Cookies. View our privacy policy here.