Open Access Article
María Judit Montes de Oca-Estévez
a,
Javier Hernández-Rojas
*b and
Rita Prosmiti
*a
aInstitute of Fundamental Physics, CSIC (IFF-CSIC), Serrano 123, 28006 Madrid, Spain. E-mail: rita@iff.csic.es
bDepartamento de Física and IUdEA, Universidad de La Laguna, 38200 Tenerife, Spain
First published on 27th February 2026
We present a systematic description of the interactions governing HenH+ clusters, derived from gold-standard ab initio data. Two-, three-, and four-body potential contributions generated using CCSD(T)/CBS calculations and a kernel-based machine learning approach are integrated within a many-body expansion formalism. Benchmark tests show that inclusion of up to four-body terms is required to achieve consistent accuracy for larger clusters. Using the validated potential, global minimum searches reveal that structural organization proceeds through sequential binding of He atoms to a rigid linear HeH+He core, with formation of well-defined solvation motifs, while the more weakly bound He atoms display increasing degrees of delocalization. Zero-point-corrected energetics reproduce the experimentally reported stability patterns for n ≤ 13 clusters, while the onset of pronounced He delocalization, associated with microscopic superfluid-like behavior, introduces nontrivial quantum effects that influence stability trends for larger clusters. From a computational perspective, these results establish the central importance of higher-order many-body effects and quantum contributions in accurately describing proton microsolvation in He, while providing a reliable and comprehensive framework for interpreting experimental stability trends in proton-bound noble-gas clusters.
Among these systems, proton-bound complexes of the form HenHm+(n, m ≥ 1) provide a unique platform for studying microsolvation of H+, H2+, and H3+ – species of central importance in the interstellar medium (ISM).19–22 Experimentally, their characterization has relied primarily on high-resolution mass spectrometry. Early studies by Kojima et al.23 identified HenH+ (up to n = 14) and HenH3+ (up to n = 13), revealing distinct magic numbers at n = 13 and n = 10–11, respectively, while more recent ultrahigh-resolution spectroscopy measurements extended these data up to n = 30, quantifying binding energies and uncovering HenH2+ clusters for the first time.24 Additional studies reported enhanced stabilities at n = 2, 6, 11, and 13,25,26 and identified signatures of microscopic superfluidity in the intermediate-size clusters.25 Moreover, cryogenic IR spectroscopy has provided structural fingerprints for small complexes (n = 3–6).27
From the theoretical side, the strongly bound and highly symmetric He2H+ molecular ion is known to act as the structural core of HenH+ clusters, as is the case for the ArnH+ clusters.14,28 Several attempts have been made to calculate the equilibrium structures and relative energies of HenH+ for n ≥ 2.6,17,29–39 As n increases, additional He atoms interact only through very weak dispersive forces, and previous studies have reported multiple low-symmetry minimum-energy structures.25,29,30,34 However, the observed progressive energetic decrease in the calculated evaporation energies between n = 3–6, suggests that He6H+ has a remarkable stability. The latest study25 has also incorporated experimental information obtained via cryogenic ion-trap mass spectrometry up to n = 7, which provides complementary evidence on stability patterns and supports the theoretical characterization.
Despite decades of experimental and theoretical work, a quantitative and transferable description of the interactions governing HenH+ clusters remains a central challenge. Accurate modeling requires treating subtle many-body effects and long–range interactions, along with the sizable quantum contributions arising from the light and weakly bound He atoms. Existing potential energy surfaces (PESs) provide valuable insights for small complexes,10,11,15 but they are not designed to scale reliably to larger cluster sizes. Moreover, recent analyses of related protonated noble-gas clusters14 show that interaction terms up to the four-body (4B) level are necessary to capture the energetics with quantitative fidelity.
The objective of this work is to establish a systematic and size-consistent description of the interactions in HenH+ clusters. To this end, we develop a high-accuracy potential energy surface for the He3H+ tetramer based on single and double excitation coupled cluster with perturbative triples at complete basis set limit (CCSD(T)/CBS) calculations and a reproducing kernel Hilbert space (RKHS) representation. This PES is then integrated within a many-body expansion (2B, 3B, and 4B) to build scalable interaction models applicable to clusters up to n = 34. Extensive benchmark calculations against additional CCSD(T)/CBS reference data demonstrate that accurate modeling requires inclusion of 4B terms, whereas lower-order truncations lead to substantial errors. Finally, the validated 4B potential is employed to characterize the global minima, microsolvation motifs, and size-dependent stabilities of HenH+ clusters. Our results reproduce known magic numbers and predict new stability patterns at larger sizes, offering a comprehensive framework for understanding proton microsolvation in He.
of a system with
-bodies or monomers into 1-body (1B), 2-body (2B), …,
-body
contributions,
![]() | (1) |
-body
term. Each m-body contribution isolates the interaction arising from groups of m monomers, enabling a controlled, hierarchical description of many-body effects.
In practice, the MBE is truncated at a finite order, and in this work, we evaluate the predictive performance of the analytical models V truncated at the 2B, 3B and 4B terms for energetics of HenH+ clusters, retaining accordingly up to:
![]() | (2) |
![]() | (3) |
![]() | (4) |
This decomposition simplifies the analysis of complex cluster, while systematically incorporating higher-order interactions. Here,
denotes the vector connecting the proton to the i-th He atom (with angular coordinates θi, ϕi), while r and R12 = re represent the HeH+ and HeH+He bond vectors in eqn (3) and (4), respectively (see Fig. 1), Ril vectors connect the i-th and l-th He atoms in each HenH+ cluster. In this sense, the VHeH+(R),
and
are the corresponding 2B, 3B and 4B CCSD(T)/CBS potentials for the HeHe,41 He2H+,10 and He3H+ systems, respectively, as developed in the present work and employed in the MBE approach.
, with n = 5 and 6, in order to obtain the CCSD(T)/CBS[56] energies, ECBS = EHF/AV6Z + EcorrCBS.
The configuration space is sampled using the R and θ coordinates with R = 1.30–8.0 Å and θ = 0–90° for re = 1.8496 Å, the equilibrium He–H+–He bondlength from the CCSD(T)/AV6Z geometry optimizations. In total, 1900 grid points were considered, with the interaction energies defined as
. The energy of the total dissociation He3H+ → He + He + He + H+ corresponds to zero (see Table S1 in the SI), whereas
,
, and EHe are the CCSD(T)/CBS[56] energies of He3H+, He2H+, and He at each configuration, respectively. The basis set superposition error (BSSE)47 is rendered negligible upon extrapolation of the interaction energies to the complete basis set (CBS) limit.
The ML-PES was constructed using the reproducing kernel Hilbert space (RKHS) approach,48 in which the potential energy function V(x) is interpolated from a set of known ab initio energies Vi corresponding to molecular configurations xi. By applying the representer theorem, a general functional relationship can be written as a linear combination of kernel functions,
, where Ci are expansion coefficients and K(x, x′) is the reproducing kernel. The coefficients are obtained by solving the linear system C = K−1 y, with
the vector of input energies,
the coefficient vector, and K the
kernel matrix.
For multidimensional systems, the full kernel is constructed as a direct product of one-dimensional (1D) kernels,
, where D is the dimensionality of the coordinate space and kd are the corresponding 1D kernels. Several RKHS kernels have been proposed for different applications; in this work we employ the kn,m1 kernel for the distance-like coordinates R, and the k2 kernel for the angle-like coordinates θ. The angular coordinate is expressed through its reduced form z = cos
θ. The grid contains NR points along R and Nθ points along θ. The distance-like reproducing kernel
is given by
, where χ, χ′
R, χ> and χ< denote the larger and smaller of (χ, χ′), B is the beta function, and 2F1 is the Gaussian hypergeometric function. The parameters n′ and m′ control the smoothness and asymptotic decay of the kernel; here we use n′ = 2 and m′ = 3, corresponding to the leading R−4 dispersion interaction between He and the He–H+–He molecular ion. The angle-like kernel is expanded in terms of Legendre polynomials
, where
denotes the Legendre polynomial of order
.
The final expression of the ML-PES is given as
![]() | (5) |
The best-performed RKHS PES model was identified using a hold-out cross-validation procedure applied to six different training-grid datasets of 200 (20 × 10), 380 (20 × 19), 300 (30 × 10), 570 (30 × 19), 400 (40 × 10), and 760 (40 × 19) points. These configurations varied systematically in the number of radial (20, 30, or 40) and angular (10 or 19) sampling points, thereby providing comprehensive coverage of the two-dimensional coordinate space.
In the left panel of Fig. 2, the energy distribution of the training energies data indicates that approximately 90% of the computed points lie within the first dissociation channel of the system (see Table S1 in the SI), demonstrating that the most physically relevant regions of the PES are well represented in the training set.
In turn, in the right panel of Fig. 2 shows a systematic improvement in the RKHS ML-PES accuracy as the size of the training dataset increases. The root-mean-square-error (RMSE) values were evaluated over several independent testing sets, including the full 1140-point validation grid, the 1090-points below the dissociation limit, and two randomly selected subsets of 570 and 543 points, and all exhibit the same monotonic error reduction. This trend is clearly observed across the 200/380-, 300/570-, and 400/760-point comparisons, highlighting the direct correlation between model performance and the amount of training data.
A closer inspection of the radial sampling reveals that increasing the number of R points from 20 to 40 leads to a substantial and consistent decrease in RMSE, with values ranging from 56.66 to 34.78 cm−1 for energies up to the total dissociation. Notably, the errors remain essentially unchanged when comparing sparsely sampled angular grids (200 = 20 × 10, 300 = 30 × 10, 400 = 40 × 10) with their more densely sampled counterparts (380 = 20 × 19, 570 = 30 × 19, 760 = 40 × 19). This confirms that the θ coordinate plays a minor role in the overall accuracy, whereas the number of radial points is the dominant factor controlling the PES quality. For these reasons, we have selected the 40 × 10 training model for the construction of the He3H+ PES.
The correlation plot in Fig. 3 illustrates the performance of the selected RKHS ML-PES model against both the training set (400 configurations) and the full testing set (1140 configurations), covering the attractive and repulsive regions of the potential. The corresponding RMSE and mean-absolute-error (MAE) values for different energy intervals are also plotted in the figure. The similar RMSE and MAE values indicates that the RKHS model does not produce sporadic large errors and that its predictive performance remains uniform across configurations of different energies. Importantly, the model preserves a controlled error behavior outside the training region, maintaining deviations of approximately 200 cm−1 for the 1090 configurations lying below the dissociation threshold, and errors up to 3% for regions above the dissociation limit.
Moreover, the upper panel of Fig. 4 compares the RKHS ML-PES (solid lines) with the reference CCSD(T)/CBS[56] energies (circle symbols) as a function of the R and θ coordinates (see inset plot). The potential energy curves are shown for θ ranging from 0 to 90° in increments of 5°. In addition, the full 2D RKHS ML potential for the He3H+ cation is also displayed as a contour plot in the lower panel of Fig. 4, in the (θ, R) plane over the energy range −21
450 to −21
050 cm−1. The presence of well-defined minima on the He3H+ PES is clearly evident. The global minimum corresponds to a T-shaped configuration with a well depth of −21
427.60 cm−1 for both the RKHS ML-PES and the CCSD(T)/CBS reference data. This minimum is located at R = 2.16 Å, which is only 0.15 Å larger than the equilibrium distance predicted by the CCSD(T)/AV6Z geometry optimization. The excellent agreement between the ab initio data and the present RKHS ML-PES demonstrates that the RKHS model provides an accurate and smooth representation of this region of the potential energy surface.
![]() | ||
| Fig. 5 Potential curves obtained for the HenH+ (with n = 2, 3, 4 and 6) clusters using the sum-of-potentials approach of eqn (2)–(4) (see color dashed lines), together with the calculated CCSD(T)/CBS[56] interaction energies (black circles) as a function of the indicated Ri coordinates. | ||
The upper panels of Fig. 5 illustrate the results for the n = 2 and n = 3 systems. In the n = 2 case, the HeH+ internuclear distance r was fixed at its equilibrium value of 0.92480 Å, and the V3B is represented by the He2H+ CCSD(T)/CBS potential, while for n = 3, the linear He–H+–He was fixed at re = 1.8496 Å, with the V4B described by the present He3H+ CCSD(T)/CBS[56] PES. For larger clusters, the V2B, V3B and V4B schemes represent increasingly refined truncations of the many-body expansion. For the n = 4 system, we examined selected configurations in which the He–H+–He core was kept at its equilibrium geometry re, and the two additional He atoms were arranged in T-shaped configurations (see the insets in the middle panels). In the middle left panel, the distance to the He atom located outside the core was fixed at 2.10 Å. In the middle right panel, a symmetry constraint R3 = R4 was imposed. For the n = 6 system, the external He atoms were placed in a symmetric square ring around the proton (lower panels of Fig. 5). In the lower left panel, three of the He-H+ distances were fixed at 2.18 Å, whereas in the lower right panel all four He atoms in the ring were allowed to move symmetrically, satisfying R3 = R4 = R5 = R6.
Analysis of the results reveals that, for all systems studied, the V2B model exhibits substantial discrepancies relative to the benchmark CCSD(T)/CBS[56] data. In the vicinity of the potential-well minimum, deviations reach approximately 5500 cm−1, clearly demonstrating that a purely pairwise description is insufficient for modeling the energetics of the HenH+ clusters. Incorporating higher-order contributions through the V3B model significantly reduces these errors. Upon inclusion of the 3B terms, the discrepancies in the well depth decrease to approximately 400 cm−1 for the He3H+, He4H+, and He6H+ complexes, as shown in the top-right, middle-left, and bottom-left panels of Fig. 5, respectively. However, for the second geometrical arrangements of He4H+ and He6H+ (middle-right and bottom-right panels of Fig. 5), the discrepancies increase to roughly 750 and 1500 cm−1, respectively. These cases indicate a limitation of the V3B model, which still fails to fully capture the interaction energy landscape for certain configurations. A substantial improvement is achieved with the V4B potential model. In this case, the deviations from the CCSD(T)/CBS reference data remain below 20 cm−1 across all examined configurations. Based on this validated level of accuracy, the V4B model is adopted for the characterization of interaction energies in all HenH+ clusters with n ≥ 3.
The BH algorithm, originally introduced by Li and Scheraga under the “Monte Carlo plus energy minimization” framework,50 enables the efficient identification of low-energy structures for clusters of up to n = 34 He atoms. This method is particularly well suited for exploring stable configurations in atomic and molecular clusters, both neutral and charged, as well as for more general structural optimization problems.51 By transforming the potential energy landscape into a collection of basins and allowing stochastic hops between local minima, BH provides a robust strategy for sampling the complex configurational space associated with these microsolvated ionic clusters.
All results were obtained using an optimization temperature of kBT = 0.8 meV, and four independent trajectories for each cluster size. In all cases, the global minimum was located within fewer than 104 BH steps. Analysis of the optimized geometries of the HenH+ complexes (see Table S2 in the SI and Fig. 6) reveals a consistent structural motif across all cluster sizes: a central linear core that remains essentially unchanged as additional He atoms are added. This core corresponds to two strongly bound He atoms surrounding the proton, with a binding energy of approximately 4560 cm−1, in sharp contrast to the significantly weaker interactions (50–330 cm−1) associated with the remaining He atoms. This energetic distinction is also reflected in the bond lengths, which are markedly shorter in the core (∼0.925 Å) compared to the more distant peripheral He-H+ connections (2.15–2.25 Å), as shown in Fig. 6.
Furthermore, the optimized geometries reveal that the He atoms outside the central core preferentially occupy positions nearly perpendicular to the principal axis defined by the He–H+–He trimer. This trend persists as the cluster grows, and the first solvation shell is completed at n = 6 with a square bipyramidal arrangement. At n = 7, the addition of one helium atom leads to the formation of a pentagonal bipyramid. For n = 10, the three additional He atoms form a triangular arrangement on one side of the He–H+–He axis at distances of approximately 3.05 Å. When the cluster reaches n = 13, the corresponding triangular motif on the opposite side of the core is completed. Beyond this size, the growth of the HenH+ clusters proceeds through the sequential formation of triangular and pentagonal units on both sides of the central axis, reflecting the layered and increasingly diffuse character of the solvation shells. It is worth noting that the He16H+ cluster also exhibits an asymmetric structure, with one He atom occupying a linear position adjacent to the side pentagon at distance of 3.54 Å from the proton, while the opposite side contains a triangular unit. In contrast, the optimized geometry of He19H+ is fully symmetric, featuring a central pentagon flanked by two complete pentagonal rings, with the remaining two He atoms located along the linear axis at opposite ends of the He–H+–He core. By n = 24, a second central pentagonal ring at distance of 4.2 Å is completed, signaling the onset of a more extended third solvation layer, together with two complete side pentagons at n = 34. This structural analysis indicates that the cationic core forms a remarkably strong bond, whereas the additional He atoms are stabilized only through much weaker interactions. Consequently, the overall properties of the clusters are largely governed by the interplay between the He–H+–He trimer and each subsequently attached He atom.
The optimized potential energies, together with the zero point energy (ZPE) corrected values are presented in Fig. 7 as a function of cluster size up to n = 34. The corresponding MP2/AVQZ and CCSD(T)/AVQZ data are also shown for clusters up to n = 13. As the number of He atoms increases, the potential energy decreases monotonically, reflecting the progressive stabilization of the microsolvated ion. Notably, the energy curve does not follow a single linear trend; instead, four distinct slopes are observed. The changes in slope observed at n = 6, 13, and 19 correspond to the completion of successive solvation layers and are fully consistent with the structural growth patterns identified for the HenH+ clusters. These breakpoints mark the formation of the first and second solvation shells, which appear as pentagonal rings arranged around the He–H+–He axis, and they clearly illustrate the shell-by-shell assembly of the He environment surrounding the central core. This classical layering behavior persists even after the inclusion of ZPE corrections, although the slope transitions become noticeably less pronounced. The ZPE-corrected energies additionally reveal some atypical behavior in the range n = 14–19, which we discuss in detail below.
![]() | ||
Fig. 7 Optimized BH potential energies and corrections for the HenH+ clusters using the V4B SOP expansion. | ||
Fig. 8 displays the single-atom evaporation energies (En−1 − En) as a function of cluster size, obtained from the BH and BH + ZPE simulations and compared with the experimental measurements available for the HenH+ clusters. The experimental data reveal several notable stability features. Ref. 24 reports measurements up to n = 30, identifying peaks at n = 6 and n = 13, along with smaller features at n = 9 for HenH+. In contrast, ref. 26, which extends the measurements to n = 50, finds enhanced stability for the HenH+ only at n = 6, 11, and 13. Despite these differences, all available experimental datasets consistently support enhanced stability at n = 6 and n = 13, and very similar pattern for n > 15 HenH+ clusters. The features at n = 9 and n = 11, however, appear to vary between experiments, suggesting a dependence on experimental conditions or detection sensitivity.
![]() | ||
| Fig. 8 Single-atom evaporative energies (solid lines) for HenH+ clusters, obtained from BH optimizations using the V4B potential and including harmonic ZPE corrections. Experimentally measured ion-yield distributions from (a) ref. 24 and (b) ref. 26 are also shown. | ||
Our BH + ZPE calculations reproduce the experimental peaks for the HenH+ at n = 6, 9, 11 and 13 with remarkable fidelity. Likewise, this approach predicts new peaks at n = 18, and 24, which currently lack experimental confirmation but appear as potential candidates for future verification. The observed changes in the experimental ion-yield curves for n between 15 to 23 and 24 to 32 may therefore be rationalized as signatures of such structural reorganizations within the growing patterns. However, the region between n = 14 and 19 requires particular attention. In this range, the harmonic ZPE approach appears to break down, as both the potential and evaporation energy curves exhibit an unusually flat dependence on cluster size. This behaviour is consistent with the presence of energetically close local minima and indicates substantial anharmonicity and He delocalization, suggesting that more accurate quantum treatments should be employed. Such features mirror the mobility-enhanced regions observed experimentally/theoretically,25,27 which are considered as classical signatures of microscopic superfluidity. For comparison, the BH potential calculations also show peaks at n = 6 and 13, but produce a different overall profile, with additional peaks shifted to n = 10, 15, 17, and 19. Again the discrepancy with experimental curves highlights the critical importance of quantum ZPE corrections, especially in larger clusters where the high delocalization of He atoms significantly influences the relative stability of protonated He clusters. In this context, comparison with ArnH+ clusters14,26 shows that, beyond differences in the strengths of the underlying interactions, quantum effects are expected to be significantly more pronounced in HenH+. Likewise, in contrast to alkali-ion, e.g. Li+, microsolvation in He, where ion-induced dipole interactions dominate, while the presence of a covalently stabilized proton-bound core in HenH+ leads to sharper structural reorganizations and enhanced quantum contributions.
Using the V4B validated PESs, global optimization searches reveal a strongly bound linear He–H+–He structural core across all cluster sizes, onto which additional, weakly bound He atoms attach forming outer layers. The first solvation shell closes at n = 6 forming a square bipyramidal geometry; further growth proceeds through sequential formation of triangular, tetragonal and pentagonal units on both sides of the central pentagon ring around the He–H+–He axis, as seen for n = 19, 24 and 34.
Zero-point-corrected BH + ZPE energetics reproduce with high fidelity the experimentally established stability enhancements at n = 6, 9, 11, and 13, and predict additional stability features at n = 18 and 24, which remain to be experimentally verified. Importantly, the energetic behaviour between n = 14 to 19 deserves particular attention, as both the BH potential and evaporation energy curves, including harmonic ZPE corrections, exhibit an anomalously flat dependence on cluster size. This flattening correlates with the presence of multiple energetically near degenerate minima, signalling significant anharmonicity and pronounced delocalization of the He atoms, allowing permutation-like motion typical of superfluid response. Such effects lie beyond the harmonic ZPE approximation and indicate that more advanced quantum treatments (e.g., anharmonic corrections, path-integral or diffusion Monte Carlo methods) are required to accurately capture the size-dependent stability and structural reorganization occurring in such protonated He clusters.
Overall, this study provides a detailed mechanistic understanding of proton microsolvation in He atoms, characterizing the structural, energetic, and experimental stability trends across a wide cluster sizes. Beyond offering new predictions for unexplored cluster sizes, the methodology established here is broadly transferable and can be extended to other weakly bound protonated noble-gas clusters, enabling systematic and size-consistent modeling of complex quantum solvation phenomena.
| This journal is © The Royal Society of Chemistry 2026 |